Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: synthetic data generation cascade down topic tree #277

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

leonardmq
Copy link
Contributor

What does this PR do?

  • Add Add to all buttons on the Synthetic data page to let the user generate samples for every subtopic leaf descendant of a given topic node; I called that "Cascade mode"
  • Add optional parallelization of synthetic data generation when in Cascade mode

The parallelization is inspired by what was done for the Save all action; the error handling is similar. Tried to keep the workers' errors mapped to their topic so that if we want, we can later add something to Try again and retry only the failed topics.

Related Issues

N/A

Contributor License Agreement

I, @leonardmq, confirm that I have read and agree to the Contributors License Agreement.

Checklists

  • Tests have been run locally and passed
  • New tests have been added to any work in /lib

model_name: string,
model_provider: string,
) {
// Add ignoring dupes and empty strings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a plan to skip the duplicates?

{model}
{num_samples_to_generate}
{custom_topics_string}
on_completed={handleGenerateSamplesCompleted}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there’s a more idiomatic way in Svelte to control modal visibility (when the modal decides to close itself) than passing in a callback from the parent component like this

@leonardmq
Copy link
Contributor Author

The Add data to all button is added on the root of the tree as well as on each topic (that has 1 or more child topics):
image

The Add data to all button is hidden on topics that have no children:
image

The Add data to all modal is the same as Add data with the text slightly altered to align with what it does, and adds a parallelization field:
image

If the generation is successful, it will close the modal.

If the generation runs into errors, the errors from each worker are collected and displayed at the end (specific error message a little clunky - might want to rephrase):

image

In the case of a non-cascade (i.e. single topic) generation failure, the error is shown directly like this:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant