Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.benchgen.com/llms.txt

Use this file to discover all available pages before exploring further.

Upload a Dataset

Train accepts labeled datasets in standard formats. Datasets can be uploaded manually or imported directly from an Eval export.

Supported Formats

FormatUse case
JSONL (instruction-response pairs)General fine-tuning
CSV (prompt, completion columns)Simple tabular data
Eval exportImported automatically from an Eval benchmark run

Upload Steps

  1. Go to Train → Datasets → Upload.
  2. Select your file or drag and drop.
  3. Choose the dataset format.
  4. Preview the parsed examples — confirm the prompt/response split looks correct.
  5. Name the dataset and save.

Dataset Quality Tips

  • Minimum size: 50 examples for a meaningful fine-tune; 200+ for reliable results.
  • Diversity: include varied phrasing of the same task, not just copies.
  • Clean labels: incorrect or inconsistent responses in the training set directly hurt output quality.
  • Balance: if the task has multiple subtypes, spread examples roughly evenly across them.

Importing from Eval

If you’ve already exported a dataset from a benchmark run, it appears automatically in Train → Datasets — no upload needed. See Export datasets → Train.

Next Steps