Documentation Index
Fetch the complete documentation index at: https://docs.benchgen.com/llms.txt
Use this file to discover all available pages before exploring further.
Run a Benchmark
A benchmark run evaluates a model against a set of test cases and produces a structured results report.Prerequisites
- A model connected or uploaded in Eval (see Upload a model)
- At least one benchmark dataset available in your workspace
Steps
- Open Eval from the left sidebar.
- Select a benchmark from the benchmark library or upload your own.
- Choose a model to evaluate — pick from connected APIs or uploaded models.
- Configure run settings
- Temperature and sampling parameters
- Max tokens per response
- Timeout per question
- Click Run to start the evaluation.
- Monitor progress in the run dashboard — results stream in as each test case completes.
- View the report when the run finishes (see Read results).
Tips
- Run the same benchmark against multiple models to compare them side by side.
- Start with a small benchmark (10–50 cases) to validate your setup before scaling up.
- Save run configs to re-run the same evaluation after each training iteration.