Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.benchgen.com/llms.txt

Use this file to discover all available pages before exploring further.

Run Inference

Train provides a built-in inference endpoint so you can test a fine-tuned model immediately after training — without merging or deploying first.

When to Use

  • Quick sanity check right after a training run completes.
  • Comparing adapter vs base model responses side by side.
  • Validating the model before committing to a merge.

Steps

  1. Go to Train → Runs and open a completed run.
  2. Click Test inference.
  3. Enter a prompt in the chat interface.
  4. The response is generated using the adapter applied to the base model.
You can toggle Base model only to compare the same prompt without the adapter.

API Access

The inference endpoint is also available as an API during the run’s active window:
curl -X POST https://train.benchgen.ai/inference/{run-id} \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Your test prompt here", "max_tokens": 256}'

Next Steps