Run Inference

Train provides a built-in inference endpoint so you can test a fine-tuned model immediately after training — without merging or deploying first.

When to Use

Quick sanity check right after a training run completes.
Comparing adapter vs base model responses side by side.
Validating the model before committing to a merge.

Steps

Go to Train → Runs and open a completed run.
Click Test inference.
Enter a prompt in the chat interface.
The response is generated using the adapter applied to the base model.

You can toggle Base model only to compare the same prompt without the adapter.

API Access

The inference endpoint is also available as an API during the run’s active window:

curl -X POST https://train.benchgen.ai/inference/{run-id} \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Your test prompt here", "max_tokens": 256}'

Next Steps

Happy with the results? Merge the adapter for a portable checkpoint.
Want to measure improvement formally? Run a benchmark in Eval.

Get started

Agents

Eval

Train

Run Inference

Run Inference

When to Use

Steps

API Access

Next Steps

Get started

Agents

Eval

Train

Documentation Index

​Run Inference

​When to Use

​Steps

​API Access

​Next Steps

Run Inference

When to Use

Steps

API Access

Next Steps