Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.oumi.ai/llms.txt

Use this file to discover all available pages before exploring further.

To run an evaluation against an existing model, go to Evaluations and click on Run an Evaluation. Click on Judge-Based Evaluation from the Builder. Under the INPUTS tab, provide the following information:
  • Model - A hosted or custom model for evaluation.
  • Evaluators - One or more evaluators to score model outputs.
  • Dataset - The dataset to evaluate against.
  • Failure Mode Analysis (optional) - Whether to generate failure modes automatically
  • Inference Configurations (optional) - Inference parameters like Temperature, Max Tokens, Seed, Requests Per Minute.
After confirming and launching the evaluation job, you can view the results on the Evaluations page.