Skip to main content

Creating Custom Checks

If the predefined checks do not serve your needs, you can create your own. Okareo supports two types of custom checks:

  • Code-based checks — Python code with an evaluate method. See Code Checks for the full contract (allowed parameters, return types, allowed imports).
  • Model checks -- A prompt template evaluated by a judge LLM. See Model Checks for template variables and prompt structure guidance.

Custom Code Checks

Generating code checks

The Okareo SDK provides a generate_check method that uses an LLM to generate an evaluate method from a natural-language description.

const generated_check = await okareo.generate_check({  
project_id,
name: "demo.summaryUnder256",
description: "Pass if model_output contains at least one line of natural language.",
output_data_type: "bool",
requires_scenario_input:true,
requires_scenario_result:true,
});

return await okareo.upload_check({
project_id,
...generated_check
} as UploadEvaluatorProps);

note

Ensure that requires_scenario_input and requires_scenario_result are correctly configured for your check. If your check relies on the scenario_input, set requires_scenario_input=True. Similarly for scenario_result.

Uploading code checks

Given generated (or hand-written) check code, the Okareo SDK provides the upload_check method to register it.

const upload_check: any = await okareo.upload_check({
name: 'Example Uploaded Check',
project_id,
description: "Pass if the model result length is within 10% of the expected result.",
requires_scenario_input: false,
requires_scenario_result: true,
output_data_type: "bool",
file_path: "tests/example_eval.py",
update: true
});
note

Your evaluate function must be saved locally as a .py file, and the file_path should point to this .py file.

For the full code check contract — allowed parameters, return types, allowed imports, and restrictions — see Code Checks.


Custom Model Checks

You can also create custom model checks by providing a prompt template. The prompt template is a set of instructions for a judge LLM that evaluates model output at runtime.

Creating a model check via the SDK

const check = await okareo.create_or_update_check({
name: "custom_coherence_check",
description: "Rate the coherence of the model output on a 1-5 scale",
check: {
type: "model",
prompt_template: `You will be given a Model Output.
Rate the output on one metric: Coherence (1-5).

Evaluation Criteria:
Coherence (1-5) - how well-structured and logically organized the output is.

Evaluation Steps:
1. Read the Model Output carefully.
2. Assess whether ideas flow logically and are well-organized.
3. Assign a Coherence score from 1 to 5.

Model Output:
{generation}

Evaluation Form (scores ONLY, one number):
- Coherence (1-5):`,
check_type: CheckOutputType.SCORE,
},
});

For the full template variables list, prompt structure guidance, and output format details, see Model Checks.


Running custom checks

Once a custom check has been created (code-based or model-based), you can use it in an evaluation by adding the check's name or ID to your list of checks:

// provide a list of checks by name or ID 
const eval_results: any = await model.run_test({
model_api_key: OPENAI_API_KEY,
name: 'Evaluation Name',
tags: ["Example", `Build:${UNIQUE_BUILD_ID}`],
project_id: project_id,
scenario_id: scenario_id,
calculate_metrics: true,
type: TestRunType.NL_GENERATION,
checks: [
"check_name_1",
"check_name_2",
...
],
} as RunTestProps);