okareo
BaseGenerationSchema Objects
class BaseGenerationSchema(PydanticBaseModel)
A base schema class for specifying structured outputs to synthetic data generators.
Okareo Objects
class Okareo()
A class for interacting with Okareo API and for formatting request data.
seed_data_from_list
@staticmethod
def seed_data_from_list(data_list: List[SeedDataRow]) -> List[SeedData]
Create a list of SeedData objects from a list of dictionaries.
Each dictionary in the input list must have 'input' and 'result' keys.
Arguments:
data_listList[SeedDataRow] - A list of dictionaries, where each dictionary contains 'input' and 'result' keys.
Returns:
List[SeedData]- A list of SeedData objects created from the input dictionaries.
get_projects
def get_projects() -> List[ProjectResponse]
Get a list of all Okareo projects available to the user.
Returns:
List[ProjectResponse]- A list of ProjectResponse objects accessible to the user.
Raises:
TypeError- If the API response is an error.ValueError- If no response is received from the API.
create_project
def create_project(name: str,
tags: Union[Unset, List[str]] = UNSET) -> ProjectResponse
Create a new Okareo project.
Arguments:
namestr - The name of the new project.tagsUnion[Unset, List[str]], optional - Optional list of tags to associate with the project.
Returns:
ProjectResponse- The created ProjectResponse object.
Raises:
TypeError- If the API response is an error.ValueError- If no response is received from the API.
register_model
def register_model(
name: str,
tags: Union[List[str], None] = None,
project_id: Union[str, UUID, None] = None,
model: Union[None, BaseModel, List[BaseModel]] = None,
update: bool = False,
sensitive_fields: Union[List[str], None] = None) -> ModelUnderTest
Register a new Model Under Test (MUT) to use in an Okareo evaluation.
Arguments:
namestr - The name of the model. Model names must be unique within a project. Using the same name will return or update the existing model.tagsUnion[List[str], None], optional - Optional list of tags to associate with the model.project_idUnion[str, None], optional - The project ID to associate the model with.modelUnion[None, BaseModel, List[BaseModel]], optional - The model or list of models to register.updatebool, optional - Whether to update an existing model with the same name. Defaults to False.sensitive_fieldsList[str], optional - A list of sensitive fields to mask in the model parameters. Defaults to None.
Returns:
ModelUnderTest- The registered ModelUnderTest object.
Raises:
TypeError- If the API response is an error.ValueError- If no response is received from the API.
get_model
def get_model(name: str, version: str | int = "latest") -> ModelUnderTest
Fetch a model under test based on the name and version.
Arguments:
namestr - The name of the model to fetch.versionstr | int, optional - The version of the model to fetch. Defaults to "latest".
create_scenario_set
def create_scenario_set(
create_request: ScenarioSetCreate) -> ScenarioSetResponse
Create a new scenario set to use in an Okareo evaluation or as a seed for synthetic data generation.
Arguments:
create_requestScenarioSetCreate - The request object containing scenario set details and seed data. The ScenarioSetCreate object should include:
Returns:
ScenarioSetResponse- The created ScenarioSetResponse object.
Raises:
ValueError- If the seed data is empty or if no response is received from the API.TypeError- If the API response is an error.
Example:
seed_data = okareo_client.seed_data_from_list([
{"input": {"animal": "fish", "color": "red"}, "result": "red"},
{"input": {"animal": "dog", "color": "blue"}, "result": "blue"},
{"input": {"animal": "cat", "color": "green"}, "result": "green"}
])
create_request = ScenarioSetCreate(name="My Scenario Set", seed_data=seed_data)
okareo_client.create_scenario_set(create_request)
upload_scenario_set
def upload_scenario_set(
scenario_name: str,
file_path: str,
project_id: Union[Unset, str, UUID] = UNSET) -> ScenarioSetResponse
Upload a file as a scenario set to use in an Okareo evaluation or as a seed for synthetic data generation.
Arguments:
scenario_namestr - The name to assign to the uploaded scenario set.file_pathstr - The path to the file to upload.project_idUnion[Unset, str], optional - The project ID to associate with the scenario set.
Returns:
ScenarioSetResponse- The created ScenarioSetResponse object.
Raises:
UnexpectedStatus- If the API returns an unexpected status.TypeError- If the API response is an error.ValueError- If no response is received from the API.
Example:
project_id = "your_project_id" # Optional, can be None
okareo_client.upload_scenario_set(
scenario_name="My Uploaded Scenario Set",
file_path="/path/to/scenario_set_file.json",
project_id=project_id or None,
)
download_scenario_set
def download_scenario_set(scenario: Union[ScenarioSetResponse, str],
file_path: str = "") -> Any
Download a scenario set from Okareo to the client's local filesystem.
Arguments:
scenario_setScenarioSetResponse - The scenario set to download.file_pathstr, optional - The path where the file will be saved. If not provided, uses scenario set name.
Returns:
File- The downloaded file object.
Example:
response_file = okareo_client.download_scenario_set(create_scenario_set)
with open(response_file.name) as scenario_file:
for line in scenario_file:
print(line)
generate_scenarios
def generate_scenarios(
source_scenario: Union[str, UUID, ScenarioSetResponse],
name: str,
number_examples: int,
project_id: Union[Unset, str, UUID] = UNSET,
generation_type: Union[Unset,
ScenarioType] = ScenarioType.REPHRASE_INVARIANT
) -> ScenarioSetResponse
Generate a synthetic scenario set based on an existing seed scenario.
Arguments:
source_scenarioUnion[str, ScenarioSetResponse] - The source scenario set or its ID to generate from.namestr - The name for the new generated scenario set.number_examplesint - The number of synthetic examples to generate per seed scenario row.project_idUnion[Unset, str], optional - The project ID to associate with the generated scenario set.generation_typeUnion[Unset, ScenarioType], optional - The type of scenario generation to use.
Returns:
ScenarioSetResponse- The generated synthetic scenario set.
Raises:
TypeError- If the API response is an error.ValueError- If no response is received from the API.
Example:
source_scenario = "source_scenario_id" # or ScenarioSetResponse object
generated_set = okareo_client.generate_scenarios(
source_scenario=source_scenario,
name="Generated Scenario Set",
number_examples=100,
project_id="your_project_id",
generation_type=ScenarioType.REPHRASE_INVARIANT
)
print(generated_set.app_link) # Prints the link to the generated scenario set
generate_scenario_set
def generate_scenario_set(
create_request: ScenarioSetGenerate) -> ScenarioSetResponse
Generate a synthetic scenario set based on an existing seed scenario and a ScenarioSetGenerate object. Offers more controls than the comparable generate_scenarios method.
Arguments:
create_requestScenarioSetGenerate - The request object specifying scenario generation parameters.
Returns:
ScenarioSetResponse- The generated synthetic scenario set.
Example:
generate_request = ScenarioSetGenerate(
source_scenario_id="seed_scenario_id",
name="My Synthetic Scenario Set",
number_examples=50,
project_id="your_project_id",
generation_type=ScenarioType.REPHRASE_INVARIANT,
)
generated_set = okareo_client.generate_scenario_set(generate_request)
print(generated_set.app_link) # Prints the link to the generated scenario set
get_scenario_data_points
def get_scenario_data_points(
scenario_id: Union[str, UUID]) -> List[ScenarioDataPoinResponse]
Fetch the scenario data points associated with a scenario set with scenario_id.
Arguments:
scenario_idstr - The ID of the scenario set to fetch data points for.
Returns:
List[ScenarioDataPoinResponse]- A list of scenario data point responses associated with the scenario set.
Example:
okareo_client = Okareo(api_key="your_api_key")
scenario_id = "your_scenario_id"
data_points = okareo_client.get_scenario_data_points(scenario_id)
for dp in data_points:
print(dp.input_, dp.result)
find_test_data_points
def find_test_data_points(
test_data_point_payload: FindTestDataPointPayload
) -> List[Union[TestDataPointItem, FullDataPointItem]]
Fetch the test run data points associated as specified in the payload.
Arguments:
test_data_point_payloadFindTestDataPointPayload - The payload specifying the test data point search criteria.
Returns:
List[Union[TestDataPointItem, FullDataPointItem]]: A list of test or full data point items.
Raises:
TypeError- If the API response is an error.
Example:
from okareo_api_client.models.find_test_data_point_payload import (
FindTestDataPointPayload,
)
test_run_id = "your_test_run_id" # Replace with your actual test run ID
payload = FindTestDataPointPayload(
test_run_id=test_run_id,
)
data_points = okareo_client.find_test_data_points(payload)
for dp in data_points:
print(dp)
find_datapoints
def find_datapoints(
datapoint_search: DatapointSearch) -> List[DatapointListItem]
Fetch the datapoints specified by a Datapoint Search.
Arguments:
datapoint_searchDatapointSearch - The search criteria for fetching datapoints.
Returns:
List[DatapointListItem]- A list of datapoint items matching the search.
Raises:
TypeError- If the API response is an error.
Example:
from okareo_api_client.models.datapoint_search import DatapointSearch
### Search based on a test run ID
test_run__id = "your_test_run_id" # Replace with your actual test run ID
search = DatapointSearch(
test_run_id=test_run__id,
)
datapoints = okareo_client.find_datapoints(search)
for dp in datapoints:
print(dp)
### Search based on a context token from a logger
logger_config = {
"api_key": "<API_KEY>",
"tags": ["logger-test"],
"context_token": random_string(10),
}
# Use the logger config to log completions from CrewAI or Autogen
...
# Search for the logged datapoints by the context token
search = DatapointSearch(
context_token=context_token,
)
datapoints = okareo_client.find_datapoints(search)
for dp in datapoints:
print(dp)
find_datapoints_filter
def find_datapoints_filter(
datapoint_search: DatapointFilterSearchPayload
) -> List[DatapointListItem]
Fetch the datapoints specified by a Datapoint Search.
Arguments:
datapoint_searchDatapointFilterSearchPayload - The search criteria for fetching datapoints.
Returns:
List[DatapointListItem]- A list of datapoint items matching the search.
Raises:
TypeError- If the API response is an error.
Example:
from okareo_api_client.models.datapoint_filter_search_payload import DatapointFilterSearchPayload
### Search based on a test run ID
test_run__id = "your_test_run_id" # Replace with your actual test run ID
search = DatapointFilterSearchPayload(
test_run_id=test_run__id,
)
datapoints = okareo_client.find_datapoints(search)
for dp in datapoints:
print(dp)
### Find datapoints based on filters on datapoints fields
from okareo_api_client.models.datapoint_filter_search_payload import DatapointFilterSearchPayload
from okareo_api_client.models.filter_condition import FilterCondition
from okareo_api_client.models.comparison_operator import ComparisonOperator
search = DatapointFilterSearchPayload(
filters=[FilterCondition(
field=DatapointField.TEST_RUN_ID,
operator=ComparisonOperator.EQUAL,
value="France"
)]
)
datapoints = okareo_client.find_datapoints_filter(search)
for dp in datapoints:
print(dp)
generate_check
def generate_check(
create_check: EvaluatorSpecRequest) -> EvaluatorGenerateResponse
Generate the contents of a Check based on an EvaluatorSpecRequest. Can be used to generate a behavioral (model-based) or a deterministic (code-based) check. Check names must be unique within a project.
Arguments:
create_checkEvaluatorSpecRequest - The specification for the check to generate.
Returns:
EvaluatorGenerateResponse- The generated check response.
Example:
from okareo_api_client.models.evaluator_spec_request import EvaluatorSpecRequest
from okareo.okareo import OkareoClient, BaseCheck
# Generate a behavioral model-based check
spec = EvaluatorSpecRequest(
description="Checks if the output contains toxic language.",
requires_scenario_input=False,
requires_scenario_result=False,
output_data_type="bool", # bool, int, float
)
okareo_client = Okareo(api_key="your_api_key")
generated_check = okareo_client.generate_check(spec)
# Inspect the generated check to ensure it meets your requirements
print(generated_check)
# Upload the generated check to Okareo to use in evaluations
toxicity_check = okareo.create_or_update_check(
name="toxicity_check",
description=generated_check.description,
check=ModelBasedCheck( # type: ignore
prompt_template=check.generated_prompt,
check_type=CheckOutputType.PASS_FAIL,
),
)
# Inspect the uploaded check
print(toxicity_check)
get_all_checks
def get_all_checks(all_versions: bool = False) -> List[EvaluatorBriefResponse]
Fetch all available checks.
Arguments:
all_versions- If True, return all versions of every check (full version history). Defaults to False (latest version only).
Returns:
List[EvaluatorBriefResponse]- A list of EvaluatorBriefResponse objects representing all available checks.
Example:
checks = okareo_client.get_all_checks()
for check in checks:
print(check.name, check.id)
# Include full version history
all_checks = okareo_client.get_all_checks(all_versions=True)
get_check
def get_check(
check_id: Union[str, UUID],
version: Union[str, int, None] = None) -> EvaluatorDetailedResponse
Fetch details for a specific check by UUID or by name.
Arguments:
check_id- A check UUID (str or UUID object) or a check name (str). When a name is given the method resolves it to a UUID via the list endpoint.version- Optional version number or the string"latest". Only used when check_id is a name.Noneand"latest"both resolve to the most recent version.
Returns:
EvaluatorDetailedResponse- The detailed response for the specified check.
Raises:
ValueError- If no check matches the given name/version.
Example:
# By UUID (existing behaviour)
check = okareo_client.get_check("your_check_uuid")
# By name (latest version)
check = okareo_client.get_check("my_check")
check = okareo_client.get_check("my_check", version="latest")
# By name + pinned version
check = okareo_client.get_check("my_check", version=1)
delete_check
def delete_check(check_id: Union[str, UUID], check_name: str) -> str
Deletes a check identified by its ID and name.
Arguments:
check_idstr - The unique identifier of the check to delete.check_namestr - The name of the check to delete.
Returns:
str- A message indicating the result of the deletion.
Example:
result = okareo_client.delete_check(check_id="abc123", check_name="MyCheck")
print(result) # Output: Check deletion was successful
create_or_update_check
def create_or_update_check(
name: str,
description: str,
check: BaseCheck,
tags: Optional[List[str]] = None) -> EvaluatorDetailedResponse
Create or update an existing check. If the check with 'name' already exists, then this method will update the existing check. Otherwise, this method will create a new check.
Arguments:
namestr - The unique name of the check to create or update.descriptionstr - A human-readable description of the check.checkBaseCheck - An instance of BaseCheck containing the check configuration.tags- Optional list of string tags to associate with the check.
Returns:
EvaluatorDetailedResponse- The detailed response from the evaluator after creating or updating the check.
Raises:
AssertionError- If the response is not an instance of EvaluatorDetailedResponse.ValueError- If the response validation fails.
Example:
from okareo.checks import CheckOutputType, ModelBasedCheck
my_check = ModelBasedCheck(
prompt_template="Only output the number of words in the following text: {scenario_input} {generation}",
check_type=CheckOutputType.PASS_FAIL,
)
response = okareo_client.create_or_update_check(
name="my_word_count_check",
description="Custom check for counting combined total number of words in input and output.",
check=my_check,
tags=["prod", "v1"],
)
print(response)
create_trace_eval
def create_trace_eval(group: Any, context_token: str) -> Any
Create a trace evaluation for a group.
Arguments:
group_idstr - The ID of the group.context_tokenstr - The context token for the trace.
Returns:
The created trace evaluation details.
Raises:
OkareoAPIException- If the API request fails.
evaluate
def evaluate(name: str,
test_run_type: TestRunType,
scenario_id: Union[Unset, str] = UNSET,
datapoint_ids: Union[Unset, list[str]] = UNSET,
filter_group_id: Union[Unset, str] = UNSET,
tags: Union[Unset, list[str]] = UNSET,
metrics_kwargs: Union[Dict[str, Any], Unset] = UNSET,
checks: Union[Unset, list[str]] = UNSET) -> TestRunItem
Evaluate datapoints using the specified parameters.
Arguments:
scenario_id- ID of the scenario setmetrics_kwargs- Dictionary of metrics to be measuredname- Name of the test runtest_run_type- Type of test runtags- Tags for filtering test runschecks- List of checks to includedatapoint_ids- List of datapoint IDs to filter byfilter_group_id- ID of the datapoint filter group to apply
Returns:
TestRunItem- The evaluation results as a TestRunItem object.
Example:
checks = ["model_refusal"] # one or more checks to apply in the evaluation
test_run = okareo.evaluate(
name="My Test Run",
test_run_type=TestRunType.NL_GENERATION,
checks=checks,
datapoint_ids=["datapoint_id_1", "datapoint_id_2"],
)
print(test_run.app_link) # View link to eval results in Okareo app
create_or_update_driver
def create_or_update_driver(driver: Driver) -> Driver
Create or update a driver.
Arguments:
driver- The driver to create or update.
Returns:
The created or updated driver.
get_driver_by_name
def get_driver_by_name(driver_name: str) -> Driver
Get a driver by its name.
Arguments:
driver_name- The name of the driver to retrieve.
Returns:
The driver with the specified name.
create_or_update_target
def create_or_update_target(
target: Target,
tags: Optional[List[str]] = None,
project_id: Optional[str] = None,
sensitive_fields: Union[List[str], None] = None) -> Target
Create or update a target.
Arguments:
target- The target to create or update.
Returns:
The created or updated target.
generate_driver_prompt
def generate_driver_prompt(user_input: str,
prior_prompt: Optional[str] = None,
language: Optional[str] = None,
**driver_kwargs: Any) -> Driver
Generate a structured driver prompt from a one-sentence description.
Arguments:
user_input- Natural language description of the caller persona.prior_prompt- Optional existing prompt to refine.language- BCP-47 language code (e.g. "en", "es", "fr-CA").**driver_kwargs- Extra fields forwarded to the returned Driver (e.g. voice_instructions, temperature, voice).
Returns:
A Driver with the AI-generated name and prompt_template.
find_test_runs
def find_test_runs(name: Optional[str] = None,
tags: Optional[list] = None,
project_id: Optional[str] = None,
return_model_metrics: bool = False) -> list
Find test runs, optionally filtering by name or tags.
Arguments:
name- Filter results to runs with this exact name (client-side).tags- Filter results to runs with these tags (server-side).project_id- Scope to a specific project.return_model_metrics- Include model_metrics in the response.
Returns:
List of test run dicts from the server.
re_evaluate
def re_evaluate(test_run_id: str,
checks: list,
name: Optional[str] = None,
tags: Optional[list] = None) -> TestRunItem
Re-evaluate an existing test run with different checks.
No new simulation or phone call is made. The existing conversation data is re-scored with the specified checks.
Arguments:
test_run_id- ID of the source test run to re-evaluate.checks- List of check names or IDs to apply.name- Optional name for the new re-evaluated run.tags- Optional tags for the new run.
Returns:
The newly created TestRunItem with re-evaluated results.
download_call_recording
def download_call_recording(call_sid: str) -> bytes
Download a voice call recording by its Twilio CallSid.
Arguments:
call_sid- The Twilio CallSid from datapoint metadata (e.g. dp.model_metadata.additional_properties["call_sid"]).
Returns:
Raw WAV audio bytes.
download_voice
def download_voice(file_url: str) -> bytes
Download a voice file from Okareo.
Files are always stored as MP3 on the server.
Arguments:
file_url- The file URL returned by upload_voice() (e.g. https://api.okareo.com/v0/voice/file/{project_id}/{file_id}).
Returns:
Raw MP3 audio bytes.
create_scenario_set_with_audio_files
def create_scenario_set_with_audio_files(
name: str,
data_list: List[Dict[str, str]],
project_id: Optional[Union[str, UUID]] = None) -> ScenarioSetResponse
Upload local audio files and create a scenario set in one call.
Each item's 'input' field should be a local file path to a WAV or MP3 file. The file is uploaded to Okareo (coerced to MP3), and 'input' is replaced with the returned file URL before creating the scenario.
Arguments:
name- Scenario set name.data_list- List of dicts with 'input' (local file path) and 'result' (expected transcript string).project_id- Optional project to associate files and scenario with.
Returns:
ScenarioSetResponse from the created scenario set.
ingest_conversations
def ingest_conversations(
project_id: Union[str, UUID],
conversations: List[Dict[str, Any]],
mut_id: Union[str, UUID, None] = None) -> Dict[str, Any]
Ingest voice conversations for monitoring.
Accepts one or more conversations from voice platforms (Retell, Twilio, VAPI, etc.) and enqueues them for async processing. Each conversation's turns will become Datapoint rows, and configured monitors will automatically match and run checks.
This is the monitoring path, not the simulation path. No ScenarioSets are created. The mut_id is optional - when omitted, datapoints are created without MUT association and rely entirely on monitor/filter group matching.
Arguments:
project_id- Okareo project ID.conversations- List of conversation dictionaries, each containing:- source_platform (str): Platform source ('retell', 'twilio', 'vapi', 'elevenlabs', or 'custom')
- call_id (str): Platform-specific call identifier
- context_token (str, optional): Context token for correlation (defaults to call_id)
- audio (dict, optional): Preferred audio shape with one of:
{"type": "url", "url": "https://..."}{"type": "voice_file_id", "voice_file_id": "uuid"}{"type": "inline_b64", "inline_b64": "..."}- recording_url (str, optional): Legacy compatibility alias for audio URL
- recording_bytes_b64 (str, optional): Legacy compatibility alias for inline base64 audio
- transcript (list, optional): Pre-parsed transcript as list of turns with 'role' and 'content'
- diarization (bool, optional): When transcript is absent, controls whether Okareo runs diarization + ASR. Defaults to True.
- metadata (dict, optional): Platform-specific metadata
- tags (list, optional): Tags for monitor matching
- first_turn (str, optional): For audio-only diarization ('user' or 'assistant' spoke first, defaults to 'assistant')
mut_id- Optional model under test ID. If not provided, datapoints are created without MUT association (monitoring path).
Returns:
Dict with 'status' and list of conversation identifiers.
Raises:
-
httpx.HTTPStatusError- If the API returns an error status.Example (monitoring-only, no MUT):
okareo.ingest_conversations(
project_id="your-project-id",
conversations=[
{
"source_platform": "retell",
"call_id": "call-123",
"audio": {
"type": "url",
"url": "https://retell.ai/recordings/call-123.mp3",
},
"tags": ["support", "billing"],
"metadata": {"customer_id": "cust-456"}
}
]
)
Example (with MUT association):
okareo.ingest_conversations(
project_id="your-project-id",
mut_id="your-mut-id",
conversations=[
{
"source_platform": "custom",
"call_id": "call-456",
"transcript": [
{"role": "user", "content": "Hello", "timestamp_ms": 0},
{"role": "assistant", "content": "Hi, how can I help?", "timestamp_ms": 1000}
],
}
]
)