Skip to main content

Overview

What problem are we solving?

AI/ML is becoming increasingly common in software development. Deterministic code, which always produces the same output given the same input, is relatively easy to test. Non-deterministic software components, which can produce different outputs given the same input, require new approaches to testing.

Manual testing and manually monitored production feedback loops can be used to improve models. But doing so is arduous, time consuming and risky.

Enter Okareo. Our focus is on helping you establish reliable AI throughout your development lifecycle.

Okareo Diagram

Getting Started

When seeking reliability, it is not hard for model evaluation to get complicated, fast. But let's start with something simple to get a feel for Okareo.

Okareo Basics

To use Okareo you will need an API Token, some data, and a model to test.
Okareo can evaluate a wide range of models. In each case, the process is very similar. The output and analytics for each can differ dramatically.
The following is a general outline for how to evaluate a model. For specific examples or instructions, please refer to the guides and examples.

Step 1: Get an API Token

  1. If you haven't already, sign-up for Okareo.
  2. Navigate to Settings > API Token and provision a token
  3. We suggest making the token available in your environment as OKAREO_API_KEY
export OKAREO_API_KEY="<YOUR_TOKEN>"

Step 2: Install Okareo

Okareo is just an easy pip, yarn, or npm install away. We also expose all of our capabilities via API if you prefer.

pip install Okareo

Step 3: Register a Model

Everyone's AI/Model evaluation needs are different. We have provided some common examples you can build on.

# Model endpoints can be Custom made by you.
# Or you can use one of our premade endpoint for OpenAI, Cohere, Pinecone, Qdrant, and more..
model_under_test = okareo.register_model(
name="Example Classifier",
project_id="",
model=<CUSTOM, OpenAI, Cohere, Pinecone, QDrant, ...>,
)

Step 4: Create a Scenario

Okareo scenarios can be used as defined or as seeds for synthetically generating variations to stretch your model and discover edges.

# Define a collection of scenario data 
scenario_set = ScenarioSetCreate(
name="Scenario Name",
number_examples=10,
generation_type=ScenarioType.REPHRASE_INVARIANT,
seed_data=[
SeedData(
input_= {JSON} | "String" | Custom... ,
result={JSON} | "String" | Custom...
),
...
],
)
# Create the Scenario Set in Okareo from the scenario data
scenario = okareo.create_scenario_set(scenario_set)
print(f"{scenario.app_link}")
info

All Okareo primary objects (models, scenarios, and evaluations) can be accessed through the UI. Just print or share the .app_link property.

Step 4: Run an Evaluation

Okareo can handle the round-trip evaluation from the cloud. This makes it easy to run evaluations of any size or length from CI or from your local workspace.

evaluation = model_under_test.run_test_v2(
name="Example Classifier Run",
scenario=scenario,
api_key=<MODEL_API_KEY>, # This is bsed on the model you defined
test_run_type=TestRunType.MULTI_CLASS_CLASSIFICATION,
calculate_metrics=True,
)
print(f"{evaluation.app_link}")