Skip to main content

Scenarios

What is a scenario?

In Okareo, a scenario is a collection of data points, each of which is defined by an input and a corresponding expected result. A single data point of a scenario can be represented as json or dict object.

Scenarios describe the expected inputs and results of models, and they allow you to:

  • Evaluate classification, retrieval, or generation models via Okareo's evaluations.
  • Create synthetic data in Okareo via scenario generators.

Cookbook examples that showcase Okareo scenarios are available here:

note

Try creating and generating scenarios for yourself with the companion Jupyter notebook - scenarios.ipynb

Formatting scenario data

The format of your inputs should match the format expected by your model. The format of your result is dependent on the type of evaluations you want to run on the scenario.

Classification

In classification scenarios, the results correspond to the expected category or label that the model should assign to the input. For example, a point in a classification scenario could look like the following:

{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "rewards"
}

Here rewards indicates that the input should be classified into the rewards category. See the Get started with Classification page for more on classification evaluations.

Retrieval

For a retrieval evaluation, each result is a list of one or more viable document IDs that should be returned for the associated input, like the following:

{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": ["35a4fd5b-453e-4ca6-9536-f20db7303344"]
}

See our Retrieval Testing guide for more details on setting up scenarios retrieval evaluations!

Generation

Evaluation of generative models can either be referenced or reference-free. Referenced evaluations involve comparing the generative model's output to one or more references, and in such cases, the result field should contain the reference(s). For example,

{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "With WebBizz Rewards, customers can earn points with each purchase and avail exclusive discounts."
}

When performing referenced evaluations, the reference in the result field will be compared against the model's outputs. The content of the reference depends on your use case and can vary from written responses to edited versions of the model's outputs.

For reference-free evaluations, the result field is not strictly necessary, meaning any placeholder value can be provided, e.g.

{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "<YOUR_PLACEHOLDER_STRING_HERE>"
}

Get started on setting up such scenarios with our Generation evaluation guide.

Seed scenarios

To get started in Okareo, you will need to begin with a Seed scenario, so-called since it can serve as the "seed" for Generated scenarios. Any scenario that has been uploaded to or created in Okareo can serve as the Seed for a Generated scenario.

As of now, there are three paths to creating/designating a Seed scenario:

  1. An uploaded file (.jsonl)
  2. A static definition
  3. An existing scenario (Seed or Generated)

Creating seed scenarios

To create a seed scenario with a .jsonl file, you can use the following:

seed_scenario = okareo.upload_scenario_set(
file_path='./path/to/your/file.jsonl',
scenario_name="your_scenario_name"
)

To create a seed scenario via a static definition, you can use the following:

from okareo_api_client.models import ScenarioSetCreate, SeedData

# list of statically defined seed data
seed_data=[
SeedData(input_="input1", result="result1"),
SeedData(input_="input2", result="result2"),
SeedData(input_="input3", result="result3")
]

# request for scenario set creation
scenario_set_create = ScenarioSetCreate(
name="your_static_scenario_name",
generation_type=ScenarioType.SEED,
seed_data=seed_data
)

static_scenario = okareo.create_scenario_set(scenario_set_create)

Finally, to use a previously created scenario as a seed, you can call okareo.generate_scenarios with the proper scenario_id

# use the previously generated `static_scenario` to seed another generated scenario

new_generated_scenario = okareo.generate_scenarios(
source_scenario=static_scenario.scenario_id,
name="generated_seed_scenario"
)

Generating synthetic scenarios

Assuming you have an existing scenario to use as a Seed, Okareo lets you automatically generate synthetic test cases based on a suite of scenario generators.

Generated scenarios can be a powerful tool to improve your model evaluation pipeline by allowing you to:

  • Create new test cases automatically
  • Ensure robustness to input perturbations/human error

Here, we describe our available scenario generators in more detail and offer a few examples of potential use cases. You can try these generators for yourself by checking out scenarios.ipynb.

Rephrasing

The Rephrasing generator rewords each sentence of the input while keeping the same content. This can be useful when you want to ensure that your model returns the same results under semantically identical inputs.

Example

--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs...
-----Generated #0------
WebBizz prioritizes a smooth digital shopping journey for our customers. Our platform is tailored with straightforward interfaces for easier product browsing and selection...

Relevant Terms

The Relevant Terms generator returns three terms based on tf-idf, meaning the terms are frequent in the the document and relatively less frequent in the larger corpus of the scenario's inputs. This can be useful when you'd like to produce queries based on keywords, a typical pattern that search engine users might use.

Example

--------Seed #2--------
WebBizz places immense value on its dedicated clientele, recognizing their loyalty through the exclusive 'Premium Club' membership. This special program is designed to enrich the shopping experience, providing a suite of benefits tailored to our valued members. Among the advantages, members enjoy complimentary shipping, granting them a seamless and cost-effective way to receive their purchases. Additionally, the 'Premium Club' offers early access to sales, allowing members to avail themselves of promotional offers before they are opened to the general public.
-----Generated #0------
offers members club

Misspellings

The Misspellings generator lets you create scenarios with human-like errors. This can be useful if your model will be used in a context where inputs are likely to be error-prone. For example, you may be evaluating a model used in a conversational context (e.g., as a customer service chatbot).

Example

--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brown fox jumps over the lazt dog
-----Generated #1------
The quick brown fox humps over the lazy dog

Contractions

The Contractions generator attempts to shorten words in a human-like way. Similar to Misspellings, this generator can be beneficial if your model will be seeing conversational inputs.

Example

--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brwn fox jumps over the lazy dog

Reverse Questions

The Reverse Question generator poses questions based on the contents of inputs in the seed scenario. This generator is particularly useful when assessing the robustness of a retrieval model.

Suppose you have a database of articles and you would like to generate questions that a user might pose to a chatbot. The Reverse Question generator can help you get coverage on a wide range of questions that potential customers might pose, allowing you to evaluate the chatbot's robustness on corner cases.

Example

--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs. We offer a wide range of products from top brands and new entrants, ensuring diversity and quality in our offerings. Our 24/7 customer support is ready to assist you with any queries, from product details, shipping timelines, to payment methods. We also have a dedicated FAQ section addressing common concerns. Always ensure you are logged in to enjoy personalized product recommendations and faster checkout processes.
-----Generated #0------
What features does WebBizz offer to enhance the customer's online shopping experience?

Conditionals

The Conditional generator assumes that the input values are questions and rewords each question to emphasize a particular clause. This can be used in conjunction with the Reverse Question generator to further expand your test coverage in a retrieval scenario.

Example

--------Seed #4--------
What is the primary benefit of joining the WebBizz Rewards program?
-----Generated #0------
Should you decide to join the WebBizz Rewards program, what would be the primary benefit?

Generator usage

To use a scenario generator, you can use the following template:

from okareo_api_client.models import ScenarioType
# assuming you have an available seed scenario `source_scenario`

okareo.generate_scenarios(
source_scenario=source_scenario.scenario_id,
name="generated_scenario",
num_examples=1,
generation_type=ScenarioType.REPHRASE_INVARIANT
)

For each input in the seed scenario, the generator will attempt to generate num_examples variations of that input.

The generator type is denoted by the ScenarioType enum, and the above example uses the Rephrasing generator. To use a different generator, simply change the enum to a valid ScenarioType in the table below.

GeneratorScenarioTypeBrief Description
RephrasingREPHRASE_INVARIANTChanges the wording of each sentence per input.
Relevant TermsTERM_RELEVANCE_INVARIANTReturns relevant/uniquely identifying words from inputs.
MisspellingsCOMMON_MISSPELLINGSAdds human-like typing errors to inputs.
ContractionsCOMMON_CONTRACTIONSRemoves characters from input.
Reverse QuestionsTEXT_REVERSE_QUESTIONCreates questions where an input contains the relevant answer.
ConditionalsCONDITIONALChanges questions in inputs to emphasize a specific condition.

Chaining generators

Composing multiple generators into a chain can help you test different model behaviors. For example, suppose you have trained a retrieval model on user questions. You might want to see if the model performs well based on keyword queries with and without errors. You might set up a chain of generators as follows:

# static definition for retrieval questions as seed data
seed_data=[
SeedData(input_="What type of products does WebBizz offer?", "result"= ["75eaa363-dfcc-499f-b2af-1407b43cb133"])
...
]

# upload the seed data
scenario_set_create = ScenarioSetCreate(
seed_data=seed_data,
name="Chain Step #1: Seed Questions",
generation_type=ScenarioType.SEED
)

questions_scenario = okareo.create_scenario_set(scenario_set_create)

# first generator uses uploaded scenario as seed
term_relev_scenario = okareo.generate_scenarios(
source_scenario=questions_scenario.scenario_id,
name="Chain Step #2: Term Relevance",
generation_type=ScenarioType.TERM_RELEVANCE_INVARIANT
)

# second generator uses the first generator's output as a seed
misspellings_scenario = okareo.generate_scenarios(
source_scenario=term_relev_scenario.scenario_id,
name="Chain Step #3: Misspellings",
generation_type=ScenarioType.COMMON_MISSPELLINGS
)

# third generator uses the second generator's output as a seed
contractions_scenario = okareo.generate_scenarios(
source_scenario=misspellings_scenario.scenario_id,
name="Chain Step #4: Contractions",
generation_type=ScenarioType.COMMON_CONTRACTIONS
)

Now all the steps of the chain are available to use in evaluating your retrieval model.