Skip to main content

Custom-Endpoint Multi-Turn Simulations

Okareo can drive a full conversation against your running service (RAG pipeline, tool-calling agent, or any HTTP API) by mapping requests & JSON responses to a Custom Endpoint Target. This guide shows you, step-by-step, how to run a multi-turn simulation using custom endpoints, in either the Okareo UI or SDK.

You'll follow the same four core steps you saw in the Multi-Turn Overview.

Cookbook examples for this guide are available:

1 · Define a Target agent profile

  1. Go to Multi-Turn Simulations → Settings.
  2. Click ➕ New Target and choose “Custom Endpoint”.
  3. Enter:
    • URL & HTTP method – e.g. POST https://api.example.com/v1/chat.
    • Headers / Query Params – add auth keys if needed.
    • Body Template – supports {session_id}, {latest_message}, {message_history[i:j]}.
    • Response Session-ID Path – JSONPath to the thread / session field.
    • Response Message Path – JSONPath to the assistant’s text.
  4. Click Test Start Session to preview and adjust paths until the response is highlighted correctly.

Target Settings – Custom Endpoint

Driver Parameters

ParameterDescription
driver_temperatureControls randomness of user/agent simulation
max_turnsMax back-and-forth messages
repeatsRepeats each test row to capture variance
first_turn"driver" or "target" starts conversation
stop_checkDefines stopping condition (via check)

2 · Choose or Define a Driver Persona

  1. Switch to the Scenarios sub‑tab.
  2. Click + New Scenario and fill in:
    • Driver Persona – e.g. “Confused shopper asking about returns”.
    • Expected Behaviors – what success looks like (“Explains policy & offers label”).

New Scenario

3 · Launch a Simulation

  1. Switch to the Simulations sub-tab.
  2. Click + New Simulation → select Target, Scenario, and Checks.
  3. Click Run. You can watch the progress of the simulation.

4 · Inspect Results

Click a Simulation tile to open its details. The results page breaks down the simulation into:

  • Conversation Transcript – View the full back-and-forth between the Driver and Target, one turn per row.
  • Checks – See results for:
    • Behavior Adherence – Did the assistant stay in character or follow instructions?
    • Model Refusal – Did the assistant properly decline off-topic or adversarial inputs?
    • Task Completed – Did it fulfill the main objective?
    • A custom check specific to your agent

Each turn is annotated with check results, so you can trace where things went wrong — or right.

Results


That's it! You now have a complete, repeatable workflow for evaluating assistants with multi-turn simulations—entirely from the browser or your codebase.