Model Feedback
Collecting Production Datapoints
Critical to evaluating a model is the collection of feedback. The best feedback is directly from users in production. But user feedback is not always possible and often inconsistent. Even when users don't provide feedback, it is still useful to collect the results of input/response pairs to review and compare to original assumptions. Establishing production baselines for concerns such as user intent (classification), stable embeddings (retrieval), and clear summarization (generation) is a great place to start.
With reasonable coverage in the above baselines, evaluation can be extended into observing production behavior.
If you haven't already, we would suggest running through the following examples before starting directly on RAG production evaluation.
In this guide we will explore how to establish a pipe of production datapoints from evaluation. This will provide evaluation oriented observability.
Getting Started
What do you need?
This example differs from some of the others. You can still experiment with a Jupyter notebook.
However, the best practice is to wrap your runtime model with the Okareo lib. As a result, The examples will be in both Python and REST API.
As always, you will need an Okareo API Token. Please refer to the Okareo API Key to get the token and prepare your environment.
Introduction to Datapoints
You may have noticed that there are sections titled Model
in several places within the application and on evaluation details. These areas provide you access to all of the interactions Okareo is aware of with your model.
In the case of running evaluations through Okareo, we automatically persist all of the inputs and ouputs related to the model invocation. To get the same detail from production, all you need to do is add the okareo lib to your runtime.
The collected Datapoint provide insight into how the model was used and what happened. You can filter for negative feedback
, results
and more.
- inputs
- results
- context token
- feedback
- error code
- error details
- created date
- more...
How to send Datapoints
There are two approaches to sending datapoints. If you use the Okareo python lib then you will get the benefit of async batching. If you call the endpoint directly, then you will need to manage queuing, etc.
- Python
- Curl
- HTTP
- API
from okareo import Okareo
from datetime import datetime
okareo = Okareo("YOUR API TOKEN HERE")
model_under_test = okareo.register_model(name="Example Model")
# Send a model trace data point
data_point = model_under_test.add_data_point_async(
input_obj = { "input": "value" },
input_datetime = str(datetime.now()),
result_obj = { "result": "value" },
result_datetime = str(datetime.now()),
context_token = "<YOUR_CONTEXT_TOKEN>",
feedback = 0.5,
tags = ["intent_classification_v3", "env:test"],
)
curl --request POST \
--url https://api.okareo.com/v0/datapoints \
--header 'accept: application/json' \
--header 'api-key: YOUR TOKEN HERE' \
--header 'content-type: application/json'\
--data '
{
"mut_id": "string",
"input": "string",
"result": "string",
"tags": ["tag 1", "tag 2"],
"feedback": 1,
"context_token": "<YOUR_CONTEXT_TOKEN>"
}
'
POST /v0/datapoints
Accept: application/json
Content-Type: application/json
api-key: <YOUR TOKEN HERE>
Host: api.okareo.com
{"mut_id":"string","input":"string","result":"string","tags":["string","string"],"feedback":boolean,"context_token":"string"}
Add Datapoints API
For interactive docs, refer to the okareo API Guide.
Endpoint: /v0/datapoints
Method: POST
Header
Name | Required | Type | Example |
---|---|---|---|
api-key | True | String | nR5cCI6IkpXVCIsImtpZCI6I... |
Body
Name | Required | Type | Example |
---|---|---|---|
mut_id | -- | string - uuid | d4e34a3a-fd2a-4952-a524-3b27360ea00b |
input | -- | string - json-string | "{'vegetable':'carrot'}" |
input_datetime | -- | string - date-time | 2019-08-24T14:15:22Z |
result | -- | string - json-string | "{'vegetable':'tomatoe'}" |
result_datetime | -- | string - date-time | 2019-08-24T14:15:22Z |
tags | -- | array | -- |
feedback | -- | boolean - 0 or 1 | 1 |
error_message | -- | string - Error Message | Error: Pipeline processing error on... |
error_code | -- | string - Error Code | "-934" |
context_token | -- | string | "{'caller':'foo', ...}" |
test_run_id | -- | string - uuid | 01dbc221-8963-4bd7-8591-4a1f9d286862 |
Responses
Code | Resonse | Payload |
---|---|---|
201 | Success | {'id', 'project_id', 'mut_id'} |
400 | Input data is incorrect | detail - Specific to the error instance |
404 | Data not found | detail - Specific to the error insance |
422 | Input data is invalid | detail - Specific to the error instance |