Skip to main content

Okareo Proxy

The Okareo Proxy is a powerful tool designed to facilitate seamless integration and monitoring of Large Language Model (LLM) requests. It provides both a hosted solution at proxy.okareo.com and a self-hosted version that can be run via the Okareo Command Line Interface (CLI). This flexibility allows teams to choose the deployment method that best fits their needs while ensuring robust observability and evaluation capabilities.

Why Use the Okareo Proxy?

As AI and machine learning become integral to software development, the need for reliable and efficient testing of non-deterministic components grows. The Okareo Proxy addresses this challenge by enabling real-time AI observability, synthetic data generation, and agent simulation, all in one platform. By routing LLM requests through the proxy, you can monitor interactions, evaluate performance, and quickly iterate on your AI systems. The proxy also acts as a gateway and provides access to any model provider or local model without requiring a rewrite of your application.

Key Features

  • Real-Time Error Tracking: Evaluate LLM interactions in real-time and generate alerts when critical issues occur.
  • Behavioral Observability: Observe conversations and LLM behaviors quickly and easily.
  • Gateway: Use the OpenAI spec to access any inference provider or local model.
  • Cloud or Self Hosted: Use the Okareo managed cloud proxy or run the proxy yourself using the Okareo CLI.

Using Cloud Proxy

The hosted cloud proxy is accessible at https://proxy.okareo.com, providing a straightforward way to integrate Okareo's capabilities into your applications without the need for local setup.

To use the hosted cloud proxy, use OpenAI library, set the base_url in the configuration to https://proxy.okareo.com and include your OKAREO_API_KEY in the default_headers.

from openai import OpenAI

openai = OpenAI(
base_url="https://proxy.okareo.com",
default_headers={"api-key": "<OKAREO_API_KEY>"},
api_key="<YOUR_LLM_PROVIDER_KEY>")

Sample of available models by provider

This is a short list of models natively supported by the Okareo proxy. You can also add your own models by self-hosting and adding a simple configuration file.

warning

Make sure to include the credientials (usually an API KEY) specific to the model you are trying to access.

ProviderSample of Available Models
OpenAIgpt-4, gpt-4-32k, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-instruct, text-davinci-003
Anthropicanthropic/claude-4, anthropic/claude-3.7, anthropic/claude-3.5, anthropic/claude-2.1, anthropic/claude-instant-1.2
Google (Gemini)gemini/gemini-2.5-pro-preview, gemini/gemini-2.5-flash-preview, gemini/gemini-2.0-pro, gemini/gemini-2.0-flash, gemini/gemini-1.5-pro
Azure (OpenAI)azure/gpt-4, azure/gpt-4-32k, azure/gpt-3.5-turbo, azure/gpt-3.5-turbo-16k, azure/text-davinci-003
AWS Bedrockbedrock/anthropic.claude-v2, bedrock/amazon.titan-text-express-v1, bedrock/amazon.titan-text-lite-v1, bedrock/ai21.j2-ultra-v1, bedrock/ai21.j2-mid-v1
Hugging Facehuggingface/meta-llama/Llama-2-70b-chat-hf, huggingface/tiiuae/falcon-40b-instruct, huggingface/mosaicml/mpt-30b-chat, huggingface/lmsys/vicuna-33b-v1.5, huggingface/mistralai/Mistral-7B-v0.1
Coherecohere/command-a-03-2025, cohere/command-r-plus, cohere/command-r, cohere/command-light, cohere/command-nightly
AI21 Labsai21/j2-ultra, ai21/j2-mid, ai21/j2-light
IBM watsonx.aiibm/granite-13b-chat-v2, ibm/granite-13b-instruct-v2, ibm/granite-13b-chat-v1, ibm/granite-13b-instruct-v1
Meta (via Groq)groq/Llama-4-Scout-17B-16E-Instruct-FP8, groq/Llama-4-Maverick-17B-128E-Instruct-FP8, groq/Llama-3.3-70B-Instruct, groq/Llama-3.3-8B-Instruct
tip

There are too many models to list all of them. So, if you don't see what you are looking for, let us know and we'll send you the magic model_id key. support@okareo.com