Python Examples
Runnable Python notebooks and toolkits, grouped by use case. Click any card to open in Colab or GitHub.
Safety & Red Teaming
Voice Simulation
Python
Describe a caller in one sentence, get a real phone conversation
Python
Production-quality persona with voice and tone control
Python
Apply quality checks: resolution, consistency, loops
Python
Run multiple callers, get recordings and transcripts
Python
Add new checks to an existing run — no new call
Python
Inject real-world conditions: noise, barge-in
Python
Threshold-based pass/fail for CI pipelines
Python
Stress-test with 20+ concurrent calls
Quality & Evaluation
Python
First Multi-Turn Simulation Guide
Python
Conversation Simulation with Custom Endpoint
Python
Simulating Multi-Turn Conversations with an OpenAI Model
Python
Prompt-Target Multi-Turn Simulation (Python)
Python
Custom Model Multi-Turn Simulation (Python)
Python
Tool-Use / Function-Calling Multi-Turn Simulation (Python)
Python
Using Built-in “Checks” to Score Model Behaviour
Python
Classification Evaluation: Labelling 'Pricing vs Returns vs Complaints' Scenarios
Python
Generation Evaluation: Assessing Text Quality in a RAG Pipeline
Python
Scenario Management Walk-through for Uploading Seed Cases
Python
Comparing Dense vs Sparse Embedding Models for Semantic Search
Python
Retrieval Evaluation with Cohere Embeddings and a Pinecone Index
Python
End-to-End Retrieval Evaluation Workflow
Python