A cinematic local proxy that watches every provider stream, repairs broken protocol frames, and syncs the fix across your swarm before your agents ever notice.
Okto runs as a local HTTPS proxy for agent CLIs. It watches raw provider streams, detects malformed chunks before your client sees them, patches repeat failures in real time, then stores the fix as a reusable mutation. When the same failure appears again, the repair is already loaded. Your agent keeps running while the protocol layer is repaired underneath it.
Five things, working together, in the background — all automatic.
Acts as a local proxy between your AI tool (Claude, Gemini, Cursor…) and the provider's servers. Every message passes through it — exactly like a smart router.
Knows what "correct" AI streaming data looks like. The instant it sees anything wrong — missing prefix, broken JSON, truncated stream — it flags it automatically.
Rewrites the broken data chunk on the fly — before your agent ever sees it. The agent receives clean, correct data and continues working without interruption.
Every fix gets stored permanently as a "mutation" — a rule for how to repair this type of broken data in the future. Next session, the fix is already there. No config needed.
Optionally connects to other Okto instances via the RNA bus. One machine discovers a fix — every machine in your team benefits instantly, like a herd immune system.
Every message your AI agent sends or receives travels through this pipeline.
From casual debugging to production swarm deployments.
Open the Traffic Inspector dashboard while running your agent. When the crash happens, click the session to see the exact byte chunk that caused it — the raw bytes, the anomaly type, and what the fix would have been.
Sometimes Claude returns nothing and your agent just hangs. Okto shows you the raw SSE frames so you can see if the provider returned 0 bytes, a malformed data: [DONE], or a network timeout.
Hit "Replay with genome" in the dashboard to re-run a recorded session's raw chunks through the state machine. See whether your current genome fixes the bug without touching the provider again.
Provider docs don't always match reality. Okto captures the literal bytes coming over the wire — including undocumented fields, edge-case headers, and content-type mismatches you'd never see in a client SDK log.
If your agent fails to parse a response only 1 in 50 times, Okto captures every chunk of every session. Filter for anomalous chunks and find the exact malformed JSON fragment that's slipping through.
Use okto fabricate to generate synthetic sessions that look real but contain injected chaos — truncated streams, bare JSON without SSE prefixes, model swaps mid-stream — before real providers send them.
Planning to move from Anthropic to OpenAI? Route both through Okto simultaneously. Compare their streaming protocols side by side and see which mutations your genome needs before your agent handles both.
Add Okto to your CI pipeline. Every pull request routes test API calls through it. If a code change causes the agent to mishandle an SSE stream, the anomaly appears immediately — before it ships.
Switching from LangChain to a custom agent loop? Replay your anomaly library through the fabricator against the new framework. If it crashes on chunk 4, you know before you deploy.
Every session is saved as a JSONL file. Curate the interesting ones — sessions with rare anomalies — into a library. Use them as fixtures to test future versions of your agent or genome.
Deploy Okto as a sidecar proxy on your server. When a provider starts sending broken data during an outage, the proxy fixes it automatically — your agent keeps working, your users see nothing.
Run a broker with okto broker. Every developer's instance joins the RNA bus. One person's agent discovers a fix — everyone's genome updates within seconds. No Slack message required.
When you hit a rate limit, providers send very specific HTTP 429 responses with retry headers. Okto captures these at the wire level, so you can see exactly when, how often, and for how long you're throttled.
When a provider has a partial outage — serving malformed chunks to some requests — the genome kicks in immediately. Your agent continues working while other teams are filing bug reports.
For compliance or security purposes, Okto gives you a complete record of every byte exchanged between your systems and AI providers — something no client SDK offers out of the box.
Anthropic, OpenAI, Google, Mistral — they all claim SSE, but each has quirks. Route the same prompts through each provider and compare their raw streaming behavior: chunk sizes, timing, event formats.
Client SDKs hide connection overhead, retry logic, and buffering. Okto measures time-to-first-chunk at the TCP level, giving you the real latency picture for benchmarking providers or regions.
Protocol anomalies from production AI providers are hard to reproduce in a lab. Okto records them faithfully as they occur. Build a dataset of real-world AI streaming edge cases over time.
The genome is an append-only ledger — you can inspect and extend it directly. Write custom mutations to transform, filter, or annotate AI responses at the byte level for experiments your framework can't reach.
Connect dozens of Okto instances at different locations to a shared broker. When a provider has a regional issue, the anomaly surfaces globally in seconds — like a nervous system for AI infrastructure.
No configuration files. No cloud account. Just run it.
$ pip install okto $ okto ca trust # trust the local TLS cert $ okto run --client claude ✓ Genome loaded: 0 mutations ✓ Proxy active on port 8877 ✓ Dashboard live at http://localhost:7878
# On your shared server: $ okto broker --port 9373 # Each developer runs: $ okto run \ --broker ws://shared-server:9373 # Now fixes spread automatically ✓ Connected to RNA broker
# Simulate a truncated stream: $ okto fabricate \ -s stream_truncation \ -o chaos.jsonl # Other scenarios available: bare_json model_swap iam_hallucination reflection_loop infinite_context
# See all learned fixes: $ okto genome list ID Client Confidence Description a3f2c1d8 claude 1.00 sse prefix repair b9e7a2c1 claude 0.92 json parse repair # Reset if needed: $ okto genome clear
http://localhost:7878 when you run okto run.
Every session, anomaly, mutation, and replay is one click away — no setup required.