How to stream chat completions through RelayRouter
To stream chat completions through RelayRouter, keep your existing OpenAI, Anthropic or Gemini SDK, point the base URL at RelayRouter, swap in your RelayRouter API key, and enable streaming on the request. Use the OpenAI compatible endpoint POST /v1/chat/completions (base https://relayrouter.io/v1), the Anthropic compatible endpoint POST /v1/messages (base https://relayrouter.io), or the Gemini compatible endpoint POST /v1beta/models/{model}:generateContent. Authenticate with Authorization: Bearer YOUR_API_KEY. Streaming is supported across these protocols with no other code changes.
Choose your protocol and base URL
RelayRouter speaks three protocols, so you can use whichever SDK you already have. For OpenAI compatible clients, send POST /v1/chat/completions with base https://relayrouter.io/v1. For Anthropic compatible clients, send POST /v1/messages with base https://relayrouter.io. For Gemini compatible clients, send POST /v1beta/models/{model}:generateContent. The key principle is simple: keep your existing SDK, point the base URL at RelayRouter, and swap the API key. No other code changes are required. This lets you reuse your current streaming logic without rewriting your application around a new client library.
| Protocol | Base URL | Endpoint |
|---|---|---|
| OpenAI compatible | https://relayrouter.io/v1 |
POST /v1/chat/completions |
| Anthropic compatible | https://relayrouter.io |
POST /v1/messages |
| Gemini compatible | (Gemini base) | POST /v1beta/models/{model}:generateContent |
Steps to enable streaming
Setting up a streaming request takes only a few configuration changes. Because RelayRouter is a unified gateway, the same steps apply whether you are using Claude, GPT or Gemini models.
- Create an API key in the dashboard at
https://relayrouter.io/dashboard. - Point your SDK base URL at the correct RelayRouter endpoint for your protocol.
- Set the header
Authorization: Bearer YOUR_API_KEY. - Select a model, for example
gpt-5.5,claude-opus-4-8orgemini-3.5-flash. - Enable streaming in your request and consume the streamed response as usual.
For detailed configuration guidance, see the documentation at https://relayrouter.io/docs.
Available models for streaming
Streaming is supported across the models RelayRouter offers. The Claude line includes claude-opus-4-8 and claude-fable-5. You can also stream from gpt-5.5 and Gemini 3.5 (gemini-3.5-flash). In addition, RelayRouter provides access to DeepSeek, MiniMax and Moonshot models. Because all of these are reachable through the same three compatible protocols, you can switch between providers by changing the model identifier rather than rewriting your streaming code. Live per-model rates are published at https://relayrouter.io/models, so you can confirm current pricing for whichever model you choose to stream.
Authentication, pricing and billing
All requests authenticate with Authorization: Bearer YOUR_API_KEY, and you create keys at https://relayrouter.io/dashboard. Payments are handled through Stripe card. Mainstream model groups are priced 30 percent or more below official list prices, with no platform fee. Importantly, failed or errored requests are never billed, which matters for streaming workloads where connections can occasionally drop mid response. This billing model means you only pay for successful completions. For the most current numbers, check the live per-model rates at https://relayrouter.io/models rather than relying on cached figures.
Frequently asked questions
Do I need to rewrite my code to stream through RelayRouter? No. Keep your existing OpenAI, Anthropic or Gemini SDK, point the base URL at RelayRouter, and swap the API key, with no other code changes.
Is streaming supported for all models? Streaming is supported, and you can select models including the Claude line, gpt-5.5, Gemini 3.5, DeepSeek, MiniMax and Moonshot.
Will I be charged if a streaming request fails? No. Failed or errored requests are never billed.