Documentation
One OpenAI-compatible endpoint for every model. If you've used the OpenAI SDK, you already know VietToken — just change the base URL.
Overview
VietToken is a gateway that routes your requests to dozens of LLM providers behind a single API and key. It speaks the OpenAI Chat Completions format, streams tokens straight through (low latency), and fails over automatically across keys and providers.
https://api.viettoken.app/v1Key ideas
- One endpoint: switch models by changing the
modelstring — no new SDK, no re-auth. - Streaming first: Server-Sent Events pass through unbuffered for fast first tokens.
- Failover: if a model or key fails, traffic reroutes within the same group.
Quickstart
From sign-up to your first streamed token in under a minute.
Sign up free — you get trial credit, no card required.
Top up once and spend on any model. No subscription.
Dashboard → API Keys → Create. Copy it once and store it safely.
Point your OpenAI SDK at the VietToken base URL and call any model.
# Chat completion curl https://api.viettoken.app/v1/chat/completions \ -H "Authorization: Bearer $VIETTOKEN_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [{"role":"user","content":"Hello!"}] }'
from openai import OpenAI client = OpenAI( base_url="https://api.viettoken.app/v1", api_key="$VIETTOKEN_API_KEY", ) resp = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}], ) print(resp.choices[0].message.content)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.viettoken.app/v1", apiKey: process.env.VIETTOKEN_API_KEY, }); const resp = await client.chat.completions.create({ model: "openai/gpt-4o-mini", messages: [{ role: "user", content: "Hello!" }], }); console.log(resp.choices[0].message.content);
Authentication
Authenticate every request with a Bearer token in the Authorization header. Create and revoke keys in the dashboard. Treat keys like passwords — never ship them in client-side code.
Authorization: Bearer $VIETTOKEN_API_KEY
VIETTOKEN_API_KEY) and load it at runtime.Chat completions
The core endpoint. Send a list of messages and a model id; get a completion back. Fully OpenAI-compatible — temperature, max_tokens, tools, JSON mode and more all work.
| Parameter | Description |
|---|---|
model | Model id, e.g. anthropic/claude-sonnet-4. |
messages | Conversation as role/content objects. |
stream | Set true to stream tokens via SSE. |
temperature | Sampling randomness (0–2). Optional. |
Streaming
Set stream: true to receive tokens as Server-Sent Events. VietToken passes the stream through unbuffered, so first tokens arrive fast.
stream = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role":"user","content":"Tell me a story"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")List models
Fetch every model available to your key. Use it to populate dropdowns or validate a model id before calling.
curl https://api.viettoken.app/v1/models \
-H "Authorization: Bearer $VIETTOKEN_API_KEY"Prefer a visual list? Browse the full catalog with prices on the Models page.
Errors
VietToken uses standard HTTP status codes. Error bodies follow the OpenAI shape with a message and type.
| Code | Meaning |
|---|---|
401 | Invalid or missing API key. |
402 | Insufficient credits — top up to continue. |
429 | Rate limited — retry with backoff. |
5xx | Upstream issue — VietToken auto-retries the next provider. |
Custom providers
Bring your own model: add any OpenAI-compatible endpoint (self-hosted vLLM/Ollama, a private deployment, or another gateway) as a custom provider in the dashboard. Give it a base URL and key, then call it by its model id like any other.
Rate limits
Limits depend on your plan and balance. The gateway load-balances across multiple keys per provider and fails over on 429/5xx, so you get higher effective throughput than a single upstream key.