API Reference
TokenSense is a transparent proxy — you call it the same way you'd call your AI provider directly. This page covers the endpoint format, authentication, and code examples.
Base URL
Your TokenSense endpoint is shown on your dashboard home page. All requests go through this URL:
https://api.tokensense.io/chat/completionsReplace /chat/completions with the appropriate path for your provider and model type.
Supported endpoints
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions | POST | Chat completions (OpenAI format, all providers) |
/v1/messages | POST | Native Anthropic Messages API |
/v1beta/models/{model}:{action} | POST | Native Gemini API (generateContent, streamGenerateContent) |
/v1/embeddings | POST | Text embeddings (OpenAI, xAI, Mistral) |
/v1/images/generations | POST | Image generation (OpenAI, Google Imagen, fal Flux) |
/v1/audio/speech | POST | Text-to-speech (OpenAI) |
/v1/audio/transcriptions | POST | Audio transcription (OpenAI Whisper) |
/v1/models | GET | List available models |
/health | GET | Health check (no auth required) |
The most common endpoint is /v1/chat/completions — it accepts OpenAI-format requests and automatically translates for Anthropic, Gemini, xAI, and Mistral. Use the native endpoints (/v1/messages, /v1beta/models/...) when you need provider-specific features that don't map to the OpenAI format.
Authentication
Every request must include a Bearer token with your TokenSense API key:
Authorization: Bearer YOUR_TOKENSENSE_KEYYour TokenSense API key is not your provider key. You add provider keys separately in Settings → Providers. TokenSense uses them to forward requests on your behalf.
Supported formats
TokenSense detects the provider from the model name in your request and routes automatically. You can send requests in any of these formats:
- OpenAI format —
/v1/chat/completionswith any model. TokenSense translates automatically for all providers. - Native Anthropic —
/v1/messagesfor the Anthropic Messages API (passthrough, no translation). - Native Gemini —
/v1beta/models/{model}:{action}for the Gemini API (passthrough, no translation).
xAI and Mistral are natively OpenAI-compatible — requests pass through directly via /v1/chat/completions with zero translation.
Attribution headers
Add these optional headers to track costs per workflow and step in your dashboard:
| Header | Purpose | Example |
|---|---|---|
x-workflow-tag | Identifies the workflow | lead-enrichment |
x-step | Identifies the step within a workflow | summarize |
x-execution-id | Groups requests from the same run | exec-12345 |
x-source | Identifies the tool or platform | make |
x-project | Associates with a project | client-a |
Response format
TokenSense returns the exact same response as the upstream provider. Your code doesn't need any changes to parse the response — it's identical to calling the provider directly.
Code examples
cURL
curl https://api.tokensense.io/chat/completions \
-H "Authorization: Bearer YOUR_TOKENSENSE_KEY" \
-H "Content-Type: application/json" \
-H "x-workflow-tag: my-workflow" \
-d '{
"model": "gpt-5-mini",
"messages": [
{"role": "user", "content": "Hello, world!"}
]
}'Python
import requests
response = requests.post(
"https://api.tokensense.io/chat/completions",
headers={
"Authorization": "Bearer YOUR_TOKENSENSE_KEY",
"Content-Type": "application/json",
"x-workflow-tag": "my-workflow",
},
json={
"model": "gpt-5-mini",
"messages": [
{"role": "user", "content": "Hello, world!"}
],
},
)
print(response.json())Node.js
const response = await fetch(
"https://api.tokensense.io/chat/completions",
{
method: "POST",
headers: {
"Authorization": "Bearer YOUR_TOKENSENSE_KEY",
"Content-Type": "application/json",
"x-workflow-tag": "my-workflow",
},
body: JSON.stringify({
model: "gpt-5-mini",
messages: [
{ role: "user", content: "Hello, world!" },
],
}),
}
);
const data = await response.json();
console.log(data);Error responses
| Code | Meaning | Fix |
|---|---|---|
400 | Invalid request body or missing fields | Check your request payload format |
401 | Invalid or missing API key | Check key in dashboard |
402 | Budget exceeded or subscription locked | Increase budget or update billing |
403 | Missing provider key or routing policy block | Add provider key in Settings → Providers |
404 | Model not found in catalog | Check model name, use /v1/models to list |
429 | Rate limit exceeded (tiered by plan) | Check Retry-After header, slow down |
502 | Upstream provider error | Retry with backoff |
504 | Provider timeout | Try a faster model |
See the full Error Codes reference for detailed troubleshooting.
