API Documentation
Everything you need to integrate Optamil into your application. Our API is fully OpenAI-compatible -- change one line and start saving.
Getting Started
Get up and running with Optamil in under 2 minutes.
1. Get your API key
Sign up at optamil.com to receive your free API key. No credit card required.
2. Make your first request
Optamil is a drop-in replacement for OpenAI. Just change the base URL:
curl https://api.optamil.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [
{"role": "user", "content": "Explain quantum computing in one sentence."}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.optamil.com/v1",
api_key="YOUR_API_KEY"
)
response = client.chat.completions.create(
model="auto", # Neural Router picks the best model
messages=[
{"role": "user", "content": "Explain quantum computing in one sentence."}
]
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.optamil.com/v1",
apiKey: "YOUR_API_KEY",
});
const response = await client.chat.completions.create({
model: "auto", // Neural Router picks the best model
messages: [
{ role: "user", content: "Explain quantum computing in one sentence." }
],
});
console.log(response.choices[0].message.content);
model: "auto" to let Neural Router automatically select the optimal model for each query. This is how most users achieve 94% cost savings.
Authentication
All API requests require authentication. Include your API key in one of these headers:
| Method | Header | Example |
|---|---|---|
| Bearer Token (recommended) | Authorization |
Bearer sk-optamil-... |
| API Key Header | X-API-Key |
sk-optamil-... |
Chat Completions
Creates a chat completion. Fully compatible with the OpenAI Chat Completions API.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Model ID or "auto" for Neural Router (recommended). See available models. |
messages |
array | Yes | Array of message objects with role ("system", "user", "assistant") and content. |
max_tokens |
integer | No | Maximum tokens to generate. Defaults to model maximum. |
temperature |
number | No | Sampling temperature (0.0 - 2.0). Lower = more deterministic. Default: 1.0. |
stream |
boolean | No | If true, returns a stream of server-sent events. Default: false. |
top_p |
number | No | Nucleus sampling (0.0 - 1.0). Alternative to temperature. Default: 1.0. |
stop |
string/array | No | Up to 4 stop sequences where generation halts. |
response_format |
object | No | Set {"type": "json_object"} for guaranteed JSON output. |
Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1712345678,
"model": "deepseek-v3",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses quantum mechanical phenomena..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 14,
"completion_tokens": 42,
"total_tokens": 56
}
}
model: "auto", the response model field shows which model was actually selected by Neural Router. This is useful for debugging and understanding routing decisions.
Available Models
Use "auto" for intelligent routing, or specify a model directly:
| Tier | Models | Cost / 1M tokens | Best For |
|---|---|---|---|
| Free | gemini-2.5-flash, gemini-2.5-pro, llama-3.3-70b |
$0.00 | Simple queries, classification, summarization |
| Budget | deepseek-v3, deepseek-r1, qwen-2.5-72b |
$0.14 | Code generation, analysis, reasoning |
| Mid | gpt-4.1-mini, claude-3.5-haiku |
$1.50 | Complex tasks, nuanced writing |
| Premium | claude-sonnet-4, gpt-5, gpt-4.1 |
$5.00 | Expert reasoning, creative work |
| Ultra | claude-opus-4, o3, o4-mini |
$15.00 | Frontier tasks, research, max capability |
List Models
Returns the list of all available models and their metadata.
curl https://api.optamil.com/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
Response
{
"object": "list",
"data": [
{
"id": "auto",
"object": "model",
"owned_by": "optamil",
"description": "Neural Router - automatic model selection"
},
{
"id": "deepseek-v3",
"object": "model",
"owned_by": "deepseek"
}
// ... 55+ models
]
}
Usage
Returns your current billing period usage, remaining credits, and tier breakdown.
curl https://api.optamil.com/v1/usage \
-H "Authorization: Bearer YOUR_API_KEY"
Response
{
"credits_used": 342,
"credits_remaining": 158,
"credits_total": 500,
"billing_period_start": "2026-04-01T00:00:00Z",
"billing_period_end": "2026-04-30T23:59:59Z",
"tier_breakdown": {
"free": 240,
"budget": 68,
"mid": 24,
"premium": 8,
"ultra": 2
}
}
Rate Limits
Rate limits vary by plan. When exceeded, the API returns 429 Too Many Requests with a Retry-After header.
| Plan | Requests/min | Requests/day | Tokens/min |
|---|---|---|---|
| Free | 10 | 200 | 40,000 |
| Starter | 60 | 5,000 | 200,000 |
| Pro | 300 | 50,000 | 1,000,000 |
| Business | 1,000 | 200,000 | 5,000,000 |
| Enterprise | Custom | Custom | Custom |
X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.
Error Codes
Optamil uses standard HTTP status codes. Errors return a JSON body with details.
| Code | Meaning | What To Do |
|---|---|---|
400 |
Bad Request | Check your request body format, required fields, and parameter types. |
401 |
Unauthorized | Invalid or missing API key. Check your Authorization header. |
402 |
Insufficient Credits | You have exceeded your credit allocation. Upgrade your plan or wait for the next billing cycle. |
429 |
Rate Limited | Too many requests. Wait for Retry-After seconds, then retry with exponential backoff. |
500 |
Internal Error | Temporary server issue. Retry after a brief delay. If persistent, contact support. |
503 |
Service Unavailable | Model temporarily unavailable. Neural Router auto-fails over; retry the request. |
Error Response Format
{
"error": {
"message": "Insufficient credits. Please upgrade your plan.",
"type": "insufficient_credits",
"code": 402
}
}
Code Examples
Streaming Response
from openai import OpenAI
client = OpenAI(
base_url="https://api.optamil.com/v1",
api_key="YOUR_API_KEY"
)
stream = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Write a haiku about AI."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.optamil.com/v1",
apiKey: "YOUR_API_KEY",
});
const stream = await client.chat.completions.create({
model: "auto",
messages: [{ role: "user", content: "Write a haiku about AI." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
curl -N https://api.optamil.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Write a haiku about AI."}],
"stream": true
}'
JSON Mode
response = client.chat.completions.create(
model="auto",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "Respond in JSON."},
{"role": "user", "content": "List 3 programming languages with their year of creation."}
]
)
# Returns: {"languages": [{"name": "Python", "year": 1991}, ...]}
Specifying a Model Directly
# Skip Neural Router — use a specific model
response = client.chat.completions.create(
model="claude-sonnet-4", # Direct model access
messages=[
{"role": "user", "content": "Analyze the implications of quantum error correction."}
],
max_tokens=2000,
temperature=0.7
)