API Documentation

Everything you need to integrate Optamil into your application. Our API is fully OpenAI-compatible -- change one line and start saving.

Getting Started

Get up and running with Optamil in under 2 minutes.

1. Get your API key

Sign up at optamil.com to receive your free API key. No credit card required.

2. Make your first request

Optamil is a drop-in replacement for OpenAI. Just change the base URL:

bash
curl https://api.optamil.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in one sentence."}
    ]
  }'
python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.optamil.com/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="auto",  # Neural Router picks the best model
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ]
)

print(response.choices[0].message.content)
javascript
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.optamil.com/v1",
    apiKey: "YOUR_API_KEY",
});

const response = await client.chat.completions.create({
    model: "auto",  // Neural Router picks the best model
    messages: [
        { role: "user", content: "Explain quantum computing in one sentence." }
    ],
});

console.log(response.choices[0].message.content);
Tip: Set model: "auto" to let Neural Router automatically select the optimal model for each query. This is how most users achieve 94% cost savings.

Authentication

All API requests require authentication. Include your API key in one of these headers:

MethodHeaderExample
Bearer Token (recommended) Authorization Bearer sk-optamil-...
API Key Header X-API-Key sk-optamil-...
Security: Never expose your API key in client-side code or public repositories. Use environment variables or a backend proxy for browser-based applications.

Chat Completions

POST /v1/chat/completions

Creates a chat completion. Fully compatible with the OpenAI Chat Completions API.

Request Body

ParameterTypeRequiredDescription
model string Yes Model ID or "auto" for Neural Router (recommended). See available models.
messages array Yes Array of message objects with role ("system", "user", "assistant") and content.
max_tokens integer No Maximum tokens to generate. Defaults to model maximum.
temperature number No Sampling temperature (0.0 - 2.0). Lower = more deterministic. Default: 1.0.
stream boolean No If true, returns a stream of server-sent events. Default: false.
top_p number No Nucleus sampling (0.0 - 1.0). Alternative to temperature. Default: 1.0.
stop string/array No Up to 4 stop sequences where generation halts.
response_format object No Set {"type": "json_object"} for guaranteed JSON output.

Response

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1712345678,
  "model": "deepseek-v3",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum mechanical phenomena..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 42,
    "total_tokens": 56
  }
}
Note: When using model: "auto", the response model field shows which model was actually selected by Neural Router. This is useful for debugging and understanding routing decisions.

Available Models

Use "auto" for intelligent routing, or specify a model directly:

TierModelsCost / 1M tokensBest For
Free gemini-2.5-flash, gemini-2.5-pro, llama-3.3-70b $0.00 Simple queries, classification, summarization
Budget deepseek-v3, deepseek-r1, qwen-2.5-72b $0.14 Code generation, analysis, reasoning
Mid gpt-4.1-mini, claude-3.5-haiku $1.50 Complex tasks, nuanced writing
Premium claude-sonnet-4, gpt-5, gpt-4.1 $5.00 Expert reasoning, creative work
Ultra claude-opus-4, o3, o4-mini $15.00 Frontier tasks, research, max capability

List Models

GET /v1/models

Returns the list of all available models and their metadata.

bash
curl https://api.optamil.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

json
{
  "object": "list",
  "data": [
    {
      "id": "auto",
      "object": "model",
      "owned_by": "optamil",
      "description": "Neural Router - automatic model selection"
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "owned_by": "deepseek"
    }
    // ... 55+ models
  ]
}

Usage

GET /v1/usage

Returns your current billing period usage, remaining credits, and tier breakdown.

bash
curl https://api.optamil.com/v1/usage \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

json
{
  "credits_used": 342,
  "credits_remaining": 158,
  "credits_total": 500,
  "billing_period_start": "2026-04-01T00:00:00Z",
  "billing_period_end": "2026-04-30T23:59:59Z",
  "tier_breakdown": {
    "free": 240,
    "budget": 68,
    "mid": 24,
    "premium": 8,
    "ultra": 2
  }
}

Rate Limits

Rate limits vary by plan. When exceeded, the API returns 429 Too Many Requests with a Retry-After header.

PlanRequests/minRequests/dayTokens/min
Free1020040,000
Starter605,000200,000
Pro30050,0001,000,000
Business1,000200,0005,000,000
EnterpriseCustomCustomCustom
Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Error Codes

Optamil uses standard HTTP status codes. Errors return a JSON body with details.

CodeMeaningWhat To Do
400 Bad Request Check your request body format, required fields, and parameter types.
401 Unauthorized Invalid or missing API key. Check your Authorization header.
402 Insufficient Credits You have exceeded your credit allocation. Upgrade your plan or wait for the next billing cycle.
429 Rate Limited Too many requests. Wait for Retry-After seconds, then retry with exponential backoff.
500 Internal Error Temporary server issue. Retry after a brief delay. If persistent, contact support.
503 Service Unavailable Model temporarily unavailable. Neural Router auto-fails over; retry the request.

Error Response Format

json
{
  "error": {
    "message": "Insufficient credits. Please upgrade your plan.",
    "type": "insufficient_credits",
    "code": 402
  }
}

Code Examples

Streaming Response

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.optamil.com/v1",
    api_key="YOUR_API_KEY"
)

stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Write a haiku about AI."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
javascript
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.optamil.com/v1",
    apiKey: "YOUR_API_KEY",
});

const stream = await client.chat.completions.create({
    model: "auto",
    messages: [{ role: "user", content: "Write a haiku about AI." }],
    stream: true,
});

for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
bash
curl -N https://api.optamil.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Write a haiku about AI."}],
    "stream": true
  }'

JSON Mode

python
response = client.chat.completions.create(
    model="auto",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "Respond in JSON."},
        {"role": "user", "content": "List 3 programming languages with their year of creation."}
    ]
)

# Returns: {"languages": [{"name": "Python", "year": 1991}, ...]}

Specifying a Model Directly

python
# Skip Neural Router — use a specific model
response = client.chat.completions.create(
    model="claude-sonnet-4",  # Direct model access
    messages=[
        {"role": "user", "content": "Analyze the implications of quantum error correction."}
    ],
    max_tokens=2000,
    temperature=0.7
)