API Documentation

Everything you need to integrate Optamil into your application. Our API is fully OpenAI-compatible -- change one line and start saving.

Getting Started

Get up and running with Optamil in under 2 minutes.

1. Get your API key

2. Make your first request

Optamil is a drop-in replacement for OpenAI. Just change the base URL:

bash

curl https://api.optamil.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in one sentence."}
    ]
  }'

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.optamil.com/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="auto",  # Neural Router picks the best model
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ]
)

print(response.choices[0].message.content)

javascript

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.optamil.com/v1",
    apiKey: "YOUR_API_KEY",
});

const response = await client.chat.completions.create({
    model: "auto",  // Neural Router picks the best model
    messages: [
        { role: "user", content: "Explain quantum computing in one sentence." }
    ],
});

console.log(response.choices[0].message.content);

Tip: Set model: "auto" to let Neural Router automatically select the optimal model for each query. This is how most users achieve 94% cost savings.

Authentication

All API requests require authentication. Include your API key in one of these headers:

Method	Header	Example
Bearer Token (recommended)	`Authorization`	`Bearer sk-optamil-...`
API Key Header	`X-API-Key`	`sk-optamil-...`

Security: Never expose your API key in client-side code or public repositories. Use environment variables or a backend proxy for browser-based applications.

Chat Completions

POST /v1/chat/completions

Creates a chat completion. Fully compatible with the OpenAI Chat Completions API.

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID or `"auto"` for Neural Router (recommended). See available models.
`messages`	array	Yes	Array of message objects with `role` (`"system"`, `"user"`, `"assistant"`) and `content`.
`max_tokens`	integer	No	Maximum tokens to generate. Defaults to model maximum.
`temperature`	number	No	Sampling temperature (0.0 - 2.0). Lower = more deterministic. Default: 1.0.
`stream`	boolean	No	If `true`, returns a stream of server-sent events. Default: `false`.
`top_p`	number	No	Nucleus sampling (0.0 - 1.0). Alternative to temperature. Default: 1.0.
`stop`	string/array	No	Up to 4 stop sequences where generation halts.
`response_format`	object	No	Set `{"type": "json_object"}` for guaranteed JSON output.

Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1712345678,
  "model": "deepseek-v3",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum mechanical phenomena..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 42,
    "total_tokens": 56
  }
}

Note: When using model: "auto", the response model field shows which model was actually selected by Neural Router. This is useful for debugging and understanding routing decisions.

Available Models

Use "auto" for intelligent routing, or specify a model directly:

Tier	Models	Cost / 1M tokens	Best For
Free	`gemini-2.5-flash`, `gemini-2.5-pro`, `llama-3.3-70b`	$0.00	Simple queries, classification, summarization
Budget	`deepseek-v3`, `deepseek-r1`, `qwen-2.5-72b`	$0.14	Code generation, analysis, reasoning
Mid	`gpt-4.1-mini`, `claude-3.5-haiku`	$1.50	Complex tasks, nuanced writing
Premium	`claude-sonnet-4`, `gpt-5`, `gpt-4.1`	$5.00	Expert reasoning, creative work
Ultra	`claude-opus-4`, `o3`, `o4-mini`	$15.00	Frontier tasks, research, max capability

List Models

GET /v1/models

Returns the list of all available models and their metadata.

bash

curl https://api.optamil.com/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

json

{
  "object": "list",
  "data": [
    {
      "id": "auto",
      "object": "model",
      "owned_by": "optamil",
      "description": "Neural Router - automatic model selection"
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "owned_by": "deepseek"
    }
    // ... 55+ models
  ]
}

Usage

GET /v1/usage

Returns your current billing period usage, remaining credits, and tier breakdown.

bash

curl https://api.optamil.com/v1/usage \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

json

{
  "credits_used": 342,
  "credits_remaining": 158,
  "credits_total": 500,
  "billing_period_start": "2026-04-01T00:00:00Z",
  "billing_period_end": "2026-04-30T23:59:59Z",
  "tier_breakdown": {
    "free": 240,
    "budget": 68,
    "mid": 24,
    "premium": 8,
    "ultra": 2
  }
}

Rate Limits

Rate limits vary by plan. When exceeded, the API returns 429 Too Many Requests with a Retry-After header.

Plan	Requests/min	Requests/day	Tokens/min
Free	10	200	40,000
Starter	60	5,000	200,000
Pro	300	50,000	1,000,000
Business	1,000	200,000	5,000,000
Enterprise	Custom	Custom	Custom

Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Error Codes

Optamil uses standard HTTP status codes. Errors return a JSON body with details.

Code	Meaning	What To Do
`400`	Bad Request	Check your request body format, required fields, and parameter types.
`401`	Unauthorized	Invalid or missing API key. Check your `Authorization` header.
`402`	Insufficient Credits	You have exceeded your credit allocation. Upgrade your plan or wait for the next billing cycle.
`429`	Rate Limited	Too many requests. Wait for `Retry-After` seconds, then retry with exponential backoff.
`500`	Internal Error	Temporary server issue. Retry after a brief delay. If persistent, contact support.
`503`	Service Unavailable	Model temporarily unavailable. Neural Router auto-fails over; retry the request.

Error Response Format

json

{
  "error": {
    "message": "Insufficient credits. Please upgrade your plan.",
    "type": "insufficient_credits",
    "code": 402
  }
}

Code Examples

Streaming Response

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.optamil.com/v1",
    api_key="YOUR_API_KEY"
)

stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Write a haiku about AI."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

javascript

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.optamil.com/v1",
    apiKey: "YOUR_API_KEY",
});

const stream = await client.chat.completions.create({
    model: "auto",
    messages: [{ role: "user", content: "Write a haiku about AI." }],
    stream: true,
});

for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

bash

curl -N https://api.optamil.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Write a haiku about AI."}],
    "stream": true
  }'

JSON Mode

python

response = client.chat.completions.create(
    model="auto",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "Respond in JSON."},
        {"role": "user", "content": "List 3 programming languages with their year of creation."}
    ]
)

# Returns: {"languages": [{"name": "Python", "year": 1991}, ...]}

Specifying a Model Directly

python

# Skip Neural Router — use a specific model
response = client.chat.completions.create(
    model="claude-sonnet-4",  # Direct model access
    messages=[
        {"role": "user", "content": "Analyze the implications of quantum error correction."}
    ],
    max_tokens=2000,
    temperature=0.7
)