Skip to content
Back to home

API Documentation

aiapi.cheap is fully compatible with the Anthropic Messages API. Just change the base URL and API key.

Quick Start

  1. Create an account at aiapi.cheap/dashboard
  2. Top up your balance with crypto (USDT, BTC, ETH)
  3. Generate an API key from the dashboard
  4. Replace your Anthropic base URL and API key
  5. Start making requests — that's it!

Authentication

All API requests require a valid API key in the Authorization header. Keys start with sk-cc-.

HTTP
Authorization: Bearer sk-cc-your-api-key-here

Base URL

We expose two compatible APIs. Pick the one that matches the SDK or tool you're using.

For Anthropic SDKs & Claude Code

URL
https://aiapi.cheap/api/proxy

Set as base_url / baseURL / ANTHROPIC_BASE_URL. The SDK appends /v1/messages automatically — do not add /v1 yourself.

Works with: official anthropic Python & Node SDKs, Claude Code CLI, Cursor (Anthropic provider), Cline (Anthropic provider).

For OpenAI-compatible tools

URL
https://aiapi.cheap/api/proxy/v1

Set as base_url / baseURL. The OpenAI client appends /chat/completions automatically — the /v1 suffix IS required here.

Works with: official openai Python & Node SDKs, Kilo Code, Cline (OpenAI Compatible), Cursor (OpenAI Compatible), Cherry Studio, LobeChat, LibreChat, Continue, Roo Code, SwiftRouter.

Quick reference

Your toolBase URL to enter
Anthropic SDK / Claude Codehttps://aiapi.cheap/api/proxy
Kilo Code, Continue, Cherry Studio, etc.https://aiapi.cheap/api/proxy/v1
Raw curl — Anthropic formathttps://aiapi.cheap/api/proxy/v1/messages
Raw curl — OpenAI formathttps://aiapi.cheap/api/proxy/v1/chat/completions

Available Models

Model IDOfficialBasic (70% off)Pro (80% off)Context
claude-opus-4-7$5 / $25$1.50 / $7.50$1.00 / $5.00200K
claude-sonnet-4-6$3 / $15$0.90 / $4.50$0.60 / $3.00200K
claude-haiku-4-5$1 / $5$0.30 / $1.50$0.20 / $1.00200K

Messages API

POST /v1/messages — Create a message (Anthropic Messages format).

Request Body

FieldTypeRequiredDescription
modelstringYesModel ID
messagesarrayYesArray of message objects
max_tokensintegerNoMax output tokens (default: 4096)
systemstring | arrayNoSystem prompt (string or array with cache_control)
temperaturefloatNo0.0 to 1.0
streambooleanNoEnable SSE streaming
thinkingobjectNoEnable extended thinking (see below)
toolsarrayNoTool definitions for function calling
tool_choiceobjectNoControl tool selection behavior

Response

response.json
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-6",
"content": [
{
"type": "text",
"text": "Hello! How can I help you today?"
}
],
"stop_reason": "end_turn",
"usage": {
"input_tokens": 12,
"output_tokens": 15
}
}

OpenAI-Compatible API

POST /v1/chat/completions — Same Claude models, OpenAI ChatCompletions wire format.

This endpoint exists so any tool that speaks OpenAI's ChatCompletions API can talk to Claude through us without modification — drop us in as a custom OpenAI provider.

When to use this

Use it whenever your tool asks for an "OpenAI Compatible" / "OpenAI-style" provider. We handle the translation in both directions internally — system prompts, tool calls, streaming, thinking blocks, and prompt caching all map cleanly.

Configuration

FieldValue
Base URLhttps://aiapi.cheap/api/proxy/v1
API Keysk-cc-… (same key as Anthropic side)
Modelclaude-opus-4-7 · claude-sonnet-4-6 · claude-haiku-4-5

Python (OpenAI SDK)

openai_example.py
from openai import OpenAI
 
client = OpenAI(
base_url="https://aiapi.cheap/api/proxy/v1",
api_key="sk-cc-your-api-key",
)
 
resp = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
)
print(resp.choices[0].message.content)

Node.js (OpenAI SDK)

openai_example.ts
import OpenAI from "openai";
 
const client = new OpenAI({
baseURL: "https://aiapi.cheap/api/proxy/v1",
apiKey: "sk-cc-your-api-key",
});
 
const resp = await client.chat.completions.create({
model: "claude-sonnet-4-6",
messages: [
{ role: "user", content: "Explain quantum computing" }
],
});
console.log(resp.choices[0].message.content);

cURL

terminal
curl -X POST https://aiapi.cheap/api/proxy/v1/chat/completions \
-H "Authorization: Bearer sk-cc-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-haiku-4-5",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'

Editor extension setup (Kilo Code, Cline, Continue, Cursor "OpenAI Compat")

  1. In the extension settings, choose OpenAI Compatible provider.
  2. Set Base URL to https://aiapi.cheap/api/proxy/v1. The /v1 suffix is required — without it the extension hits a 404.
  3. Paste your sk-cc-… key into the API key field.
  4. Add models manually if the extension does not auto-discover: claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5.

Streaming

Set "stream": true to receive Server-Sent Events (SSE). Works with every model, thinking, and prompt caching.

streaming.py
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a haiku about Vietnam"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)

Prompt Caching

Cache long system prompts to save up to 90% on repeated input tokens. Works with all models and streaming.

How it works

  1. First request — system prompt is cached. You pay 1.25× input price (cache write).
  2. Next requests — same system prompt is read from cache. You pay only 0.1× input price (90% off).
  3. Cache lasts 5 minutes (refreshed on each use).

Minimum cacheable size: 1,024 tokens for Sonnet 4.x and Opus 4.x, 2,048 tokens for Haiku 4.5. Below the threshold the request still works but nothing is cached.

Pass system as an array with cache_control:

caching.py
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=[
{
"type": "text",
"text": "Your long system prompt here...",
"cache_control": {"type": "ephemeral"}
}
],
messages=[{"role": "user", "content": "Hello"}]
)
 
# Response usage will include:
# cache_creation_input_tokens (first call)
# cache_read_input_tokens (subsequent calls)

Extended Thinking

Enable extended thinking to let Claude reason through complex problems before answering. Works with streaming.

thinking.py
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=8000,
thinking={
"type": "enabled",
"budget_tokens": 5000
},
messages=[{"role": "user", "content": "Solve this step by step..."}]
)
 
# Response includes thinking + text content blocks:
# content[0] = {"type": "thinking", "thinking": "..."}
# content[1] = {"type": "text", "text": "..."}

budget_tokens controls max thinking tokens. Thinking tokens are billed as output tokens. Works on Claude Sonnet 4.6 and Opus 4.7.

Python SDK

example.py
from anthropic import Anthropic
 
client = Anthropic(
base_url="https://aiapi.cheap/api/proxy",
api_key="sk-cc-your-api-key"
)
 
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain quantum computing"}
]
)
 
print(message.content[0].text)

Node.js SDK

example.ts
import Anthropic from "@anthropic-ai/sdk";
 
const client = new Anthropic({
baseURL: "https://aiapi.cheap/api/proxy",
apiKey: "sk-cc-your-api-key",
});
 
const message = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [
{ role: "user", content: "Explain quantum computing" }
],
});
 
console.log(message.content[0].text);

cURL

terminal
curl -X POST https://aiapi.cheap/api/proxy/v1/messages \
-H "Authorization: Bearer sk-cc-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello!"}
]
}'

Error Handling

StatusError TypeDescription
401authentication_errorInvalid or missing API key
400invalid_request_errorInvalid model or malformed request
402insufficient_balanceBalance is $0. Top up required.
429rate_limit_errorToo many requests
500internal_errorServer error
502api_errorModel temporarily unavailable

Rate Limits

PlanPriceRequests/minTokens/minDiscount
BasicFree2001,000,00070% off
Pro$19 lifetime5002,000,00080% off

Claude Code Integration

Use aiapi.cheap directly with Claude Code:

~/.bashrc
export ANTHROPIC_API_KEY="sk-cc-your-api-key"
export ANTHROPIC_BASE_URL="https://aiapi.cheap/api/proxy"

Then just run claude in your terminal. All requests route through aiapi.cheap at up to 80% off with Pro plan.


Need Help?

Contact us at support@aiapi.cheap.