Yes. We forward your requests to real AI providers (Claude, GPT, Gemini, Grok, DeepSeek). Same models, same output, same context windows. Only the price is different.

How is the discount possible?

We pool bulk credit across providers and accept crypto, which keeps ops cost low. Those savings get passed through as 70-80% off list price.

Which SDKs and tools work?

Anthropic SDK, OpenAI SDK, LangChain, raw fetch — all work. Just swap the base URL and the model name to use any AI.

What payment methods do you accept?

We accept cryptocurrency — USDT (TRC20/ERC20), BTC, ETH, and 100+ other coins via Oxapay. Credits never expire.

What are the pricing plans?

Basic plan (free): 70% off, 200 req/min. Pro ($19 lifetime): 80% off, 500 req/min. Both plans support every connected AI provider.

Do you store my prompts or data?

No. We don't log, store, or train on your API requests. Zero data retention policy on request content.

24/7 support via email at support@aiapi.cheap. Pro users get priority response.

All posts

April 4, 2026·4 min readtutorialpythonclaude python sdk tutorial

Claude Python SDK Tutorial — Full Guide with Code Examples

Complete guide to using the Anthropic Python SDK with claudeapi.cheap. Covers basic usage, streaming, async, error handling, and multi-turn conversations with code examples.

Getting Started

The official Anthropic Python SDK makes it straightforward to interact with Claude models from any Python application. By pairing it with claudeapi.cheap, you get the same SDK experience at up to 80% off.

This tutorial walks through everything you need: installation, basic usage, streaming, async patterns, error handling, and multi-turn conversations.

Installation

Install the official Anthropic SDK from PyPI:

pip install anthropic

You will also need a claudeapi.cheap account. Sign up here if you have not already, then grab your API key from the dashboard.

Basic Usage

Connecting to claudeapi.cheap requires just two changes from the standard Anthropic setup: the base_url and api_key parameters.

from anthropic import Anthropic

client = Anthropic(
    base_url="https://claudeapi.cheap/api/proxy",
    api_key="sk-cc-your-api-key"
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what a REST API is in simple terms."}
    ]
)

print(message.content[0].text)

That is it. The response format, model behavior, and capabilities are identical to the official API.

Using Environment Variables

For cleaner code and better security, set your credentials as environment variables instead of hardcoding them:

export ANTHROPIC_API_KEY="sk-cc-your-api-key"
export ANTHROPIC_BASE_URL="https://claudeapi.cheap/api/proxy"

Then your Python code becomes identical to any standard Anthropic tutorial:

from anthropic import Anthropic

client = Anthropic()  # reads from environment variables automatically

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is machine learning?"}
    ]
)

print(message.content[0].text)

This is the recommended approach for production applications.

System Prompts

System prompts let you set the behavior and personality of Claude for your application:

message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=2048,
    system="You are a senior Python developer. Give concise, practical answers with code examples.",
    messages=[
        {"role": "user", "content": "What is the best way to handle exceptions in Python?"}
    ]
)

print(message.content[0].text)

Streaming Responses

For long responses or real-time UI updates, use streaming to receive tokens as they are generated:

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a Python function to merge two sorted lists."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Streaming is especially useful for chatbot interfaces, CLI tools, and any application where perceived latency matters.

Async Usage

For web servers, background workers, and any async Python application, use the AsyncAnthropic client:

import asyncio
from anthropic import AsyncAnthropic

async_client = AsyncAnthropic(
    base_url="https://claudeapi.cheap/api/proxy",
    api_key="sk-cc-your-api-key"
)

async def ask_claude(question: str) -> str:
    message = await async_client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": question}]
    )
    return message.content[0].text

async def main():
    # Run multiple requests concurrently
    questions = [
        "What is a decorator in Python?",
        "Explain list comprehensions.",
        "What is the GIL?"
    ]
    results = await asyncio.gather(
        *[ask_claude(q) for q in questions]
    )
    for q, a in zip(questions, results):
        print(f"Q: {q}")
        print(f"A: {a}\n")

asyncio.run(main())

The async client is ideal for FastAPI, aiohttp, or any asyncio-based application where you want to handle multiple requests without blocking.

Multi-Turn Conversations

To build a conversational experience, pass the full message history with each request:

conversation = []

def chat(user_message: str) -> str:
    conversation.append({"role": "user", "content": user_message})

    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a helpful coding tutor.",
        messages=conversation
    )

    assistant_reply = message.content[0].text
    conversation.append({"role": "assistant", "content": assistant_reply})
    return assistant_reply

# Example conversation
print(chat("What is a Python dictionary?"))
print(chat("Show me an example with nested dictionaries."))
print(chat("How do I safely access nested keys?"))

Each follow-up message includes the full conversation context, so Claude can reference earlier exchanges.

Error Handling

Production applications should always handle API errors gracefully:

from anthropic import APIError, RateLimitError, AuthenticationError
import time

def call_claude_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=messages
            )
            return message.content[0].text

        except RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limited. Retrying in {wait_time}s...")
            time.sleep(wait_time)

        except AuthenticationError:
            print("Invalid API key. Check your credentials.")
            raise

        except APIError as e:
            print(f"API error (status {e.status_code}): {e.message}")
            if attempt == max_retries - 1:
                raise

    raise Exception("Max retries exceeded")

For a detailed list of error codes and what they mean, see our error codes reference.

Available Models

All Claude models are available through claudeapi.cheap:

| Model | Best For | Speed |

|-------|----------|-------|

| claude-opus-4-7 | Newest flagship; advanced reasoning, coding, research | Slowest |

| claude-opus-4-6 | Complex reasoning, research, architecture | Slowest |

| claude-sonnet-4-6 | Everyday coding, writing, analysis | Balanced |

| claude-haiku-4-5 | Quick tasks, classification, extraction | Fastest |

For detailed pricing across all models, check our pricing comparison.

Next Steps

Now that you have the Python SDK working with claudeapi.cheap, here are some directions to explore:

Read about streaming vs non-streaming to choose the right approach for your app

Learn about rate limits and how to work within them

Explore cost-saving strategies to optimize your API spending

Set up Claude Code for AI-assisted development in your terminal

Get your API key at claudeapi.cheap