Skip to content
All posts
·4 min readtutorialpythonclaude python sdk tutorial

Claude Python SDK Tutorial — Full Guide with Code Examples

Complete guide to using the Anthropic Python SDK with claudeapi.cheap. Covers basic usage, streaming, async, error handling, and multi-turn conversations with code examples.

Getting Started

The official Anthropic Python SDK makes it straightforward to interact with Claude models from any Python application. By pairing it with claudeapi.cheap, you get the same SDK experience at up to 80% off.

This tutorial walks through everything you need: installation, basic usage, streaming, async patterns, error handling, and multi-turn conversations.

Installation

Install the official Anthropic SDK from PyPI:

pip install anthropic

You will also need a claudeapi.cheap account. Sign up here if you have not already, then grab your API key from the dashboard.

Basic Usage

Connecting to claudeapi.cheap requires just two changes from the standard Anthropic setup: the base_url and api_key parameters.

from anthropic import Anthropic

client = Anthropic(
    base_url="https://claudeapi.cheap/api/proxy",
    api_key="sk-cc-your-api-key"
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what a REST API is in simple terms."}
    ]
)

print(message.content[0].text)

That is it. The response format, model behavior, and capabilities are identical to the official API.

Using Environment Variables

For cleaner code and better security, set your credentials as environment variables instead of hardcoding them:

export ANTHROPIC_API_KEY="sk-cc-your-api-key"
export ANTHROPIC_BASE_URL="https://claudeapi.cheap/api/proxy"

Then your Python code becomes identical to any standard Anthropic tutorial:

from anthropic import Anthropic

client = Anthropic()  # reads from environment variables automatically

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What is machine learning?"}
    ]
)

print(message.content[0].text)

This is the recommended approach for production applications.

System Prompts

System prompts let you set the behavior and personality of Claude for your application:

message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=2048,
    system="You are a senior Python developer. Give concise, practical answers with code examples.",
    messages=[
        {"role": "user", "content": "What is the best way to handle exceptions in Python?"}
    ]
)

print(message.content[0].text)

Streaming Responses

For long responses or real-time UI updates, use streaming to receive tokens as they are generated:

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a Python function to merge two sorted lists."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Streaming is especially useful for chatbot interfaces, CLI tools, and any application where perceived latency matters.

Async Usage

For web servers, background workers, and any async Python application, use the AsyncAnthropic client:

import asyncio
from anthropic import AsyncAnthropic

async_client = AsyncAnthropic(
    base_url="https://claudeapi.cheap/api/proxy",
    api_key="sk-cc-your-api-key"
)

async def ask_claude(question: str) -> str:
    message = await async_client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": question}]
    )
    return message.content[0].text

async def main():
    # Run multiple requests concurrently
    questions = [
        "What is a decorator in Python?",
        "Explain list comprehensions.",
        "What is the GIL?"
    ]
    results = await asyncio.gather(
        *[ask_claude(q) for q in questions]
    )
    for q, a in zip(questions, results):
        print(f"Q: {q}")
        print(f"A: {a}\n")

asyncio.run(main())

The async client is ideal for FastAPI, aiohttp, or any asyncio-based application where you want to handle multiple requests without blocking.

Multi-Turn Conversations

To build a conversational experience, pass the full message history with each request:

conversation = []

def chat(user_message: str) -> str:
    conversation.append({"role": "user", "content": user_message})

    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a helpful coding tutor.",
        messages=conversation
    )

    assistant_reply = message.content[0].text
    conversation.append({"role": "assistant", "content": assistant_reply})
    return assistant_reply

# Example conversation
print(chat("What is a Python dictionary?"))
print(chat("Show me an example with nested dictionaries."))
print(chat("How do I safely access nested keys?"))

Each follow-up message includes the full conversation context, so Claude can reference earlier exchanges.

Error Handling

Production applications should always handle API errors gracefully:

from anthropic import APIError, RateLimitError, AuthenticationError
import time

def call_claude_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            message = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=messages
            )
            return message.content[0].text

        except RateLimitError:
            wait_time = 2 ** attempt
            print(f"Rate limited. Retrying in {wait_time}s...")
            time.sleep(wait_time)

        except AuthenticationError:
            print("Invalid API key. Check your credentials.")
            raise

        except APIError as e:
            print(f"API error (status {e.status_code}): {e.message}")
            if attempt == max_retries - 1:
                raise

    raise Exception("Max retries exceeded")

For a detailed list of error codes and what they mean, see our error codes reference.

Available Models

All Claude models are available through claudeapi.cheap:

| Model | Best For | Speed |

|-------|----------|-------|

| claude-opus-4-7 | Newest flagship; advanced reasoning, coding, research | Slowest |

| claude-opus-4-6 | Complex reasoning, research, architecture | Slowest |

| claude-sonnet-4-6 | Everyday coding, writing, analysis | Balanced |

| claude-haiku-4-5 | Quick tasks, classification, extraction | Fastest |

For detailed pricing across all models, check our pricing comparison.

Next Steps

Now that you have the Python SDK working with claudeapi.cheap, here are some directions to explore:

  • Read about streaming vs non-streaming to choose the right approach for your app
  • Learn about rate limits and how to work within them
  • Explore cost-saving strategies to optimize your API spending
  • Set up Claude Code for AI-assisted development in your terminal
  • Get your API key at claudeapi.cheap