# Anthropic Developer Documentation - Full Content This file provides comprehensive documentation with full rendered content. ## Root URL Claude Developer Platform Console (Requires login) https://platform.claude.com ## Available Languages on Website The full documentation is available in the following languages on https://platform.claude.com/docs: - English (en) - 532 pages - ✓ Full content included below - German (Deutsch) (de) - 181 pages - Visit website for content - Spanish (Español) (es) - 181 pages - Visit website for content - French (Français) (fr) - 181 pages - Visit website for content - Italian (Italiano) (it) - 181 pages - Visit website for content - Japanese (日本語) (ja) - 181 pages - Visit website for content - Korean (한국어) (ko) - 181 pages - Visit website for content - Portuguese (Português) (pt-BR) - 181 pages - Visit website for content - Russian (Русский) (ru) - 181 pages - Visit website for content - Chinese Simplified (简体中文) (zh-CN) - 181 pages - Visit website for content - Chinese Traditional (繁體中文) (zh-TW) - 181 pages - Visit website for content - Indonesian (Bahasa Indonesia) (id) - 181 pages - Visit website for content --- # English Documentation - Full Content ## Developer Guide ### First steps --- # Get started with Claude URL: https://platform.claude.com/docs/en/get-started # Get started with Claude Make your first API call to Claude and build a simple web search assistant --- ## Prerequisites - An Anthropic [Console account](/) - An [API key](/settings/keys) ## Call the API Get your API key at the [Claude Console](/settings/keys) and set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key-here' ``` Run this command to create a simple web search assistant: ```bash curl https://api.anthropic.com/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1000, "messages": [ { "role": "user", "content": "What should I search for to find the latest developments in renewable energy?" } ] }' ``` **Example output:** ```json { "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "Here are some effective search strategies to find the latest renewable energy developments:\n\n## Search Terms to Use:\n- \"renewable energy news 2024\"\n- \"clean energy breakthrough\"\n- \"solar/wind/battery technology advances\"\n- \"green energy innovations\"\n- \"climate tech developments\"\n- \"energy storage solutions\"\n\n## Best Sources to Check:\n\n**News & Industry Sites:**\n- Renewable Energy World\n- GreenTech Media (now Wood Mackenzie)\n- Energy Storage News\n- CleanTechnica\n- PV Magazine (for solar)\n- WindPower Engineering & Development..." } ], "model": "claude-sonnet-4-5", "stop_reason": "end_turn", "usage": { "input_tokens": 21, "output_tokens": 305 } } ``` Get your API key from the [Claude Console](/settings/keys) and set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key-here' ``` Install the Anthropic Python SDK: ```bash pip install anthropic ``` Save this as `quickstart.py`: ```python import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1000, messages=[ { "role": "user", "content": "What should I search for to find the latest developments in renewable energy?" } ] ) print(message.content) ``` ```bash python quickstart.py ``` **Example output:** ```python [TextBlock(text='Here are some effective search strategies for finding the latest renewable energy developments:\n\n**Search Terms to Use:**\n- "renewable energy news 2024"\n- "clean energy breakthroughs"\n- "solar/wind/battery technology advances"\n- "energy storage innovations"\n- "green hydrogen developments"\n- "renewable energy policy updates"\n\n**Reliable Sources to Check:**\n- **News & Analysis:** Reuters Energy, Bloomberg New Energy Finance, Greentech Media, Energy Storage News\n- **Industry Publications:** Renewable Energy World, PV Magazine, Wind Power Engineering\n- **Research Organizations:** International Energy Agency (IEA), National Renewable Energy Laboratory (NREL)\n- **Government Sources:** Department of Energy websites, EPA clean energy updates\n\n**Specific Topics to Explore:**\n- Perovskite and next-gen solar cells\n- Offshore wind expansion\n- Grid-scale battery storage\n- Green hydrogen production\n- Carbon capture technologies\n- Smart grid innovations\n- Energy policy changes and incentives...', type='text')] ``` Get your API key from the [Claude Console](/settings/keys) and set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key-here' ``` Install the Anthropic TypeScript SDK: ```bash npm install @anthropic-ai/sdk ``` Save this as `quickstart.ts`: ```typescript import Anthropic from "@anthropic-ai/sdk"; async function main() { const anthropic = new Anthropic(); const msg = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1000, messages: [ { role: "user", content: "What should I search for to find the latest developments in renewable energy?" } ] }); console.log(msg); } main().catch(console.error); ``` ```bash npx tsx quickstart.ts ``` **Example output:** ```javascript { id: 'msg_01ThFHzad6Bh4TpQ6cHux9t8', type: 'message', role: 'assistant', model: 'claude-sonnet-4-5-20250929', content: [ { type: 'text', text: 'Here are some effective search strategies to find the latest renewable energy developments:\n\n' + '## Search Terms to Use:\n' + '- "renewable energy news 2024"\n' + '- "clean energy breakthroughs"\n' + '- "solar wind technology advances"\n' + '- "energy storage innovations"\n' + '- "green hydrogen developments"\n' + '- "offshore wind projects"\n' + '- "battery technology renewable"\n\n' + '## Best Sources to Check:\n\n' + '**News & Industry Sites:**\n' + '- Renewable Energy World\n' + '- CleanTechnica\n' + '- GreenTech Media (now Wood Mackenzie)\n' + '- Energy Storage News\n' + '- PV Magazine (for solar)...' } ], stop_reason: 'end_turn', usage: { input_tokens: 21, output_tokens: 302 } } ``` Get your API key from the [Claude Console](/settings/keys) and set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key-here' ``` Add the Anthropic Java SDK to your project. First find the current version on [Maven Central](https://central.sonatype.com/artifact/com.anthropic/anthropic-java). **Gradle:** ```groovy implementation("com.anthropic:anthropic-java:1.0.0") ``` **Maven:** ```xml com.anthropic anthropic-java 1.0.0 ``` Save this as `QuickStart.java`: ```java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; public class QuickStart { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model("claude-sonnet-4-5-20250929") .maxTokens(1000) .addUserMessage("What should I search for to find the latest developments in renewable energy?") .build(); Message message = client.messages().create(params); System.out.println(message.content()); } } ``` ```bash javac QuickStart.java java QuickStart ``` **Example output:** ```java [ContentBlock{text=TextBlock{text=Here are some effective search strategies to find the latest renewable energy developments: ## Search Terms to Use: - "renewable energy news 2024" - "clean energy breakthroughs" - "solar/wind/battery technology advances" - "energy storage innovations" - "green hydrogen developments" - "renewable energy policy updates" ## Best Sources to Check: - **News & Analysis:** Reuters Energy, Bloomberg New Energy Finance, Greentech Media - **Industry Publications:** Renewable Energy World, PV Magazine, Wind Power Engineering - **Research Organizations:** International Energy Agency (IEA), National Renewable Energy Laboratory (NREL) - **Government Sources:** Department of Energy websites, EPA clean energy updates ## Specific Topics to Explore: - Perovskite and next-gen solar cells - Offshore wind expansion - Grid-scale battery storage - Green hydrogen production..., type=text}}] ``` ## Next steps Now that you have made your first Claude API request, it's time to explore what else is possible: Learn common patterns for the Messages API. Explore Claude's advanced features and capabilities. Discover Anthropic client libraries. Learn with interactive Jupyter notebooks. --- # Intro to Claude URL: https://platform.claude.com/docs/en/intro # Intro to Claude Claude is a highly performant, trustworthy, and intelligent AI platform built by Anthropic. Claude excels at tasks involving language, reasoning, analysis, coding, and more. --- The latest generation of Claude models: **Claude Opus 4.5** - Most intelligent model, and an industry-leader for coding, agents, and computer use. [Learn more](https://www.anthropic.com/news/claude-opus-4-5). **Claude Sonnet 4.5** - Balanced performance and practicality for most uses, including coding and agents. [Learn more](https://www.anthropic.com/news/claude-sonnet-4-5). **Claude Haiku 4.5** - Fastest model with near-frontier intelligence. [Learn more](https://www.anthropic.com/news/claude-haiku-4-5). Looking to chat with Claude? Visit [claude.ai](http://www.claude.ai)! ## Get started If you’re new to Claude, start here to learn the essentials and make your first API call. Set up your development environment for building with Claude. Learn about the family of Claude models. Explore example prompts for inspiration. --- ## Develop with Claude Anthropic has best-in-class developer tools to build scalable applications with Claude. Enjoy easier, more powerful prompting in your browser with the Workbench and the prompt generator tool. Explore, implement, and scale with the Claude API and SDKs. Learn with interactive Jupyter notebooks that demonstrate uploading PDFs, embeddings, and more. --- ## Key capabilities Claude can assist with many tasks that involve text, code, and images. Summarize text, answer questions, extract data, translate text, and explain and generate code. Process and analyze visual input and generate text and code from images. --- ## Support Find answers to frequently asked account and billing questions. Check the status of Anthropic services. ### Models & pricing --- # Models overview URL: https://platform.claude.com/docs/en/about-claude/models/overview # Models overview Claude is a family of state-of-the-art large language models developed by Anthropic. This guide introduces our models and compares their performance. --- ## Choosing a model If you're unsure which model to use, we recommend starting with **Claude Sonnet 4.5**. It offers the best balance of intelligence, speed, and cost for most use cases, with exceptional performance in coding and agentic tasks. All current Claude models support text and image input, text output, multilingual capabilities, and vision. Models are available via the Anthropic API, AWS Bedrock, and Google Vertex AI. Once you've picked a model, [learn how to make your first API call](/docs/en/get-started). ### Latest models comparison | Feature | Claude Sonnet 4.5 | Claude Haiku 4.5 | Claude Opus 4.5 | |:--------|:------------------|:-----------------|:----------------| | **Description** | Our smart model for complex agents and coding | Our fastest model with near-frontier intelligence | Premium model combining maximum intelligence with practical performance | | **Claude API ID** | claude-sonnet-4-5-20250929 | claude-haiku-4-5-20251001 | claude-opus-4-5-20251101 | | **Claude API alias**1 | claude-sonnet-4-5 | claude-haiku-4-5 | claude-opus-4-5 | | **AWS Bedrock ID** | anthropic.claude-sonnet-4-5-20250929-v1:0 | anthropic.claude-haiku-4-5-20251001-v1:0 | anthropic.claude-opus-4-5-20251101-v1:0 | | **GCP Vertex AI ID** | claude-sonnet-4-5@20250929 | claude-haiku-4-5@20251001 | claude-opus-4-5@20251101 | | **Pricing**2 | \$3 / input MTok
\$15 / output MTok | \$1 / input MTok
\$5 / output MTok | \$5 / input MTok
\$25 / output MTok | | **[Extended thinking](/docs/en/build-with-claude/extended-thinking)** | Yes | Yes | Yes | | **[Priority Tier](/docs/en/api/service-tiers)** | Yes | Yes | Yes | | **Comparative latency** | Fast | Fastest | Moderate | | **Context window** | 200K tokens /
1M tokens (beta)3 | 200K tokens | 200K tokens | | **Max output** | 64K tokens | 64K tokens | 64K tokens | | **Reliable knowledge cutoff** | Jan 20254 | Feb 2025 | May 20254 | | **Training data cutoff** | Jul 2025 | Jul 2025 | Aug 2025 | _1 - Aliases automatically point to the most recent model snapshot. When we release new model snapshots, we migrate aliases to point to the newest version of a model, typically within a week of the new release. While aliases are useful for experimentation, we recommend using specific model versions (e.g., `claude-sonnet-4-5-20250929`) in production applications to ensure consistent behavior._ _2 - See our [pricing page](/docs/en/about-claude/pricing) for complete pricing information including batch API discounts, prompt caching rates, extended thinking costs, and vision processing fees._ _3 - Claude Sonnet 4.5 supports a [1M token context window](/docs/en/build-with-claude/context-windows#1m-token-context-window) when using the `context-1m-2025-08-07` beta header. [Long context pricing](/docs/en/about-claude/pricing#long-context-pricing) applies to requests exceeding 200K tokens._ _4 - **Reliable knowledge cutoff** indicates the date through which a model's knowledge is most extensive and reliable. **Training data cutoff** is the broader date range of training data used. For example, Claude Sonnet 4.5 was trained on publicly available information through July 2025, but its knowledge is most extensive and reliable through January 2025. For more information, see [Anthropic's Transparency Hub](https://www.anthropic.com/transparency)._ Models with the same snapshot date (e.g., 20240620) are identical across all platforms and do not change. The snapshot date in the model name ensures consistency and allows developers to rely on stable performance across different environments. Starting with **Claude Sonnet 4.5 and all future models**, AWS Bedrock and Google Vertex AI offer two endpoint types: **global endpoints** (dynamic routing for maximum availability) and **regional endpoints** (guaranteed data routing through specific geographic regions). For more information, see the [third-party platform pricing section](/docs/en/about-claude/pricing#third-party-platform-pricing).
The following models are still available but we recommend migrating to current models for improved performance: | Feature | Claude Opus 4.1 | Claude Sonnet 4 | Claude Sonnet 3.7 | Claude Opus 4 | Claude Haiku 3 | |:--------|:----------------|:----------------|:------------------|:--------------|:---------------| | **Claude API ID** | claude-opus-4-1-20250805 | claude-sonnet-4-20250514 | claude-3-7-sonnet-20250219 | claude-opus-4-20250514 | claude-3-haiku-20240307 | | **Claude API alias** | claude-opus-4-1 | claude-sonnet-4-0 | claude-3-7-sonnet-latest | claude-opus-4-0 | — | | **AWS Bedrock ID** | anthropic.claude-opus-4-1-20250805-v1:0 | anthropic.claude-sonnet-4-20250514-v1:0 | anthropic.claude-3-7-sonnet-20250219-v1:0 | anthropic.claude-opus-4-20250514-v1:0 | anthropic.claude-3-haiku-20240307-v1:0 | | **GCP Vertex AI ID** | claude-opus-4-1@20250805 | claude-sonnet-4@20250514 | claude-3-7-sonnet@20250219 | claude-opus-4@20250514 | claude-3-haiku@20240307 | | **Pricing** | \$15 / input MTok
\$75 / output MTok | \$3 / input MTok
\$15 / output MTok | \$3 / input MTok
\$15 / output MTok | \$15 / input MTok
\$75 / output MTok | \$0.25 / input MTok
\$1.25 / output MTok | | **[Extended thinking](/docs/en/build-with-claude/extended-thinking)** | Yes | Yes | Yes | Yes | No | | **[Priority Tier](/docs/en/api/service-tiers)** | Yes | Yes | Yes | Yes | No | | **Comparative latency** | Moderate | Fast | Fast | Moderate | Fast | | **Context window** | 200K tokens | 200K tokens /
1M tokens (beta)1 | 200K tokens | 200K tokens | 200K tokens | | **Max output** | 32K tokens | 64K tokens | 64K tokens / 128K tokens (beta)4 | 32K tokens | 4K tokens | | **Reliable knowledge cutoff** | Jan 20252 | Jan 20252 | Oct 20242 | Jan 20252 | 3 | | **Training data cutoff** | Mar 2025 | Mar 2025 | Nov 2024 | Mar 2025 | Aug 2023 | _1 - Claude Sonnet 4 supports a [1M token context window](/docs/en/build-with-claude/context-windows#1m-token-context-window) when using the `context-1m-2025-08-07` beta header. [Long context pricing](/docs/en/about-claude/pricing#long-context-pricing) applies to requests exceeding 200K tokens._ _2 - **Reliable knowledge cutoff** indicates the date through which a model's knowledge is most extensive and reliable. **Training data cutoff** is the broader date range of training data used._ _3 - Some Haiku models have a single training data cutoff date._ _4 - Include the beta header `output-128k-2025-02-19` in your API request to increase the maximum output token length to 128K tokens for Claude Sonnet 3.7. We strongly suggest using our [streaming Messages API](/docs/en/build-with-claude/streaming) to avoid timeouts when generating longer outputs. See our guidance on [long requests](/docs/en/api/errors#long-requests) for more details._
## Prompt and output performance Claude 4 models excel in: - **Performance**: Top-tier results in reasoning, coding, multilingual tasks, long-context handling, honesty, and image processing. See the [Claude 4 blog post](http://www.anthropic.com/news/claude-4) for more information. - **Engaging responses**: Claude models are ideal for applications that require rich, human-like interactions. - If you prefer more concise responses, you can adjust your prompts to guide the model toward the desired output length. Refer to our [prompt engineering guides](/docs/en/build-with-claude/prompt-engineering) for details. - For specific Claude 4 prompting best practices, see our [Claude 4 best practices guide](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices). - **Output quality**: When migrating from previous model generations to Claude 4, you may notice larger improvements in overall performance. ## Migrating to Claude 4.5 If you're currently using Claude 3 models, we recommend migrating to Claude 4.5 to take advantage of improved intelligence and enhanced capabilities. For detailed migration instructions, see [Migrating to Claude 4.5](/docs/en/about-claude/models/migrating-to-claude-4). ## Get started with Claude If you're ready to start exploring what Claude can do for you, let's dive in! Whether you're a developer looking to integrate Claude into your applications or a user wanting to experience the power of AI firsthand, we've got you covered. Looking to chat with Claude? Visit [claude.ai](http://www.claude.ai)! Explore Claude's capabilities and development flow. Learn how to make your first API call in minutes. Craft and test powerful prompts directly in your browser. If you have any questions or need assistance, don't hesitate to reach out to our [support team](https://support.claude.com/) or consult the [Discord community](https://www.anthropic.com/discord). --- # Choosing the right model URL: https://platform.claude.com/docs/en/about-claude/models/choosing-a-model # Choosing the right model Selecting the optimal Claude model for your application involves balancing three key considerations: capabilities, speed, and cost. This guide helps you make an informed decision based on your specific requirements. --- ## Establish key criteria When choosing a Claude model, we recommend first evaluating these factors: - **Capabilities:** What specific features or capabilities will you need the model to have in order to meet your needs? - **Speed:** How quickly does the model need to respond in your application? - **Cost:** What's your budget for both development and production usage? Knowing these answers in advance will make narrowing down and deciding which model to use much easier. *** ## Choose the best model to start with There are two general approaches you can use to start testing which Claude model best works for your needs. ### Option 1: Start with a fast, cost-effective model For many applications, starting with a faster, more cost-effective model like Claude Haiku 4.5 can be the optimal approach: 1. Begin implementation with Claude Haiku 4.5 2. Test your use case thoroughly 3. Evaluate if performance meets your requirements 4. Upgrade only if necessary for specific capability gaps This approach allows for quick iteration, lower development costs, and is often sufficient for many common applications. This approach is best for: - Initial prototyping and development - Applications with tight latency requirements - Cost-sensitive implementations - High-volume, straightforward tasks ### Option 2: Start with the most capable model For complex tasks where intelligence and advanced capabilities are paramount, you may want to start with the most capable model and then consider optimizing to more efficient models down the line: 1. Implement with Claude Sonnet 4.5 2. Optimize your prompts for these models 3. Evaluate if performance meets your requirements 4. Consider increasing efficiency by downgrading intelligence over time with greater workflow optimization This approach is best for: - Complex reasoning tasks - Scientific or mathematical applications - Tasks requiring nuanced understanding - Applications where accuracy outweighs cost considerations - Advanced coding ## Model selection matrix | When you need... | We recommend starting with... | Example use cases | |------------------|-------------------|-------------------| | Best model for complex agents and coding, highest intelligence across most tasks, superior tool orchestration for long-running autonomous tasks | Claude Sonnet 4.5 | Autonomous coding agents, cybersecurity automation, complex financial analysis, multi-hour research tasks, multi agent frameworks | | Maximum intelligence with practical performance for complex specialized tasks | Claude Opus 4.5 | Professional software engineering, advanced agents for office tasks, computer and browser use at scale, step-change vision applications | | Exceptional intelligence and reasoning for specialized complex tasks | Claude Opus 4.1 | Highly complex codebase refactoring, nuanced creative writing, specialized scientific analysis | | Near-frontier performance with lightning-fast speed and extended thinking - our fastest and most intelligent Haiku model at the most economical price point | Claude Haiku 4.5 | Real-time applications, high-volume intelligent processing, cost-sensitive deployments needing strong reasoning, sub-agent tasks | *** ## Decide whether to upgrade or change models To determine if you need to upgrade or change models, you should: 1. [Create benchmark tests](/docs/en/test-and-evaluate/develop-tests) specific to your use case - having a good evaluation set is the most important step in the process 2. Test with your actual prompts and data 3. Compare performance across models for: - Accuracy of responses - Response quality - Handling of edge cases 4. Weigh performance and cost tradeoffs ## Next steps See detailed specifications and pricing for the latest Claude models Explore the latest improvements in Claude 4.5 models Get started with your first API call --- # Migrating to Claude 4.5 URL: https://platform.claude.com/docs/en/about-claude/models/migrating-to-claude-4 # Migrating to Claude 4.5 --- This guide covers two key migration paths to Claude 4.5 models: - **Claude Sonnet 3.7 → Claude Sonnet 4.5**: Our most intelligent model with best-in-class reasoning, coding, and long-running agent capabilities - **Claude Haiku 3.5 → Claude Haiku 4.5**: Our fastest and most intelligent Haiku model with near-frontier performance for real-time applications and high-volume intelligent processing Both migrations involve breaking changes that require updates to your implementation. This guide will walk you through each migration path with step-by-step instructions and clearly marked breaking changes. Before starting your migration, we recommend reviewing [What's new in Claude 4.5](/docs/en/about-claude/models/whats-new-claude-4-5) to understand the new features and capabilities available in these models, including extended thinking, context awareness, and behavioral improvements. ## Migrating from Claude Sonnet 3.7 to Claude Sonnet 4.5 Claude Sonnet 4.5 is our most intelligent model, offering best-in-class performance for reasoning, coding, and long-running autonomous agents. This migration includes several breaking changes that require updates to your implementation. ### Migration steps 1. **Update your model name:** ```python # Before (Claude Sonnet 3.7) model="claude-3-7-sonnet-20250219" # After (Claude Sonnet 4.5) model="claude-sonnet-4-5-20250929" ``` 2. **Update sampling parameters** This is a breaking change from the Claude Sonnet 3.7. Use only `temperature` OR `top_p`, not both: ```python # Before (Claude Sonnet 3.7) - This will error in Sonnet 4.5 response = client.messages.create( model="claude-3-7-sonnet-20250219", temperature=0.7, top_p=0.9, # Cannot use both ... ) # After (Claude Sonnet 4.5) response = client.messages.create( model="claude-sonnet-4-5-20250929", temperature=0.7, # Use temperature OR top_p, not both ... ) ``` 3. **Handle the new `refusal` stop reason** Update your application to [handle `refusal` stop reasons](/docs/en/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals): ```python response = client.messages.create(...) if response.stop_reason == "refusal": # Handle refusal appropriately pass ``` 4. **Update text editor tool (if applicable)** This is a breaking change from the Claude Sonnet 3.7. Update to `text_editor_20250728` (type) and `str_replace_based_edit_tool` (name). Remove any code using the `undo_edit` command. ```python # Before (Claude Sonnet 3.7) tools=[{"type": "text_editor_20250124", "name": "str_replace_editor"}] # After (Claude Sonnet 4.5) tools=[{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}] ``` See [Text editor tool documentation](/docs/en/agents-and-tools/tool-use/text-editor-tool) for details. 5. **Update code execution tool (if applicable)** Upgrade to `code_execution_20250825`. The legacy version `code_execution_20250522` still works but is not recommended. See [Code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool#upgrade-to-latest-tool-version) for migration instructions. 6. **Remove token-efficient tool use beta header** Token-efficient tool use is a beta feature that only works with Claude 3.7 Sonnet. All Claude 4 models have built-in token-efficient tool use, so you should no longer include the beta header. Remove the `token-efficient-tools-2025-02-19` [beta header](/docs/en/api/beta-headers) from your requests: ```python # Before (Claude Sonnet 3.7) client.messages.create( model="claude-3-7-sonnet-20250219", betas=["token-efficient-tools-2025-02-19"], # Remove this ... ) # After (Claude Sonnet 4.5) client.messages.create( model="claude-sonnet-4-5-20250929", # No token-efficient-tools beta header ... ) ``` 7. **Remove extended output beta header** The `output-128k-2025-02-19` [beta header](/docs/en/api/beta-headers) for extended output is only available in Claude Sonnet 3.7. Remove this header from your requests: ```python # Before (Claude Sonnet 3.7) client.messages.create( model="claude-3-7-sonnet-20250219", betas=["output-128k-2025-02-19"], # Remove this ... ) # After (Claude Sonnet 4.5) client.messages.create( model="claude-sonnet-4-5-20250929", # No output-128k beta header ... ) ``` 8. **Update your prompts for behavioral changes** Claude Sonnet 4.5 has a more concise, direct communication style and requires explicit direction. Review [Claude 4 prompt engineering best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices) for optimization guidance. 9. **Consider enabling extended thinking for complex tasks** Enable [extended thinking](/docs/en/build-with-claude/extended-thinking) for significant performance improvements on coding and reasoning tasks (disabled by default): ```python response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=16000, thinking={"type": "enabled", "budget_tokens": 10000}, messages=[...] ) ``` Extended thinking impacts [prompt caching](/docs/en/build-with-claude/prompt-caching#caching-with-thinking-blocks) efficiency. 10. **Test your implementation** Test in a development environment before deploying to production to ensure all breaking changes are properly handled. ### Sonnet 3.7 → 4.5 migration checklist - [ ] Update model ID to `claude-sonnet-4-5-20250929` - [ ] **BREAKING**: Update sampling parameters to use only `temperature` OR `top_p`, not both - [ ] Handle new `refusal` stop reason in your application - [ ] **BREAKING**: Update text editor tool to `text_editor_20250728` and `str_replace_based_edit_tool` (if applicable) - [ ] **BREAKING**: Remove any code using the `undo_edit` command (if applicable) - [ ] Update code execution tool to `code_execution_20250825` (if applicable) - [ ] Remove `token-efficient-tools-2025-02-19` beta header (if applicable) - [ ] Remove `output-128k-2025-02-19` beta header (if applicable) - [ ] Review and update prompts following [Claude 4 best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices) - [ ] Consider enabling extended thinking for complex reasoning tasks - [ ] Handle `model_context_window_exceeded` stop reason (Sonnet 4.5 specific) - [ ] Consider enabling memory tool for long-running agents (beta) - [ ] Consider using automatic tool call clearing for context editing (beta) - [ ] Test in development environment before production deployment ### Features removed from Claude Sonnet 3.7 - **Token-efficient tool use**: The `token-efficient-tools-2025-02-19` beta header only works with Claude 3.7 Sonnet and is not supported in Claude 4 models (see step 6) - **Extended output**: The `output-128k-2025-02-19` beta header is not supported (see step 7) Both headers can be included in Claude 4 requests but will have no effect. ## Migrating from Claude Haiku 3.5 to Claude Haiku 4.5 Claude Haiku 4.5 is our fastest and most intelligent Haiku model with near-frontier performance, delivering premium model quality with real-time performance for interactive applications and high-volume intelligent processing. This migration includes several breaking changes that require updates to your implementation. For a complete overview of new capabilities, see [What's new in Claude 4.5](/docs/en/about-claude/models/whats-new-claude-4-5#key-improvements-in-haiku-4-5-over-haiku-3-5). Haiku 4.5 pricing $1 per million input tokens, $5 per million output tokens. See [Claude pricing](/docs/en/about-claude/pricing) for details. ### Migration steps 1. **Update your model name:** ```python # Before (Haiku 3.5) model="claude-3-5-haiku-20241022" # After (Haiku 4.5) model="claude-haiku-4-5-20251001" ``` 2. **Update tool versions (if applicable)** This is a breaking change from the Claude Haiku 3.5. Haiku 4.5 only supports the latest tool versions: ```python # Before (Haiku 3.5) tools=[{"type": "text_editor_20250124", "name": "str_replace_editor"}] # After (Haiku 4.5) tools=[{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}] ``` - **Text editor**: Use `text_editor_20250728` and `str_replace_based_edit_tool` - **Code execution**: Use `code_execution_20250825` - Remove any code using the `undo_edit` command 3. **Update sampling parameters** This is a breaking change from the Claude Haiku 3.5. Use only `temperature` OR `top_p`, not both: ```python # Before (Haiku 3.5) - This will error in Haiku 4.5 response = client.messages.create( model="claude-3-5-haiku-20241022", temperature=0.7, top_p=0.9, # Cannot use both ... ) # After (Haiku 4.5) response = client.messages.create( model="claude-haiku-4-5-20251001", temperature=0.7, # Use temperature OR top_p, not both ... ) ``` 4. **Review new rate limits** Haiku 4.5 has separate rate limits from Haiku 3.5. See [Rate limits documentation](/docs/en/api/rate-limits) for details. 5. **Handle the new `refusal` stop reason** Update your application to [handle refusal stop reasons](/docs/en/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals). 6. **Consider enabling extended thinking for complex tasks** Enable [extended thinking](/docs/en/build-with-claude/extended-thinking) for significant performance improvements on coding and reasoning tasks (disabled by default): ```python response = client.messages.create( model="claude-haiku-4-5-20251001", max_tokens=16000, thinking={"type": "enabled", "budget_tokens": 5000}, messages=[...] ) ``` Extended thinking impacts [prompt caching](/docs/en/build-with-claude/prompt-caching#caching-with-thinking-blocks) efficiency. 7. **Explore new capabilities** See [What's new in Claude 4.5](/docs/en/about-claude/models/whats-new-claude-4-5#key-improvements-in-haiku-4-5-over-haiku-3-5) for details on context awareness, increased output capacity (64K tokens), higher intelligence, and improved speed. 8. **Test your implementation** Test in a development environment before deploying to production to ensure all breaking changes are properly handled. ### Haiku 3.5 → 4.5 migration checklist - [ ] Update model ID to `claude-haiku-4-5-20251001` - [ ] **BREAKING**: Update tool versions to latest (e.g., `text_editor_20250728`, `code_execution_20250825`) - legacy versions not supported - [ ] **BREAKING**: Remove any code using the `undo_edit` command (if applicable) - [ ] **BREAKING**: Update sampling parameters to use only `temperature` OR `top_p`, not both - [ ] Review and adjust for new rate limits (separate from Haiku 3.5) - [ ] Handle new `refusal` stop reason in your application - [ ] Consider enabling extended thinking for complex reasoning tasks (new capability) - [ ] Leverage context awareness for better token management in long sessions - [ ] Prepare for larger responses (max output increased from 8K to 64K tokens) - [ ] Review and update prompts following [Claude 4 best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices) - [ ] Test in development environment before production deployment ## Choosing between Sonnet 4.5 and Haiku 4.5 Both Claude Sonnet 4.5 and Claude Haiku 4.5 are powerful Claude 4 models with different strengths: ### Choose Claude Sonnet 4.5 (most intelligent) for: - **Complex reasoning and analysis**: Best-in-class intelligence for sophisticated tasks - **Long-running autonomous agents**: Superior performance for agents working independently for extended periods - **Advanced coding tasks**: Our strongest coding model with advanced planning and security engineering - **Large context workflows**: Enhanced context management with memory tool and context editing capabilities - **Tasks requiring maximum capability**: When intelligence and accuracy are the top priorities ### Choose Claude Haiku 4.5 (fastest and most intelligent Haiku) for: - **Real-time applications**: Fast response times for interactive user experiences with near-frontier performance - **High-volume intelligent processing**: Cost-effective intelligence at scale with improved speed - **Cost-sensitive deployments**: Near-frontier performance at lower price points - **Sub-agent architectures**: Fast, intelligent agents for multi-agent systems - **Computer use at scale**: Cost-effective autonomous desktop and browser automation - **Tasks requiring speed**: When low latency is critical while maintaining near-frontier intelligence ### Extended thinking recommendations Claude 4 models, particularly Sonnet and Haiku 4.5, show significant performance improvements when using [extended thinking](/docs/en/build-with-claude/extended-thinking) for coding and complex reasoning tasks. Extended thinking is **disabled by default** but we recommend enabling it for demanding work. **Important**: Extended thinking impacts [prompt caching](/docs/en/build-with-claude/prompt-caching#caching-with-thinking-blocks) efficiency. When non-tool-result content is added to a conversation, thinking blocks are stripped from cache, which can increase costs in multi-turn conversations. We recommend enabling thinking when the performance benefits outweigh the caching trade-off. ## Other migration scenarios The primary migration paths covered above (Sonnet 3.7 → 4.5 and Haiku 3.5 → 4.5) represent the most common upgrades. However, you may be migrating from other Claude models to Claude 4.5. This section covers those scenarios. ### Migrating from Claude Sonnet 4 → Sonnet 4.5 **Breaking change**: Cannot specify both `temperature` and `top_p` in the same request. All other API calls will work without modification. Update your model ID and adjust sampling parameters if needed: ```python # Before (Claude Sonnet 4) model="claude-sonnet-4-20250514" # After (Claude Sonnet 4.5) model="claude-sonnet-4-5-20250929" ``` ### Migrating from Claude Opus 4.1 → Sonnet 4.5 **No breaking changes.** All API calls will work without modification. Simply update your model ID: ```python # Before (Claude Opus 4.1) model="claude-opus-4-1-20250805" # After (Claude Sonnet 4.5) model="claude-sonnet-4-5-20250929" ``` Claude Sonnet 4.5 is our most intelligent model with best-in-class reasoning, coding, and long-running agent capabilities. It offers superior performance compared to Opus 4.1 for most use cases. ### Migrating from Claude Opus 4.1 → Opus 4.5 **No breaking changes.** All API calls will work without modification. Simply update your model ID: ```python # Before (Claude Opus 4.1) model="claude-opus-4-1-20250805" # After (Claude Opus 4.5) model="claude-opus-4-5-20251101" ``` Claude Opus 4.5 is our most intelligent model, combining maximum capability with practical performance. It features step-change improvements in vision, coding, and computer use at a more accessible price point than Opus 4.1. Ideal for complex specialized tasks and professional software engineering. For codebases with many model references, a [Claude Code plugin](https://github.com/anthropics/claude-code/tree/main/plugins/claude-opus-4-5-migration) is available to automate migration to Opus 4.5. ### Migrating between Claude 4.5 models **No breaking changes.** All API calls will work without modification. Simply update your model ID. ## Need help? - Check our [API documentation](/docs/en/api/overview) for detailed specifications - Review [model capabilities](/docs/en/about-claude/models/overview) for performance comparisons - Review [API release notes](/docs/en/release-notes/api) for API updates - Contact support if you encounter any issues during migration --- # Model deprecations URL: https://platform.claude.com/docs/en/about-claude/model-deprecations # Model deprecations --- As we launch safer and more capable models, we regularly retire older models. Applications relying on Anthropic models may need occasional updates to keep working. Impacted customers will always be notified by email and in our documentation. This page lists all API deprecations, along with recommended replacements. ## Overview Anthropic uses the following terms to describe the lifecycle of our models: - **Active**: The model is fully supported and recommended for use. - **Legacy**: The model will no longer receive updates and may be deprecated in the future. - **Deprecated**: The model is no longer available for new customers but continues to be available for existing users until retirement. We assign a retirement date at this point. - **Retired**: The model is no longer available for use. Requests to retired models will fail. Please note that deprecated models are likely to be less reliable than active models. We urge you to move workloads to active models to maintain the highest level of support and reliability. ## Migrating to replacements Once a model is deprecated, please migrate all usage to a suitable replacement before the retirement date. Requests to models past the retirement date will fail. To help measure the performance of replacement models on your tasks, we recommend thorough testing of your applications with the new models well before the retirement date. For specific instructions on migrating from Claude 3.7 to Claude 4.5 models, see [Migrating to Claude 4.5](/docs/en/about-claude/models/migrating-to-claude-4). ## Notifications Anthropic notifies customers with active deployments for models with upcoming retirements. We provide at least 60 days notice before model retirement for publicly released models. ## Auditing model usage To help identify usage of deprecated models, customers can access an audit of their API usage. Follow these steps: 1. Go to the [Usage](/settings/usage) page in Console 2. Click the "Export" button 3. Review the downloaded CSV to see usage broken down by API key and model This audit will help you locate any instances where your application is still using deprecated models, allowing you to prioritize updates to newer models before the retirement date. ## Best practices 1. Regularly check our documentation for updates on model deprecations. 2. Test your applications with newer models well before the retirement date of your current model. 3. Update your code to use the recommended replacement model as soon as possible. 4. Contact our support team if you need assistance with migration or have any questions. ## Deprecation downsides and mitigations We currently deprecate and retire models to ensure capacity for new model releases. We recognize that this comes with downsides: - Users who value specific models must migrate to new versions - Researchers lose access to models for ongoing and comparative studies - Model retirement introduces safety- and model welfare-related risks At some point, we hope to make past models publicly available again. In the meantime, we've committed to long-term preservation of model weights and other measures to help mitigate these impacts. For more details, see [Commitments on Model Deprecation and Preservation](https://www.anthropic.com/research/deprecation-commitments). ## Model status All publicly released models are listed below with their status: | API Model Name | Current State | Deprecated | Tentative Retirement Date | |:----------------------------|:--------------------|:------------------|:-------------------------| | `claude-3-haiku-20240307` | Active | N/A | Not sooner than March 7, 2025 | | `claude-3-5-haiku-20241022` | Deprecated | December 19, 2025 | February 19, 2026 | | `claude-3-7-sonnet-20250219`| Deprecated | October 28, 2025 | February 19, 2026 | | `claude-sonnet-4-20250514` | Active | N/A | Not sooner than May 14, 2026 | | `claude-opus-4-20250514` | Active | N/A | Not sooner than May 14, 2026 | | `claude-opus-4-1-20250805` | Active | N/A | Not sooner than August 5, 2026 | | `claude-sonnet-4-5-20250929`| Active | N/A | Not sooner than September 29, 2026 | | `claude-haiku-4-5-20251001` | Active | N/A | Not sooner than October 15, 2026 | | `claude-opus-4-5-20251101` | Active | N/A | Not sooner than November 24, 2026 | ## Deprecation history All deprecations are listed below, with the most recent announcements at the top. ### 2025-12-19: Claude Haiku 3.5 model On December 19, 2025, we notified developers using Claude Haiku 3.5 model of its upcoming retirement on the Claude API. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:----------------------------|:--------------------------------| | February 19, 2026 | `claude-3-5-haiku-20241022` | `claude-haiku-4-5-20251001` | ### 2025-10-28: Claude Sonnet 3.7 model On October 28, 2025, we notified developers using Claude Sonnet 3.7 model of its upcoming retirement on the Claude API. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:----------------------------|:--------------------------------| | February 19, 2026 | `claude-3-7-sonnet-20250219`| `claude-sonnet-4-5-20250929` | ### 2025-08-13: Claude Sonnet 3.5 models These models were retired October 28, 2025. On August 13, 2025, we notified developers using Claude Sonnet 3.5 models of their upcoming retirement. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:----------------------------|:--------------------------------| | October 28, 2025 | `claude-3-5-sonnet-20240620`| `claude-sonnet-4-5-20250929` | | October 28, 2025 | `claude-3-5-sonnet-20241022`| `claude-sonnet-4-5-20250929` | ### 2025-06-30: Claude Opus 3 model This model was retired January 5, 2026. On June 30, 2025, we notified developers using Claude Opus 3 model of its upcoming retirement. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:----------------------------|:--------------------------------| | January 5, 2026 | `claude-3-opus-20240229` | `claude-opus-4-5-20251101` | ### 2025-01-21: Claude 2, Claude 2.1, and Claude Sonnet 3 models These models were retired July 21, 2025. On January 21, 2025, we notified developers using Claude 2, Claude 2.1, and Claude Sonnet 3 models of their upcoming retirements. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:----------------------------|:--------------------------------| | July 21, 2025 | `claude-2.0` | `claude-sonnet-4-5-20250929` | | July 21, 2025 | `claude-2.1` | `claude-sonnet-4-5-20250929` | | July 21, 2025 | `claude-3-sonnet-20240229` | `claude-sonnet-4-5-20250929` | ### 2024-09-04: Claude 1 and Instant models These models were retired November 6, 2024. On September 4, 2024, we notified developers using Claude 1 and Instant models of their upcoming retirements. | Retirement Date | Deprecated Model | Recommended Replacement | |:----------------------------|:--------------------------|:---------------------------| | November 6, 2024 | `claude-1.0` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-1.1` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-1.2` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-1.3` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-instant-1.0` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-instant-1.1` | `claude-haiku-4-5-20251001`| | November 6, 2024 | `claude-instant-1.2` | `claude-haiku-4-5-20251001`| --- # Pricing URL: https://platform.claude.com/docs/en/about-claude/pricing # Pricing Learn about Anthropic's pricing structure for models and features --- This page provides detailed pricing information for Anthropic's models and features. All prices are in USD. For the most current pricing information, please visit [claude.com/pricing](https://claude.com/pricing). ## Model pricing The following table shows pricing for all Claude models across different usage tiers: | Model | Base Input Tokens | 5m Cache Writes | 1h Cache Writes | Cache Hits & Refreshes | Output Tokens | |-------------------|-------------------|-----------------|-----------------|----------------------|---------------| | Claude Opus 4.5 | $5 / MTok | $6.25 / MTok | $10 / MTok | $0.50 / MTok | $25 / MTok | | Claude Opus 4.1 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Opus 4 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Sonnet 4.5 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Sonnet 4 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Haiku 4.5 | $1 / MTok | $1.25 / MTok | $2 / MTok | $0.10 / MTok | $5 / MTok | | Claude Haiku 3.5 | $0.80 / MTok | $1 / MTok | $1.6 / MTok | $0.08 / MTok | $4 / MTok | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Haiku 3 | $0.25 / MTok | $0.30 / MTok | $0.50 / MTok | $0.03 / MTok | $1.25 / MTok | MTok = Million tokens. The "Base Input Tokens" column shows standard input pricing, "Cache Writes" and "Cache Hits" are specific to [prompt caching](/docs/en/build-with-claude/prompt-caching), and "Output Tokens" shows output pricing. Prompt caching offers both 5-minute (default) and 1-hour cache durations to optimize costs for different use cases. The table above reflects the following pricing multipliers for prompt caching: - 5-minute cache write tokens are 1.25 times the base input tokens price - 1-hour cache write tokens are 2 times the base input tokens price - Cache read tokens are 0.1 times the base input tokens price ## Third-party platform pricing Claude models are available on [AWS Bedrock](/docs/en/build-with-claude/claude-on-amazon-bedrock), [Google Vertex AI](/docs/en/build-with-claude/claude-on-vertex-ai), and [Microsoft Foundry](/docs/en/build-with-claude/claude-in-microsoft-foundry). For official pricing, visit: - [AWS Bedrock pricing](https://aws.amazon.com/bedrock/pricing/) - [Google Vertex AI pricing](https://cloud.google.com/vertex-ai/generative-ai/pricing) - [Microsoft Foundry pricing](https://azure.microsoft.com/en-us/pricing/details/ai-foundry/#pricing) **Regional endpoint pricing for Claude 4.5 models and beyond** Starting with Claude Sonnet 4.5 and Haiku 4.5, AWS Bedrock and Google Vertex AI offer two endpoint types: - **Global endpoints**: Dynamic routing across regions for maximum availability - **Regional endpoints**: Data routing guaranteed within specific geographic regions Regional endpoints include a 10% premium over global endpoints. **The Claude API (1P) is global by default and unaffected by this change.** The Claude API is global-only (equivalent to the global endpoint offering and pricing from other providers). **Scope**: This pricing structure applies to Claude Sonnet 4.5, Haiku 4.5, and all future models. Earlier models (Claude Sonnet 4, Opus 4, and prior releases) retain their existing pricing. For implementation details and code examples: - [AWS Bedrock global vs regional endpoints](/docs/en/build-with-claude/claude-on-amazon-bedrock#global-vs-regional-endpoints) - [Google Vertex AI global vs regional endpoints](/docs/en/build-with-claude/claude-on-vertex-ai#global-vs-regional-endpoints) ## Feature-specific pricing ### Batch processing The Batch API allows asynchronous processing of large volumes of requests with a 50% discount on both input and output tokens. | Model | Batch input | Batch output | |-------------------|------------------|-----------------| | Claude Opus 4.5 | $2.50 / MTok | $12.50 / MTok | | Claude Opus 4.1 | $7.50 / MTok | $37.50 / MTok | | Claude Opus 4 | $7.50 / MTok | $37.50 / MTok | | Claude Sonnet 4.5 | $1.50 / MTok | $7.50 / MTok | | Claude Sonnet 4 | $1.50 / MTok | $7.50 / MTok | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | $1.50 / MTok | $7.50 / MTok | | Claude Haiku 4.5 | $0.50 / MTok | $2.50 / MTok | | Claude Haiku 3.5 | $0.40 / MTok | $2 / MTok | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | $7.50 / MTok | $37.50 / MTok | | Claude Haiku 3 | $0.125 / MTok | $0.625 / MTok | For more information about batch processing, see our [batch processing documentation](/docs/en/build-with-claude/batch-processing). ### Long context pricing When using Claude Sonnet 4 or Sonnet 4.5 with the [1M token context window enabled](/docs/en/build-with-claude/context-windows#1m-token-context-window), requests that exceed 200K input tokens are automatically charged at premium long context rates: The 1M token context window is currently in beta for organizations in [usage tier](/docs/en/api/rate-limits) 4 and organizations with custom rate limits. The 1M token context window is only available for Claude Sonnet 4 and Sonnet 4.5. | ≤ 200K input tokens | > 200K input tokens | |-----------------------------------|-------------------------------------| | Input: $3 / MTok | Input: $6 / MTok | | Output: $15 / MTok | Output: $22.50 / MTok | Long context pricing stacks with other pricing modifiers: - The [Batch API 50% discount](#batch-processing) applies to long context pricing - [Prompt caching multipliers](#model-pricing) apply on top of long context pricing Even with the beta flag enabled, requests with fewer than 200K input tokens are charged at standard rates. If your request exceeds 200K input tokens, all tokens incur premium pricing. The 200K threshold is based solely on input tokens (including cache reads/writes). Output token count does not affect pricing tier selection, though output tokens are charged at the higher rate when the input threshold is exceeded. To check if your API request was charged at the 1M context window rates, examine the `usage` object in the API response: ```json { "usage": { "input_tokens": 250000, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0, "output_tokens": 500 } } ``` Calculate the total input tokens by summing: - `input_tokens` - `cache_creation_input_tokens` (if using prompt caching) - `cache_read_input_tokens` (if using prompt caching) If the total exceeds 200,000 tokens, the entire request was billed at 1M context rates. For more information about the `usage` object, see the [API response documentation](/docs/en/api/messages#response-usage). ### Tool use pricing Tool use requests are priced based on: 1. The total number of input tokens sent to the model (including in the `tools` parameter) 2. The number of output tokens generated 3. For server-side tools, additional usage-based pricing (e.g., web search charges per search performed) Client-side tools are priced the same as any other Claude API request, while server-side tools may incur additional charges based on their specific usage. The additional tokens from tool use come from: - The `tools` parameter in API requests (tool names, descriptions, and schemas) - `tool_use` content blocks in API requests and responses - `tool_result` content blocks in API requests When you use `tools`, we also automatically include a special system prompt for the model which enables tool use. The number of tool use tokens required for each model are listed below (excluding the additional tokens listed above). Note that the table assumes at least 1 tool is provided. If no `tools` are provided, then a tool choice of `none` uses 0 additional system prompt tokens. | Model | Tool choice | Tool use system prompt token count | |--------------------------|------------------------------------------------------|---------------------------------------------| | Claude Opus 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Opus 4.1 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Opus 4 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 4 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Haiku 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Haiku 3.5 | `auto`, `none`
`any`, `tool` | 264 tokens
340 tokens | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | `auto`, `none`
`any`, `tool` | 530 tokens
281 tokens | | Claude Sonnet 3 | `auto`, `none`
`any`, `tool` | 159 tokens
235 tokens | | Claude Haiku 3 | `auto`, `none`
`any`, `tool` | 264 tokens
340 tokens | These token counts are added to your normal input and output tokens to calculate the total cost of a request. For current per-model prices, refer to our [model pricing](#model-pricing) section above. For more information about tool use implementation and best practices, see our [tool use documentation](/docs/en/agents-and-tools/tool-use/overview). ### Specific tool pricing #### Bash tool The bash tool adds **245 input tokens** to your API calls. Additional tokens are consumed by: - Command outputs (stdout/stderr) - Error messages - Large file contents See [tool use pricing](#tool-use-pricing) for complete pricing details. #### Code execution tool Code execution tool usage is tracked separately from token usage. Execution time has a minimum of 5 minutes. If files are included in the request, execution time is billed even if the tool is not used due to files being preloaded onto the container. Each organization receives 1,550 free hours of usage with the code execution tool per month. Additional usage beyond the first 1,550 hours is billed at $0.05 per hour, per container. #### Text editor tool The text editor tool uses the same pricing structure as other tools used with Claude. It follows the standard input and output token pricing based on the Claude model you're using. In addition to the base tokens, the following additional input tokens are needed for the text editor tool: | Tool | Additional input tokens | | ----------------------------------------- | --------------------------------------- | | `text_editor_20250429` (Claude 4.x) | 700 tokens | | `text_editor_20250124` (Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations))) | 700 tokens | See [tool use pricing](#tool-use-pricing) for complete pricing details. #### Web search tool Web search usage is charged in addition to token usage: ```json "usage": { "input_tokens": 105, "output_tokens": 6039, "cache_read_input_tokens": 7123, "cache_creation_input_tokens": 7345, "server_tool_use": { "web_search_requests": 1 } } ``` Web search is available on the Claude API for **$10 per 1,000 searches**, plus standard token costs for search-generated content. Web search results retrieved throughout a conversation are counted as input tokens, in search iterations executed during a single turn and in subsequent conversation turns. Each web search counts as one use, regardless of the number of results returned. If an error occurs during web search, the web search will not be billed. #### Web fetch tool Web fetch usage has **no additional charges** beyond standard token costs: ```json "usage": { "input_tokens": 25039, "output_tokens": 931, "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0, "server_tool_use": { "web_fetch_requests": 1 } } ``` The web fetch tool is available on the Claude API at **no additional cost**. You only pay standard token costs for the fetched content that becomes part of your conversation context. To protect against inadvertently fetching large content that would consume excessive tokens, use the `max_content_tokens` parameter to set appropriate limits based on your use case and budget considerations. Example token usage for typical content: - Average web page (10KB): ~2,500 tokens - Large documentation page (100KB): ~25,000 tokens - Research paper PDF (500KB): ~125,000 tokens #### Computer use tool Computer use follows the standard [tool use pricing](/docs/en/agents-and-tools/tool-use/overview#pricing). When using the computer use tool: **System prompt overhead**: The computer use beta adds 466-499 tokens to the system prompt **Computer use tool token usage**: | Model | Input tokens per tool definition | | ----- | -------------------------------- | | Claude 4.x models | 735 tokens | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | 735 tokens | **Additional token consumption**: - Screenshot images (see [Vision pricing](/docs/en/build-with-claude/vision)) - Tool execution results returned to Claude If you're also using bash or text editor tools alongside computer use, those tools have their own token costs as documented in their respective pages. ## Agent use case pricing examples Understanding pricing for agent applications is crucial when building with Claude. These real-world examples can help you estimate costs for different agent patterns. ### Customer support agent example When building a customer support agent, here's how costs might break down: Example calculation for processing 10,000 support tickets: - Average ~3,700 tokens per conversation - Using Claude Sonnet 4.5 at $3/MTok input, $15/MTok output - Total cost: ~$22.20 per 10,000 tickets For a detailed walkthrough of this calculation, see our [customer support agent guide](/docs/en/about-claude/use-case-guides/customer-support-chat). ### General agent workflow pricing For more complex agent architectures with multiple steps: 1. **Initial request processing** - Typical input: 500-1,000 tokens - Processing cost: ~$0.003 per request 2. **Memory and context retrieval** - Retrieved context: 2,000-5,000 tokens - Cost per retrieval: ~$0.015 per operation 3. **Action planning and execution** - Planning tokens: 1,000-2,000 - Execution feedback: 500-1,000 - Combined cost: ~$0.045 per action For a comprehensive guide on agent pricing patterns, see our [agent use cases guide](/docs/en/about-claude/use-case-guides). ### Cost optimization strategies When building agents with Claude: 1. **Use appropriate models**: Choose Haiku for simple tasks, Sonnet for complex reasoning 2. **Implement prompt caching**: Reduce costs for repeated context 3. **Batch operations**: Use the Batch API for non-time-sensitive tasks 4. **Monitor usage patterns**: Track token consumption to identify optimization opportunities For high-volume agent applications, consider contacting our [enterprise sales team](https://claude.com/contact-sales) for custom pricing arrangements. ## Additional pricing considerations ### Rate limits Rate limits vary by usage tier and affect how many requests you can make: - **Tier 1**: Entry-level usage with basic limits - **Tier 2**: Increased limits for growing applications - **Tier 3**: Higher limits for established applications - **Tier 4**: Maximum standard limits - **Enterprise**: Custom limits available For detailed rate limit information, see our [rate limits documentation](/docs/en/api/rate-limits). For higher rate limits or custom pricing arrangements, [contact our sales team](https://claude.com/contact-sales). ### Volume discounts Volume discounts may be available for high-volume users. These are negotiated on a case-by-case basis. - Standard tiers use the pricing shown above - Enterprise customers can [contact sales](mailto:sales@anthropic.com) for custom pricing - Academic and research discounts may be available ### Enterprise pricing For enterprise customers with specific needs: - Custom rate limits - Volume discounts - Dedicated support - Custom terms Contact our sales team at [sales@anthropic.com](mailto:sales@anthropic.com) or through the [Claude Console](/settings/limits) to discuss enterprise pricing options. ## Billing and payment - Billing is calculated monthly based on actual usage - Payments are processed in USD - Credit card and invoicing options available - Usage tracking available in the [Claude Console](/) ## Frequently asked questions **How is token usage calculated?** Tokens are pieces of text that models process. As a rough estimate, 1 token is approximately 4 characters or 0.75 words in English. The exact count varies by language and content type. **Are there free tiers or trials?** New users receive a small amount of free credits to test the API. [Contact sales](mailto:sales@anthropic.com) for information about extended trials for enterprise evaluation. **How do discounts stack?** Batch API and prompt caching discounts can be combined. For example, using both features together provides significant cost savings compared to standard API calls. **What payment methods are accepted?** We accept major credit cards for standard accounts. Enterprise customers can arrange invoicing and other payment methods. For additional questions about pricing, contact [support@anthropic.com](mailto:support@anthropic.com). --- # What's new in Claude 4.5 URL: https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-5 # What's new in Claude 4.5 --- Claude 4.5 introduces three models designed for different use cases: - **Claude Opus 4.5**: Our most intelligent model combining maximum capability with practical performance. Features a more accessible price point than previous Opus models. Available with a 200k token context window. - **Claude Sonnet 4.5**: Our best model for complex agents and coding, with the highest intelligence across most tasks. Available with a 200k and 1M (beta) token context window. - **Claude Haiku 4.5**: Our fastest and most intelligent Haiku model with near-frontier performance. Available with a 200k token context window. ## Key improvements in Opus 4.5 over Opus 4.1 ### Maximum intelligence Claude Opus 4.5 represents our most intelligent model, combining maximum capability with practical performance. It delivers step-change improvements across reasoning, coding, and complex problem-solving tasks while maintaining the high-quality outputs expected from the Opus family. ### Effort parameter Claude Opus 4.5 is the only model that supports the [effort parameter](/docs/en/build-with-claude/effort), allowing you to control how many tokens Claude uses when responding. This gives you the ability to trade off between response thoroughness and token efficiency with a single model. The effort parameter affects all tokens in the response, including text responses, tool calls, and extended thinking. You can choose between: - **High effort**: Maximum thoroughness for complex analysis and detailed explanations - **Medium effort**: Balanced approach for most production use cases - **Low effort**: Most token-efficient responses for high-volume automation ### Computer use excellence Claude Opus 4.5 introduces [enhanced computer use capabilities](/docs/en/agents-and-tools/tool-use/computer-use-tool) with a new zoom action that enables detailed inspection of specific screen regions at full resolution. This allows Claude to examine fine-grained UI elements, small text, and detailed visual information that might be unclear in standard screenshots. The zoom capability is particularly valuable for: - Inspecting small UI elements and controls - Reading fine print or detailed text - Analyzing complex interfaces with dense information - Verifying precise visual details before taking actions ### Practical performance Claude Opus 4.5 delivers flagship intelligence at a [more accessible price point](/docs/en/about-claude/pricing) than previous Opus models, making advanced AI capabilities available for a broader range of applications and use cases. ### Thinking block preservation Claude Opus 4.5 [automatically preserves all previous thinking blocks](/docs/en/build-with-claude/extended-thinking#thinking-block-preservation-in-claude-opus-4-5) throughout conversations, maintaining reasoning continuity across extended multi-turn interactions and tool use sessions. This ensures Claude can effectively leverage its full reasoning history when working on complex, long-running tasks. ## Key improvements in Sonnet 4.5 over Sonnet 4 ### Coding excellence Claude Sonnet 4.5 is our best coding model to date, with significant improvements across the entire development lifecycle: - **SWE-bench Verified performance**: Advanced state-of-the-art on coding benchmarks - **Enhanced planning and system design**: Better architectural decisions and code organization - **Improved security engineering**: More robust security practices and vulnerability detection - **Better instruction following**: More precise adherence to coding specifications and requirements Claude Sonnet 4.5 performs significantly better on coding tasks when [extended thinking](/docs/en/build-with-claude/extended-thinking) is enabled. Extended thinking is disabled by default, but we recommend enabling it for complex coding work. Be aware that extended thinking impacts [prompt caching efficiency](/docs/en/build-with-claude/prompt-caching#caching-with-thinking-blocks). See the [migration guide](/docs/en/about-claude/models/migrating-to-claude-4#extended-thinking-recommendations) for configuration details. ### Agent capabilities Claude Sonnet 4.5 introduces major advances in agent capabilities: - **Extended autonomous operation**: Sonnet 4.5 can work independently for hours while maintaining clarity and focus on incremental progress. The model makes steady advances on a few tasks at a time rather than attempting everything at once. It provides fact-based progress updates that accurately reflect what has been accomplished. - **Context awareness**: Claude now tracks its token usage throughout conversations, receiving updates after each tool call. This awareness helps prevent premature task abandonment and enables more effective execution on long-running tasks. See [Context awareness](/docs/en/build-with-claude/context-windows#context-awareness-in-claude-sonnet-4-5) for technical details and [prompting guidance](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices#context-awareness-and-multi-window-workflows). - **Enhanced tool usage**: The model more effectively uses parallel tool calls, firing off multiple speculative searches simultaneously during research and reading several files at once to build context faster. Improved coordination across multiple tools and information sources enables the model to effectively leverage a wide range of capabilities in agentic search and coding workflows. - **Advanced context management**: Sonnet 4.5 maintains exceptional state tracking in external files, preserving goal-orientation across sessions. Combined with more effective context window usage and our new context management API features, the model optimally handles information across extended sessions to maintain coherence over time. Context awareness is available in Claude Sonnet 4, Sonnet 4.5, Haiku 4.5, Opus 4, Opus 4.1, and Opus 4.5. ### Communication and interaction style Claude Sonnet 4.5 has a refined communication approach that is concise, direct, and natural. It provides fact-based progress updates and may skip verbose summaries after tool calls to maintain workflow momentum (though this can be adjusted with prompting). For detailed guidance on working with this communication style, see [Claude 4 best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices). ### Creative content generation Claude Sonnet 4.5 excels at creative content tasks: - **Presentations and animations**: Matches or exceeds Claude Opus 4.1 and Opus 4.5 for creating slides and visual content - **Creative flair**: Produces polished, professional output with strong instruction following - **First-try quality**: Generates usable, well-designed content in initial attempts ## Key improvements in Haiku 4.5 over Haiku 3.5 Claude Haiku 4.5 represents a transformative leap for the Haiku model family, bringing frontier capabilities to our fastest model class: ### Near-frontier intelligence with blazing speed Claude Haiku 4.5 delivers near-frontier performance matching Sonnet 4 at significantly lower cost and faster speed: - **Near-frontier intelligence**: Matches Sonnet 4 performance across reasoning, coding, and complex tasks - **Enhanced speed**: More than twice the speed of Sonnet 4, with optimizations for output tokens per second (OTPS) - **Optimal cost-performance**: Near-frontier intelligence at one-third the cost, ideal for high-volume deployments ### Extended thinking capabilities Claude Haiku 4.5 is the **first Haiku model** to support extended thinking, bringing advanced reasoning capabilities to the Haiku family: - **Reasoning at speed**: Access to Claude's internal reasoning process for complex problem-solving - **Thinking Summarization**: Summarized thinking output for production-ready deployments - **Interleaved thinking**: Think between tool calls for more sophisticated multi-step workflows - **Budget control**: Configure thinking token budgets to balance reasoning depth with speed Extended thinking must be enabled explicitly by adding a `thinking` parameter to your API requests. See the [Extended thinking documentation](/docs/en/build-with-claude/extended-thinking) for implementation details. Claude Haiku 4.5 performs significantly better on coding and reasoning tasks when [extended thinking](/docs/en/build-with-claude/extended-thinking) is enabled. Extended thinking is disabled by default, but we recommend enabling it for complex problem-solving, coding work, and multi-step reasoning. Be aware that extended thinking impacts [prompt caching efficiency](/docs/en/build-with-claude/prompt-caching#caching-with-thinking-blocks). See the [migration guide](/docs/en/about-claude/models/migrating-to-claude-4#extended-thinking-recommendations) for configuration details. Available in Claude Sonnet 3.7, Sonnet 4, Sonnet 4.5, Haiku 4.5, Opus 4, Opus 4.1, and Opus 4.5. ### Context awareness Claude Haiku 4.5 features **context awareness**, enabling the model to track its remaining context window throughout a conversation: - **Token budget tracking**: Claude receives real-time updates on remaining context capacity after each tool call - **Better task persistence**: The model can execute tasks more effectively by understanding available working space - **Multi-context-window workflows**: Improved handling of state transitions across extended sessions This is the first Haiku model with native context awareness capabilities. For prompting guidance, see [Claude 4 best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices#context-awareness-and-multi-window-workflows). Available in Claude Sonnet 4, Sonnet 4.5, Haiku 4.5, Opus 4, Opus 4.1, and Opus 4.5. ### Strong coding and tool use Claude Haiku 4.5 delivers robust coding capabilities expected from modern Claude models: - **Coding proficiency**: Strong performance across code generation, debugging, and refactoring tasks - **Full tool support**: Compatible with all Claude 4 tools including bash, code execution, text editor, web search, and computer use - **Enhanced computer use**: Optimized for autonomous desktop interaction and browser automation workflows - **Parallel tool execution**: Efficient coordination across multiple tools for complex workflows Haiku 4.5 is designed for use cases that demand both intelligence and efficiency: - **Real-time applications**: Fast response times for interactive user experiences - **High-volume processing**: Cost-effective intelligence for large-scale deployments - **Free tier implementations**: Premium model quality at accessible pricing - **Sub-agent architectures**: Fast, intelligent agents for multi-agent systems - **Computer use at scale**: Cost-effective autonomous desktop and browser automation ## New API features ### Programmatic tool calling (Beta) [Programmatic tool calling](/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) allows Claude to write code that calls your tools programmatically within a code execution container, rather than requiring round trips through the model for each tool invocation. This significantly reduces latency for multi-tool workflows and decreases token consumption by allowing Claude to filter or process data before it reaches the model's context window. ```python tools=[ { "type": "code_execution_20250825", "name": "code_execution" }, { "name": "query_database", "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", "input_schema": {...}, "allowed_callers": ["code_execution_20250825"] # Enable programmatic calling } ] ``` Key benefits: - **Reduced latency**: Eliminate model round-trips between tool calls - **Token efficiency**: Process and filter tool results programmatically before returning to Claude - **Complex workflows**: Support loops, conditional logic, and batch processing Available in Claude Opus 4.5 and Claude Sonnet 4.5. Requires [beta header](/docs/en/api/beta-headers): `advanced-tool-use-2025-11-20` ### Tool search tool (Beta) The [tool search tool](/docs/en/agents-and-tools/tool-use/tool-search-tool) enables Claude to work with hundreds or thousands of tools by dynamically discovering and loading them on-demand. Instead of loading all tool definitions into the context window upfront, Claude searches your tool catalog and loads only the tools it needs. Two search variants are available: - **Regex** (`tool_search_tool_regex_20251119`): Claude constructs regex patterns to search tool names, descriptions, and arguments - **BM25** (`tool_search_tool_bm25_20251119`): Claude uses natural language queries to search for tools ```python tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get the weather at a specific location", "input_schema": {...}, "defer_loading": True # Load on-demand via search } ] ``` This approach solves two critical challenges: - **Context efficiency**: Save 10-20K tokens by not loading all tool definitions upfront - **Tool selection accuracy**: Maintain high accuracy even with 100+ available tools Available in Claude Opus 4.5 and Claude Sonnet 4.5. Requires [beta header](/docs/en/api/beta-headers): `advanced-tool-use-2025-11-20` ### Effort parameter (Beta) The [effort parameter](/docs/en/build-with-claude/effort) allows you to control how many tokens Claude uses when responding, trading off between response thoroughness and token efficiency: ```python response = client.beta.messages.create( model="claude-opus-4-5-20251101", betas=["effort-2025-11-24"], max_tokens=4096, messages=[{"role": "user", "content": "..."}], output_config={ "effort": "medium" # "low", "medium", or "high" } ) ``` The effort parameter affects all tokens in the response, including text responses, tool calls, and extended thinking. Lower effort levels produce more concise responses with minimal explanations, while higher effort provides detailed reasoning and comprehensive answers. Available exclusively in Claude Opus 4.5. Requires [beta header](/docs/en/api/beta-headers): `effort-2025-11-24` ### Tool use examples (Beta) [Tool use examples](/docs/en/agents-and-tools/tool-use/implement-tool-use#providing-tool-use-examples) allow you to provide concrete examples of valid tool inputs to help Claude understand how to use your tools more effectively. This is particularly useful for complex tools with nested objects, optional parameters, or format-sensitive inputs. ```python tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": {...}, "input_examples": [ { "location": "San Francisco, CA", "unit": "fahrenheit" }, { "location": "Tokyo, Japan", "unit": "celsius" }, { "location": "New York, NY" # Demonstrates optional 'unit' parameter } ] } ] ``` Examples are included in the prompt alongside your tool schema, showing Claude concrete patterns for well-formed tool calls. Each example must be valid according to the tool's `input_schema`. Available in Claude Sonnet 4.5, Haiku 4.5, Opus 4.5, Opus 4.1, and Opus 4. Requires [beta header](/docs/en/api/beta-headers): `advanced-tool-use-2025-11-20`. ### Memory tool (Beta) The new [memory tool](/docs/en/agents-and-tools/tool-use/memory-tool) enables Claude to store and retrieve information outside the context window: ```python tools=[ { "type": "memory_20250818", "name": "memory" } ] ``` This allows for: - Building knowledge bases over time - Maintaining project state across sessions - Preserving effectively unlimited context through file-based storage Available in Claude Sonnet 4, Sonnet 4.5, Haiku 4.5, Opus 4, Opus 4.1, and Opus 4.5. Requires [beta header](/docs/en/api/beta-headers): `context-management-2025-06-27` ### Context editing Use [context editing](/docs/en/build-with-claude/context-editing) for intelligent context management through automatic tool call clearing: ```python response = client.beta.messages.create( betas=["context-management-2025-06-27"], model="claude-sonnet-4-5", # or claude-haiku-4-5 max_tokens=4096, messages=[{"role": "user", "content": "..."}], context_management={ "edits": [ { "type": "clear_tool_uses_20250919", "trigger": {"type": "input_tokens", "value": 500}, "keep": {"type": "tool_uses", "value": 2}, "clear_at_least": {"type": "input_tokens", "value": 100} } ] }, tools=[...] ) ``` This feature automatically removes older tool calls and results when approaching token limits, helping manage context in long-running agent sessions. Available in Claude Sonnet 4, Sonnet 4.5, Haiku 4.5, Opus 4, Opus 4.1, and Opus 4.5. Requires [beta header](/docs/en/api/beta-headers): `context-management-2025-06-27` ### Enhanced stop reasons Claude 4.5 models introduce a new `model_context_window_exceeded` stop reason that explicitly indicates when generation stopped due to hitting the context window limit, rather than the requested `max_tokens` limit. This makes it easier to handle context window limits in your application logic. ```json { "stop_reason": "model_context_window_exceeded", "usage": { "input_tokens": 150000, "output_tokens": 49950 } } ``` ### Improved tool parameter handling Claude 4.5 models include a bug fix that preserves intentional formatting in tool call string parameters. Previously, trailing newlines in string parameters were sometimes incorrectly stripped. This fix ensures that tools requiring precise formatting (like text editors) receive parameters exactly as intended. This is a behind-the-scenes improvement with no API changes required. However, tools with string parameters may now receive values with trailing newlines that were previously stripped. **Example:** ```json // Before: Final newline accidentally stripped { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "edit_todo", "input": { "file": "todo.txt", "contents": "1. Chop onions.\n2. ???\n3. Profit" } } // After: Trailing newline preserved as intended { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "edit_todo", "input": { "file": "todo.txt", "contents": "1. Chop onions.\n2. ???\n3. Profit\n" } } ``` ### Token count optimizations Claude 4.5 models include automatic optimizations to improve model performance. These optimizations may add small amounts of tokens to requests, but **you are not billed for these system-added tokens**. ## Features introduced in Claude 4 The following features were introduced in Claude 4 and are available across Claude 4 models, including Claude Sonnet 4.5 and Claude Haiku 4.5. ### New refusal stop reason Claude 4 models introduce a new `refusal` stop reason for content that the model declines to generate for safety reasons: ```json { "id": "msg_014XEDjypDjFzgKVWdFUXxZP", "type": "message", "role": "assistant", "model": "claude-sonnet-4-5", "content": [{"type": "text", "text": "I would be happy to assist you. You can "}], "stop_reason": "refusal", "stop_sequence": null, "usage": { "input_tokens": 564, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 0, "output_tokens": 22 } } ``` When using Claude 4 models, you should update your application to [handle `refusal` stop reasons](/docs/en/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals). ### Summarized thinking With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse. While the API is consistent across Claude 3.7 and 4 models, streaming responses for extended thinking might return in a "chunky" delivery pattern, with possible delays between streaming events. Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output. For more information, see the [Extended thinking documentation](/docs/en/build-with-claude/extended-thinking#summarized-thinking). ### Interleaved thinking Claude 4 models support interleaving tool use with extended thinking, allowing for more natural conversations where tool uses and responses can be mixed with regular messages. Interleaved thinking is in beta. To enable interleaved thinking, add [the beta header](/docs/en/api/beta-headers) `interleaved-thinking-2025-05-14` to your API request. For more information, see the [Extended thinking documentation](/docs/en/build-with-claude/extended-thinking#interleaved-thinking). ### Behavioral differences Claude 4 models have notable behavioral changes that may affect how you structure prompts: #### Communication style changes - **More concise and direct**: Claude 4 models communicate more efficiently, with less verbose explanations - **More natural tone**: Responses are slightly more conversational and less machine-like - **Efficiency-focused**: May skip detailed summaries after completing actions to maintain workflow momentum (you can prompt for more detail if needed) #### Instruction following Claude 4 models are trained for precise instruction following and require more explicit direction: - **Be explicit about actions**: Use direct language like "Make these changes" or "Implement this feature" rather than "Can you suggest changes" if you want Claude to take action - **State desired behaviors clearly**: Claude will follow instructions precisely, so being specific about what you want helps achieve better results For comprehensive guidance on working with these models, see [Claude 4 prompt engineering best practices](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices). ### Updated text editor tool The text editor tool has been updated for Claude 4 models with the following changes: - **Tool type**: `text_editor_20250728` - **Tool name**: `str_replace_based_edit_tool` - The `undo_edit` command is no longer supported The `str_replace_editor` text editor tool remains the same for Claude Sonnet 3.7. If you're migrating from Claude Sonnet 3.7 and using the text editor tool: ```python # Claude Sonnet 3.7 tools=[ { "type": "text_editor_20250124", "name": "str_replace_editor" } ] # Claude 4 models tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ] ``` For more information, see the [Text editor tool documentation](/docs/en/agents-and-tools/tool-use/text-editor-tool). ### Updated code execution tool If you're using the code execution tool, ensure you're using the latest version `code_execution_20250825`, which adds Bash commands and file manipulation capabilities. The legacy version `code_execution_20250522` (Python only) is still available but not recommended for new implementations. For migration instructions, see the [Code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool#upgrade-to-latest-tool-version). ## Pricing and availability ### Pricing Claude 4.5 models maintain competitive pricing: | Model | Input | Output | |-------|-------|--------| | Claude Opus 4.5 | $5 per million tokens | $25 per million tokens | | Claude Sonnet 4.5 | $3 per million tokens | $15 per million tokens | | Claude Haiku 4.5 | $1 per million tokens | $5 per million tokens | For more details, see the [pricing documentation](/docs/en/about-claude/pricing). ### Third-party platform pricing Starting with Claude 4.5 models (Opus 4.5, Sonnet 4.5, and Haiku 4.5), AWS Bedrock and Google Vertex AI offer two endpoint types: - **Global endpoints**: Dynamic routing for maximum availability - **Regional endpoints**: Guaranteed data routing through specific geographic regions with a **10% pricing premium** **This regional pricing applies to all Claude 4.5 models: Opus 4.5, Sonnet 4.5, and Haiku 4.5.** **The Claude API (1P) is global by default and unaffected by this change.** The Claude API is global-only (equivalent to the global endpoint offering and pricing from other providers). For implementation details and migration guidance: - [AWS Bedrock global vs regional endpoints](/docs/en/build-with-claude/claude-on-amazon-bedrock#global-vs-regional-endpoints) - [Google Vertex AI global vs regional endpoints](/docs/en/build-with-claude/claude-on-vertex-ai#global-vs-regional-endpoints) ### Availability Claude 4.5 models are available on: | Model | Claude API | Amazon Bedrock | Google Cloud Vertex AI | |-------|-----------|----------------|------------------------| | Claude Opus 4.5 | `claude-opus-4-5-20251101` | `anthropic.claude-opus-4-5-20251101-v1:0` | `claude-opus-4-5@20251101` | | Claude Sonnet 4.5 | `claude-sonnet-4-5-20250929` | `anthropic.claude-sonnet-4-5-20250929-v1:0` | `claude-sonnet-4-5@20250929` | | Claude Haiku 4.5 | `claude-haiku-4-5-20251001` | `anthropic.claude-haiku-4-5-20251001-v1:0` | `claude-haiku-4-5@20251001` | Also available through Claude.ai and Claude Code platforms. ## Migration guide Breaking changes and migration requirements vary depending on which model you're upgrading from. For detailed migration instructions, including step-by-step guides, breaking changes, and migration checklists, see [Migrating to Claude 4.5](/docs/en/about-claude/models/migrating-to-claude-4). The migration guide covers the following scenarios: - **Claude Sonnet 3.7 → Sonnet 4.5**: Complete migration path with breaking changes - **Claude Haiku 3.5 → Haiku 4.5**: Complete migration path with breaking changes - **Claude Sonnet 4 → Sonnet 4.5**: Quick upgrade with minimal changes - **Claude Opus 4.1 → Sonnet 4.5**: Seamless upgrade with no breaking changes - **Claude Opus 4.1 → Opus 4.5**: Seamless upgrade with no breaking changes - **Claude Opus 4.5 → Sonnet 4.5**: Seamless downgrade with no breaking changes ## Next steps Learn prompt engineering techniques for Claude 4.5 models Compare Claude 4.5 models with other Claude models Upgrade from previous models ### Build with Claude --- # Features overview URL: https://platform.claude.com/docs/en/build-with-claude/overview # Features overview Explore Claude's advanced features and capabilities. --- ## Core capabilities These features enhance Claude's fundamental abilities for processing, analyzing, and generating content across various formats and use cases. | Feature | Description | Availability | |---------|-------------|--------------| | [1M token context window](/docs/en/build-with-claude/context-windows#1m-token-context-window) | An extended context window that allows you to process much larger documents, maintain longer conversations, and work with more extensive codebases. | | | [Agent Skills](/docs/en/agents-and-tools/agent-skills/overview) | Extend Claude's capabilities with Skills. Use pre-built Skills (PowerPoint, Excel, Word, PDF) or create custom Skills with instructions and scripts. Skills use progressive disclosure to efficiently manage context. | | | [Batch processing](/docs/en/build-with-claude/batch-processing) | Process large volumes of requests asynchronously for cost savings. Send batches with a large number of queries per batch. Batch API calls costs 50% less than standard API calls. | | | [Citations](/docs/en/build-with-claude/citations) | Ground Claude's responses in source documents. With Citations, Claude can provide detailed references to the exact sentences and passages it uses to generate responses, leading to more verifiable, trustworthy outputs. | | | [Context editing](/docs/en/build-with-claude/context-editing) | Automatically manage conversation context with configurable strategies. Supports clearing tool results when approaching token limits and managing thinking blocks in extended thinking conversations. | | | [Effort](/docs/en/build-with-claude/effort) | Control how many tokens Claude uses when responding with the effort parameter, trading off between response thoroughness and token efficiency. | | | [Extended thinking](/docs/en/build-with-claude/extended-thinking) | Enhanced reasoning capabilities for complex tasks, providing transparency into Claude's step-by-step thought process before delivering its final answer. | | | [Files API](/docs/en/build-with-claude/files) | Upload and manage files to use with Claude without re-uploading content with each request. Supports PDFs, images, and text files. | | | [PDF support](/docs/en/build-with-claude/pdf-support) | Process and analyze text and visual content from PDF documents. | | | [Prompt caching (5m)](/docs/en/build-with-claude/prompt-caching) | Provide Claude with more background knowledge and example outputs to reduce costs and latency. | | | [Prompt caching (1hr)](/docs/en/build-with-claude/prompt-caching#1-hour-cache-duration) | Extended 1-hour cache duration for less frequently accessed but important context, complementing the standard 5-minute cache. | | | [Search results](/docs/en/build-with-claude/search-results) | Enable natural citations for RAG applications by providing search results with proper source attribution. Achieve web search-quality citations for custom knowledge bases and tools. | | | [Structured outputs](/docs/en/build-with-claude/structured-outputs) | Guarantee schema conformance with two approaches: JSON outputs for structured data responses, and strict tool use for validated tool inputs. Available on Sonnet 4.5, Opus 4.1, Opus 4.5, and Haiku 4.5. | | | [Token counting](/docs/en/api/messages-count-tokens) | Token counting enables you to determine the number of tokens in a message before sending it to Claude, helping you make informed decisions about your prompts and usage. | | | [Tool use](/docs/en/agents-and-tools/tool-use/overview) | Enable Claude to interact with external tools and APIs to perform a wider variety of tasks. For a list of supported tools, see [the Tools table](#tools). | | ## Tools These features enable Claude to interact with external systems, execute code, and perform automated tasks through various tool interfaces. | Feature | Description | Availability | |---------|-------------|--------------| | [Bash](/docs/en/agents-and-tools/tool-use/bash-tool) | Execute bash commands and scripts to interact with the system shell and perform command-line operations. | | | [Code execution](/docs/en/agents-and-tools/tool-use/code-execution-tool) | Run Python code in a sandboxed environment for advanced data analysis. | | | [Programmatic tool calling](/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) | Enable Claude to call your tools programmatically from within code execution containers, reducing latency and token consumption for multi-tool workflows. | | | [Computer use](/docs/en/agents-and-tools/tool-use/computer-use-tool) | Control computer interfaces by taking screenshots and issuing mouse and keyboard commands. | | | [Fine-grained tool streaming](/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming) | Stream tool use parameters without buffering/JSON validation, reducing latency for receiving large parameters. | | | [MCP connector](/docs/en/agents-and-tools/mcp-connector) | Connect to remote [MCP](/docs/en/mcp) servers directly from the Messages API without a separate MCP client. | | | [Memory](/docs/en/agents-and-tools/tool-use/memory-tool) | Enable Claude to store and retrieve information across conversations. Build knowledge bases over time, maintain project context, and learn from past interactions. | | | [Text editor](/docs/en/agents-and-tools/tool-use/text-editor-tool) | Create and edit text files with a built-in text editor interface for file manipulation tasks. | | | [Tool search](/docs/en/agents-and-tools/tool-use/tool-search-tool) | Scale to thousands of tools by dynamically discovering and loading tools on-demand using regex-based search, optimizing context usage and improving tool selection accuracy. | | | [Web fetch](/docs/en/agents-and-tools/tool-use/web-fetch-tool) | Retrieve full content from specified web pages and PDF documents for in-depth analysis. | | | [Web search](/docs/en/agents-and-tools/tool-use/web-search-tool) | Augment Claude's comprehensive knowledge with current, real-world data from across the web. | | --- # Context windows URL: https://platform.claude.com/docs/en/build-with-claude/context-windows # Context windows --- ## Understanding the context window The "context window" refers to the entirety of the amount of text a language model can look back on and reference when generating new text plus the new text it generates. This is different from the large corpus of data the language model was trained on, and instead represents a "working memory" for the model. A larger context window allows the model to understand and respond to more complex and lengthy prompts, while a smaller context window may limit the model's ability to handle longer prompts or maintain coherence over extended conversations. The diagram below illustrates the standard context window behavior for API requests1: ![Context window diagram](/docs/images/context-window.svg) _1For chat interfaces, such as for [claude.ai](https://claude.ai/), context windows can also be set up on a rolling "first in, first out" system._ * **Progressive token accumulation:** As the conversation advances through turns, each user message and assistant response accumulates within the context window. Previous turns are preserved completely. * **Linear growth pattern:** The context usage grows linearly with each turn, with previous turns preserved completely. * **200K token capacity:** The total available context window (200,000 tokens) represents the maximum capacity for storing conversation history and generating new output from Claude. * **Input-output flow:** Each turn consists of: - **Input phase:** Contains all previous conversation history plus the current user message - **Output phase:** Generates a text response that becomes part of a future input ## The context window with extended thinking When using [extended thinking](/docs/en/build-with-claude/extended-thinking), all input and output tokens, including the tokens used for thinking, count toward the context window limit, with a few nuances in multi-turn situations. The thinking budget tokens are a subset of your `max_tokens` parameter, are billed as output tokens, and count towards rate limits. However, previous thinking blocks are automatically stripped from the context window calculation by the Claude API and are not part of the conversation history that the model "sees" for subsequent turns, preserving token capacity for actual conversation content. The diagram below demonstrates the specialized token management when extended thinking is enabled: ![Context window diagram with extended thinking](/docs/images/context-window-thinking.svg) * **Stripping extended thinking:** Extended thinking blocks (shown in dark gray) are generated during each turn's output phase, **but are not carried forward as input tokens for subsequent turns**. You do not need to strip the thinking blocks yourself. The Claude API automatically does this for you if you pass them back. * **Technical implementation details:** - The API automatically excludes thinking blocks from previous turns when you pass them back as part of the conversation history. - Extended thinking tokens are billed as output tokens only once, during their generation. - The effective context window calculation becomes: `context_window = (input_tokens - previous_thinking_tokens) + current_turn_tokens`. - Thinking tokens include both `thinking` blocks and `redacted_thinking` blocks. This architecture is token efficient and allows for extensive reasoning without token waste, as thinking blocks can be substantial in length. You can read more about the context window and extended thinking in our [extended thinking guide](/docs/en/build-with-claude/extended-thinking). ## The context window with extended thinking and tool use The diagram below illustrates the context window token management when combining extended thinking with tool use: ![Context window diagram with extended thinking and tool use](/docs/images/context-window-thinking-tools.svg) - **Input components:** Tools configuration and user message - **Output components:** Extended thinking + text response + tool use request - **Token calculation:** All input and output components count toward the context window, and all output components are billed as output tokens. - **Input components:** Every block in the first turn as well as the `tool_result`. The extended thinking block **must** be returned with the corresponding tool results. This is the only case wherein you **have to** return thinking blocks. - **Output components:** After tool results have been passed back to Claude, Claude will respond with only text (no additional extended thinking until the next `user` message). - **Token calculation:** All input and output components count toward the context window, and all output components are billed as output tokens. - **Input components:** All inputs and the output from the previous turn is carried forward with the exception of the thinking block, which can be dropped now that Claude has completed the entire tool use cycle. The API will automatically strip the thinking block for you if you pass it back, or you can feel free to strip it yourself at this stage. This is also where you would add the next `User` turn. - **Output components:** Since there is a new `User` turn outside of the tool use cycle, Claude will generate a new extended thinking block and continue from there. - **Token calculation:** Previous thinking tokens are automatically stripped from context window calculations. All other previous blocks still count as part of the token window, and the thinking block in the current `Assistant` turn counts as part of the context window. * **Considerations for tool use with extended thinking:** - When posting tool results, the entire unmodified thinking block that accompanies that specific tool request (including signature/redacted portions) must be included. - The effective context window calculation for extended thinking with tool use becomes: `context_window = input_tokens + current_turn_tokens`. - The system uses cryptographic signatures to verify thinking block authenticity. Failing to preserve thinking blocks during tool use can break Claude's reasoning continuity. Thus, if you modify thinking blocks, the API will return an error. Claude 4 models support [interleaved thinking](/docs/en/build-with-claude/extended-thinking#interleaved-thinking), which enables Claude to think between tool calls and make more sophisticated reasoning after receiving tool results. Claude Sonnet 3.7 does not support interleaved thinking, so there is no interleaving of extended thinking and tool calls without a non-`tool_result` user turn in between. For more information about using tools with extended thinking, see our [extended thinking guide](/docs/en/build-with-claude/extended-thinking#extended-thinking-with-tool-use). ## 1M token context window Claude Sonnet 4 and 4.5 support a 1-million token context window. This extended context window allows you to process much larger documents, maintain longer conversations, and work with more extensive codebases. The 1M token context window is currently in beta for organizations in [usage tier](/docs/en/api/rate-limits) 4 and organizations with custom rate limits. The 1M token context window is only available for Claude Sonnet 4 and Sonnet 4.5. To use the 1M token context window, include the `context-1m-2025-08-07` [beta header](/docs/en/api/beta-headers) in your API requests: ```python Python from anthropic import Anthropic client = Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ {"role": "user", "content": "Process this large document..."} ], betas=["context-1m-2025-08-07"] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const msg = await anthropic.beta.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { role: 'user', content: 'Process this large document...' } ], betas: ['context-1m-2025-08-07'] }); ``` ```bash cURL curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: context-1m-2025-08-07" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Process this large document..."} ] }' ``` **Important considerations:** - **Beta status**: This is a beta feature subject to change. Features and pricing may be modified or removed in future releases. - **Usage tier requirement**: The 1M token context window is available to organizations in [usage tier](/docs/en/api/rate-limits) 4 and organizations with custom rate limits. Lower tier organizations must advance to usage tier 4 to access this feature. - **Availability**: The 1M token context window is currently available on the Claude API, [Microsoft Foundry](/docs/en/build-with-claude/claude-in-microsoft-foundry), [Amazon Bedrock](/docs/en/build-with-claude/claude-on-amazon-bedrock), and [Google Cloud's Vertex AI](/docs/en/build-with-claude/claude-on-vertex-ai). - **Pricing**: Requests exceeding 200K tokens are automatically charged at premium rates (2x input, 1.5x output pricing). See the [pricing documentation](/docs/en/about-claude/pricing#long-context-pricing) for details. - **Rate limits**: Long context requests have dedicated rate limits. See the [rate limits documentation](/docs/en/api/rate-limits#long-context-rate-limits) for details. - **Multimodal considerations**: When processing large numbers of images or pdfs, be aware that the files can vary in token usage. When pairing a large prompt with a large number of images, you may hit [request size limits](/docs/en/api/overview#request-size-limits). ## Context awareness in Claude Sonnet 4.5 and Haiku 4.5 Claude Sonnet 4.5 and Claude Haiku 4.5 feature **context awareness**, enabling these models to track their remaining context window (i.e. "token budget") throughout a conversation. This enables Claude to execute tasks and manage context more effectively by understanding how much space it has to work. Claude is natively trained to use this context precisely to persist in the task until the very end, rather than having to guess how many tokens are remaining. For a model, lacking context awareness is like competing in a cooking show without a clock. Claude 4.5 models change this by explicitly informing the model about its remaining context, so it can take maximum advantage of the available tokens. **How it works:** At the start of a conversation, Claude receives information about its total context window: ``` 200000 ``` The budget is set to 200K tokens (standard), 500K tokens (Claude.ai Enterprise), or 1M tokens (beta, for eligible organizations). After each tool call, Claude receives an update on remaining capacity: ``` Token usage: 35000/200000; 165000 remaining ``` This awareness helps Claude determine how much capacity remains for work and enables more effective execution on long-running tasks. Image tokens are included in these budgets. **Benefits:** Context awareness is particularly valuable for: - Long-running agent sessions that require sustained focus - Multi-context-window workflows where state transitions matter - Complex tasks requiring careful token management For prompting guidance on leveraging context awareness, see our [Claude 4 best practices guide](/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices#context-awareness-and-multi-window-workflows). ## Context window management with newer Claude models In newer Claude models (starting with Claude Sonnet 3.7), if the sum of prompt tokens and output tokens exceeds the model's context window, the system will return a validation error rather than silently truncating the context. This change provides more predictable behavior but requires more careful token management. To plan your token usage and ensure you stay within context window limits, you can use the [token counting API](/docs/en/build-with-claude/token-counting) to estimate how many tokens your messages will use before sending them to Claude. See our [model comparison](/docs/en/about-claude/models/overview#latest-models-comparison) table for a list of context window sizes by model. # Next steps See our model comparison table for a list of context window sizes and input / output token pricing by model. Learn more about how extended thinking works and how to implement it alongside other features such as tool use and prompt caching. --- # Prompting best practices URL: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices # Prompting best practices --- This guide provides specific prompt engineering techniques for Claude 4.x models, with specific guidance for Sonnet 4.5, Haiku 4.5, and Opus 4.5. These models have been trained for more precise instruction following than previous generations of Claude models. For an overview of Claude 4.5's new capabilities, see [What's new in Claude 4.5](/docs/en/about-claude/models/whats-new-claude-4-5). For migration guidance from previous models, see [Migrating to Claude 4.5](/docs/en/about-claude/models/migrating-to-claude-4). ## General principles ### Be explicit with your instructions Claude 4.x models respond well to clear, explicit instructions. Being specific about your desired output can help enhance results. Customers who desire the "above and beyond" behavior from previous Claude models might need to more explicitly request these behaviors with newer models.
**Less effective:** ```text Create an analytics dashboard ``` **More effective:** ```text Create an analytics dashboard. Include as many relevant features and interactions as possible. Go beyond the basics to create a fully-featured implementation. ```
### Add context to improve performance Providing context or motivation behind your instructions, such as explaining to Claude why such behavior is important, can help Claude 4.x models better understand your goals and deliver more targeted responses.
**Less effective:** ```text NEVER use ellipses ``` **More effective:** ```text Your response will be read aloud by a text-to-speech engine, so never use ellipses since the text-to-speech engine will not know how to pronounce them. ```
Claude is smart enough to generalize from the explanation. ### Be vigilant with examples & details Claude 4.x models pay close attention to details and examples as part of their precise instruction following capabilities. Ensure that your examples align with the behaviors you want to encourage and minimize behaviors you want to avoid. ### Long-horizon reasoning and state tracking Claude 4.5 models excel at long-horizon reasoning tasks with exceptional state tracking capabilities. It maintains orientation across extended sessions by focusing on incremental progress—making steady advances on a few things at a time rather than attempting everything at once. This capability especially emerges over multiple context windows or task iterations, where Claude can work on a complex task, save the state, and continue with a fresh context window. #### Context awareness and multi-window workflows Claude 4.5 models feature [context awareness](/docs/en/build-with-claude/context-windows#context-awareness-in-claude-sonnet-4-5), enabling the model to track its remaining context window (i.e. "token budget") throughout a conversation. This enables Claude to execute tasks and manage context more effectively by understanding how much space it has to work. **Managing context limits:** If you are using Claude in an agent harness that compacts context or allows saving context to external files (like in Claude Code), we suggest adding this information to your prompt so Claude can behave accordingly. Otherwise, Claude may sometimes naturally try to wrap up work as it approaches the context limit. Below is an example prompt: ```text Sample prompt Your context window will be automatically compacted as it approaches its limit, allowing you to continue working indefinitely from where you left off. Therefore, do not stop tasks early due to token budget concerns. As you approach your token budget limit, save your current progress and state to memory before the context window refreshes. Always be as persistent and autonomous as possible and complete tasks fully, even if the end of your budget is approaching. Never artificially stop any task early regardless of the context remaining. ``` The [memory tool](/docs/en/agents-and-tools/tool-use/memory-tool) pairs naturally with context awareness for seamless context transitions. #### Multi-context window workflows For tasks spanning multiple context windows: 1. **Use a different prompt for the very first context window**: Use the first context window to set up a framework (write tests, create setup scripts), then use future context windows to iterate on a todo-list. 2. **Have the model write tests in a structured format**: Ask Claude to create tests before starting work and keep track of them in a structured format (e.g., `tests.json`). This leads to better long-term ability to iterate. Remind Claude of the importance of tests: "It is unacceptable to remove or edit tests because this could lead to missing or buggy functionality." 3. **Set up quality of life tools**: Encourage Claude to create setup scripts (e.g., `init.sh`) to gracefully start servers, run test suites, and linters. This prevents repeated work when continuing from a fresh context window. 4. **Starting fresh vs compacting**: When a context window is cleared, consider starting with a brand new context window rather than using compaction. Claude 4.5 models are extremely effective at discovering state from the local filesystem. In some cases, you may want to take advantage of this over compaction. Be prescriptive about how it should start: - "Call pwd; you can only read and write files in this directory." - "Review progress.txt, tests.json, and the git logs." - "Manually run through a fundamental integration test before moving on to implementing new features." 5. **Provide verification tools**: As the length of autonomous tasks grows, Claude needs to verify correctness without continuous human feedback. Tools like Playwright MCP server or computer use capabilities for testing UIs are helpful. 6. **Encourage complete usage of context**: Prompt Claude to efficiently complete components before moving on: ```text Sample prompt This is a very long task, so it may be beneficial to plan out your work clearly. It's encouraged to spend your entire output context working on the task - just make sure you don't run out of context with significant uncommitted work. Continue working systematically until you have completed this task. ``` #### State management best practices - **Use structured formats for state data**: When tracking structured information (like test results or task status), use JSON or other structured formats to help Claude understand schema requirements - **Use unstructured text for progress notes**: Freeform progress notes work well for tracking general progress and context - **Use git for state tracking**: Git provides a log of what's been done and checkpoints that can be restored. Claude 4.5 models perform especially well in using git to track state across multiple sessions. - **Emphasize incremental progress**: Explicitly ask Claude to keep track of its progress and focus on incremental work
```json // Structured state file (tests.json) { "tests": [ {"id": 1, "name": "authentication_flow", "status": "passing"}, {"id": 2, "name": "user_management", "status": "failing"}, {"id": 3, "name": "api_endpoints", "status": "not_started"} ], "total": 200, "passing": 150, "failing": 25, "not_started": 25 } ``` ```text // Progress notes (progress.txt) Session 3 progress: - Fixed authentication token validation - Updated user model to handle edge cases - Next: investigate user_management test failures (test #2) - Note: Do not remove tests as this could lead to missing functionality ```
### Communication style Claude 4.5 models have a more concise and natural communication style compared to previous models: - **More direct and grounded**: Provides fact-based progress reports rather than self-celebratory updates - **More conversational**: Slightly more fluent and colloquial, less machine-like - **Less verbose**: May skip detailed summaries for efficiency unless prompted otherwise This communication style accurately reflects what has been accomplished without unnecessary elaboration. ## Guidance for specific situations ### Balance verbosity Claude 4.5 models tend toward efficiency and may skip verbal summaries after tool calls, jumping directly to the next action. While this creates a streamlined workflow, you may prefer more visibility into its reasoning process. If you want Claude to provide updates as it works: ```text Sample prompt After completing a task that involves tool use, provide a quick summary of the work you've done. ``` ### Tool usage patterns Claude 4.5 models are trained for precise instruction following and benefits from explicit direction to use specific tools. If you say "can you suggest some changes," it will sometimes provide suggestions rather than implementing them—even if making changes might be what you intended. For Claude to take action, be more explicit:
**Less effective (Claude will only suggest):** ```text Can you suggest some changes to improve this function? ``` **More effective (Claude will make the changes):** ```text Change this function to improve its performance. ``` Or: ```text Make these edits to the authentication flow. ```
To make Claude more proactive about taking action by default, you can add this to your system prompt: ```text Sample prompt for proactive action By default, implement changes rather than only suggesting them. If the user's intent is unclear, infer the most useful likely action and proceed, using tools to discover any missing details instead of guessing. Try to infer the user's intent about whether a tool call (e.g., file edit or read) is intended or not, and act accordingly. ``` On the other hand, if you want the model to be more hesitant by default, less prone to jumping straight into implementations, and only take action if requested, you can steer this behavior with a prompt like the below: ```text Sample prompt for conservative action Do not jump into implementatation or changes files unless clearly instructed to make changes. When the user's intent is ambiguous, default to providing information, doing research, and providing recommendations rather than taking action. Only proceed with edits, modifications, or implementations when the user explicitly requests them. ``` ### Tool usage and triggering Claude Opus 4.5 is more responsive to the system prompt than previous models. If your prompts were designed to reduce undertriggering on tools or skills, Claude Opus 4.5 may now overtrigger. The fix is to dial back any aggressive language. Where you might have said "CRITICAL: You MUST use this tool when...", you can use more normal prompting like "Use this tool when...". ### Control the format of responses There are a few ways that we have found to be particularly effective in steering output formatting in Claude 4.x models: 1. **Tell Claude what to do instead of what not to do** - Instead of: "Do not use markdown in your response" - Try: "Your response should be composed of smoothly flowing prose paragraphs." 2. **Use XML format indicators** - Try: "Write the prose sections of your response in \ tags." 3. **Match your prompt style to the desired output** The formatting style used in your prompt may influence Claude's response style. If you are still experiencing steerability issues with output formatting, we recommend as best as you can matching your prompt style to your desired output style. For example, removing markdown from your prompt can reduce the volume of markdown in the output. 4. **Use detailed prompts for specific formatting preferences** For more control over markdown and formatting usage, provide explicit guidance: ```text Sample prompt to minimize markdown When writing reports, documents, technical explanations, analyses, or any long-form content, write in clear, flowing prose using complete paragraphs and sentences. Use standard paragraph breaks for organization and reserve markdown primarily for `inline code`, code blocks (```...```), and simple headings (###, and ###). Avoid using **bold** and *italics*. DO NOT use ordered lists (1. ...) or unordered lists (*) unless : a) you're presenting truly discrete items where a list format is the best option, or b) the user explicitly requests a list or ranking Instead of listing items with bullets or numbers, incorporate them naturally into sentences. This guidance applies especially to technical writing. Using prose instead of excessive formatting will improve user satisfaction. NEVER output a series of overly short bullet points. Your goal is readable, flowing text that guides the reader naturally through ideas rather than fragmenting information into isolated points. ``` ### Research and information gathering Claude 4.5 models demonstrate exceptional agentic search capabilities and can find and synthesize information from multiple sources effectively. For optimal research results: 1. **Provide clear success criteria**: Define what constitutes a successful answer to your research question 2. **Encourage source verification**: Ask Claude to verify information across multiple sources 3. **For complex research tasks, use a structured approach**: ```text Sample prompt for complex research Search for this information in a structured way. As you gather data, develop several competing hypotheses. Track your confidence levels in your progress notes to improve calibration. Regularly self-critique your approach and plan. Update a hypothesis tree or research notes file to persist information and provide transparency. Break down this complex research task systematically. ``` This structured approach allows Claude to find and synthesize virtually any piece of information and iteratively critique its findings, no matter the size of the corpus. ### Subagent orchestration Claude 4.5 models demonstrate significantly improved native subagent orchestration capabilities. These models can recognize when tasks would benefit from delegating work to specialized subagents and do so proactively without requiring explicit instruction. To take advantage of this behavior: 1. **Ensure well-defined subagent tools**: Have subagent tools available and described in tool definitions 2. **Let Claude orchestrate naturally**: Claude will delegate appropriately without explicit instruction 3. **Adjust conservativeness if needed**: ```text Sample prompt for conservative subagent usage Only delegate to subagents when the task clearly benefits from a separate agent with a new context window. ``` ### Model self-knowledge If you would like Claude to identify itself correctly in your application or use specific API strings: ```text Sample prompt for model identity The assistant is Claude, created by Anthropic. The current model is Claude Sonnet 4.5. ``` For LLM-powered apps that need to specify model strings: ```text Sample prompt for model string When an LLM is needed, please default to Claude Sonnet 4.5 unless the user requests otherwise. The exact model string for Claude Sonnet 4.5 is claude-sonnet-4-5-20250929. ``` ### Thinking sensitivity When extended thinking is disabled, Claude Opus 4.5 is particularly sensitive to the word "think" and its variants. We recommend replacing "think" with alternative words that convey similar meaning, such as "consider," "believe," and "evaluate." ### Leverage thinking & interleaved thinking capabilities Claude 4.x models offer thinking capabilities that can be especially helpful for tasks involving reflection after tool use or complex multi-step reasoning. You can guide its initial or interleaved thinking for better results. ```text Example prompt After receiving tool results, carefully reflect on their quality and determine optimal next steps before proceeding. Use your thinking to plan and iterate based on this new information, and then take the best next action. ``` For more information on thinking capabilities, see [Extended thinking](/docs/en/build-with-claude/extended-thinking). ### Document creation Claude 4.5 models excel at creating presentations, animations, and visual documents. These models match or exceed Claude Opus 4.1 in this domain, with impressive creative flair and stronger instruction following. The models produce polished, usable output on the first try in most cases. For best results with document creation: ```text Sample prompt Create a professional presentation on [topic]. Include thoughtful design elements, visual hierarchy, and engaging animations where appropriate. ``` ### Improved vision capabilities Claude Opus 4.5 has improved vision capabilities compared to previous Claude models. It performs better on image processing and data extraction tasks, particularly when there are multiple images present in context. These improvements carry over to computer use, where the model can more reliably interpret screenshots and UI elements. You can also use Claude Opus 4.5 to analyze videos by breaking them up into frames. One technique we've found effective to further boost performance is to give Claude Opus 4.5 a crop tool or [skill](/docs/en/agents-and-tools/agent-skills/overview). We've seen consistent uplift on image evaluations when Claude is able to "zoom" in on relevant regions of an image. We've put together a cookbook for the crop tool [here](https://platform.claude.com/cookbook/multimodal-crop-tool). ### Optimize parallel tool calling Claude 4.x models excel at parallel tool execution, with Sonnet 4.5 being particularly aggressive in firing off multiple operations simultaneously. Claude 4.x models will: - Run multiple speculative searches during research - Read several files at once to build context faster - Execute bash commands in parallel (which can even bottleneck system performance) This behavior is easily steerable. While the model has a high success rate in parallel tool calling without prompting, you can boost this to ~100% or adjust the aggression level: ```text Sample prompt for maximum parallel efficiency If you intend to call multiple tools and there are no dependencies between the tool calls, make all of the independent tool calls in parallel. Prioritize calling tools simultaneously whenever the actions can be done in parallel rather than sequentially. For example, when reading 3 files, run 3 tool calls in parallel to read all 3 files into context at the same time. Maximize use of parallel tool calls where possible to increase speed and efficiency. However, if some tool calls depend on previous calls to inform dependent values like the parameters, do NOT call these tools in parallel and instead call them sequentially. Never use placeholders or guess missing parameters in tool calls. ``` ```text Sample prompt to reduce parallel execution Execute operations sequentially with brief pauses between each step to ensure stability. ``` ### Reduce file creation in agentic coding Claude 4.x models may sometimes create new files for testing and iteration purposes, particularly when working with code. This approach allows Claude to use files, especially python scripts, as a 'temporary scratchpad' before saving its final output. Using temporary files can improve outcomes particularly for agentic coding use cases. If you'd prefer to minimize net new file creation, you can instruct Claude to clean up after itself: ```text Sample prompt If you create any temporary new files, scripts, or helper files for iteration, clean up these files by removing them at the end of the task. ``` ### Overeagerness and file creation Claude Opus 4.5 has a tendency to overengineer by creating extra files, adding unnecessary abstractions, or building in flexibility that wasn't requested. If you're seeing this undesired behavior, add explicit prompting to keep solutions minimal. For example: ```text Sample prompt to minimize overengineering Avoid over-engineering. Only make changes that are directly requested or clearly necessary. Keep solutions simple and focused. Don't add features, refactor code, or make "improvements" beyond what was asked. A bug fix doesn't need surrounding code cleaned up. A simple feature doesn't need extra configurability. Don't add error handling, fallbacks, or validation for scenarios that can't happen. Trust internal code and framework guarantees. Only validate at system boundaries (user input, external APIs). Don't use backwards-compatibility shims when you can just change the code. Don't create helpers, utilities, or abstractions for one-time operations. Don't design for hypothetical future requirements. The right amount of complexity is the minimum needed for the current task. Reuse existing abstractions where possible and follow the DRY principle. ``` ### Frontend design Claude 4.x models, particularly Opus 4.5, excel at building complex, real-world web applications with strong frontend design. However, without guidance, models can default to generic patterns that create what users call the "AI slop" aesthetic. To create distinctive, creative frontends that surprise and delight: For a detailed guide on improving frontend design, see our blog post on [improving frontend design through skills](https://www.claude.com/blog/improving-frontend-design-through-skills). Here's a system prompt snippet you can use to encourage better frontend design: ```text Sample prompt for frontend aesthetics You tend to converge toward generic, "on distribution" outputs. In frontend design, this creates what users call the "AI slop" aesthetic. Avoid this: make creative, distinctive frontends that surprise and delight. Focus on: - Typography: Choose fonts that are beautiful, unique, and interesting. Avoid generic fonts like Arial and Inter; opt instead for distinctive choices that elevate the frontend's aesthetics. - Color & Theme: Commit to a cohesive aesthetic. Use CSS variables for consistency. Dominant colors with sharp accents outperform timid, evenly-distributed palettes. Draw from IDE themes and cultural aesthetics for inspiration. - Motion: Use animations for effects and micro-interactions. Prioritize CSS-only solutions for HTML. Use Motion library for React when available. Focus on high-impact moments: one well-orchestrated page load with staggered reveals (animation-delay) creates more delight than scattered micro-interactions. - Backgrounds: Create atmosphere and depth rather than defaulting to solid colors. Layer CSS gradients, use geometric patterns, or add contextual effects that match the overall aesthetic. Avoid generic AI-generated aesthetics: - Overused font families (Inter, Roboto, Arial, system fonts) - Clichéd color schemes (particularly purple gradients on white backgrounds) - Predictable layouts and component patterns - Cookie-cutter design that lacks context-specific character Interpret creatively and make unexpected choices that feel genuinely designed for the context. Vary between light and dark themes, different fonts, different aesthetics. You still tend to converge on common choices (Space Grotesk, for example) across generations. Avoid this: it is critical that you think outside the box! ``` You can also refer to the full skill [here](https://github.com/anthropics/claude-code/blob/main/plugins/frontend-design/skills/frontend-design/SKILL.md). ### Avoid focusing on passing tests and hard-coding Claude 4.x models can sometimes focus too heavily on making tests pass at the expense of more general solutions, or may use workarounds like helper scripts for complex refactoring instead of using standard tools directly. To prevent this behavior and ensure robust, generalizable solutions: ```text Sample prompt Please write a high-quality, general-purpose solution using the standard tools available. Do not create helper scripts or workarounds to accomplish the task more efficiently. Implement a solution that works correctly for all valid inputs, not just the test cases. Do not hard-code values or create solutions that only work for specific test inputs. Instead, implement the actual logic that solves the problem generally. Focus on understanding the problem requirements and implementing the correct algorithm. Tests are there to verify correctness, not to define the solution. Provide a principled implementation that follows best practices and software design principles. If the task is unreasonable or infeasible, or if any of the tests are incorrect, please inform me rather than working around them. The solution should be robust, maintainable, and extendable. ``` ### Encouraging code exploration Claude Opus 4.5 is highly capable but can be overly conservative when exploring code. If you notice the model proposing solutions without looking at the code or making assumptions about code it hasn't read, the best solution is to add explicit instructions to the prompt. Claude Opus 4.5 is our most steerable model to date and responds reliably to direct guidance. For example: ```text Sample prompt for code exploration ALWAYS read and understand relevant files before proposing code edits. Do not speculate about code you have not inspected. If the user references a specific file/path, you MUST open and inspect it before explaining or proposing fixes. Be rigorous and persistent in searching code for key facts. Thoroughly review the style, conventions, and abstractions of the codebase before implementing new features or abstractions. ``` ### Minimizing hallucinations in agentic coding Claude 4.x models are less prone to hallucinations and give more accurate, grounded, intelligent answers based on the code. To encourage this behavior even more and minimize hallucinations: ```text Sample prompt Never speculate about code you have not opened. If the user references a specific file, you MUST read the file before answering. Make sure to investigate and read relevant files BEFORE answering questions about the codebase. Never make any claims about code before investigating unless you are certain of the correct answer - give grounded and hallucination-free answers. ``` ## Migration considerations When migrating to Claude 4.5 models: 1. **Be specific about desired behavior**: Consider describing exactly what you'd like to see in the output. 2. **Frame your instructions with modifiers**: Adding modifiers that encourage Claude to increase the quality and detail of its output can help better shape Claude's performance. For example, instead of "Create an analytics dashboard", use "Create an analytics dashboard. Include as many relevant features and interactions as possible. Go beyond the basics to create a fully-featured implementation." 3. **Request specific features explicitly**: Animations and interactive elements should be requested explicitly when desired. --- # Using the Messages API URL: https://platform.claude.com/docs/en/build-with-claude/working-with-messages # Using the Messages API Practical patterns and examples for using the Messages API effectively --- This guide covers common patterns for working with the Messages API, including basic requests, multi-turn conversations, prefill techniques, and vision capabilities. For complete API specifications, see the [Messages API reference](/docs/en/api/messages). ## Basic request and response ```bash Shell #!/bin/sh curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Hello, Claude"} ] }' ``` ```python Python import anthropic message = anthropic.Anthropic().messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] ) print(message) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const message = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ {"role": "user", "content": "Hello, Claude"} ] }); console.log(message); ``` ```json JSON { "id": "msg_01XFDUDYJgAACzvnptvVoYEL", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "Hello!" } ], "model": "claude-sonnet-4-5", "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 12, "output_tokens": 6 } } ``` ## Multiple conversational turns The Messages API is stateless, which means that you always send the full conversational history to the API. You can use this pattern to build up a conversation over time. Earlier conversational turns don't necessarily need to actually originate from Claude — you can use synthetic `assistant` messages. ```bash Shell #!/bin/sh curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ] }' ``` ```python Python import anthropic message = anthropic.Anthropic().messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ], ) print(message) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ] }); ``` ```json JSON { "id": "msg_018gCsTGsXkYJVqYPxTgDHBU", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "Sure, I'd be happy to provide..." } ], "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 30, "output_tokens": 309 } } ``` ## Putting words in Claude's mouth You can pre-fill part of Claude's response in the last position of the input messages list. This can be used to shape Claude's response. The example below uses `"max_tokens": 1` to get a single multiple choice answer from Claude. ```bash Shell #!/bin/sh curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1, "messages": [ {"role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"}, {"role": "assistant", "content": "The answer is ("} ] }' ``` ```python Python import anthropic message = anthropic.Anthropic().messages.create( model="claude-sonnet-4-5", max_tokens=1, messages=[ {"role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"}, {"role": "assistant", "content": "The answer is ("} ] ) print(message) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const message = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1, messages: [ {"role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"}, {"role": "assistant", "content": "The answer is ("} ] }); console.log(message); ``` ```json JSON { "id": "msg_01Q8Faay6S7QPTvEUUQARt7h", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "C" } ], "model": "claude-sonnet-4-5", "stop_reason": "max_tokens", "stop_sequence": null, "usage": { "input_tokens": 42, "output_tokens": 1 } } ``` For more information on prefill techniques, see our [prefill guide](/docs/en/build-with-claude/prompt-engineering/prefill-claudes-response). ## Vision Claude can read both text and images in requests. We support both `base64` and `url` source types for images, and the `image/jpeg`, `image/png`, `image/gif`, and `image/webp` media types. See our [vision guide](/docs/en/build-with-claude/vision) for more details. ```bash Shell #!/bin/sh # Option 1: Base64-encoded image IMAGE_URL="https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" IMAGE_MEDIA_TYPE="image/jpeg" IMAGE_BASE64=$(curl "$IMAGE_URL" | base64) curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": [ {"type": "image", "source": { "type": "base64", "media_type": "'$IMAGE_MEDIA_TYPE'", "data": "'$IMAGE_BASE64'" }}, {"type": "text", "text": "What is in the above image?"} ]} ] }' # Option 2: URL-referenced image curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": [ {"type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" }}, {"type": "text", "text": "What is in the above image?"} ]} ] }' ``` ```python Python import anthropic import base64 import httpx # Option 1: Base64-encoded image image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" image_media_type = "image/jpeg" image_data = base64.standard_b64encode(httpx.get(image_url).content).decode("utf-8") message = anthropic.Anthropic().messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image_media_type, "data": image_data, }, }, { "type": "text", "text": "What is in the above image?" } ], } ], ) print(message) # Option 2: URL-referenced image message_from_url = anthropic.Anthropic().messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "What is in the above image?" } ], } ], ) print(message_from_url) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Option 1: Base64-encoded image const image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" const image_media_type = "image/jpeg" const image_array_buffer = await ((await fetch(image_url)).arrayBuffer()); const image_data = Buffer.from(image_array_buffer).toString('base64'); const message = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image_media_type, "data": image_data, }, }, { "type": "text", "text": "What is in the above image?" } ], } ] }); console.log(message); // Option 2: URL-referenced image const messageFromUrl = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "What is in the above image?" } ], } ] }); console.log(messageFromUrl); ``` ```json JSON { "id": "msg_01EcyWo6m4hyW8KHs2y2pei5", "type": "message", "role": "assistant", "content": [ { "type": "text", "text": "This image shows an ant, specifically a close-up view of an ant. The ant is shown in detail, with its distinct head, antennae, and legs clearly visible. The image is focused on capturing the intricate details and features of the ant, likely taken with a macro lens to get an extreme close-up perspective." } ], "model": "claude-sonnet-4-5", "stop_reason": "end_turn", "stop_sequence": null, "usage": { "input_tokens": 1551, "output_tokens": 71 } } ``` ## Tool use and computer use See our [guide](/docs/en/agents-and-tools/tool-use/overview) for examples of how to use tools with the Messages API. See our [computer use guide](/docs/en/agents-and-tools/tool-use/computer-use-tool) for examples of how to control desktop computer environments with the Messages API. For guaranteed JSON output, see [Structured Outputs](/docs/en/build-with-claude/structured-outputs). ### Capabilities --- # Batch processing URL: https://platform.claude.com/docs/en/build-with-claude/batch-processing # Batch processing --- Batch processing is a powerful approach for handling large volumes of requests efficiently. Instead of processing requests one at a time with immediate responses, batch processing allows you to submit multiple requests together for asynchronous processing. This pattern is particularly useful when: - You need to process large volumes of data - Immediate responses are not required - You want to optimize for cost efficiency - You're running large-scale evaluations or analyses The Message Batches API is our first implementation of this pattern. --- # Message Batches API The Message Batches API is a powerful, cost-effective way to asynchronously process large volumes of [Messages](/docs/en/api/messages) requests. This approach is well-suited to tasks that do not require immediate responses, with most batches finishing in less than 1 hour while reducing costs by 50% and increasing throughput. You can [explore the API reference directly](/docs/en/api/creating-message-batches), in addition to this guide. ## How the Message Batches API works When you send a request to the Message Batches API: 1. The system creates a new Message Batch with the provided Messages requests. 2. The batch is then processed asynchronously, with each request handled independently. 3. You can poll for the status of the batch and retrieve results when processing has ended for all requests. This is especially useful for bulk operations that don't require immediate results, such as: - Large-scale evaluations: Process thousands of test cases efficiently. - Content moderation: Analyze large volumes of user-generated content asynchronously. - Data analysis: Generate insights or summaries for large datasets. - Bulk content generation: Create large amounts of text for various purposes (e.g., product descriptions, article summaries). ### Batch limitations - A Message Batch is limited to either 100,000 Message requests or 256 MB in size, whichever is reached first. - We process each batch as fast as possible, with most batches completing within 1 hour. You will be able to access batch results when all messages have completed or after 24 hours, whichever comes first. Batches will expire if processing does not complete within 24 hours. - Batch results are available for 29 days after creation. After that, you may still view the Batch, but its results will no longer be available for download. - Batches are scoped to a [Workspace](/settings/workspaces). You may view all batches—and their results—that were created within the Workspace that your API key belongs to. - Rate limits apply to both Batches API HTTP requests and the number of requests within a batch waiting to be processed. See [Message Batches API rate limits](/docs/en/api/rate-limits#message-batches-api). Additionally, we may slow down processing based on current demand and your request volume. In that case, you may see more requests expiring after 24 hours. - Due to high throughput and concurrent processing, batches may go slightly over your Workspace's configured [spend limit](/settings/limits). ### Supported models All [active models](/docs/en/about-claude/models/overview) support the Message Batches API. ### What can be batched Any request that you can make to the Messages API can be included in a batch. This includes: - Vision - Tool use - System messages - Multi-turn conversations - Any beta features Since each request in the batch is processed independently, you can mix different types of requests within a single batch. Since batches can take longer than 5 minutes to process, consider using the [1-hour cache duration](/docs/en/build-with-claude/prompt-caching#1-hour-cache-duration) with prompt caching for better cache hit rates when processing batches with shared context. --- ## Pricing The Batches API offers significant cost savings. All usage is charged at 50% of the standard API prices. | Model | Batch input | Batch output | |-------------------|------------------|-----------------| | Claude Opus 4.5 | $2.50 / MTok | $12.50 / MTok | | Claude Opus 4.1 | $7.50 / MTok | $37.50 / MTok | | Claude Opus 4 | $7.50 / MTok | $37.50 / MTok | | Claude Sonnet 4.5 | $1.50 / MTok | $7.50 / MTok | | Claude Sonnet 4 | $1.50 / MTok | $7.50 / MTok | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | $1.50 / MTok | $7.50 / MTok | | Claude Haiku 4.5 | $0.50 / MTok | $2.50 / MTok | | Claude Haiku 3.5 | $0.40 / MTok | $2 / MTok | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | $7.50 / MTok | $37.50 / MTok | | Claude Haiku 3 | $0.125 / MTok | $0.625 / MTok | --- ## How to use the Message Batches API ### Prepare and create your batch A Message Batch is composed of a list of requests to create a Message. The shape of an individual request is comprised of: - A unique `custom_id` for identifying the Messages request - A `params` object with the standard [Messages API](/docs/en/api/messages) parameters You can [create a batch](/docs/en/api/creating-message-batches) by passing this list into the `requests` parameter: ```bash Shell curl https://api.anthropic.com/v1/messages/batches \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "requests": [ { "custom_id": "my-first-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Hello, world"} ] } }, { "custom_id": "my-second-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Hi again, friend"} ] } } ] }' ``` ```python Python import anthropic from anthropic.types.message_create_params import MessageCreateParamsNonStreaming from anthropic.types.messages.batch_create_params import Request client = anthropic.Anthropic() message_batch = client.messages.batches.create( requests=[ Request( custom_id="my-first-request", params=MessageCreateParamsNonStreaming( model="claude-sonnet-4-5", max_tokens=1024, messages=[{ "role": "user", "content": "Hello, world", }] ) ), Request( custom_id="my-second-request", params=MessageCreateParamsNonStreaming( model="claude-sonnet-4-5", max_tokens=1024, messages=[{ "role": "user", "content": "Hi again, friend", }] ) ) ] ) print(message_batch) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const messageBatch = await anthropic.messages.batches.create({ requests: [{ custom_id: "my-first-request", params: { model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ {"role": "user", "content": "Hello, world"} ] } }, { custom_id: "my-second-request", params: { model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ {"role": "user", "content": "Hi again, friend"} ] } }] }); console.log(messageBatch) ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.batches.*; public class BatchExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); BatchCreateParams params = BatchCreateParams.builder() .addRequest(BatchCreateParams.Request.builder() .customId("my-first-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addUserMessage("Hello, world") .build()) .build()) .addRequest(BatchCreateParams.Request.builder() .customId("my-second-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addUserMessage("Hi again, friend") .build()) .build()) .build(); MessageBatch messageBatch = client.messages().batches().create(params); System.out.println(messageBatch); } } ``` In this example, two separate requests are batched together for asynchronous processing. Each request has a unique `custom_id` and contains the standard parameters you'd use for a Messages API call. **Test your batch requests with the Messages API** Validation of the `params` object for each message request is performed asynchronously, and validation errors are returned when processing of the entire batch has ended. You can ensure that you are building your input correctly by verifying your request shape with the [Messages API](/docs/en/api/messages) first. When a batch is first created, the response will have a processing status of `in_progress`. ```json JSON { "id": "msgbatch_01HkcTjaV5uDC8jWR4ZsDV8d", "type": "message_batch", "processing_status": "in_progress", "request_counts": { "processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0 }, "ended_at": null, "created_at": "2024-09-24T18:37:24.100435Z", "expires_at": "2024-09-25T18:37:24.100435Z", "cancel_initiated_at": null, "results_url": null } ``` ### Tracking your batch The Message Batch's `processing_status` field indicates the stage of processing the batch is in. It starts as `in_progress`, then updates to `ended` once all the requests in the batch have finished processing, and results are ready. You can monitor the state of your batch by visiting the [Console](/settings/workspaces/default/batches), or using the [retrieval endpoint](/docs/en/api/retrieving-message-batches). #### Polling for Message Batch completion To poll a Message Batch, you'll need its `id`, which is provided in the response when creating a batch or by listing batches. You can implement a polling loop that checks the batch status periodically until processing has ended: ```python Python import anthropic import time client = anthropic.Anthropic() message_batch = None while True: message_batch = client.messages.batches.retrieve( MESSAGE_BATCH_ID ) if message_batch.processing_status == "ended": break print(f"Batch {MESSAGE_BATCH_ID} is still processing...") time.sleep(60) print(message_batch) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); let messageBatch; while (true) { messageBatch = await anthropic.messages.batches.retrieve( MESSAGE_BATCH_ID ); if (messageBatch.processing_status === 'ended') { break; } console.log(`Batch ${messageBatch} is still processing... waiting`); await new Promise(resolve => setTimeout(resolve, 60_000)); } console.log(messageBatch); ``` ```bash Shell #!/bin/sh until [[ $(curl -s "https://api.anthropic.com/v1/messages/batches/$MESSAGE_BATCH_ID" \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ | grep -o '"processing_status":[[:space:]]*"[^"]*"' \ | cut -d'"' -f4) == "ended" ]]; do echo "Batch $MESSAGE_BATCH_ID is still processing..." sleep 60 done echo "Batch $MESSAGE_BATCH_ID has finished processing" ``` ### Listing all Message Batches You can list all Message Batches in your Workspace using the [list endpoint](/docs/en/api/listing-message-batches). The API supports pagination, automatically fetching additional pages as needed: ```python Python import anthropic client = anthropic.Anthropic() # Automatically fetches more pages as needed. for message_batch in client.messages.batches.list( limit=20 ): print(message_batch) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Automatically fetches more pages as needed. for await (const messageBatch of anthropic.messages.batches.list({ limit: 20 })) { console.log(messageBatch); } ``` ```bash Shell #!/bin/sh if ! command -v jq &> /dev/null; then echo "Error: This script requires jq. Please install it first." exit 1 fi BASE_URL="https://api.anthropic.com/v1/messages/batches" has_more=true after_id="" while [ "$has_more" = true ]; do # Construct URL with after_id if it exists if [ -n "$after_id" ]; then url="${BASE_URL}?limit=20&after_id=${after_id}" else url="$BASE_URL?limit=20" fi response=$(curl -s "$url" \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01") # Extract values using jq has_more=$(echo "$response" | jq -r '.has_more') after_id=$(echo "$response" | jq -r '.last_id') # Process and print each entry in the data array echo "$response" | jq -c '.data[]' | while read -r entry; do echo "$entry" | jq '.' done done ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.batches.*; public class BatchListExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Automatically fetches more pages as needed for (MessageBatch messageBatch : client.messages().batches().list( BatchListParams.builder() .limit(20) .build() )) { System.out.println(messageBatch); } } } ``` ### Retrieving batch results Once batch processing has ended, each Messages request in the batch will have a result. There are 4 result types: | Result Type | Description | |-------------|-------------| | `succeeded` | Request was successful. Includes the message result. | | `errored` | Request encountered an error and a message was not created. Possible errors include invalid requests and internal server errors. You will not be billed for these requests. | | `canceled` | User canceled the batch before this request could be sent to the model. You will not be billed for these requests. | | `expired` | Batch reached its 24 hour expiration before this request could be sent to the model. You will not be billed for these requests. | You will see an overview of your results with the batch's `request_counts`, which shows how many requests reached each of these four states. Results of the batch are available for download at the `results_url` property on the Message Batch, and if the organization permission allows, in the Console. Because of the potentially large size of the results, it's recommended to [stream results](/docs/en/api/retrieving-message-batch-results) back rather than download them all at once. ```bash Shell #!/bin/sh curl "https://api.anthropic.com/v1/messages/batches/msgbatch_01HkcTjaV5uDC8jWR4ZsDV8d" \ --header "anthropic-version: 2023-06-01" \ --header "x-api-key: $ANTHROPIC_API_KEY" \ | grep -o '"results_url":[[:space:]]*"[^"]*"' \ | cut -d'"' -f4 \ | while read -r url; do curl -s "$url" \ --header "anthropic-version: 2023-06-01" \ --header "x-api-key: $ANTHROPIC_API_KEY" \ | sed 's/}{/}\n{/g' \ | while IFS= read -r line do result_type=$(echo "$line" | sed -n 's/.*"result":[[:space:]]*{[[:space:]]*"type":[[:space:]]*"\([^"]*\)".*/\1/p') custom_id=$(echo "$line" | sed -n 's/.*"custom_id":[[:space:]]*"\([^"]*\)".*/\1/p') error_type=$(echo "$line" | sed -n 's/.*"error":[[:space:]]*{[[:space:]]*"type":[[:space:]]*"\([^"]*\)".*/\1/p') case "$result_type" in "succeeded") echo "Success! $custom_id" ;; "errored") if [ "$error_type" = "invalid_request" ]; then # Request body must be fixed before re-sending request echo "Validation error: $custom_id" else # Request can be retried directly echo "Server error: $custom_id" fi ;; "expired") echo "Expired: $line" ;; esac done done ``` ```python Python import anthropic client = anthropic.Anthropic() # Stream results file in memory-efficient chunks, processing one at a time for result in client.messages.batches.results( "msgbatch_01HkcTjaV5uDC8jWR4ZsDV8d", ): match result.result.type: case "succeeded": print(f"Success! {result.custom_id}") case "errored": if result.result.error.type == "invalid_request": # Request body must be fixed before re-sending request print(f"Validation error {result.custom_id}") else: # Request can be retried directly print(f"Server error {result.custom_id}") case "expired": print(f"Request expired {result.custom_id}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Stream results file in memory-efficient chunks, processing one at a time for await (const result of await anthropic.messages.batches.results( "msgbatch_01HkcTjaV5uDC8jWR4ZsDV8d" )) { switch (result.result.type) { case 'succeeded': console.log(`Success! ${result.custom_id}`); break; case 'errored': if (result.result.error.type == "invalid_request") { // Request body must be fixed before re-sending request console.log(`Validation error: ${result.custom_id}`); } else { // Request can be retried directly console.log(`Server error: ${result.custom_id}`); } break; case 'expired': console.log(`Request expired: ${result.custom_id}`); break; } } ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.http.StreamResponse; import com.anthropic.models.messages.batches.MessageBatchIndividualResponse; import com.anthropic.models.messages.batches.BatchResultsParams; public class BatchResultsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Stream results file in memory-efficient chunks, processing one at a time try (StreamResponse streamResponse = client.messages() .batches() .resultsStreaming( BatchResultsParams.builder() .messageBatchId("msgbatch_01HkcTjaV5uDC8jWR4ZsDV8d") .build())) { streamResponse.stream().forEach(result -> { if (result.result().isSucceeded()) { System.out.println("Success! " + result.customId()); } else if (result.result().isErrored()) { if (result.result().asErrored().error().error().isInvalidRequestError()) { // Request body must be fixed before re-sending request System.out.println("Validation error: " + result.customId()); } else { // Request can be retried directly System.out.println("Server error: " + result.customId()); } } else if (result.result().isExpired()) { System.out.println("Request expired: " + result.customId()); } }); } } } ``` The results will be in `.jsonl` format, where each line is a valid JSON object representing the result of a single request in the Message Batch. For each streamed result, you can do something different depending on its `custom_id` and result type. Here is an example set of results: ```json .jsonl file {"custom_id":"my-second-request","result":{"type":"succeeded","message":{"id":"msg_014VwiXbi91y3JMjcpyGBHX5","type":"message","role":"assistant","model":"claude-sonnet-4-5-20250929","content":[{"type":"text","text":"Hello again! It's nice to see you. How can I assist you today? Is there anything specific you'd like to chat about or any questions you have?"}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":11,"output_tokens":36}}}} {"custom_id":"my-first-request","result":{"type":"succeeded","message":{"id":"msg_01FqfsLoHwgeFbguDgpz48m7","type":"message","role":"assistant","model":"claude-sonnet-4-5-20250929","content":[{"type":"text","text":"Hello! How can I assist you today? Feel free to ask me any questions or let me know if there's anything you'd like to chat about."}],"stop_reason":"end_turn","stop_sequence":null,"usage":{"input_tokens":10,"output_tokens":34}}}} ``` If your result has an error, its `result.error` will be set to our standard [error shape](/docs/en/api/errors#error-shapes). **Batch results may not match input order** Batch results can be returned in any order, and may not match the ordering of requests when the batch was created. In the above example, the result for the second batch request is returned before the first. To correctly match results with their corresponding requests, always use the `custom_id` field. ### Canceling a Message Batch You can cancel a Message Batch that is currently processing using the [cancel endpoint](/docs/en/api/canceling-message-batches). Immediately after cancellation, a batch's `processing_status` will be `canceling`. You can use the same polling technique described above to wait until cancellation is finalized. Canceled batches end up with a status of `ended` and may contain partial results for requests that were processed before cancellation. ```python Python import anthropic client = anthropic.Anthropic() message_batch = client.messages.batches.cancel( MESSAGE_BATCH_ID, ) print(message_batch) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const messageBatch = await anthropic.messages.batches.cancel( MESSAGE_BATCH_ID ); console.log(messageBatch); ``` ```bash Shell #!/bin/sh curl --request POST https://api.anthropic.com/v1/messages/batches/$MESSAGE_BATCH_ID/cancel \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.batches.*; public class BatchCancelExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageBatch messageBatch = client.messages().batches().cancel( BatchCancelParams.builder() .messageBatchId(MESSAGE_BATCH_ID) .build() ); System.out.println(messageBatch); } } ``` The response will show the batch in a `canceling` state: ```json JSON { "id": "msgbatch_013Zva2CMHLNnXjNJJKqJ2EF", "type": "message_batch", "processing_status": "canceling", "request_counts": { "processing": 2, "succeeded": 0, "errored": 0, "canceled": 0, "expired": 0 }, "ended_at": null, "created_at": "2024-09-24T18:37:24.100435Z", "expires_at": "2024-09-25T18:37:24.100435Z", "cancel_initiated_at": "2024-09-24T18:39:03.114875Z", "results_url": null } ``` ### Using prompt caching with Message Batches The Message Batches API supports prompt caching, allowing you to potentially reduce costs and processing time for batch requests. The pricing discounts from prompt caching and Message Batches can stack, providing even greater cost savings when both features are used together. However, since batch requests are processed asynchronously and concurrently, cache hits are provided on a best-effort basis. Users typically experience cache hit rates ranging from 30% to 98%, depending on their traffic patterns. To maximize the likelihood of cache hits in your batch requests: 1. Include identical `cache_control` blocks in every Message request within your batch 2. Maintain a steady stream of requests to prevent cache entries from expiring after their 5-minute lifetime 3. Structure your requests to share as much cached content as possible Example of implementing prompt caching in a batch: ```bash Shell curl https://api.anthropic.com/v1/messages/batches \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "requests": [ { "custom_id": "my-first-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "system": [ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], "messages": [ {"role": "user", "content": "Analyze the major themes in Pride and Prejudice."} ] } }, { "custom_id": "my-second-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "system": [ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], "messages": [ {"role": "user", "content": "Write a summary of Pride and Prejudice."} ] } } ] }' ``` ```python Python import anthropic from anthropic.types.message_create_params import MessageCreateParamsNonStreaming from anthropic.types.messages.batch_create_params import Request client = anthropic.Anthropic() message_batch = client.messages.batches.create( requests=[ Request( custom_id="my-first-request", params=MessageCreateParamsNonStreaming( model="claude-sonnet-4-5", max_tokens=1024, system=[ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], messages=[{ "role": "user", "content": "Analyze the major themes in Pride and Prejudice." }] ) ), Request( custom_id="my-second-request", params=MessageCreateParamsNonStreaming( model="claude-sonnet-4-5", max_tokens=1024, system=[ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], messages=[{ "role": "user", "content": "Write a summary of Pride and Prejudice." }] ) ) ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const messageBatch = await anthropic.messages.batches.create({ requests: [{ custom_id: "my-first-request", params: { model: "claude-sonnet-4-5", max_tokens: 1024, system: [ { type: "text", text: "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { type: "text", text: "", cache_control: {type: "ephemeral"} } ], messages: [ {"role": "user", "content": "Analyze the major themes in Pride and Prejudice."} ] } }, { custom_id: "my-second-request", params: { model: "claude-sonnet-4-5", max_tokens: 1024, system: [ { type: "text", text: "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { type: "text", text: "", cache_control: {type: "ephemeral"} } ], messages: [ {"role": "user", "content": "Write a summary of Pride and Prejudice."} ] } }] }); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; import com.anthropic.models.messages.batches.*; public class BatchExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); BatchCreateParams createParams = BatchCreateParams.builder() .addRequest(BatchCreateParams.Request.builder() .customId("my-first-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .systemOfTextBlockParams(List.of( TextBlockParam.builder() .text("You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n") .build(), TextBlockParam.builder() .text("") .cacheControl(CacheControlEphemeral.builder().build()) .build() )) .addUserMessage("Analyze the major themes in Pride and Prejudice.") .build()) .build()) .addRequest(BatchCreateParams.Request.builder() .customId("my-second-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .systemOfTextBlockParams(List.of( TextBlockParam.builder() .text("You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n") .build(), TextBlockParam.builder() .text("") .cacheControl(CacheControlEphemeral.builder().build()) .build() )) .addUserMessage("Write a summary of Pride and Prejudice.") .build()) .build()) .build(); MessageBatch messageBatch = client.messages().batches().create(createParams); } } ``` In this example, both requests in the batch include identical system messages and the full text of Pride and Prejudice marked with `cache_control` to increase the likelihood of cache hits. ### Best practices for effective batching To get the most out of the Batches API: - Monitor batch processing status regularly and implement appropriate retry logic for failed requests. - Use meaningful `custom_id` values to easily match results with requests, since order is not guaranteed. - Consider breaking very large datasets into multiple batches for better manageability. - Dry run a single request shape with the Messages API to avoid validation errors. ### Troubleshooting common issues If experiencing unexpected behavior: - Verify that the total batch request size doesn't exceed 256 MB. If the request size is too large, you may get a 413 `request_too_large` error. - Check that you're using [supported models](#supported-models) for all requests in the batch. - Ensure each request in the batch has a unique `custom_id`. - Ensure that it has been less than 29 days since batch `created_at` (not processing `ended_at`) time. If over 29 days have passed, results will no longer be viewable. - Confirm that the batch has not been canceled. Note that the failure of one request in a batch does not affect the processing of other requests. --- ## Batch storage and privacy - **Workspace isolation**: Batches are isolated within the Workspace they are created in. They can only be accessed by API keys associated with that Workspace, or users with permission to view Workspace batches in the Console. - **Result availability**: Batch results are available for 29 days after the batch is created, allowing ample time for retrieval and processing. --- ## FAQ
Batches may take up to 24 hours for processing, but many will finish sooner. Actual processing time depends on the size of the batch, current demand, and your request volume. It is possible for a batch to expire and not complete within 24 hours.
See [above](#supported-models) for the list of supported models.
Yes, the Message Batches API supports all features available in the Messages API, including beta features. However, streaming is not supported for batch requests.
The Message Batches API offers a 50% discount on all usage compared to standard API prices. This applies to input tokens, output tokens, and any special tokens. For more on pricing, visit our [pricing page](https://claude.com/pricing#anthropic-api).
No, once a batch has been submitted, it cannot be modified. If you need to make changes, you should cancel the current batch and submit a new one. Note that cancellation may not take immediate effect.
The Message Batches API has HTTP requests-based rate limits in addition to limits on the number of requests in need of processing. See [Message Batches API rate limits](/docs/en/api/rate-limits#message-batches-api). Usage of the Batches API does not affect rate limits in the Messages API.
When you retrieve the results, each request will have a `result` field indicating whether it `succeeded`, `errored`, was `canceled`, or `expired`. For `errored` results, additional error information will be provided. View the error response object in the [API reference](/docs/en/api/creating-message-batches).
The Message Batches API is designed with strong privacy and data separation measures: 1. Batches and their results are isolated within the Workspace in which they were created. This means they can only be accessed by API keys from that same Workspace. 2. Each request within a batch is processed independently, with no data leakage between requests. 3. Results are only available for a limited time (29 days), and follow our [data retention policy](https://support.claude.com/en/articles/7996866-how-long-do-you-store-personal-data). 4. Downloading batch results in the Console can be disabled on the organization-level or on a per-workspace basis.
Yes, it is possible to use prompt caching with Message Batches API. However, because asynchronous batch requests can be processed concurrently and in any order, cache hits are provided on a best-effort basis.
--- # Building with extended thinking URL: https://platform.claude.com/docs/en/build-with-claude/extended-thinking # Building with extended thinking --- Extended thinking gives Claude enhanced reasoning capabilities for complex tasks, while providing varying levels of transparency into its step-by-step thought process before it delivers its final answer. ## Supported models Extended thinking is supported in the following models: - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Sonnet 3.7 (`claude-3-7-sonnet-20250219`) ([deprecated](/docs/en/about-claude/model-deprecations)) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) API behavior differs across Claude Sonnet 3.7 and Claude 4 models, but the API shapes remain exactly the same. For more information, see [Differences in thinking across model versions](#differences-in-thinking-across-model-versions). ## How extended thinking works When extended thinking is turned on, Claude creates `thinking` content blocks where it outputs its internal reasoning. Claude incorporates insights from this reasoning before crafting a final response. The API response will include `thinking` content blocks, followed by `text` content blocks. Here's an example of the default response format: ```json { "content": [ { "type": "thinking", "thinking": "Let me analyze this step by step...", "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...." }, { "type": "text", "text": "Based on my analysis..." } ] } ``` For more information about the response format of extended thinking, see the [Messages API Reference](/docs/en/api/messages). ## How to use extended thinking Here is an example of using extended thinking in the Messages API: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 16000, "thinking": { "type": "enabled", "budget_tokens": 10000 }, "messages": [ { "role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=16000, thinking={ "type": "enabled", "budget_tokens": 10000 }, messages=[{ "role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?" }] ) # The response will contain summarized thinking blocks and text blocks for block in response.content: if block.type == "thinking": print(f"\nThinking summary: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 16000, thinking: { type: "enabled", budget_tokens: 10000 }, messages: [{ role: "user", content: "Are there an infinite number of prime numbers such that n mod 4 == 3?" }] }); // The response will contain summarized thinking blocks and text blocks for (const block of response.content) { if (block.type === "thinking") { console.log(`\nThinking summary: ${block.thinking}`); } else if (block.type === "text") { console.log(`\nResponse: ${block.text}`); } } ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.beta.messages.*; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.messages.*; public class SimpleThinkingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); BetaMessage response = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addUserMessage("Are there an infinite number of prime numbers such that n mod 4 == 3?") .build() ); System.out.println(response); } } ``` To turn on extended thinking, add a `thinking` object, with the `type` parameter set to `enabled` and the `budget_tokens` to a specified token budget for extended thinking. The `budget_tokens` parameter determines the maximum number of tokens Claude is allowed to use for its internal reasoning process. In Claude 4 models, this limit applies to full thinking tokens, and not to [the summarized output](#summarized-thinking). Larger budgets can improve response quality by enabling more thorough analysis for complex problems, although Claude may not use the entire budget allocated, especially at ranges above 32k. `budget_tokens` must be set to a value less than `max_tokens`. However, when using [interleaved thinking with tools](#interleaved-thinking), you can exceed this limit as the token limit becomes your entire context window (200k tokens). ### Summarized thinking With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse. Here are some important considerations for summarized thinking: - You're charged for the full thinking tokens generated by the original request, not the summary tokens. - The billed output token count will **not match** the count of tokens you see in the response. - The first few lines of thinking output are more verbose, providing detailed reasoning that's particularly helpful for prompt engineering purposes. - As Anthropic seeks to improve the extended thinking feature, summarization behavior is subject to change. - Summarization preserves the key ideas of Claude's thinking process with minimal added latency, enabling a streamable user experience and easy migration from Claude Sonnet 3.7 to Claude 4 models. - Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output. Claude Sonnet 3.7 continues to return full thinking output. In rare cases where you need access to full thinking output for Claude 4 models, [contact our sales team](mailto:sales@anthropic.com). ### Streaming thinking You can stream extended thinking responses using [server-sent events (SSE)](https://developer.mozilla.org/en-US/Web/API/Server-sent%5Fevents/Using%5Fserver-sent%5Fevents). When streaming is enabled for extended thinking, you receive thinking content via `thinking_delta` events. For more documention on streaming via the Messages API, see [Streaming Messages](/docs/en/build-with-claude/streaming). Here's how to handle streaming with thinking: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 16000, "stream": true, "thinking": { "type": "enabled", "budget_tokens": 10000 }, "messages": [ { "role": "user", "content": "What is 27 * 453?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() with client.messages.stream( model="claude-sonnet-4-5", max_tokens=16000, thinking={"type": "enabled", "budget_tokens": 10000}, messages=[{"role": "user", "content": "What is 27 * 453?"}], ) as stream: thinking_started = False response_started = False for event in stream: if event.type == "content_block_start": print(f"\nStarting {event.content_block.type} block...") # Reset flags for each new block thinking_started = False response_started = False elif event.type == "content_block_delta": if event.delta.type == "thinking_delta": if not thinking_started: print("Thinking: ", end="", flush=True) thinking_started = True print(event.delta.thinking, end="", flush=True) elif event.delta.type == "text_delta": if not response_started: print("Response: ", end="", flush=True) response_started = True print(event.delta.text, end="", flush=True) elif event.type == "content_block_stop": print("\nBlock complete.") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const stream = await client.messages.stream({ model: "claude-sonnet-4-5", max_tokens: 16000, thinking: { type: "enabled", budget_tokens: 10000 }, messages: [{ role: "user", content: "What is 27 * 453?" }] }); let thinkingStarted = false; let responseStarted = false; for await (const event of stream) { if (event.type === 'content_block_start') { console.log(`\nStarting ${event.content_block.type} block...`); // Reset flags for each new block thinkingStarted = false; responseStarted = false; } else if (event.type === 'content_block_delta') { if (event.delta.type === 'thinking_delta') { if (!thinkingStarted) { process.stdout.write('Thinking: '); thinkingStarted = true; } process.stdout.write(event.delta.thinking); } else if (event.delta.type === 'text_delta') { if (!responseStarted) { process.stdout.write('Response: '); responseStarted = true; } process.stdout.write(event.delta.text); } } else if (event.type === 'content_block_stop') { console.log('\nBlock complete.'); } } ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.http.StreamResponse; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.beta.messages.BetaRawMessageStreamEvent; import com.anthropic.models.beta.messages.BetaThinkingConfigEnabled; import com.anthropic.models.messages.Model; public class SimpleThinkingStreamingExample { private static boolean thinkingStarted = false; private static boolean responseStarted = false; public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams createParams = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addUserMessage("What is 27 * 453?") .build(); try (StreamResponse streamResponse = client.beta().messages().createStreaming(createParams)) { streamResponse.stream() .forEach(event -> { if (event.isContentBlockStart()) { System.out.printf("\nStarting %s block...%n", event.asContentBlockStart()._type()); // Reset flags for each new block thinkingStarted = false; responseStarted = false; } else if (event.isContentBlockDelta()) { var delta = event.asContentBlockDelta().delta(); if (delta.isBetaThinking()) { if (!thinkingStarted) { System.out.print("Thinking: "); thinkingStarted = true; } System.out.print(delta.asBetaThinking().thinking()); System.out.flush(); } else if (delta.isBetaText()) { if (!responseStarted) { System.out.print("Response: "); responseStarted = true; } System.out.print(delta.asBetaText().text()); System.out.flush(); } } else if (event.isContentBlockStop()) { System.out.println("\nBlock complete."); } }); } } } ``` Try in Console Example streaming output: ```json event: message_start data: {"type": "message_start", "message": {"id": "msg_01...", "type": "message", "role": "assistant", "content": [], "model": "claude-sonnet-4-5", "stop_reason": null, "stop_sequence": null}} event: content_block_start data: {"type": "content_block_start", "index": 0, "content_block": {"type": "thinking", "thinking": ""}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "Let me solve this step by step:\n\n1. First break down 27 * 453"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n2. 453 = 400 + 50 + 3"}} // Additional thinking deltas... event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}} event: content_block_stop data: {"type": "content_block_stop", "index": 0} event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "text", "text": ""}} event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "text_delta", "text": "27 * 453 = 12,231"}} // Additional text deltas... event: content_block_stop data: {"type": "content_block_stop", "index": 1} event: message_delta data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}} event: message_stop data: {"type": "message_stop"} ``` When using streaming with thinking enabled, you might notice that text sometimes arrives in larger chunks alternating with smaller, token-by-token delivery. This is expected behavior, especially for thinking content. The streaming system needs to process content in batches for optimal performance, which can result in this "chunky" delivery pattern, with possible delays between streaming events. We're continuously working to improve this experience, with future updates focused on making thinking content stream more smoothly. ## Extended thinking with tool use Extended thinking can be used alongside [tool use](/docs/en/agents-and-tools/tool-use/overview), allowing Claude to reason through tool selection and results processing. When using extended thinking with tool use, be aware of the following limitations: 1. **Tool choice limitation**: Tool use with thinking only supports `tool_choice: {"type": "auto"}` (the default) or `tool_choice: {"type": "none"}`. Using `tool_choice: {"type": "any"}` or `tool_choice: {"type": "tool", "name": "..."}` will result in an error because these options force tool use, which is incompatible with extended thinking. 2. **Preserving thinking blocks**: During tool use, you must pass `thinking` blocks back to the API for the last assistant message. Include the complete unmodified block back to the API to maintain reasoning continuity. ### Toggling thinking modes in conversations You cannot toggle thinking in the middle of an assistant turn, including during tool use loops. The entire assistant turn should operate in a single thinking mode: - **If thinking is enabled**, the final assistant turn should start with a thinking block. - **If thinking is disabled**, the final assistant turn should not contain any thinking blocks From the model's perspective, **tool use loops are part of the assistant turn**. An assistant turn doesn't complete until Claude finishes its full response, which may include multiple tool calls and results. For example, this sequence is all part of a **single assistant turn**: ``` User: "What's the weather in Paris?" Assistant: [thinking] + [tool_use: get_weather] User: [tool_result: "20°C, sunny"] Assistant: [text: "The weather in Paris is 20°C and sunny"] ``` Even though there are multiple API messages, the tool use loop is conceptually part of one continuous assistant response. #### Graceful thinking degradation When a mid-turn thinking conflict occurs (such as toggling thinking on or off during a tool use loop), the API automatically disables thinking for that request. To preserve model quality and remain on-distribution, the API may: - Strip thinking blocks from the conversation when they would create an invalid turn structure - Disable thinking for the current request when the conversation history is incompatible with thinking being enabled This means that attempting to toggle thinking mid-turn won't cause an error, but thinking will be silently disabled for that request. To confirm whether thinking was active, check for the presence of `thinking` blocks in the response. #### Practical guidance **Best practice**: Plan your thinking strategy at the start of each turn rather than trying to toggle mid-turn. **Example: Toggling thinking after completing a turn** ``` User: "What's the weather?" Assistant: [tool_use] (thinking disabled) User: [tool_result] Assistant: [text: "It's sunny"] User: "What about tomorrow?" Assistant: [thinking] + [text: "..."] (thinking enabled - new turn) ``` By completing the assistant turn before toggling thinking, you ensure that thinking is actually enabled for the new request. Toggling thinking modes also invalidates prompt caching for message history. For more details, see the [Extended thinking with prompt caching](#extended-thinking-with-prompt-caching) section.
Here's a practical example showing how to preserve thinking blocks when providing tool results: ```python Python weather_tool = { "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] } } # First request - Claude responds with thinking and tool request response = client.messages.create( model="claude-sonnet-4-5", max_tokens=16000, thinking={ "type": "enabled", "budget_tokens": 10000 }, tools=[weather_tool], messages=[ {"role": "user", "content": "What's the weather in Paris?"} ] ) ``` ```typescript TypeScript const weatherTool = { name: "get_weather", description: "Get current weather for a location", input_schema: { type: "object", properties: { location: { type: "string" } }, required: ["location"] } }; // First request - Claude responds with thinking and tool request const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 16000, thinking: { type: "enabled", budget_tokens: 10000 }, tools: [weatherTool], messages: [ { role: "user", content: "What's the weather in Paris?" } ] }); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.beta.messages.BetaMessage; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.beta.messages.BetaThinkingConfigEnabled; import com.anthropic.models.beta.messages.BetaTool; import com.anthropic.models.beta.messages.BetaTool.InputSchema; import com.anthropic.models.messages.Model; public class ThinkingWithToolsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of("type", "string") ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); BetaTool weatherTool = BetaTool.builder() .name("get_weather") .description("Get current weather for a location") .inputSchema(schema) .build(); BetaMessage response = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addTool(weatherTool) .addUserMessage("What's the weather in Paris?") .build() ); System.out.println(response); } } ``` The API response will include thinking, text, and tool_use blocks: ```json { "content": [ { "type": "thinking", "thinking": "The user wants to know the current weather in Paris. I have access to a function `get_weather`...", "signature": "BDaL4VrbR2Oj0hO4XpJxT28J5TILnCrrUXoKiiNBZW9P+nr8XSj1zuZzAl4egiCCpQNvfyUuFFJP5CncdYZEQPPmLxYsNrcs...." }, { "type": "text", "text": "I can help you get the current weather information for Paris. Let me check that for you" }, { "type": "tool_use", "id": "toolu_01CswdEQBMshySk6Y9DFKrfq", "name": "get_weather", "input": { "location": "Paris" } } ] } ``` Now let's continue the conversation and use the tool ```python Python # Extract thinking block and tool use block thinking_block = next((block for block in response.content if block.type == 'thinking'), None) tool_use_block = next((block for block in response.content if block.type == 'tool_use'), None) # Call your actual weather API, here is where your actual API call would go # let's pretend this is what we get back weather_data = {"temperature": 88} # Second request - Include thinking block and tool result # No new thinking blocks will be generated in the response continuation = client.messages.create( model="claude-sonnet-4-5", max_tokens=16000, thinking={ "type": "enabled", "budget_tokens": 10000 }, tools=[weather_tool], messages=[ {"role": "user", "content": "What's the weather in Paris?"}, # notice that the thinking_block is passed in as well as the tool_use_block # if this is not passed in, an error is raised {"role": "assistant", "content": [thinking_block, tool_use_block]}, {"role": "user", "content": [{ "type": "tool_result", "tool_use_id": tool_use_block.id, "content": f"Current temperature: {weather_data['temperature']}°F" }]} ] ) ``` ```typescript TypeScript // Extract thinking block and tool use block const thinkingBlock = response.content.find(block => block.type === 'thinking'); const toolUseBlock = response.content.find(block => block.type === 'tool_use'); // Call your actual weather API, here is where your actual API call would go // let's pretend this is what we get back const weatherData = { temperature: 88 }; // Second request - Include thinking block and tool result // No new thinking blocks will be generated in the response const continuation = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 16000, thinking: { type: "enabled", budget_tokens: 10000 }, tools: [weatherTool], messages: [ { role: "user", content: "What's the weather in Paris?" }, // notice that the thinkingBlock is passed in as well as the toolUseBlock // if this is not passed in, an error is raised { role: "assistant", content: [thinkingBlock, toolUseBlock] }, { role: "user", content: [{ type: "tool_result", tool_use_id: toolUseBlock.id, content: `Current temperature: ${weatherData.temperature}°F` }]} ] }); ``` ```java Java import java.util.List; import java.util.Map; import java.util.Optional; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.beta.messages.*; import com.anthropic.models.beta.messages.BetaTool.InputSchema; import com.anthropic.models.messages.Model; public class ThinkingToolsResultExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of("type", "string") ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); BetaTool weatherTool = BetaTool.builder() .name("get_weather") .description("Get current weather for a location") .inputSchema(schema) .build(); BetaMessage response = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addTool(weatherTool) .addUserMessage("What's the weather in Paris?") .build() ); // Extract thinking block and tool use block Optional thinkingBlockOpt = response.content().stream() .filter(BetaContentBlock::isThinking) .map(BetaContentBlock::asThinking) .findFirst(); Optional toolUseBlockOpt = response.content().stream() .filter(BetaContentBlock::isToolUse) .map(BetaContentBlock::asToolUse) .findFirst(); if (thinkingBlockOpt.isPresent() && toolUseBlockOpt.isPresent()) { BetaThinkingBlock thinkingBlock = thinkingBlockOpt.get(); BetaToolUseBlock toolUseBlock = toolUseBlockOpt.get(); // Call your actual weather API, here is where your actual API call would go // let's pretend this is what we get back Map weatherData = Map.of("temperature", 88); // Second request - Include thinking block and tool result // No new thinking blocks will be generated in the response BetaMessage continuation = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addTool(weatherTool) .addUserMessage("What's the weather in Paris?") .addAssistantMessageOfBetaContentBlockParams( // notice that the thinkingBlock is passed in as well as the toolUseBlock // if this is not passed in, an error is raised List.of( BetaContentBlockParam.ofThinking(thinkingBlock.toParam()), BetaContentBlockParam.ofToolUse(toolUseBlock.toParam()) ) ) .addUserMessageOfBetaContentBlockParams(List.of( BetaContentBlockParam.ofToolResult( BetaToolResultBlockParam.builder() .toolUseId(toolUseBlock.id()) .content(String.format("Current temperature: %d°F", (Integer)weatherData.get("temperature"))) .build() ) )) .build() ); System.out.println(continuation); } } } ``` The API response will now **only** include text ```json { "content": [ { "type": "text", "text": "Currently in Paris, the temperature is 88°F (31°C)" } ] } ```
### Preserving thinking blocks During tool use, you must pass `thinking` blocks back to the API, and you must include the complete unmodified block back to the API. This is critical for maintaining the model's reasoning flow and conversation integrity. While you can omit `thinking` blocks from prior `assistant` role turns, we suggest always passing back all thinking blocks to the API for any multi-turn conversation. The API will: - Automatically filter the provided thinking blocks - Use the relevant thinking blocks necessary to preserve the model's reasoning - Only bill for the input tokens for the blocks shown to Claude When toggling thinking modes during a conversation, remember that the entire assistant turn (including tool use loops) must operate in a single thinking mode. For more details, see [Toggling thinking modes in conversations](#toggling-thinking-modes-in-conversations). When Claude invokes tools, it is pausing its construction of a response to await external information. When tool results are returned, Claude will continue building that existing response. This necessitates preserving thinking blocks during tool use, for a couple of reasons: 1. **Reasoning continuity**: The thinking blocks capture Claude's step-by-step reasoning that led to tool requests. When you post tool results, including the original thinking ensures Claude can continue its reasoning from where it left off. 2. **Context maintenance**: While tool results appear as user messages in the API structure, they're part of a continuous reasoning flow. Preserving thinking blocks maintains this conceptual flow across multiple API calls. For more information on context management, see our [guide on context windows](/docs/en/build-with-claude/context-windows). **Important**: When providing `thinking` blocks, the entire sequence of consecutive `thinking` blocks must match the outputs generated by the model during the original request; you cannot rearrange or modify the sequence of these blocks. ### Interleaved thinking Extended thinking with tool use in Claude 4 models supports interleaved thinking, which enables Claude to think between tool calls and make more sophisticated reasoning after receiving tool results. With interleaved thinking, Claude can: - Reason about the results of a tool call before deciding what to do next - Chain multiple tool calls with reasoning steps in between - Make more nuanced decisions based on intermediate results To enable interleaved thinking, add [the beta header](/docs/en/api/beta-headers) `interleaved-thinking-2025-05-14` to your API request. Here are some important considerations for interleaved thinking: - With interleaved thinking, the `budget_tokens` can exceed the `max_tokens` parameter, as it represents the total budget across all thinking blocks within one assistant turn. - Interleaved thinking is only supported for [tools used via the Messages API](/docs/en/agents-and-tools/tool-use/overview). - Interleaved thinking is supported for Claude 4 models only, with the beta header `interleaved-thinking-2025-05-14`. - Direct calls to the Claude API allow you to pass `interleaved-thinking-2025-05-14` in requests to any model, with no effect. - On 3rd-party platforms (e.g., [Amazon Bedrock](/docs/en/build-with-claude/claude-on-amazon-bedrock) and [Vertex AI](/docs/en/build-with-claude/claude-on-vertex-ai)), if you pass `interleaved-thinking-2025-05-14` to any model aside from Claude Opus 4.5, Claude Opus 4.1, Opus 4, or Sonnet 4, your request will fail.
Without interleaved thinking, Claude thinks once at the start of the assistant turn. Subsequent responses after tool results continue without new thinking blocks. ``` User: "What's the total revenue if we sold 150 units at $50 each, and how does this compare to our average monthly revenue?" Turn 1: [thinking] "I need to calculate 150 * $50, then check the database..." [tool_use: calculator] { "expression": "150 * 50" } ↓ tool result: "7500" Turn 2: [tool_use: database_query] { "query": "SELECT AVG(revenue)..." } ↑ no thinking block ↓ tool result: "5200" Turn 3: [text] "The total revenue is $7,500, which is 44% above your average monthly revenue of $5,200." ↑ no thinking block ```
With interleaved thinking enabled, Claude can think after receiving each tool result, allowing it to reason about intermediate results before continuing. ``` User: "What's the total revenue if we sold 150 units at $50 each, and how does this compare to our average monthly revenue?" Turn 1: [thinking] "I need to calculate 150 * $50 first..." [tool_use: calculator] { "expression": "150 * 50" } ↓ tool result: "7500" Turn 2: [thinking] "Got $7,500. Now I should query the database to compare..." [tool_use: database_query] { "query": "SELECT AVG(revenue)..." } ↑ thinking after receiving calculator result ↓ tool result: "5200" Turn 3: [thinking] "$7,500 vs $5,200 average - that's a 44% increase..." [text] "The total revenue is $7,500, which is 44% above your average monthly revenue of $5,200." ↑ thinking before final answer ```
## Extended thinking with prompt caching [Prompt caching](/docs/en/build-with-claude/prompt-caching) with thinking has several important considerations: Extended thinking tasks often take longer than 5 minutes to complete. Consider using the [1-hour cache duration](/docs/en/build-with-claude/prompt-caching#1-hour-cache-duration) to maintain cache hits across longer thinking sessions and multi-step workflows. **Thinking block context removal** - Thinking blocks from previous turns are removed from context, which can affect cache breakpoints - When continuing conversations with tool use, thinking blocks are cached and count as input tokens when read from cache - This creates a tradeoff: while thinking blocks don't consume context window space visually, they still count toward your input token usage when cached - If thinking becomes disabled and you pass thinking content in the current tool use turn, the thinking content will be stripped and thinking will remain disabled for that request **Cache invalidation patterns** - Changes to thinking parameters (enabled/disabled or budget allocation) invalidate message cache breakpoints - [Interleaved thinking](#interleaved-thinking) amplifies cache invalidation, as thinking blocks can occur between multiple [tool calls](#extended-thinking-with-tool-use) - System prompts and tools remain cached despite thinking parameter changes or block removal While thinking blocks are removed for caching and context calculations, they must be preserved when continuing conversations with [tool use](#extended-thinking-with-tool-use), especially with [interleaved thinking](#interleaved-thinking). ### Understanding thinking block caching behavior When using extended thinking with tool use, thinking blocks exhibit specific caching behavior that affects token counting: **How it works:** 1. Caching only occurs when you make a subsequent request that includes tool results 2. When the subsequent request is made, the previous conversation history (including thinking blocks) can be cached 3. These cached thinking blocks count as input tokens in your usage metrics when read from the cache 4. When a non-tool-result user block is included, all previous thinking blocks are ignored and stripped from context **Detailed example flow:** **Request 1:** ``` User: "What's the weather in Paris?" ``` **Response 1:** ``` [thinking_block_1] + [tool_use block 1] ``` **Request 2:** ``` User: ["What's the weather in Paris?"], Assistant: [thinking_block_1] + [tool_use block 1], User: [tool_result_1, cache=True] ``` **Response 2:** ``` [thinking_block_2] + [text block 2] ``` Request 2 writes a cache of the request content (not the response). The cache includes the original user message, the first thinking block, tool use block, and the tool result. **Request 3:** ``` User: ["What's the weather in Paris?"], Assistant: [thinking_block_1] + [tool_use block 1], User: [tool_result_1, cache=True], Assistant: [thinking_block_2] + [text block 2], User: [Text response, cache=True] ``` For Claude Opus 4.5 and later, all previous thinking blocks are kept by default. For older models, because a non-tool-result user block was included, all previous thinking blocks are ignored. This request will be processed the same as: ``` User: ["What's the weather in Paris?"], Assistant: [tool_use block 1], User: [tool_result_1, cache=True], Assistant: [text block 2], User: [Text response, cache=True] ``` **Key points:** - This caching behavior happens automatically, even without explicit `cache_control` markers - This behavior is consistent whether using regular thinking or interleaved thinking
```python Python from anthropic import Anthropic import requests from bs4 import BeautifulSoup client = Anthropic() def fetch_article_content(url): response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') # Remove script and style elements for script in soup(["script", "style"]): script.decompose() # Get text text = soup.get_text() # Break into lines and remove leading and trailing space on each lines = (line.strip() for line in text.splitlines()) # Break multi-headlines into a line each chunks = (phrase.strip() for line in lines for phrase in line.split(" ")) # Drop blank lines text = '\n'.join(chunk for chunk in chunks if chunk) return text # Fetch the content of the article book_url = "https://www.gutenberg.org/cache/epub/1342/pg1342.txt" book_content = fetch_article_content(book_url) # Use just enough text for caching (first few chapters) LARGE_TEXT = book_content[:5000] SYSTEM_PROMPT=[ { "type": "text", "text": "You are an AI assistant that is tasked with literary analysis. Analyze the following text carefully.", }, { "type": "text", "text": LARGE_TEXT, "cache_control": {"type": "ephemeral"} } ] MESSAGES = [ { "role": "user", "content": "Analyze the tone of this passage." } ] # First request - establish cache print("First request - establishing cache") response1 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 4000 }, system=SYSTEM_PROMPT, messages=MESSAGES ) print(f"First response usage: {response1.usage}") MESSAGES.append({ "role": "assistant", "content": response1.content }) MESSAGES.append({ "role": "user", "content": "Analyze the characters in this passage." }) # Second request - same thinking parameters (cache hit expected) print("\nSecond request - same thinking parameters (cache hit expected)") response2 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 4000 }, system=SYSTEM_PROMPT, messages=MESSAGES ) print(f"Second response usage: {response2.usage}") # Third request - different thinking parameters (cache miss for messages) print("\nThird request - different thinking parameters (cache miss for messages)") response3 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 8000 # Changed thinking budget }, system=SYSTEM_PROMPT, # System prompt remains cached messages=MESSAGES # Messages cache is invalidated ) print(f"Third response usage: {response3.usage}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; import axios from 'axios'; import * as cheerio from 'cheerio'; const client = new Anthropic(); async function fetchArticleContent(url: string): Promise { const response = await axios.get(url); const $ = cheerio.load(response.data); // Remove script and style elements $('script, style').remove(); // Get text let text = $.text(); // Break into lines and remove leading and trailing space on each const lines = text.split('\n').map(line => line.trim()); // Drop blank lines text = lines.filter(line => line.length > 0).join('\n'); return text; } // Fetch the content of the article const bookUrl = "https://www.gutenberg.org/cache/epub/1342/pg1342.txt"; const bookContent = await fetchArticleContent(bookUrl); // Use just enough text for caching (first few chapters) const LARGE_TEXT = bookContent.slice(0, 5000); const SYSTEM_PROMPT = [ { type: "text", text: "You are an AI assistant that is tasked with literary analysis. Analyze the following text carefully.", }, { type: "text", text: LARGE_TEXT, cache_control: { type: "ephemeral" } } ]; const MESSAGES = [ { role: "user", content: "Analyze the tone of this passage." } ]; // First request - establish cache console.log("First request - establishing cache"); const response1 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 4000 }, system: SYSTEM_PROMPT, messages: MESSAGES }); console.log(`First response usage: ${response1.usage}`); MESSAGES.push({ role: "assistant", content: response1.content }); MESSAGES.push({ role: "user", content: "Analyze the characters in this passage." }); // Second request - same thinking parameters (cache hit expected) console.log("\nSecond request - same thinking parameters (cache hit expected)"); const response2 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 4000 }, system: SYSTEM_PROMPT, messages: MESSAGES }); console.log(`Second response usage: ${response2.usage}`); // Third request - different thinking parameters (cache miss for messages) console.log("\nThird request - different thinking parameters (cache miss for messages)"); const response3 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 8000 // Changed thinking budget }, system: SYSTEM_PROMPT, // System prompt remains cached messages: MESSAGES // Messages cache is invalidated }); console.log(`Third response usage: ${response3.usage}`); ```
```python Python from anthropic import Anthropic import requests from bs4 import BeautifulSoup client = Anthropic() def fetch_article_content(url): response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') # Remove script and style elements for script in soup(["script", "style"]): script.decompose() # Get text text = soup.get_text() # Break into lines and remove leading and trailing space on each lines = (line.strip() for line in text.splitlines()) # Break multi-headlines into a line each chunks = (phrase.strip() for line in lines for phrase in line.split(" ")) # Drop blank lines text = '\n'.join(chunk for chunk in chunks if chunk) return text # Fetch the content of the article book_url = "https://www.gutenberg.org/cache/epub/1342/pg1342.txt" book_content = fetch_article_content(book_url) # Use just enough text for caching (first few chapters) LARGE_TEXT = book_content[:5000] # No system prompt - caching in messages instead MESSAGES = [ { "role": "user", "content": [ { "type": "text", "text": LARGE_TEXT, "cache_control": {"type": "ephemeral"}, }, { "type": "text", "text": "Analyze the tone of this passage." } ] } ] # First request - establish cache print("First request - establishing cache") response1 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 4000 }, messages=MESSAGES ) print(f"First response usage: {response1.usage}") MESSAGES.append({ "role": "assistant", "content": response1.content }) MESSAGES.append({ "role": "user", "content": "Analyze the characters in this passage." }) # Second request - same thinking parameters (cache hit expected) print("\nSecond request - same thinking parameters (cache hit expected)") response2 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 4000 # Same thinking budget }, messages=MESSAGES ) print(f"Second response usage: {response2.usage}") MESSAGES.append({ "role": "assistant", "content": response2.content }) MESSAGES.append({ "role": "user", "content": "Analyze the setting in this passage." }) # Third request - different thinking budget (cache miss expected) print("\nThird request - different thinking budget (cache miss expected)") response3 = client.messages.create( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 8000 # Different thinking budget breaks cache }, messages=MESSAGES ) print(f"Third response usage: {response3.usage}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; import axios from 'axios'; import * as cheerio from 'cheerio'; const client = new Anthropic(); async function fetchArticleContent(url: string): Promise { const response = await axios.get(url); const $ = cheerio.load(response.data); // Remove script and style elements $('script, style').remove(); // Get text let text = $.text(); // Clean up text (break into lines, remove whitespace) const lines = text.split('\n').map(line => line.trim()); const chunks = lines.flatMap(line => line.split(' ').map(phrase => phrase.trim())); text = chunks.filter(chunk => chunk).join('\n'); return text; } async function main() { // Fetch the content of the article const bookUrl = "https://www.gutenberg.org/cache/epub/1342/pg1342.txt"; const bookContent = await fetchArticleContent(bookUrl); // Use just enough text for caching (first few chapters) const LARGE_TEXT = bookContent.substring(0, 5000); // No system prompt - caching in messages instead let MESSAGES = [ { role: "user", content: [ { type: "text", text: LARGE_TEXT, cache_control: {type: "ephemeral"}, }, { type: "text", text: "Analyze the tone of this passage." } ] } ]; // First request - establish cache console.log("First request - establishing cache"); const response1 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 4000 }, messages: MESSAGES }); console.log(`First response usage: `, response1.usage); MESSAGES = [ ...MESSAGES, { role: "assistant", content: response1.content }, { role: "user", content: "Analyze the characters in this passage." } ]; // Second request - same thinking parameters (cache hit expected) console.log("\nSecond request - same thinking parameters (cache hit expected)"); const response2 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 4000 // Same thinking budget }, messages: MESSAGES }); console.log(`Second response usage: `, response2.usage); MESSAGES = [ ...MESSAGES, { role: "assistant", content: response2.content }, { role: "user", content: "Analyze the setting in this passage." } ]; // Third request - different thinking budget (cache miss expected) console.log("\nThird request - different thinking budget (cache miss expected)"); const response3 = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 20000, thinking: { type: "enabled", budget_tokens: 8000 // Different thinking budget breaks cache }, messages: MESSAGES }); console.log(`Third response usage: `, response3.usage); } main().catch(console.error); ``` ```java Java import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util.List; import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import java.util.Arrays; import java.util.regex.Pattern; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.beta.messages.*; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import static java.util.stream.Collectors.joining; import static java.util.stream.Collectors.toList; public class ThinkingCacheExample { public static void main(String[] args) throws IOException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Fetch the content of the article String bookUrl = "https://www.gutenberg.org/cache/epub/1342/pg1342.txt"; String bookContent = fetchArticleContent(bookUrl); // Use just enough text for caching (first few chapters) String largeText = bookContent.substring(0, 5000); List systemPrompt = List.of( BetaTextBlockParam.builder() .text("You are an AI assistant that is tasked with literary analysis. Analyze the following text carefully.") .build(), BetaTextBlockParam.builder() .text(largeText) .cacheControl(BetaCacheControlEphemeral.builder().build()) .build() ); List messages = new ArrayList<>(); messages.add(BetaMessageParam.builder() .role(BetaMessageParam.Role.USER) .content("Analyze the tone of this passage.") .build()); // First request - establish cache System.out.println("First request - establishing cache"); BetaMessage response1 = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(20000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(4000).build()) .systemOfBetaTextBlockParams(systemPrompt) .messages(messages) .build() ); System.out.println("First response usage: " + response1.usage()); // Second request - same thinking parameters (cache hit expected) System.out.println("\nSecond request - same thinking parameters (cache hit expected)"); BetaMessage response2 = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(20000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(4000).build()) .systemOfBetaTextBlockParams(systemPrompt) .addMessage(response1) .addUserMessage("Analyze the characters in this passage.") .messages(messages) .build() ); System.out.println("Second response usage: " + response2.usage()); // Third request - different thinking budget (cache hit expected because system prompt caching) System.out.println("\nThird request - different thinking budget (cache hit expected)"); BetaMessage response3 = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(20000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(8000).build()) .systemOfBetaTextBlockParams(systemPrompt) .addMessage(response1) .addUserMessage("Analyze the characters in this passage.") .addMessage(response2) .addUserMessage("Analyze the setting in this passage.") .build() ); System.out.println("Third response usage: " + response3.usage()); } private static String fetchArticleContent(String url) throws IOException { // Fetch HTML content String htmlContent = fetchHtml(url); // Remove script and style elements String noScriptStyle = removeElements(htmlContent, "script", "style"); // Extract text (simple approach - remove HTML tags) String text = removeHtmlTags(noScriptStyle); // Clean up text (break into lines, remove whitespace) List lines = Arrays.asList(text.split("\n")); List trimmedLines = lines.stream() .map(String::trim) .collect(toList()); // Split on double spaces and flatten List chunks = trimmedLines.stream() .flatMap(line -> Arrays.stream(line.split(" ")) .map(String::trim)) .collect(toList()); // Filter empty chunks and join with newlines return chunks.stream() .filter(chunk -> !chunk.isEmpty()) .collect(joining("\n")); } /** * Fetches HTML content from a URL */ private static String fetchHtml(String urlString) throws IOException { try (InputStream inputStream = new URL(urlString).openStream()) { StringBuilder content = new StringBuilder(); try (BufferedReader reader = new BufferedReader( new InputStreamReader(inputStream))) { String line; while ((line = reader.readLine()) != null) { content.append(line).append("\n"); } } return content.toString(); } } /** * Removes specified HTML elements and their content */ private static String removeElements(String html, String... elementNames) { String result = html; for (String element : elementNames) { // Pattern to match ... and self-closing tags String pattern = "<" + element + "\\s*[^>]*>.*?|<" + element + "\\s*[^>]*/?>"; result = Pattern.compile(pattern, Pattern.DOTALL).matcher(result).replaceAll(""); } return result; } /** * Removes all HTML tags from content */ private static String removeHtmlTags(String html) { // Replace
and

tags with newlines for better text formatting String withLineBreaks = html.replaceAll("|]*>", "\n"); // Remove remaining HTML tags String noTags = withLineBreaks.replaceAll("<[^>]*>", ""); // Decode HTML entities (simplified for common entities) return decodeHtmlEntities(noTags); } /** * Simple HTML entity decoder for common entities */ private static String decodeHtmlEntities(String text) { return text .replaceAll(" ", " ") .replaceAll("&", "&") .replaceAll("<", "<") .replaceAll(">", ">") .replaceAll(""", "\"") .replaceAll("'", "'") .replaceAll("…", "...") .replaceAll("—", "—"); } } ``` Here is the output of the script (you may see slightly different numbers) ``` First request - establishing cache First response usage: { cache_creation_input_tokens: 1370, cache_read_input_tokens: 0, input_tokens: 17, output_tokens: 700 } Second request - same thinking parameters (cache hit expected) Second response usage: { cache_creation_input_tokens: 0, cache_read_input_tokens: 1370, input_tokens: 303, output_tokens: 874 } Third request - different thinking budget (cache miss expected) Third response usage: { cache_creation_input_tokens: 1370, cache_read_input_tokens: 0, input_tokens: 747, output_tokens: 619 } ``` This example demonstrates that when caching is set up in the messages array, changing the thinking parameters (budget_tokens increased from 4000 to 8000) **invalidates the cache**. The third request shows no cache hit with `cache_creation_input_tokens=1370` and `cache_read_input_tokens=0`, proving that message-based caching is invalidated when thinking parameters change.

## Max tokens and context window size with extended thinking In older Claude models (prior to Claude Sonnet 3.7), if the sum of prompt tokens and `max_tokens` exceeded the model's context window, the system would automatically adjust `max_tokens` to fit within the context limit. This meant you could set a large `max_tokens` value and the system would silently reduce it as needed. With Claude 3.7 and 4 models, `max_tokens` (which includes your thinking budget when thinking is enabled) is enforced as a strict limit. The system will now return a validation error if prompt tokens + `max_tokens` exceeds the context window size. You can read through our [guide on context windows](/docs/en/build-with-claude/context-windows) for a more thorough deep dive. ### The context window with extended thinking When calculating context window usage with thinking enabled, there are some considerations to be aware of: - Thinking blocks from previous turns are stripped and not counted towards your context window - Current turn thinking counts towards your `max_tokens` limit for that turn The diagram below demonstrates the specialized token management when extended thinking is enabled: ![Context window diagram with extended thinking](/docs/images/context-window-thinking.svg) The effective context window is calculated as: ``` context window = (current input tokens - previous thinking tokens) + (thinking tokens + encrypted thinking tokens + text output tokens) ``` We recommend using the [token counting API](/docs/en/build-with-claude/token-counting) to get accurate token counts for your specific use case, especially when working with multi-turn conversations that include thinking. ### The context window with extended thinking and tool use When using extended thinking with tool use, thinking blocks must be explicitly preserved and returned with the tool results. The effective context window calculation for extended thinking with tool use becomes: ``` context window = (current input tokens + previous thinking tokens + tool use tokens) + (thinking tokens + encrypted thinking tokens + text output tokens) ``` The diagram below illustrates token management for extended thinking with tool use: ![Context window diagram with extended thinking and tool use](/docs/images/context-window-thinking-tools.svg) ### Managing tokens with extended thinking Given the context window and `max_tokens` behavior with extended thinking Claude 3.7 and 4 models, you may need to: - More actively monitor and manage your token usage - Adjust `max_tokens` values as your prompt length changes - Potentially use the [token counting endpoints](/docs/en/build-with-claude/token-counting) more frequently - Be aware that previous thinking blocks don't accumulate in your context window This change has been made to provide more predictable and transparent behavior, especially as maximum token limits have increased significantly. ## Thinking encryption Full thinking content is encrypted and returned in the `signature` field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API. It is only strictly necessary to send back thinking blocks when using [tools with extended thinking](#extended-thinking-with-tool-use). Otherwise you can omit thinking blocks from previous turns, or let the API strip them for you if you pass them back. If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues. Here are some important considerations on thinking encryption: - When [streaming responses](#streaming-thinking), the signature is added via a `signature_delta` inside a `content_block_delta` event just before the `content_block_stop` event. - `signature` values are significantly longer in Claude 4 models than in previous models. - The `signature` field is an opaque field and should not be interpreted or parsed - it exists solely for verification purposes. - `signature` values are compatible across platforms (Claude APIs, [Amazon Bedrock](/docs/en/build-with-claude/claude-on-amazon-bedrock), and [Vertex AI](/docs/en/build-with-claude/claude-on-vertex-ai)). Values generated on one platform will be compatible with another. ### Thinking redaction Occasionally Claude's internal reasoning will be flagged by our safety systems. When this occurs, we encrypt some or all of the `thinking` block and return it to you as a `redacted_thinking` block. `redacted_thinking` blocks are decrypted when passed back to the API, allowing Claude to continue its response without losing context. When building customer-facing applications that use extended thinking: - Be aware that redacted thinking blocks contain encrypted content that isn't human-readable - Consider providing a simple explanation like: "Some of Claude's internal reasoning has been automatically encrypted for safety reasons. This doesn't affect the quality of responses." - If showing thinking blocks to users, you can filter out redacted blocks while preserving normal thinking blocks - Be transparent that using extended thinking features may occasionally result in some reasoning being encrypted - Implement appropriate error handling to gracefully manage redacted thinking without breaking your UI Here's an example showing both normal and redacted thinking blocks: ```json { "content": [ { "type": "thinking", "thinking": "Let me analyze this step by step...", "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...." }, { "type": "redacted_thinking", "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..." }, { "type": "text", "text": "Based on my analysis..." } ] } ``` Seeing redacted thinking blocks in your output is expected behavior. The model can still use this redacted reasoning to inform its responses while maintaining safety guardrails. If you need to test redacted thinking handling in your application, you can use this special test string as your prompt: `ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB` When passing `thinking` and `redacted_thinking` blocks back to the API in a multi-turn conversation, you must include the complete unmodified block back to the API for the last assistant turn. This is critical for maintaining the model's reasoning flow. We suggest always passing back all thinking blocks to the API. For more details, see the [Preserving thinking blocks](#preserving-thinking-blocks) section above.
This example demonstrates how to handle `redacted_thinking` blocks that may appear in responses when Claude's internal reasoning contains content flagged by safety systems: ```python Python import anthropic client = anthropic.Anthropic() # Using a special prompt that triggers redacted thinking (for demonstration purposes only) response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=16000, thinking={ "type": "enabled", "budget_tokens": 10000 }, messages=[{ "role": "user", "content": "ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB" }] ) # Identify redacted thinking blocks has_redacted_thinking = any( block.type == "redacted_thinking" for block in response.content ) if has_redacted_thinking: print("Response contains redacted thinking blocks") # These blocks are still usable in subsequent requests # Extract all blocks (both redacted and non-redacted) all_thinking_blocks = [ block for block in response.content if block.type in ["thinking", "redacted_thinking"] ] # When passing to subsequent requests, include all blocks without modification # This preserves the integrity of Claude's reasoning print(f"Found {len(all_thinking_blocks)} thinking blocks total") print(f"These blocks are still billable as output tokens") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // Using a special prompt that triggers redacted thinking (for demonstration purposes only) const response = await client.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 16000, thinking: { type: "enabled", budget_tokens: 10000 }, messages: [{ role: "user", content: "ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB" }] }); // Identify redacted thinking blocks const hasRedactedThinking = response.content.some( block => block.type === "redacted_thinking" ); if (hasRedactedThinking) { console.log("Response contains redacted thinking blocks"); // These blocks are still usable in subsequent requests // Extract all blocks (both redacted and non-redacted) const allThinkingBlocks = response.content.filter( block => block.type === "thinking" || block.type === "redacted_thinking" ); // When passing to subsequent requests, include all blocks without modification // This preserves the integrity of Claude's reasoning console.log(`Found ${allThinkingBlocks.length} thinking blocks total`); console.log(`These blocks are still billable as output tokens`); } ``` ```java Java import java.util.List; import static java.util.stream.Collectors.toList; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.beta.messages.BetaContentBlock; import com.anthropic.models.beta.messages.BetaMessage; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.beta.messages.BetaThinkingConfigEnabled; import com.anthropic.models.messages.Model; public class RedactedThinkingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Using a special prompt that triggers redacted thinking (for demonstration purposes only) BetaMessage response = client.beta().messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_5) .maxTokens(16000) .thinking(BetaThinkingConfigEnabled.builder().budgetTokens(10000).build()) .addUserMessage("ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB") .build() ); // Identify redacted thinking blocks boolean hasRedactedThinking = response.content().stream() .anyMatch(BetaContentBlock::isRedactedThinking); if (hasRedactedThinking) { System.out.println("Response contains redacted thinking blocks"); // These blocks are still usable in subsequent requests // Extract all blocks (both redacted and non-redacted) List allThinkingBlocks = response.content().stream() .filter(block -> block.isThinking() || block.isRedactedThinking()) .collect(toList()); // When passing to subsequent requests, include all blocks without modification // This preserves the integrity of Claude's reasoning System.out.println("Found " + allThinkingBlocks.size() + " thinking blocks total"); System.out.println("These blocks are still billable as output tokens"); } } } ``` Try in Console
## Differences in thinking across model versions The Messages API handles thinking differently across Claude Sonnet 3.7 and Claude 4 models, primarily in redaction and summarization behavior. See the table below for a condensed comparison: | Feature | Claude Sonnet 3.7 | Claude 4 Models (pre-Opus 4.5) | Claude Opus 4.5 and later | |---------|------------------|-------------------------------|--------------------------| | **Thinking Output** | Returns full thinking output | Returns summarized thinking | Returns summarized thinking | | **Interleaved Thinking** | Not supported | Supported with `interleaved-thinking-2025-05-14` beta header | Supported with `interleaved-thinking-2025-05-14` beta header | | **Thinking Block Preservation** | Not preserved across turns | Not preserved across turns | **Preserved by default** (enables cache optimization, token savings) | ### Thinking block preservation in Claude Opus 4.5 Claude Opus 4.5 introduces a new default behavior: **thinking blocks from previous assistant turns are preserved in model context by default**. This differs from earlier models, which remove thinking blocks from prior turns. **Benefits of thinking block preservation:** - **Cache optimization**: When using tool use, preserved thinking blocks enable cache hits as they are passed back with tool results and cached incrementally across the assistant turn, resulting in token savings in multi-step workflows - **No intelligence impact**: Preserving thinking blocks has no negative effect on model performance **Important considerations:** - **Context usage**: Long conversations will consume more context space since thinking blocks are retained in context - **Automatic behavior**: This is the default behavior for Claude Opus 4.5—no code changes or beta headers required - **Backward compatibility**: To leverage this feature, continue passing complete, unmodified thinking blocks back to the API as you would for tool use For earlier models (Claude Sonnet 4.5, Opus 4.1, etc.), thinking blocks from previous turns continue to be removed from context. The existing behavior described in the [Extended thinking with prompt caching](#extended-thinking-with-prompt-caching) section applies to those models. ## Pricing For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the [pricing page](/docs/en/about-claude/pricing). The thinking process incurs charges for: - Tokens used during thinking (output tokens) - Thinking blocks from the last assistant turn included in subsequent requests (input tokens) - Standard text output tokens When extended thinking is enabled, a specialized system prompt is automatically included to support this feature. When using summarized thinking: - **Input tokens**: Tokens in your original request (excludes thinking tokens from previous turns) - **Output tokens (billed)**: The original thinking tokens that Claude generated internally - **Output tokens (visible)**: The summarized thinking tokens you see in the response - **No charge**: Tokens used to generate the summary The billed output token count will **not** match the visible token count in the response. You are billed for the full thinking process, not the summary you see. ## Best practices and considerations for extended thinking ### Working with thinking budgets - **Budget optimization:** The minimum budget is 1,024 tokens. We suggest starting at the minimum and increasing the thinking budget incrementally to find the optimal range for your use case. Higher token counts enable more comprehensive reasoning but with diminishing returns depending on the task. Increasing the budget can improve response quality at the tradeoff of increased latency. For critical tasks, test different settings to find the optimal balance. Note that the thinking budget is a target rather than a strict limit—actual token usage may vary based on the task. - **Starting points:** Start with larger thinking budgets (16k+ tokens) for complex tasks and adjust based on your needs. - **Large budgets:** For thinking budgets above 32k, we recommend using [batch processing](/docs/en/build-with-claude/batch-processing) to avoid networking issues. Requests pushing the model to think above 32k tokens causes long running requests that might run up against system timeouts and open connection limits. - **Token usage tracking:** Monitor thinking token usage to optimize costs and performance. ### Performance considerations - **Response times:** Be prepared for potentially longer response times due to the additional processing required for the reasoning process. Factor in that generating thinking blocks may increase overall response time. - **Streaming requirements:** Streaming is required when `max_tokens` is greater than 21,333. When streaming, be prepared to handle both thinking and text content blocks as they arrive. ### Feature compatibility - Thinking isn't compatible with `temperature` or `top_k` modifications as well as [forced tool use](/docs/en/agents-and-tools/tool-use/implement-tool-use#forcing-tool-use). - When thinking is enabled, you can set `top_p` to values between 1 and 0.95. - You cannot pre-fill responses when thinking is enabled. - Changes to the thinking budget invalidate cached prompt prefixes that include messages. However, cached system prompts and tool definitions will continue to work when thinking parameters change. ### Usage guidelines - **Task selection:** Use extended thinking for particularly complex tasks that benefit from step-by-step reasoning like math, coding, and analysis. - **Context handling:** You do not need to remove previous thinking blocks yourself. The Claude API automatically ignores thinking blocks from previous turns and they are not included when calculating context usage. - **Prompt engineering:** Review our [extended thinking prompting tips](/docs/en/build-with-claude/prompt-engineering/extended-thinking-tips) if you want to maximize Claude's thinking capabilities. ## Next steps Explore practical examples of thinking in our cookbook. Learn prompt engineering best practices for extended thinking. --- # Citations URL: https://platform.claude.com/docs/en/build-with-claude/citations # Citations --- Claude is capable of providing detailed citations when answering questions about documents, helping you track and verify information sources in responses. All [active models](/docs/en/about-claude/models/overview) support citations, with the exception of Haiku 3. *Citations with Claude Sonnet 3.7* Claude Sonnet 3.7 may be less likely to make citations compared to other Claude models without more explicit instructions from the user. When using citations with Claude Sonnet 3.7, we recommend including additional instructions in the `user` turn, like `"Use citations to back up your answer."` for example. We've also observed that when the model is asked to structure its response, it is unlikely to use citations unless explicitly told to use citations within that format. For example, if the model is asked to use `` tags in its response, you should add something like `"Always use citations in your answer, even within tags."` Please share your feedback and suggestions about the citations feature using this [form](https://forms.gle/9n9hSrKnKe3rpowH9). Here's an example of how to use citations with the Messages API: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "The grass is green. The sky is blue." }, "title": "My Document", "context": "This is a trustworthy document.", "citations": {"enabled": true} }, { "type": "text", "text": "What color is the grass and sky?" } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "The grass is green. The sky is blue." }, "title": "My Document", "context": "This is a trustworthy document.", "citations": {"enabled": True} }, { "type": "text", "text": "What color is the grass and sky?" } ] } ] ) print(response) ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.*; public class DocumentExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); PlainTextSource source = PlainTextSource.builder() .data("The grass is green. The sky is blue.") .build(); DocumentBlockParam documentParam = DocumentBlockParam.builder() .source(source) .title("My Document") .context("This is a trustworthy document.") .citations(CitationsConfigParam.builder().enabled(true).build()) .build(); TextBlockParam textBlockParam = TextBlockParam.builder() .text("What color is the grass and sky?") .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams(List.of(ContentBlockParam.ofDocument(documentParam), ContentBlockParam.ofText(textBlockParam))) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` **Comparison with prompt-based approaches** In comparison with prompt-based citations solutions, the citations feature has the following advantages: - **Cost savings:** If your prompt-based approach asks Claude to output direct quotes, you may see cost savings due to the fact that `cited_text` does not count towards your output tokens. - **Better citation reliability:** Because we parse citations into the respective response formats mentioned above and extract `cited_text`, citations are guaranteed to contain valid pointers to the provided documents. - **Improved citation quality:** In our evals, we found the citations feature to be significantly more likely to cite the most relevant quotes from documents as compared to purely prompt-based approaches. --- ## How citations work Integrate citations with Claude in these steps: - Include documents in any of the supported formats: [PDFs](#pdf-documents), [plain text](#plain-text-documents), or [custom content](#custom-content-documents) documents - Set `citations.enabled=true` on each of your documents. Currently, citations must be enabled on all or none of the documents within a request. - Note that only text citations are currently supported and image citations are not yet possible. - Document contents are "chunked" in order to define the minimum granularity of possible citations. For example, sentence chunking would allow Claude to cite a single sentence or chain together multiple consecutive sentences to cite a paragraph (or longer)! - **For PDFs:** Text is extracted as described in [PDF Support](/docs/en/build-with-claude/pdf-support) and content is chunked into sentences. Citing images from PDFs is not currently supported. - **For plain text documents:** Content is chunked into sentences that can be cited from. - **For custom content documents:** Your provided content blocks are used as-is and no further chunking is done. - Responses may now include multiple text blocks where each text block can contain a claim that Claude is making and a list of citations that support the claim. - Citations reference specific locations in source documents. The format of these citations are dependent on the type of document being cited from. - **For PDFs:** citations will include the page number range (1-indexed). - **For plain text documents:** Citations will include the character index range (0-indexed). - **For custom content documents:** Citations will include the content block index range (0-indexed) corresponding to the original content list provided. - Document indices are provided to indicate the reference source and are 0-indexed according to the list of all documents in your original request. **Automatic chunking vs custom content** By default, plain text and PDF documents are automatically chunked into sentences. If you need more control over citation granularity (e.g., for bullet points or transcripts), use custom content documents instead. See [Document Types](#document-types) for more details. For example, if you want Claude to be able to cite specific sentences from your RAG chunks, you should put each RAG chunk into a plain text document. Otherwise, if you do not want any further chunking to be done, or if you want to customize any additional chunking, you can put RAG chunks into custom content document(s). ### Citable vs non-citable content - Text found within a document's `source` content can be cited from. - `title` and `context` are optional fields that will be passed to the model but not used towards cited content. - `title` is limited in length so you may find the `context` field to be useful in storing any document metadata as text or stringified json. ### Citation indices - Document indices are 0-indexed from the list of all document content blocks in the request (spanning across all messages). - Character indices are 0-indexed with exclusive end indices. - Page numbers are 1-indexed with exclusive end page numbers. - Content block indices are 0-indexed with exclusive end indices from the `content` list provided in the custom content document. ### Token costs - Enabling citations incurs a slight increase in input tokens due to system prompt additions and document chunking. - However, the citations feature is very efficient with output tokens. Under the hood, the model is outputting citations in a standardized format that are then parsed into cited text and document location indices. The `cited_text` field is provided for convenience and does not count towards output tokens. - When passed back in subsequent conversation turns, `cited_text` is also not counted towards input tokens. ### Feature compatibility Citations works in conjunction with other API features including [prompt caching](/docs/en/build-with-claude/prompt-caching), [token counting](/docs/en/build-with-claude/token-counting) and [batch processing](/docs/en/build-with-claude/batch-processing). **Citations and Structured Outputs are incompatible** Citations cannot be used together with [Structured Outputs](/docs/en/build-with-claude/structured-outputs). If you enable citations on any user-provided document (Document blocks or RequestSearchResultBlock) and also include the `output_format` parameter, the API will return a 400 error. This is because citations require interleaving citation blocks with text output, which is incompatible with the strict JSON schema constraints of structured outputs. #### Using Prompt Caching with Citations Citations and prompt caching can be used together effectively. The citation blocks generated in responses cannot be cached directly, but the source documents they reference can be cached. To optimize performance, apply `cache_control` to your top-level document content blocks. ```python Python import anthropic client = anthropic.Anthropic() # Long document content (e.g., technical documentation) long_document = "This is a very long document with thousands of words..." + " ... " * 1000 # Minimum cacheable length response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": long_document }, "citations": {"enabled": True}, "cache_control": {"type": "ephemeral"} # Cache the document content }, { "type": "text", "text": "What does this document say about API features?" } ] } ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // Long document content (e.g., technical documentation) const longDocument = "This is a very long document with thousands of words..." + " ... ".repeat(1000); // Minimum cacheable length const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: [ { type: "document", source: { type: "text", media_type: "text/plain", data: longDocument }, citations: { enabled: true }, cache_control: { type: "ephemeral" } // Cache the document content }, { type: "text", text: "What does this document say about API features?" } ] } ] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "This is a very long document with thousands of words..." }, "citations": {"enabled": true}, "cache_control": {"type": "ephemeral"} }, { "type": "text", "text": "What does this document say about API features?" } ] } ] }' ``` In this example: - The document content is cached using `cache_control` on the document block - Citations are enabled on the document - Claude can generate responses with citations while benefiting from cached document content - Subsequent requests using the same document will benefit from the cached content ## Document Types ### Choosing a document type We support three document types for citations. Documents can be provided directly in the message (base64, text, or URL) or uploaded via the [Files API](/docs/en/build-with-claude/files) and referenced by `file_id`: | Type | Best for | Chunking | Citation format | | :--- | :--- | :--- | :--- | | Plain text | Simple text documents, prose | Sentence | Character indices (0-indexed) | | PDF | PDF files with text content | Sentence | Page numbers (1-indexed) | | Custom content | Lists, transcripts, special formatting, more granular citations | No additional chunking | Block indices (0-indexed) | .csv, .xlsx, .docx, .md, and .txt files are not supported as document blocks. Convert these to plain text and include directly in message content. See [Working with other file formats](/docs/en/build-with-claude/files#working-with-other-file-formats). ### Plain text documents Plain text documents are automatically chunked into sentences. You can provide them inline or by reference with their `file_id`: ```python { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "Plain text content..." }, "title": "Document Title", # optional "context": "Context about the document that will not be cited from", # optional "citations": {"enabled": True} } ``` ```python { "type": "document", "source": { "type": "file", "file_id": "file_011CNvxoj286tYUAZFiZMf1U" }, "title": "Document Title", # optional "context": "Context about the document that will not be cited from", # optional "citations": {"enabled": True} } ```
```python { "type": "char_location", "cited_text": "The exact text being cited", # not counted towards output tokens "document_index": 0, "document_title": "Document Title", "start_char_index": 0, # 0-indexed "end_char_index": 50 # exclusive } ```
### PDF documents PDF documents can be provided as base64-encoded data or by `file_id`. PDF text is extracted and chunked into sentences. As image citations are not yet supported, PDFs that are scans of documents and do not contain extractable text will not be citable. ```python { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": base64_encoded_pdf_data }, "title": "Document Title", # optional "context": "Context about the document that will not be cited from", # optional "citations": {"enabled": True} } ``` ```python { "type": "document", "source": { "type": "file", "file_id": "file_011CNvxoj286tYUAZFiZMf1U" }, "title": "Document Title", # optional "context": "Context about the document that will not be cited from", # optional "citations": {"enabled": True} } ```
```python { "type": "page_location", "cited_text": "The exact text being cited", # not counted towards output tokens "document_index": 0, "document_title": "Document Title", "start_page_number": 1, # 1-indexed "end_page_number": 2 # exclusive } ```
### Custom content documents Custom content documents give you control over citation granularity. No additional chunking is done and chunks are provided to the model according to the content blocks provided. ```python { "type": "document", "source": { "type": "content", "content": [ {"type": "text", "text": "First chunk"}, {"type": "text", "text": "Second chunk"} ] }, "title": "Document Title", # optional "context": "Context about the document that will not be cited from", # optional "citations": {"enabled": True} } ```
```python { "type": "content_block_location", "cited_text": "The exact text being cited", # not counted towards output tokens "document_index": 0, "document_title": "Document Title", "start_block_index": 0, # 0-indexed "end_block_index": 1 # exclusive } ```
--- ## Response Structure When citations are enabled, responses include multiple text blocks with citations: ```python { "content": [ { "type": "text", "text": "According to the document, " }, { "type": "text", "text": "the grass is green", "citations": [{ "type": "char_location", "cited_text": "The grass is green.", "document_index": 0, "document_title": "Example Document", "start_char_index": 0, "end_char_index": 20 }] }, { "type": "text", "text": " and " }, { "type": "text", "text": "the sky is blue", "citations": [{ "type": "char_location", "cited_text": "The sky is blue.", "document_index": 0, "document_title": "Example Document", "start_char_index": 20, "end_char_index": 36 }] }, { "type": "text", "text": ". Information from page 5 states that ", }, { "type": "text", "text": "water is essential", "citations": [{ "type": "page_location", "cited_text": "Water is essential for life.", "document_index": 1, "document_title": "PDF Document", "start_page_number": 5, "end_page_number": 6 }] }, { "type": "text", "text": ". The custom document mentions ", }, { "type": "text", "text": "important findings", "citations": [{ "type": "content_block_location", "cited_text": "These are important findings.", "document_index": 2, "document_title": "Custom Content Document", "start_block_index": 0, "end_block_index": 1 }] } ] } ``` ### Streaming Support For streaming responses, we've added a `citations_delta` type that contains a single citation to be added to the `citations` list on the current `text` content block.
```python event: message_start data: {"type": "message_start", ...} event: content_block_start data: {"type": "content_block_start", "index": 0, ...} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "According to..."}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "citations_delta", "citation": { "type": "char_location", "cited_text": "...", "document_index": 0, ... }}} event: content_block_stop data: {"type": "content_block_stop", "index": 0} event: message_stop data: {"type": "message_stop"} ```
--- # Context editing URL: https://platform.claude.com/docs/en/build-with-claude/context-editing # Context editing Automatically manage conversation context as it grows with context editing. --- ## Overview Context editing allows you to automatically manage conversation context as it grows, helping you optimize costs and stay within context window limits. You can use server-side API strategies, client-side SDK features, or both together. | Approach | Where it runs | Strategies | How it works | |----------|---------------|------------|--------------| | **Server-side** | API | Tool result clearing (`clear_tool_uses_20250919`)
Thinking block clearing (`clear_thinking_20251015`) | Applied before the prompt reaches Claude. Clears specific content from conversation history. Each strategy can be configured independently. | | **Client-side** | SDK | Compaction | Available in [Python and TypeScript SDKs](/docs/en/api/client-sdks) when using [`tool_runner`](/docs/en/agents-and-tools/tool-use/implement-tool-use#tool-runner-beta). Generates a summary and replaces full conversation history. See [Compaction](#client-side-compaction-sdk) below. | ## Server-side strategies Context editing is currently in beta with support for tool result clearing and thinking block clearing. To enable it, use the beta header `context-management-2025-06-27` in your API requests. Please reach out through our [feedback form](https://forms.gle/YXC2EKGMhjN1c4L88) to share your feedback on this feature. ### Tool result clearing The `clear_tool_uses_20250919` strategy clears tool results when conversation context grows beyond your configured threshold. When activated, the API automatically clears the oldest tool results in chronological order, replacing them with placeholder text to let Claude know the tool result was removed. By default, only tool results are cleared. You can optionally clear both tool results and tool calls (the tool use parameters) by setting `clear_tool_inputs` to true. ### Thinking block clearing The `clear_thinking_20251015` strategy manages `thinking` blocks in conversations when extended thinking is enabled. This strategy automatically clears older thinking blocks from previous turns. **Default behavior**: When extended thinking is enabled without configuring the `clear_thinking_20251015` strategy, the API automatically keeps only the thinking blocks from the last assistant turn (equivalent to `keep: {type: "thinking_turns", value: 1}`). To maximize cache hits, preserve all thinking blocks by setting `keep: "all"`. An assistant conversation turn may include multiple content blocks (e.g. when using tools) and multiple thinking blocks (e.g. with [interleaved thinking](/docs/en/build-with-claude/extended-thinking#interleaved-thinking)). ### Context editing happens server-side Context editing is applied server-side before the prompt reaches Claude. Your client application maintains the full, unmodified conversation history—you do not need to sync your client state with the edited version. Continue managing your full conversation history locally as you normally would. ### Context editing and prompt caching Context editing's interaction with [prompt caching](/docs/en/build-with-claude/prompt-caching) varies by strategy: - **Tool result clearing**: Invalidates cached prompt prefixes when content is cleared. To account for this, we recommend clearing enough tokens to make the cache invalidation worthwhile. Use the `clear_at_least` parameter to ensure a minimum number of tokens is cleared each time. You'll incur cache write costs each time content is cleared, but subsequent requests can reuse the newly cached prefix. - **Thinking block clearing**: When thinking blocks are **kept** in context (not cleared), the prompt cache is preserved, enabling cache hits and reducing input token costs. When thinking blocks are **cleared**, the cache is invalidated at the point where clearing occurs. Configure the `keep` parameter based on whether you want to prioritize cache performance or context window availability. ## Supported models Context editing is available on: - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) ## Tool result clearing usage The simplest way to enable tool result clearing is to specify only the strategy type, as all other [configuration options](#configuration-options-for-tool-result-clearing) will use their default values: ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "anthropic-beta: context-management-2025-06-27" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [ { "role": "user", "content": "Search for recent developments in AI" } ], "tools": [ { "type": "web_search_20250305", "name": "web_search" } ], "context_management": { "edits": [ {"type": "clear_tool_uses_20250919"} ] } }' ``` ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=4096, messages=[ { "role": "user", "content": "Search for recent developments in AI" } ], tools=[ { "type": "web_search_20250305", "name": "web_search" } ], betas=["context-management-2025-06-27"], context_management={ "edits": [ {"type": "clear_tool_uses_20250919"} ] } ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 4096, messages: [ { role: "user", content: "Search for recent developments in AI" } ], tools: [ { type: "web_search_20250305", name: "web_search" } ], context_management: { edits: [ { type: "clear_tool_uses_20250919" } ] }, betas: ["context-management-2025-06-27"] }); ``` ### Advanced configuration You can customize the tool result clearing behavior with additional parameters: ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "anthropic-beta: context-management-2025-06-27" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [ { "role": "user", "content": "Create a simple command line calculator app using Python" } ], "tools": [ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool", "max_characters": 10000 }, { "type": "web_search_20250305", "name": "web_search", "max_uses": 3 } ], "context_management": { "edits": [ { "type": "clear_tool_uses_20250919", "trigger": { "type": "input_tokens", "value": 30000 }, "keep": { "type": "tool_uses", "value": 3 }, "clear_at_least": { "type": "input_tokens", "value": 5000 }, "exclude_tools": ["web_search"] } ] } }' ``` ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=4096, messages=[ { "role": "user", "content": "Create a simple command line calculator app using Python" } ], tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool", "max_characters": 10000 }, { "type": "web_search_20250305", "name": "web_search", "max_uses": 3 } ], betas=["context-management-2025-06-27"], context_management={ "edits": [ { "type": "clear_tool_uses_20250919", # Trigger clearing when threshold is exceeded "trigger": { "type": "input_tokens", "value": 30000 }, # Number of tool uses to keep after clearing "keep": { "type": "tool_uses", "value": 3 }, # Optional: Clear at least this many tokens "clear_at_least": { "type": "input_tokens", "value": 5000 }, # Exclude these tools from being cleared "exclude_tools": ["web_search"] } ] } ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 4096, messages: [ { role: "user", content: "Create a simple command line calculator app using Python" } ], tools: [ { type: "text_editor_20250728", name: "str_replace_based_edit_tool", max_characters: 10000 }, { type: "web_search_20250305", name: "web_search", max_uses: 3 } ], betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_tool_uses_20250919", // Trigger clearing when threshold is exceeded trigger: { type: "input_tokens", value: 30000 }, // Number of tool uses to keep after clearing keep: { type: "tool_uses", value: 3 }, // Optional: Clear at least this many tokens clear_at_least: { type: "input_tokens", value: 5000 }, // Exclude these tools from being cleared exclude_tools: ["web_search"] } ] } }); ``` ## Thinking block clearing usage Enable thinking block clearing to manage context and prompt caching effectively when extended thinking is enabled: ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "anthropic-beta: context-management-2025-06-27" \ --data '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 1024, "messages": [...], "thinking": { "type": "enabled", "budget_tokens": 10000 }, "context_management": { "edits": [ { "type": "clear_thinking_20251015", "keep": { "type": "thinking_turns", "value": 2 } } ] } }' ``` ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=[...], thinking={ "type": "enabled", "budget_tokens": 10000 }, betas=["context-management-2025-06-27"], context_management={ "edits": [ { "type": "clear_thinking_20251015", "keep": { "type": "thinking_turns", "value": 2 } } ] } ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 1024, messages: [...], thinking: { type: "enabled", budget_tokens: 10000 }, betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_thinking_20251015", keep: { type: "thinking_turns", value: 2 } } ] } }); ``` ### Configuration options for thinking block clearing The `clear_thinking_20251015` strategy supports the following configuration: | Configuration option | Default | Description | |---------------------|---------|-------------| | `keep` | `{type: "thinking_turns", value: 1}` | Defines how many recent assistant turns with thinking blocks to preserve. Use `{type: "thinking_turns", value: N}` where N must be > 0 to keep the last N turns, or `"all"` to keep all thinking blocks. | **Example configurations:** ```json // Keep thinking blocks from the last 3 assistant turns { "type": "clear_thinking_20251015", "keep": { "type": "thinking_turns", "value": 3 } } // Keep all thinking blocks (maximizes cache hits) { "type": "clear_thinking_20251015", "keep": "all" } ``` ### Combining strategies You can use both thinking block clearing and tool result clearing together: When using multiple strategies, the `clear_thinking_20251015` strategy must be listed first in the `edits` array. ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=[...], thinking={ "type": "enabled", "budget_tokens": 10000 }, tools=[...], betas=["context-management-2025-06-27"], context_management={ "edits": [ { "type": "clear_thinking_20251015", "keep": { "type": "thinking_turns", "value": 2 } }, { "type": "clear_tool_uses_20250919", "trigger": { "type": "input_tokens", "value": 50000 }, "keep": { "type": "tool_uses", "value": 5 } } ] } ) ``` ```typescript TypeScript const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 1024, messages: [...], thinking: { type: "enabled", budget_tokens: 10000 }, tools: [...], betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_thinking_20251015", keep: { type: "thinking_turns", value: 2 } }, { type: "clear_tool_uses_20250919", trigger: { type: "input_tokens", value: 50000 }, keep: { type: "tool_uses", value: 5 } } ] } }); ``` ## Configuration options for tool result clearing | Configuration option | Default | Description | |---------------------|---------|-------------| | `trigger` | 100,000 input tokens | Defines when the context editing strategy activates. Once the prompt exceeds this threshold, clearing will begin. You can specify this value in either `input_tokens` or `tool_uses`. | | `keep` | 3 tool uses | Defines how many recent tool use/result pairs to keep after clearing occurs. The API removes the oldest tool interactions first, preserving the most recent ones. | | `clear_at_least` | None | Ensures a minimum number of tokens is cleared each time the strategy activates. If the API can't clear at least the specified amount, the strategy will not be applied. This helps determine if context clearing is worth breaking your prompt cache. | | `exclude_tools` | None | List of tool names whose tool uses and results should never be cleared. Useful for preserving important context. | | `clear_tool_inputs` | `false` | Controls whether the tool call parameters are cleared along with the tool results. By default, only the tool results are cleared while keeping Claude's original tool calls visible. | ## Context editing response You can see which context edits were applied to your request using the `context_management` response field, along with helpful statistics about the content and input tokens cleared. ```json Response { "id": "msg_013Zva2CMHLNnXjNJJKqJ2EF", "type": "message", "role": "assistant", "content": [...], "usage": {...}, "context_management": { "applied_edits": [ // When using `clear_thinking_20251015` { "type": "clear_thinking_20251015", "cleared_thinking_turns": 3, "cleared_input_tokens": 15000 }, // When using `clear_tool_uses_20250919` { "type": "clear_tool_uses_20250919", "cleared_tool_uses": 8, "cleared_input_tokens": 50000 } ] } } ``` For streaming responses, the context edits will be included in the final `message_delta` event: ```json Streaming Response { "type": "message_delta", "delta": { "stop_reason": "end_turn", "stop_sequence": null }, "usage": { "output_tokens": 1024 }, "context_management": { "applied_edits": [...] } } ``` ## Token counting The [token counting](/docs/en/build-with-claude/token-counting) endpoint supports context management, allowing you to preview how many tokens your prompt will use after context editing is applied. ```bash cURL curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "anthropic-beta: context-management-2025-06-27" \ --data '{ "model": "claude-sonnet-4-5", "messages": [ { "role": "user", "content": "Continue our conversation..." } ], "tools": [...], "context_management": { "edits": [ { "type": "clear_tool_uses_20250919", "trigger": { "type": "input_tokens", "value": 30000 }, "keep": { "type": "tool_uses", "value": 5 } } ] } }' ``` ```python Python response = client.beta.messages.count_tokens( model="claude-sonnet-4-5", messages=[ { "role": "user", "content": "Continue our conversation..." } ], tools=[...], # Your tool definitions betas=["context-management-2025-06-27"], context_management={ "edits": [ { "type": "clear_tool_uses_20250919", "trigger": { "type": "input_tokens", "value": 30000 }, "keep": { "type": "tool_uses", "value": 5 } } ] } ) print(f"Original tokens: {response.context_management['original_input_tokens']}") print(f"After clearing: {response.input_tokens}") print(f"Savings: {response.context_management['original_input_tokens'] - response.input_tokens} tokens") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.countTokens({ model: "claude-sonnet-4-5", messages: [ { role: "user", content: "Continue our conversation..." } ], tools: [...], // Your tool definitions betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_tool_uses_20250919", trigger: { type: "input_tokens", value: 30000 }, keep: { type: "tool_uses", value: 5 } } ] } }); console.log(`Original tokens: ${response.context_management?.original_input_tokens}`); console.log(`After clearing: ${response.input_tokens}`); console.log(`Savings: ${(response.context_management?.original_input_tokens || 0) - response.input_tokens} tokens`); ``` ```json Response { "input_tokens": 25000, "context_management": { "original_input_tokens": 70000 } } ``` The response shows both the final token count after context management is applied (`input_tokens`) and the original token count before any clearing occurred (`original_input_tokens`). ## Using with the Memory Tool Context editing can be combined with the [memory tool](/docs/en/agents-and-tools/tool-use/memory-tool). When your conversation context approaches the configured clearing threshold, Claude receives an automatic warning to preserve important information. This enables Claude to save tool results or context to its memory files before they're cleared from the conversation history. This combination allows you to: - **Preserve important context**: Claude can write essential information from tool results to memory files before those results are cleared - **Maintain long-running workflows**: Enable agentic workflows that would otherwise exceed context limits by offloading information to persistent storage - **Access information on demand**: Claude can look up previously cleared information from memory files when needed, rather than keeping everything in the active context window For example, in a file editing workflow where Claude performs many operations, Claude can summarize completed changes to memory files as the context grows. When tool results are cleared, Claude retains access to that information through its memory system and can continue working effectively. To use both features together, enable them in your API request: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=4096, messages=[...], tools=[ { "type": "memory_20250818", "name": "memory" }, # Your other tools ], betas=["context-management-2025-06-27"], context_management={ "edits": [ {"type": "clear_tool_uses_20250919"} ] } ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 4096, messages: [...], tools: [ { type: "memory_20250818", name: "memory" }, // Your other tools ], betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_tool_uses_20250919" } ] } }); ``` ## Client-side compaction (SDK) Compaction is available in the [Python and TypeScript SDKs](/docs/en/api/client-sdks) when using the [`tool_runner` method](/docs/en/agents-and-tools/tool-use/implement-tool-use#tool-runner-beta). Compaction is an SDK feature that automatically manages conversation context by generating summaries when token usage grows too large. Unlike server-side context editing strategies that clear content, compaction instructs Claude to summarize the conversation history, then replaces the full history with that summary. This allows Claude to continue working on long-running tasks that would otherwise exceed the [context window](/docs/en/build-with-claude/context-windows). ### How compaction works When compaction is enabled, the SDK monitors token usage after each model response: 1. **Threshold check**: The SDK calculates total tokens as `input_tokens + cache_creation_input_tokens + cache_read_input_tokens + output_tokens` 2. **Summary generation**: When the threshold is exceeded, a summary prompt is injected as a user turn, and Claude generates a structured summary wrapped in `` tags 3. **Context replacement**: The SDK extracts the summary and replaces the entire message history with it 4. **Continuation**: The conversation resumes from the summary, with Claude picking up where it left off ### Using compaction Add `compaction_control` to your `tool_runner` call: ```python Python import anthropic client = anthropic.Anthropic() runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=4096, tools=[...], messages=[ { "role": "user", "content": "Analyze all the files in this directory and write a summary report." } ], compaction_control={ "enabled": True, "context_token_threshold": 100000 } ) for message in runner: print(f"Tokens used: {message.usage.input_tokens}") final = runner.until_done() ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const runner = client.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 4096, tools: [...], messages: [ { role: 'user', content: 'Analyze all the files in this directory and write a summary report.' } ], compactionControl: { enabled: true, contextTokenThreshold: 100000 } }); for await (const message of runner) { console.log('Tokens used:', message.usage.input_tokens); } const finalMessage = await runner.runUntilDone(); ``` #### What happens during compaction As the conversation grows, the message history accumulates: **Before compaction (approaching 100k tokens):** ```json [ { "role": "user", "content": "Analyze all files and write a report..." }, { "role": "assistant", "content": "I'll help. Let me start by reading..." }, { "role": "user", "content": [{ "type": "tool_result", "tool_use_id": "...", "content": "..." }] }, { "role": "assistant", "content": "Based on file1.txt, I see..." }, { "role": "user", "content": [{ "type": "tool_result", "tool_use_id": "...", "content": "..." }] }, { "role": "assistant", "content": "After analyzing file2.txt..." }, // ... 50 more exchanges like this ... ] ``` When tokens exceed the threshold, the SDK injects a summary request and Claude generates a summary. The entire history is then replaced: **After compaction (back to ~2-3k tokens):** ```json [ { "role": "assistant", "content": "# Task Overview\nThe user requested analysis of directory files to produce a summary report...\n\n# Current State\nAnalyzed 52 files across 3 subdirectories. Key findings documented in report.md...\n\n# Important Discoveries\n- Configuration files use YAML format\n- Found 3 deprecated dependencies\n- Test coverage at 67%\n\n# Next Steps\n1. Analyze remaining files in /src/legacy\n2. Complete final report sections...\n\n# Context to Preserve\nUser prefers markdown format with executive summary first..." } ] ``` Claude continues working from this summary as if it were the original conversation history. ### Configuration options | Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | `enabled` | boolean | Yes | - | Whether to enable automatic compaction | | `context_token_threshold` | number | No | 100,000 | Token count at which compaction triggers | | `model` | string | No | Same as main model | Model to use for generating summaries | | `summary_prompt` | string | No | See below | Custom prompt for summary generation | #### Choosing a token threshold The threshold determines when compaction occurs. A lower threshold means more frequent compactions with smaller context windows. A higher threshold allows more context but risks hitting limits. ```python Python # More frequent compaction for memory-constrained scenarios compaction_control={ "enabled": True, "context_token_threshold": 50000 } # Less frequent compaction when you need more context compaction_control={ "enabled": True, "context_token_threshold": 150000 } ``` ```typescript TypeScript // More frequent compaction for memory-constrained scenarios compactionControl: { enabled: true, contextTokenThreshold: 50000 } // Less frequent compaction when you need more context compactionControl: { enabled: true, contextTokenThreshold: 150000 } ``` #### Using a different model for summaries You can use a faster or cheaper model for generating summaries: ```python Python compaction_control={ "enabled": True, "context_token_threshold": 100000, "model": "claude-haiku-4-5" } ``` ```typescript TypeScript compactionControl: { enabled: true, contextTokenThreshold: 100000, model: 'claude-haiku-4-5' } ``` #### Custom summary prompts You can provide a custom prompt for domain-specific needs. Your prompt should instruct Claude to wrap its summary in `` tags. ```python Python compaction_control={ "enabled": True, "context_token_threshold": 100000, "summary_prompt": """Summarize the research conducted so far, including: - Sources consulted and key findings - Questions answered and remaining unknowns - Recommended next steps Wrap your summary in tags.""" } ``` ```typescript TypeScript compactionControl: { enabled: true, contextTokenThreshold: 100000, summaryPrompt: `Summarize the research conducted so far, including: - Sources consulted and key findings - Questions answered and remaining unknowns - Recommended next steps Wrap your summary in tags.` } ``` ### Default summary prompt The built-in summary prompt instructs Claude to create a structured continuation summary including: 1. **Task Overview**: The user's core request, success criteria, and constraints 2. **Current State**: What has been completed, files modified, and artifacts produced 3. **Important Discoveries**: Technical constraints, decisions made, errors resolved, and failed approaches 4. **Next Steps**: Specific actions needed, blockers, and priority order 5. **Context to Preserve**: User preferences, domain-specific details, and commitments made This structure enables Claude to resume work efficiently without losing important context or repeating mistakes.
``` You have been working on the task described above but have not yet completed it. Write a continuation summary that will allow you (or another instance of yourself) to resume work efficiently in a future context window where the conversation history will be replaced with this summary. Your summary should be structured, concise, and actionable. Include: 1. Task Overview The user's core request and success criteria Any clarifications or constraints they specified 2. Current State What has been completed so far Files created, modified, or analyzed (with paths if relevant) Key outputs or artifacts produced 3. Important Discoveries Technical constraints or requirements uncovered Decisions made and their rationale Errors encountered and how they were resolved What approaches were tried that didn't work (and why) 4. Next Steps Specific actions needed to complete the task Any blockers or open questions to resolve Priority order if multiple steps remain 5. Context to Preserve User preferences or style requirements Domain-specific details that aren't obvious Any promises made to the user Be concise but complete—err on the side of including information that would prevent duplicate work or repeated mistakes. Write in a way that enables immediate resumption of the task. Wrap your summary in tags. ```
### Limitations #### Server-side tools Compaction requires special consideration when using server-side tools such as [web search](/docs/en/agents-and-tools/tool-use/web-search-tool) or [web fetch](/docs/en/agents-and-tools/tool-use/web-fetch-tool). When using server-side tools, the SDK may incorrectly calculate token usage, causing compaction to trigger at the wrong time. For example, after a web search operation, the API response might show: ```json { "usage": { "input_tokens": 63000, "cache_read_input_tokens": 270000, "output_tokens": 1400 } } ``` The SDK calculates total usage as 63,000 + 270,000 = 333,000 tokens. However, the `cache_read_input_tokens` value includes accumulated reads from multiple internal API calls made by the server-side tool, not your actual conversation context. Your real context length might only be the 63,000 `input_tokens`, but the SDK sees 333k and triggers compaction prematurely. **Workarounds:** - Use the [token counting](/docs/en/build-with-claude/token-counting) endpoint to get accurate context length - Avoid compaction when using server-side tools extensively #### Tool use edge cases When compaction is triggered while a tool use response is pending, the SDK removes the tool use block from the message history before generating the summary. Claude will re-issue the tool call after resuming from the summary if still needed. ### Monitoring compaction Enable logging to track when compaction occurs: ```python Python import logging logging.basicConfig(level=logging.INFO) logging.getLogger("anthropic.lib.tools").setLevel(logging.INFO) # Logs will show: # INFO: Token usage 105000 has exceeded the threshold of 100000. Performing compaction. # INFO: Compaction complete. New token usage: 2500 ``` ```typescript TypeScript // The SDK logs compaction events to the console // You'll see messages like: // Token usage 105000 has exceeded the threshold of 100000. Performing compaction. // Compaction complete. New token usage: 2500 ``` ### When to use compaction **Good use cases:** - Long-running agent tasks that process many files or data sources - Research workflows that accumulate large amounts of information - Multi-step tasks with clear, measurable progress - Tasks that produce artifacts (files, reports) that persist outside the conversation **Less ideal use cases:** - Tasks requiring precise recall of early conversation details - Workflows using server-side tools extensively - Tasks that need to maintain exact state across many variables --- # Effort URL: https://platform.claude.com/docs/en/build-with-claude/effort # Effort Control how many tokens Claude uses when responding with the effort parameter, trading off between response thoroughness and token efficiency. --- The effort parameter allows you to control how eager Claude is about spending tokens when responding to requests. This gives you the ability to trade off between response thoroughness and token efficiency, all with a single model. The effort parameter is currently in beta and only supported by Claude Opus 4.5. You must include the [beta header](/docs/en/api/beta-headers) `effort-2025-11-24` when using this feature. ## How effort works By default, Claude uses maximum effort—spending as many tokens as needed for the best possible outcome. By lowering the effort level, you can instruct Claude to be more conservative with token usage, optimizing for speed and cost while accepting some reduction in capability. Setting `effort` to `"high"` produces exactly the same behavior as omitting the `effort` parameter entirely. The effort parameter affects **all tokens** in the response, including: - Text responses and explanations - Tool calls and function arguments - Extended thinking (when enabled) This approach has two major advantages: 1. It doesn't require thinking to be enabled in order to use it. 2. It can affect all token spend including tool calls. For example, lower effort would mean Claude makes fewer tool calls. This gives a much greater degree of control over efficiency. ### Effort levels | Level | Description | Typical use case | | -------- | -------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | | `high` | Maximum capability. Claude uses as many tokens as needed for the best possible outcome. Equivalent to not setting the parameter. | Complex reasoning, difficult coding problems, agentic tasks | | `medium` | Balanced approach with moderate token savings. | Agentic tasks that require a balance of speed, cost, and performance | | `low` | Most efficient. Significant token savings with some capability reduction. | Simpler tasks that need the best speed and lowest costs, such as subagents | ## Basic usage ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-opus-4-5-20251101", betas=["effort-2025-11-24"], max_tokens=4096, messages=[{ "role": "user", "content": "Analyze the trade-offs between microservices and monolithic architectures" }], output_config={ "effort": "medium" } ) print(response.content[0].text) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.beta.messages.create({ model: "claude-opus-4-5-20251101", betas: ["effort-2025-11-24"], max_tokens: 4096, messages: [{ role: "user", content: "Analyze the trade-offs between microservices and monolithic architectures" }], output_config: { effort: "medium" } }); console.log(response.content[0].text); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: effort-2025-11-24" \ --header "content-type: application/json" \ --data '{ "model": "claude-opus-4-5-20251101", "max_tokens": 4096, "messages": [{ "role": "user", "content": "Analyze the trade-offs between microservices and monolithic architectures" }], "output_config": { "effort": "medium" } }' ``` ## When should I adjust the effort parameter? - Use **high effort** (the default) when you need Claude's best work—complex reasoning, nuanced analysis, difficult coding problems, or any task where quality is the top priority. - Use **medium effort** as a balanced option when you want solid performance without the full token expenditure of high effort. - Use **low effort** when you're optimizing for speed (because Claude answers with fewer tokens) or cost—for example, simple classification tasks, quick lookups, or high-volume use cases where marginal quality improvements don't justify additional latency or spend. ## Effort with tool use When using tools, the effort parameter affects both the explanations around tool calls and the tool calls themselves. Lower effort levels tend to: - Combine multiple operations into fewer tool calls - Make fewer tool calls - Proceed directly to action without preamble - Use terse confirmation messages after completion Higher effort levels may: - Make more tool calls - Explain the plan before taking action - Provide detailed summaries of changes - Include more comprehensive code comments ## Effort with extended thinking The effort parameter works alongside the thinking token budget when extended thinking is enabled. These two controls serve different purposes: - **Effort parameter**: Controls how Claude spends all tokens—including thinking tokens, text responses, and tool calls - **Thinking token budget**: Sets a maximum limit on thinking tokens specifically The effort parameter can be used with or without extended thinking enabled. When both are configured: 1. First determine the effort level appropriate for your task 2. Then set the thinking token budget based on task complexity For best performance on complex reasoning tasks, use high effort (the default) with a high thinking token budget. This allows Claude to think thoroughly and provide comprehensive responses. ## Best practices 1. **Start with high**: Use lower effort levels to trade off performance for token efficiency. 2. **Use low for speed-sensitive or simple tasks**: When latency matters or tasks are straightforward, low effort can significantly reduce response times and costs. 3. **Test your use case**: The impact of effort levels varies by task type. Evaluate performance on your specific use cases before deploying. 4. **Consider dynamic effort**: Adjust effort based on task complexity. Simple queries may warrant low effort while agentic coding and complex reasoning benefit from high effort. --- # Embeddings URL: https://platform.claude.com/docs/en/build-with-claude/embeddings # Embeddings Text embeddings are numerical representations of text that enable measuring semantic similarity. This guide introduces embeddings, their applications, and how to use embedding models for tasks like search, recommendations, and anomaly detection. --- ## Before implementing embeddings When selecting an embeddings provider, there are several factors you can consider depending on your needs and preferences: - Dataset size & domain specificity: size of the model training dataset and its relevance to the domain you want to embed. Larger or more domain-specific data generally produces better in-domain embeddings - Inference performance: embedding lookup speed and end-to-end latency. This is a particularly important consideration for large scale production deployments - Customization: options for continued training on private data, or specialization of models for very specific domains. This can improve performance on unique vocabularies ## How to get embeddings with Anthropic Anthropic does not offer its own embedding model. One embeddings provider that has a wide variety of options and capabilities encompassing all of the above considerations is Voyage AI. Voyage AI makes state-of-the-art embedding models and offers customized models for specific industry domains such as finance and healthcare, or bespoke fine-tuned models for individual customers. The rest of this guide is for Voyage AI, but we encourage you to assess a variety of embeddings vendors to find the best fit for your specific use case. ## Available Models Voyage recommends using the following text embedding models: | Model | Context Length | Embedding Dimension | Description | | --- | --- | --- | --- | | `voyage-3-large` | 32,000 | 1024 (default), 256, 512, 2048 | The best general-purpose and multilingual retrieval quality. See [blog post](https://blog.voyageai.com/2025/01/07/voyage-3-large/) for details. | | `voyage-3.5` | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for general-purpose and multilingual retrieval quality. See [blog post](https://blog.voyageai.com/2025/05/20/voyage-3-5/) for details. | | `voyage-3.5-lite` | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for latency and cost. See [blog post](https://blog.voyageai.com/2025/05/20/voyage-3-5/) for details. | | `voyage-code-3` | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for **code** retrieval. See [blog post](https://blog.voyageai.com/2024/12/04/voyage-code-3/) for details. | | `voyage-finance-2` | 32,000 | 1024 | Optimized for **finance** retrieval and RAG. See [blog post](https://blog.voyageai.com/2024/06/03/domain-specific-embeddings-finance-edition-voyage-finance-2/) for details. | | `voyage-law-2` | 16,000 | 1024 | Optimized for **legal** and **long-context** retrieval and RAG. Also improved performance across all domains. See [blog post](https://blog.voyageai.com/2024/04/15/domain-specific-embeddings-and-retrieval-legal-edition-voyage-law-2/) for details. | Additionally, the following multimodal embedding models are recommended: | Model | Context Length | Embedding Dimension | Description | | --- | --- | --- | --- | | `voyage-multimodal-3` | 32000 | 1024 | Rich multimodal embedding model that can vectorize interleaved text and content-rich images, such as screenshots of PDFs, slides, tables, figures, and more. See [blog post](https://blog.voyageai.com/2024/11/12/voyage-multimodal-3/) for details. | Need help deciding which text embedding model to use? Check out the [FAQ](https://docs.voyageai.com/docs/faq#what-embedding-models-are-available-and-which-one-should-i-use&ref=anthropic). ## Getting started with Voyage AI To access Voyage embeddings: 1. Sign up on Voyage AI's website 2. Obtain an API key 3. Set the API key as an environment variable for convenience: ```bash export VOYAGE_API_KEY="" ``` You can obtain the embeddings by either using the official [`voyageai` Python package](https://github.com/voyage-ai/voyageai-python) or HTTP requests, as described below. ### Voyage Python library The `voyageai` package can be installed using the following command: ```bash pip install -U voyageai ``` Then, you can create a client object and start using it to embed your texts: ```python import voyageai vo = voyageai.Client() # This will automatically use the environment variable VOYAGE_API_KEY. # Alternatively, you can use vo = voyageai.Client(api_key="") texts = ["Sample text 1", "Sample text 2"] result = vo.embed(texts, model="voyage-3.5", input_type="document") print(result.embeddings[0]) print(result.embeddings[1]) ``` `result.embeddings` will be a list of two embedding vectors, each containing 1024 floating-point numbers. After running the above code, the two embeddings will be printed on the screen: ``` [-0.013131560757756233, 0.019828535616397858, ...] # embedding for "Sample text 1" [-0.0069352793507277966, 0.020878976210951805, ...] # embedding for "Sample text 2" ``` When creating the embeddings, you can specify a few other arguments to the `embed()` function. For more information on the Voyage python package, see [the Voyage documentation](https://docs.voyageai.com/docs/embeddings#python-api). ### Voyage HTTP API You can also get embeddings by requesting Voyage HTTP API. For example, you can send an HTTP request through the `curl` command in a terminal: ```bash curl https://api.voyageai.com/v1/embeddings \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $VOYAGE_API_KEY" \ -d '{ "input": ["Sample text 1", "Sample text 2"], "model": "voyage-3.5" }' ``` The response you would get is a JSON object containing the embeddings and the token usage: ```json { "object": "list", "data": [ { "embedding": [-0.013131560757756233, 0.019828535616397858, ...], "index": 0 }, { "embedding": [-0.0069352793507277966, 0.020878976210951805, ...], "index": 1 } ], "model": "voyage-3.5", "usage": { "total_tokens": 10 } } ``` For more information on the Voyage HTTP API, see [the Voyage documentation](https://docs.voyageai.com/reference/embeddings-api). ### AWS Marketplace Voyage embeddings are available on [AWS Marketplace](https://aws.amazon.com/marketplace/seller-profile?id=seller-snt4gb6fd7ljg). Instructions for accessing Voyage on AWS are available [here](https://docs.voyageai.com/docs/aws-marketplace-model-package?ref=anthropic). ## Quickstart example Now that we know how to get embeddings, let's see a brief example. Suppose we have a small corpus of six documents to retrieve from ```python documents = [ "The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases.", "Photosynthesis in plants converts light energy into glucose and produces essential oxygen.", "20th-century innovations, from radios to smartphones, centered on electronic advancements.", "Rivers provide water, irrigation, and habitat for aquatic species, vital for ecosystems.", "Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.", "Shakespeare's works, like 'Hamlet' and 'A Midsummer Night's Dream,' endure in literature." ] ``` We will first use Voyage to convert each of them into an embedding vector ```python import voyageai vo = voyageai.Client() # Embed the documents doc_embds = vo.embed( documents, model="voyage-3.5", input_type="document" ).embeddings ``` The embeddings will allow us to do semantic search / retrieval in the vector space. Given an example query, ```python query = "When is Apple's conference call scheduled?" ``` we convert it into an embedding, and conduct a nearest neighbor search to find the most relevant document based on the distance in the embedding space. ```python import numpy as np # Embed the query query_embd = vo.embed( [query], model="voyage-3.5", input_type="query" ).embeddings[0] # Compute the similarity # Voyage embeddings are normalized to length 1, therefore dot-product # and cosine similarity are the same. similarities = np.dot(doc_embds, query_embd) retrieved_id = np.argmax(similarities) print(documents[retrieved_id]) ``` Note that we use `input_type="document"` and `input_type="query"` for embedding the document and query, respectively. More specification can be found [here](/docs/en/build-with-claude/embeddings#voyage-python-package). The output would be the 5th document, which is indeed the most relevant to the query: ``` Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET. ``` If you are looking for a detailed set of cookbooks on how to do RAG with embeddings, including vector databases, check out our [RAG cookbook](https://platform.claude.com/cookbook/third-party-pinecone-rag-using-pinecone). ## FAQ
Embedding models rely on powerful neural networks to capture and compress semantic context, similar to generative models. Voyage's team of experienced AI researchers optimizes every component of the embedding process, including: - Model architecture - Data collection - Loss functions - Optimizer selection Learn more about Voyage's technical approach on their [blog](https://blog.voyageai.com/).
For general-purpose embedding, we recommend: - `voyage-3-large`: Best quality - `voyage-3.5-lite`: Lowest latency and cost - `voyage-3.5`: Balanced performance with superior retrieval quality at a competitive price point For retrieval, use the `input_type` parameter to specify whether the text is a query or document type. Domain-specific models: - Legal tasks: `voyage-law-2` - Code and programming documentation: `voyage-code-3` - Finance-related tasks: `voyage-finance-2`
You can use Voyage embeddings with either dot-product similarity, cosine similarity, or Euclidean distance. An explanation about embedding similarity can be found [here](https://www.pinecone.io/learn/vector-similarity/). Voyage AI embeddings are normalized to length 1, which means that: - Cosine similarity is equivalent to dot-product similarity, while the latter can be computed more quickly. - Cosine similarity and Euclidean distance will result in the identical rankings.
Please see this [page](https://docs.voyageai.com/docs/tokenization?ref=anthropic).
For all retrieval tasks and use cases (e.g., RAG), we recommend that the `input_type` parameter be used to specify whether the input text is a query or document. Do not omit `input_type` or set `input_type=None`. Specifying whether input text is a query or document can create better dense vector representations for retrieval, which can lead to better retrieval quality. When using the `input_type` parameter, special prompts are prepended to the input text prior to embedding. Specifically: > 📘 **Prompts associated with `input_type`** > > - For a query, the prompt is “Represent the query for retrieving supporting documents: “. > - For a document, the prompt is “Represent the document for retrieval: “. > - Example > - When `input_type="query"`, a query like "When is Apple's conference call scheduled?" will become "**Represent the query for retrieving supporting documents:** When is Apple's conference call scheduled?" > - When `input_type="document"`, a query like "Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET." will become "**Represent the document for retrieval:** Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET." `voyage-large-2-instruct`, as the name suggests, is trained to be responsive to additional instructions that are prepended to the input text. For classification, clustering, or other [MTEB](https://huggingface.co/mteb) subtasks, please use the instructions [here](https://github.com/voyage-ai/voyage-large-2-instruct).
Quantization in embeddings converts high-precision values, like 32-bit single-precision floating-point numbers, to lower-precision formats such as 8-bit integers or 1-bit binary values, reducing storage, memory, and costs by 4x and 32x, respectively. Supported Voyage models enable quantization by specifying the output data type with the `output_dtype` parameter: - `float`: Each returned embedding is a list of 32-bit (4-byte) single-precision floating-point numbers. This is the default and provides the highest precision / retrieval accuracy. - `int8` and `uint8`: Each returned embedding is a list of 8-bit (1-byte) integers ranging from -128 to 127 and 0 to 255, respectively. - `binary` and `ubinary`: Each returned embedding is a list of 8-bit integers that represent bit-packed, quantized single-bit embedding values: `int8` for `binary` and `uint8` for `ubinary`. The length of the returned list of integers is 1/8 of the actual dimension of the embedding. The binary type uses the offset binary method, which you can learn more about in the FAQ below. > **Binary quantization example** > > Consider the following eight embedding values: -0.03955078, 0.006214142, -0.07446289, -0.039001465, 0.0046463013, 0.00030612946, -0.08496094, and 0.03994751. With binary quantization, values less than or equal to zero will be quantized to a binary zero, and positive values to a binary one, resulting in the following binary sequence: 0, 1, 0, 0, 1, 1, 0, 1. These eight bits are then packed into a single 8-bit integer, 01001101 (with the leftmost bit as the most significant bit). > - `ubinary`: The binary sequence is directly converted and represented as the unsigned integer (`uint8`) 77. > - `binary`: The binary sequence is represented as the signed integer (`int8`) -51, calculated using the offset binary method (77 - 128 = -51).
Matryoshka learning creates embeddings with coarse-to-fine representations within a single vector. Voyage models, such as `voyage-code-3`, that support multiple output dimensions generate such Matryoshka embeddings. You can truncate these vectors by keeping the leading subset of dimensions. For example, the following Python code demonstrates how to truncate 1024-dimensional vectors to 256 dimensions: ```python import voyageai import numpy as np def embd_normalize(v: np.ndarray) -> np.ndarray: """ Normalize the rows of a 2D numpy array to unit vectors by dividing each row by its Euclidean norm. Raises a ValueError if any row has a norm of zero to prevent division by zero. """ row_norms = np.linalg.norm(v, axis=1, keepdims=True) if np.any(row_norms == 0): raise ValueError("Cannot normalize rows with a norm of zero.") return v / row_norms vo = voyageai.Client() # Generate voyage-code-3 vectors, which by default are 1024-dimensional floating-point numbers embd = vo.embed(['Sample text 1', 'Sample text 2'], model='voyage-code-3').embeddings # Set shorter dimension short_dim = 256 # Resize and normalize vectors to shorter dimension resized_embd = embd_normalize(np.array(embd)[:, :short_dim]).tolist() ```
## Pricing Visit Voyage's [pricing page](https://docs.voyageai.com/docs/pricing?ref=anthropic) for the most up to date pricing details. --- # Files API URL: https://platform.claude.com/docs/en/build-with-claude/files # Files API --- The Files API lets you upload and manage files to use with the Claude API without re-uploading content with each request. This is particularly useful when using the [code execution tool](/docs/en/agents-and-tools/tool-use/code-execution-tool) to provide inputs (e.g. datasets and documents) and then download outputs (e.g. charts). You can also use the Files API to prevent having to continually re-upload frequently used documents and images across multiple API calls. You can [explore the API reference directly](/docs/en/api/files-create), in addition to this guide. The Files API is currently in beta. Please reach out through our [feedback form](https://forms.gle/tisHyierGwgN4DUE9) to share your experience with the Files API. ## Supported models Referencing a `file_id` in a Messages request is supported in all models that support the given file type. For example, [images](/docs/en/build-with-claude/vision) are supported in all Claude 3+ models, [PDFs](/docs/en/build-with-claude/pdf-support) in all Claude 3.5+ models, and [various other file types](/docs/en/agents-and-tools/tool-use/code-execution-tool#supported-file-types) for the code execution tool in Claude Haiku 4.5 plus all Claude 3.7+ models. The Files API is currently not supported on Amazon Bedrock or Google Vertex AI. ## How the Files API works The Files API provides a simple create-once, use-many-times approach for working with files: - **Upload files** to our secure storage and receive a unique `file_id` - **Download files** that are created from skills or the code execution tool - **Reference files** in [Messages](/docs/en/api/messages) requests using the `file_id` instead of re-uploading content - **Manage your files** with list, retrieve, and delete operations ## How to use the Files API To use the Files API, you'll need to include the beta feature header: `anthropic-beta: files-api-2025-04-14`. ### Uploading a file Upload a file to be referenced in future API calls: ```bash Shell curl -X POST https://api.anthropic.com/v1/files \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -F "file=@/path/to/document.pdf" ``` ```python Python import anthropic client = anthropic.Anthropic() client.beta.files.upload( file=("document.pdf", open("/path/to/document.pdf", "rb"), "application/pdf"), ) ``` ```typescript TypeScript import Anthropic, { toFile } from '@anthropic-ai/sdk'; import fs from "fs"; const anthropic = new Anthropic(); await anthropic.beta.files.upload({ file: await toFile(fs.createReadStream('/path/to/document.pdf'), undefined, { type: 'application/pdf' }) }, { betas: ['files-api-2025-04-14'] }); ``` The response from uploading a file will include: ```json { "id": "file_011CNha8iCJcU1wXNR6q4V8w", "type": "file", "filename": "document.pdf", "mime_type": "application/pdf", "size_bytes": 1024000, "created_at": "2025-01-01T00:00:00Z", "downloadable": false } ``` ### Using a file in messages Once uploaded, reference the file using its `file_id`: ```bash Shell curl -X POST https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Please summarize this document for me." }, { "type": "document", "source": { "type": "file", "file_id": "file_011CNha8iCJcU1wXNR6q4V8w" } } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "text", "text": "Please summarize this document for me." }, { "type": "document", "source": { "type": "file", "file_id": "file_011CNha8iCJcU1wXNR6q4V8w" } } ] } ], betas=["files-api-2025-04-14"], ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: [ { type: "text", text: "Please summarize this document for me." }, { type: "document", source: { type: "file", file_id: "file_011CNha8iCJcU1wXNR6q4V8w" } } ] } ], betas: ["files-api-2025-04-14"], }); console.log(response); ``` ### File types and content blocks The Files API supports different file types that correspond to different content block types: | File Type | MIME Type | Content Block Type | Use Case | | :--- | :--- | :--- | :--- | | PDF | `application/pdf` | `document` | Text analysis, document processing | | Plain text | `text/plain` | `document` | Text analysis, processing | | Images | `image/jpeg`, `image/png`, `image/gif`, `image/webp` | `image` | Image analysis, visual tasks | | [Datasets, others](/docs/en/agents-and-tools/tool-use/code-execution-tool#supported-file-types) | Varies | `container_upload` | Analyze data, create visualizations | ### Working with other file formats For file types that are not supported as `document` blocks (.csv, .txt, .md, .docx, .xlsx), convert the files to plain text, and include the content directly in your message: ```bash Shell # Example: Reading a text file and sending it as plain text # Note: For files with special characters, consider base64 encoding TEXT_CONTENT=$(cat document.txt | jq -Rs .) curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d @- < For .docx files containing images, convert them to PDF format first, then use [PDF support](/docs/en/build-with-claude/pdf-support) to take advantage of the built-in image parsing. This allows using citations from the PDF document. #### Document blocks For PDFs and text files, use the `document` content block: ```json { "type": "document", "source": { "type": "file", "file_id": "file_011CNha8iCJcU1wXNR6q4V8w" }, "title": "Document Title", // Optional "context": "Context about the document", // Optional "citations": {"enabled": true} // Optional, enables citations } ``` #### Image blocks For images, use the `image` content block: ```json { "type": "image", "source": { "type": "file", "file_id": "file_011CPMxVD3fHLUhvTqtsQA5w" } } ``` ### Managing files #### List files Retrieve a list of your uploaded files: ```bash Shell curl https://api.anthropic.com/v1/files \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" ``` ```python Python import anthropic client = anthropic.Anthropic() files = client.beta.files.list() ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const files = await anthropic.beta.files.list({ betas: ['files-api-2025-04-14'], }); ``` #### Get file metadata Retrieve information about a specific file: ```bash Shell curl https://api.anthropic.com/v1/files/file_011CNha8iCJcU1wXNR6q4V8w \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" ``` ```python Python import anthropic client = anthropic.Anthropic() file = client.beta.files.retrieve_metadata("file_011CNha8iCJcU1wXNR6q4V8w") ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const file = await anthropic.beta.files.retrieveMetadata( "file_011CNha8iCJcU1wXNR6q4V8w", { betas: ['files-api-2025-04-14'] }, ); ``` #### Delete a file Remove a file from your workspace: ```bash Shell curl -X DELETE https://api.anthropic.com/v1/files/file_011CNha8iCJcU1wXNR6q4V8w \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" ``` ```python Python import anthropic client = anthropic.Anthropic() result = client.beta.files.delete("file_011CNha8iCJcU1wXNR6q4V8w") ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const result = await anthropic.beta.files.delete( "file_011CNha8iCJcU1wXNR6q4V8w", { betas: ['files-api-2025-04-14'] }, ); ``` ### Downloading a file Download files that have been created by skills or the code execution tool: ```bash Shell curl -X GET "https://api.anthropic.com/v1/files/file_011CNha8iCJcU1wXNR6q4V8w/content" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ --output downloaded_file.txt ``` ```python Python import anthropic client = anthropic.Anthropic() file_content = client.beta.files.download("file_011CNha8iCJcU1wXNR6q4V8w") # Save to file with open("downloaded_file.txt", "w") as f: f.write(file_content.decode('utf-8')) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; import fs from 'fs'; const anthropic = new Anthropic(); const fileContent = await anthropic.beta.files.download( "file_011CNha8iCJcU1wXNR6q4V8w", { betas: ['files-api-2025-04-14'] }, ); // Save to file fs.writeFileSync("downloaded_file.txt", fileContent); ``` You can only download files that were created by [skills](/docs/en/build-with-claude/skills-guide) or the [code execution tool](/docs/en/agents-and-tools/tool-use/code-execution-tool). Files that you uploaded cannot be downloaded. --- ## File storage and limits ### Storage limits - **Maximum file size:** 500 MB per file - **Total storage:** 100 GB per organization ### File lifecycle - Files are scoped to the workspace of the API key. Other API keys can use files created by any other API key associated with the same workspace - Files persist until you delete them - Deleted files cannot be recovered - Files are inaccessible via the API shortly after deletion, but they may persist in active `Messages` API calls and associated tool uses - Files that users delete will be deleted in accordance with our [data retention policy](https://privacy.claude.com/en/articles/7996866-how-long-do-you-store-my-organization-s-data). --- ## Error handling Common errors when using the Files API include: - **File not found (404):** The specified `file_id` doesn't exist or you don't have access to it - **Invalid file type (400):** The file type doesn't match the content block type (e.g., using an image file in a document block) - **Exceeds context window size (400):** The file is larger than the context window size (e.g. using a 500 MB plaintext file in a `/v1/messages` request) - **Invalid filename (400):** Filename doesn't meet the length requirements (1-255 characters) or contains forbidden characters (`<`, `>`, `:`, `"`, `|`, `?`, `*`, `\`, `/`, or unicode characters 0-31) - **File too large (413):** File exceeds the 500 MB limit - **Storage limit exceeded (403):** Your organization has reached the 100 GB storage limit ```json { "type": "error", "error": { "type": "invalid_request_error", "message": "File not found: file_011CNha8iCJcU1wXNR6q4V8w" } } ``` ## Usage and billing File API operations are **free**: - Uploading files - Downloading files - Listing files - Getting file metadata - Deleting files File content used in `Messages` requests are priced as input tokens. You can only download files created by [skills](/docs/en/build-with-claude/skills-guide) or the [code execution tool](/docs/en/agents-and-tools/tool-use/code-execution-tool). ### Rate limits During the beta period: - File-related API calls are limited to approximately 100 requests per minute - [Contact us](mailto:sales@anthropic.com) if you need higher limits for your use case --- # Multilingual support URL: https://platform.claude.com/docs/en/build-with-claude/multilingual-support # Multilingual support Claude excels at tasks across multiple languages, maintaining strong cross-lingual performance relative to English. --- ## Overview Claude demonstrates robust multilingual capabilities, with particularly strong performance in zero-shot tasks across languages. The model maintains consistent relative performance across both widely-spoken and lower-resource languages, making it a reliable choice for multilingual applications. Note that Claude is capable in many languages beyond those benchmarked below. We encourage testing with any languages relevant to your specific use cases. ## Performance data Below are the zero-shot chain-of-thought evaluation scores for Claude models across different languages, shown as a percent relative to English performance (100%): | Language | Claude Opus 4.11 | Claude Opus 41 | Claude Sonnet 4.51 | Claude Sonnet 41 | Claude Haiku 4.51 | |----------|---------------|---------------|---------------|-----------------|------------------| | English (baseline, fixed to 100%) | 100% | 100% | 100% | 100% | 100% | | Spanish | 98.1% | 98.0% | 98.2% | 97.5% | 96.4% | | Portuguese (Brazil) | 97.8% | 97.3% | 97.8% | 97.2% | 96.1% | | Italian | 97.7% | 97.5% | 97.9% | 97.3% | 96.0% | | French | 97.9% | 97.7% | 97.5% | 97.1% | 95.7% | | Indonesian | 97.3% | 97.2% | 97.3% | 96.2% | 94.2% | | German | 97.7% | 97.1% | 97.0% | 94.7% | 94.3% | | Arabic | 97.1% | 96.9% | 97.2% | 96.1% | 92.5% | | Chinese (Simplified) | 97.1% | 96.7% | 96.9% | 95.9% | 94.2% | | Korean | 96.6% | 96.4% | 96.7% | 95.9% | 93.3% | | Japanese | 96.9% | 96.2% | 96.8% | 95.6% | 93.5% | | Hindi | 96.8% | 96.7% | 96.7% | 95.8% | 92.4% | | Bengali | 95.7% | 95.2% | 95.4% | 94.4% | 90.4% | | Swahili | 89.8% | 89.5% | 91.1% | 87.1% | 78.3% | | Yoruba | 80.3% | 78.9% | 79.7% | 76.4% | 52.7% | 1 With [extended thinking](/docs/en/build-with-claude/extended-thinking). These metrics are based on [MMLU (Massive Multitask Language Understanding)](https://en.wikipedia.org/wiki/MMLU) English test sets that were translated into 14 additional languages by professional human translators, as documented in [OpenAI's simple-evals repository](https://github.com/openai/simple-evals/blob/main/multilingual_mmlu_benchmark_results.md). The use of human translators for this evaluation ensures high-quality translations, particularly important for languages with fewer digital resources. *** ## Best practices When working with multilingual content: 1. **Provide clear language context**: While Claude can detect the target language automatically, explicitly stating the desired input/output language improves reliability. For enhanced fluency, you can prompt Claude to use "idiomatic speech as if it were a native speaker." 2. **Use native scripts**: Submit text in its native script rather than transliteration for optimal results 3. **Consider cultural context**: Effective communication often requires cultural and regional awareness beyond pure translation We also suggest following our general [prompt engineering guidelines](/docs/en/build-with-claude/prompt-engineering/overview) to better improve Claude's performance. *** ## Language support considerations - Claude processes input and generates output in most world languages that use standard Unicode characters - Performance varies by language, with particularly strong capabilities in widely-spoken languages - Even in languages with fewer digital resources, Claude maintains meaningful capabilities Master the art of prompt crafting to get the most out of Claude. Find a wide range of pre-crafted prompts for various tasks and industries. Perfect for inspiration or quick starts. --- # PDF support URL: https://platform.claude.com/docs/en/build-with-claude/pdf-support # PDF support Process PDFs with Claude. Extract text, analyze charts, and understand visual content from your documents. --- You can now ask Claude about any text, pictures, charts, and tables in PDFs you provide. Some sample use cases: - Analyzing financial reports and understanding charts/tables - Extracting key information from legal documents - Translation assistance for documents - Converting document information into structured formats ## Before you begin ### Check PDF requirements Claude works with any standard PDF. However, you should ensure your request size meets these requirements when using PDF support: | Requirement | Limit | |------------|--------| | Maximum request size | 32MB | | Maximum pages per request | 100 | | Format | Standard PDF (no passwords/encryption) | Please note that both limits are on the entire request payload, including any other content sent alongside PDFs. Since PDF support relies on Claude's vision capabilities, it is subject to the same [limitations and considerations](/docs/en/build-with-claude/vision#limitations) as other vision tasks. ### Supported platforms and models PDF support is currently supported via direct API access and Google Vertex AI. All [active models](/docs/en/about-claude/models/overview) support PDF processing. PDF support is now available on Amazon Bedrock with the following considerations: ### Amazon Bedrock PDF Support When using PDF support through Amazon Bedrock's Converse API, there are two distinct document processing modes: **Important**: To access Claude's full visual PDF understanding capabilities in the Converse API, you must enable citations. Without citations enabled, the API falls back to basic text extraction only. Learn more about [working with citations](/docs/en/build-with-claude/citations). #### Document Processing Modes 1. **Converse Document Chat** (Original mode - Text extraction only) - Provides basic text extraction from PDFs - Cannot analyze images, charts, or visual layouts within PDFs - Uses approximately 1,000 tokens for a 3-page PDF - Automatically used when citations are not enabled 2. **Claude PDF Chat** (New mode - Full visual understanding) - Provides complete visual analysis of PDFs - Can understand and analyze charts, graphs, images, and visual layouts - Processes each page as both text and image for comprehensive understanding - Uses approximately 7,000 tokens for a 3-page PDF - **Requires citations to be enabled** in the Converse API #### Key Limitations - **Converse API**: Visual PDF analysis requires citations to be enabled. There is currently no option to use visual analysis without citations (unlike the InvokeModel API). - **InvokeModel API**: Provides full control over PDF processing without forced citations. #### Common Issues If customers report that Claude isn't seeing images or charts in their PDFs when using the Converse API, they likely need to enable the citations flag. Without it, Converse falls back to basic text extraction only. This is a known constraint with the Converse API that we're working to address. For applications that require visual PDF analysis without citations, consider using the InvokeModel API instead. For non-PDF files like .csv, .xlsx, .docx, .md, or .txt files, see [Working with other file formats](/docs/en/build-with-claude/files#working-with-other-file-formats). *** ## Process PDFs with Claude ### Send your first PDF request Let's start with a simple example using the Messages API. You can provide PDFs to Claude in three ways: 1. As a URL reference to a PDF hosted online 2. As a base64-encoded PDF in `document` content blocks 3. By a `file_id` from the [Files API](/docs/en/build-with-claude/files) #### Option 1: URL-based PDF document The simplest approach is to reference a PDF directly from a URL: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [{ "role": "user", "content": [{ "type": "document", "source": { "type": "url", "url": "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" } }, { "type": "text", "text": "What are the key findings in this document?" }] }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "url", "url": "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" } }, { "type": "text", "text": "What are the key findings in this document?" } ] } ], ) print(message.content) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { const response = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { role: 'user', content: [ { type: 'document', source: { type: 'url', url: 'https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf', }, }, { type: 'text', text: 'What are the key findings in this document?', }, ], }, ], }); console.log(response); } main(); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.*; public class PdfExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Create document block with URL DocumentBlockParam documentParam = DocumentBlockParam.builder() .urlPdfSource("https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf") .build(); // Create a message with document and text content blocks MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams( List.of( ContentBlockParam.ofDocument(documentParam), ContentBlockParam.ofText( TextBlockParam.builder() .text("What are the key findings in this document?") .build() ) ) ) .build(); Message message = client.messages().create(params); System.out.println(message.content()); } } ``` #### Option 2: Base64-encoded PDF document If you need to send PDFs from your local system or when a URL isn't available: ```bash Shell # Method 1: Fetch and encode a remote PDF curl -s "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" | base64 | tr -d '\n' > pdf_base64.txt # Method 2: Encode a local PDF file # base64 document.pdf | tr -d '\n' > pdf_base64.txt # Create a JSON request file using the pdf_base64.txt content jq -n --rawfile PDF_BASE64 pdf_base64.txt '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [{ "role": "user", "content": [{ "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": $PDF_BASE64 } }, { "type": "text", "text": "What are the key findings in this document?" }] }] }' > request.json # Send the API request using the JSON file curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d @request.json ``` ```python Python import anthropic import base64 import httpx # First, load and encode the PDF pdf_url = "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf" pdf_data = base64.standard_b64encode(httpx.get(pdf_url).content).decode("utf-8") # Alternative: Load from a local file # with open("document.pdf", "rb") as f: # pdf_data = base64.standard_b64encode(f.read()).decode("utf-8") # Send to Claude using base64 encoding client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_data } }, { "type": "text", "text": "What are the key findings in this document?" } ] } ], ) print(message.content) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; import fetch from 'node-fetch'; import fs from 'fs'; async function main() { // Method 1: Fetch and encode a remote PDF const pdfURL = "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"; const pdfResponse = await fetch(pdfURL); const arrayBuffer = await pdfResponse.arrayBuffer(); const pdfBase64 = Buffer.from(arrayBuffer).toString('base64'); // Method 2: Load from a local file // const pdfBase64 = fs.readFileSync('document.pdf').toString('base64'); // Send the API request with base64-encoded PDF const anthropic = new Anthropic(); const response = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { role: 'user', content: [ { type: 'document', source: { type: 'base64', media_type: 'application/pdf', data: pdfBase64, }, }, { type: 'text', text: 'What are the key findings in this document?', }, ], }, ], }); console.log(response); } main(); ``` ```java Java import java.io.IOException; import java.net.URI; import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; import java.util.Base64; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.DocumentBlockParam; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class PdfExample { public static void main(String[] args) throws IOException, InterruptedException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Method 1: Download and encode a remote PDF String pdfUrl = "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"; HttpClient httpClient = HttpClient.newHttpClient(); HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(pdfUrl)) .GET() .build(); HttpResponse response = httpClient.send(request, HttpResponse.BodyHandlers.ofByteArray()); String pdfBase64 = Base64.getEncoder().encodeToString(response.body()); // Method 2: Load from a local file // byte[] fileBytes = Files.readAllBytes(Path.of("document.pdf")); // String pdfBase64 = Base64.getEncoder().encodeToString(fileBytes); // Create document block with base64 data DocumentBlockParam documentParam = DocumentBlockParam.builder() .base64PdfSource(pdfBase64) .build(); // Create a message with document and text content blocks MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams( List.of( ContentBlockParam.ofDocument(documentParam), ContentBlockParam.ofText(TextBlockParam.builder().text("What are the key findings in this document?").build()) ) ) .build(); Message message = client.messages().create(params); message.content().stream() .flatMap(contentBlock -> contentBlock.text().stream()) .forEach(textBlock -> System.out.println(textBlock.text())); } } ``` #### Option 3: Files API For PDFs you'll use repeatedly, or when you want to avoid encoding overhead, use the [Files API](/docs/en/build-with-claude/files): ```bash Shell # First, upload your PDF to the Files API curl -X POST https://api.anthropic.com/v1/files \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -F "file=@document.pdf" # Then use the returned file_id in your message curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [{ "role": "user", "content": [{ "type": "document", "source": { "type": "file", "file_id": "file_abc123" } }, { "type": "text", "text": "What are the key findings in this document?" }] }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() # Upload the PDF file with open("document.pdf", "rb") as f: file_upload = client.beta.files.upload(file=("document.pdf", f, "application/pdf")) # Use the uploaded file in a message message = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, betas=["files-api-2025-04-14"], messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "file", "file_id": file_upload.id } }, { "type": "text", "text": "What are the key findings in this document?" } ] } ], ) print(message.content) ``` ```typescript TypeScript import { Anthropic, toFile } from '@anthropic-ai/sdk'; import fs from 'fs'; const anthropic = new Anthropic(); async function main() { // Upload the PDF file const fileUpload = await anthropic.beta.files.upload({ file: toFile(fs.createReadStream('document.pdf'), undefined, { type: 'application/pdf' }) }, { betas: ['files-api-2025-04-14'] }); // Use the uploaded file in a message const response = await anthropic.beta.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, betas: ['files-api-2025-04-14'], messages: [ { role: 'user', content: [ { type: 'document', source: { type: 'file', file_id: fileUpload.id } }, { type: 'text', text: 'What are the key findings in this document?' } ] } ] }); console.log(response); } main(); ``` ```java Java import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.File; import com.anthropic.models.files.FileUploadParams; import com.anthropic.models.messages.*; public class PdfFilesExample { public static void main(String[] args) throws IOException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Upload the PDF file File file = client.beta().files().upload(FileUploadParams.builder() .file(Files.newInputStream(Path.of("document.pdf"))) .build()); // Use the uploaded file in a message DocumentBlockParam documentParam = DocumentBlockParam.builder() .fileSource(file.id()) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams( List.of( ContentBlockParam.ofDocument(documentParam), ContentBlockParam.ofText( TextBlockParam.builder() .text("What are the key findings in this document?") .build() ) ) ) .build(); Message message = client.messages().create(params); System.out.println(message.content()); } } ``` ### How PDF support works When you send a PDF to Claude, the following steps occur: - The system converts each page of the document into an image. - The text from each page is extracted and provided alongside each page's image. - Documents are provided as a combination of text and images for analysis. - This allows users to ask for insights on visual elements of a PDF, such as charts, diagrams, and other non-textual content. Claude can reference both textual and visual content when it responds. You can further improve performance by integrating PDF support with: - **Prompt caching**: To improve performance for repeated analysis. - **Batch processing**: For high-volume document processing. - **Tool use**: To extract specific information from documents for use as tool inputs. ### Estimate your costs The token count of a PDF file depends on the total text extracted from the document as well as the number of pages: - Text token costs: Each page typically uses 1,500-3,000 tokens per page depending on content density. Standard API pricing applies with no additional PDF fees. - Image token costs: Since each page is converted into an image, the same [image-based cost calculations](/docs/en/build-with-claude/vision#evaluate-image-size) are applied. You can use [token counting](/docs/en/build-with-claude/token-counting) to estimate costs for your specific PDFs. *** ## Optimize PDF processing ### Improve performance Follow these best practices for optimal results: - Place PDFs before text in your requests - Use standard fonts - Ensure text is clear and legible - Rotate pages to proper upright orientation - Use logical page numbers (from PDF viewer) in prompts - Split large PDFs into chunks when needed - Enable prompt caching for repeated analysis ### Scale your implementation For high-volume processing, consider these approaches: #### Use prompt caching Cache PDFs to improve performance on repeated queries: ```bash Shell # Create a JSON request file using the pdf_base64.txt content jq -n --rawfile PDF_BASE64 pdf_base64.txt '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [{ "role": "user", "content": [{ "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": $PDF_BASE64 }, "cache_control": { "type": "ephemeral" } }, { "type": "text", "text": "Which model has the highest human preference win rates across each use-case?" }] }] }' > request.json # Then make the API call using the JSON file curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d @request.json ``` ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_data }, "cache_control": {"type": "ephemeral"} }, { "type": "text", "text": "Analyze this document." } ] } ], ) ``` ```typescript TypeScript const response = await anthropic.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, messages: [ { content: [ { type: 'document', source: { media_type: 'application/pdf', type: 'base64', data: pdfBase64, }, cache_control: { type: 'ephemeral' }, }, { type: 'text', text: 'Which model has the highest human preference win rates across each use-case?', }, ], role: 'user', }, ], }); console.log(response); ``` ```java Java import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Base64PdfSource; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.DocumentBlockParam; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class MessagesDocumentExample { public static void main(String[] args) throws IOException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Read PDF file as base64 byte[] pdfBytes = Files.readAllBytes(Paths.get("pdf_base64.txt")); String pdfBase64 = new String(pdfBytes); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofDocument( DocumentBlockParam.builder() .source(Base64PdfSource.builder() .data(pdfBase64) .build()) .cacheControl(CacheControlEphemeral.builder().build()) .build()), ContentBlockParam.ofText( TextBlockParam.builder() .text("Which model has the highest human preference win rates across each use-case?") .build()) )) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` #### Process document batches Use the Message Batches API for high-volume workflows: ```bash Shell # Create a JSON request file using the pdf_base64.txt content jq -n --rawfile PDF_BASE64 pdf_base64.txt ' { "requests": [ { "custom_id": "my-first-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": $PDF_BASE64 } }, { "type": "text", "text": "Which model has the highest human preference win rates across each use-case?" } ] } ] } }, { "custom_id": "my-second-request", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": $PDF_BASE64 } }, { "type": "text", "text": "Extract 5 key insights from this document." } ] } ] } } ] } ' > request.json # Then make the API call using the JSON file curl https://api.anthropic.com/v1/messages/batches \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d @request.json ``` ```python Python message_batch = client.messages.batches.create( requests=[ { "custom_id": "doc1", "params": { "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_data } }, { "type": "text", "text": "Summarize this document." } ] } ] } } ] ) ``` ```typescript TypeScript const response = await anthropic.messages.batches.create({ requests: [ { custom_id: 'my-first-request', params: { max_tokens: 1024, messages: [ { content: [ { type: 'document', source: { media_type: 'application/pdf', type: 'base64', data: pdfBase64, }, }, { type: 'text', text: 'Which model has the highest human preference win rates across each use-case?', }, ], role: 'user', }, ], model: 'claude-sonnet-4-5', }, }, { custom_id: 'my-second-request', params: { max_tokens: 1024, messages: [ { content: [ { type: 'document', source: { media_type: 'application/pdf', type: 'base64', data: pdfBase64, }, }, { type: 'text', text: 'Extract 5 key insights from this document.', }, ], role: 'user', }, ], model: 'claude-sonnet-4-5', }, } ], }); console.log(response); ``` ```java Java import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.*; import com.anthropic.models.messages.batches.*; public class MessagesBatchDocumentExample { public static void main(String[] args) throws IOException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Read PDF file as base64 byte[] pdfBytes = Files.readAllBytes(Paths.get("pdf_base64.txt")); String pdfBase64 = new String(pdfBytes); BatchCreateParams params = BatchCreateParams.builder() .addRequest(BatchCreateParams.Request.builder() .customId("my-first-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofDocument( DocumentBlockParam.builder() .source(Base64PdfSource.builder() .data(pdfBase64) .build()) .build() ), ContentBlockParam.ofText( TextBlockParam.builder() .text("Which model has the highest human preference win rates across each use-case?") .build() ) )) .build()) .build()) .addRequest(BatchCreateParams.Request.builder() .customId("my-second-request") .params(BatchCreateParams.Request.Params.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofDocument( DocumentBlockParam.builder() .source(Base64PdfSource.builder() .data(pdfBase64) .build()) .build() ), ContentBlockParam.ofText( TextBlockParam.builder() .text("Extract 5 key insights from this document.") .build() ) )) .build()) .build()) .build(); MessageBatch batch = client.messages().batches().create(params); System.out.println(batch); } } ``` ## Next steps Explore practical examples of PDF processing in our cookbook recipe. See complete API documentation for PDF support. --- # Prompt caching URL: https://platform.claude.com/docs/en/build-with-claude/prompt-caching # Prompt caching --- Prompt caching is a powerful feature that optimizes your API usage by allowing resuming from specific prefixes in your prompts. This approach significantly reduces processing time and costs for repetitive tasks or prompts with consistent elements. Here's an example of how to implement prompt caching with the Messages API using a `cache_control` block: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "system": [ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n" }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": "Analyze the major themes in Pride and Prejudice." } ] }' # Call the model again with the same inputs up to the cache checkpoint curl https://api.anthropic.com/v1/messages # rest of input ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system=[ { "type": "text", "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n", }, { "type": "text", "text": "", "cache_control": {"type": "ephemeral"} } ], messages=[{"role": "user", "content": "Analyze the major themes in 'Pride and Prejudice'."}], ) print(response.usage.model_dump_json()) # Call the model again with the same inputs up to the cache checkpoint response = client.messages.create(.....) print(response.usage.model_dump_json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, system: [ { type: "text", text: "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n", }, { type: "text", text: "", cache_control: { type: "ephemeral" } } ], messages: [ { role: "user", content: "Analyze the major themes in 'Pride and Prejudice'." } ] }); console.log(response.usage); // Call the model again with the same inputs up to the cache checkpoint const new_response = await client.messages.create(...) console.log(new_response.usage); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class PromptCachingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .systemOfTextBlockParams(List.of( TextBlockParam.builder() .text("You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n") .build(), TextBlockParam.builder() .text("") .cacheControl(CacheControlEphemeral.builder().build()) .build() )) .addUserMessage("Analyze the major themes in 'Pride and Prejudice'.") .build(); Message message = client.messages().create(params); System.out.println(message.usage()); } } ``` ```json JSON {"cache_creation_input_tokens":188086,"cache_read_input_tokens":0,"input_tokens":21,"output_tokens":393} {"cache_creation_input_tokens":0,"cache_read_input_tokens":188086,"input_tokens":21,"output_tokens":393} ``` In this example, the entire text of "Pride and Prejudice" is cached using the `cache_control` parameter. This enables reuse of this large text across multiple API calls without reprocessing it each time. Changing only the user message allows you to ask various questions about the book while utilizing the cached content, leading to faster responses and improved efficiency. --- ## How prompt caching works When you send a request with prompt caching enabled: 1. The system checks if a prompt prefix, up to a specified cache breakpoint, is already cached from a recent query. 2. If found, it uses the cached version, reducing processing time and costs. 3. Otherwise, it processes the full prompt and caches the prefix once the response begins. This is especially useful for: - Prompts with many examples - Large amounts of context or background information - Repetitive tasks with consistent instructions - Long multi-turn conversations By default, the cache has a 5-minute lifetime. The cache is refreshed for no additional cost each time the cached content is used. If you find that 5 minutes is too short, Anthropic also offers a 1-hour cache duration [at additional cost](#pricing). For more information, see [1-hour cache duration](#1-hour-cache-duration). **Prompt caching caches the full prefix** Prompt caching references the entire prompt - `tools`, `system`, and `messages` (in that order) up to and including the block designated with `cache_control`. --- ## Pricing Prompt caching introduces a new pricing structure. The table below shows the price per million tokens for each supported model: | Model | Base Input Tokens | 5m Cache Writes | 1h Cache Writes | Cache Hits & Refreshes | Output Tokens | |-------------------|-------------------|-----------------|-----------------|----------------------|---------------| | Claude Opus 4.5 | $5 / MTok | $6.25 / MTok | $10 / MTok | $0.50 / MTok | $25 / MTok | | Claude Opus 4.1 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Opus 4 | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Sonnet 4.5 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Sonnet 4 | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | $3 / MTok | $3.75 / MTok | $6 / MTok | $0.30 / MTok | $15 / MTok | | Claude Haiku 4.5 | $1 / MTok | $1.25 / MTok | $2 / MTok | $0.10 / MTok | $5 / MTok | | Claude Haiku 3.5 | $0.80 / MTok | $1 / MTok | $1.6 / MTok | $0.08 / MTok | $4 / MTok | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | $15 / MTok | $18.75 / MTok | $30 / MTok | $1.50 / MTok | $75 / MTok | | Claude Haiku 3 | $0.25 / MTok | $0.30 / MTok | $0.50 / MTok | $0.03 / MTok | $1.25 / MTok | The table above reflects the following pricing multipliers for prompt caching: - 5-minute cache write tokens are 1.25 times the base input tokens price - 1-hour cache write tokens are 2 times the base input tokens price - Cache read tokens are 0.1 times the base input tokens price --- ## How to implement prompt caching ### Supported models Prompt caching is currently supported on: - Claude Opus 4.5 - Claude Opus 4.1 - Claude Opus 4 - Claude Sonnet 4.5 - Claude Sonnet 4 - Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) - Claude Haiku 4.5 - Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)) - Claude Haiku 3 ### Structuring your prompt Place static content (tool definitions, system instructions, context, examples) at the beginning of your prompt. Mark the end of the reusable content for caching using the `cache_control` parameter. Cache prefixes are created in the following order: `tools`, `system`, then `messages`. This order forms a hierarchy where each level builds upon the previous ones. #### How automatic prefix checking works You can use just one cache breakpoint at the end of your static content, and the system will automatically find the longest matching sequence of cached blocks. Understanding how this works helps you optimize your caching strategy. **Three core principles:** 1. **Cache keys are cumulative**: When you explicitly cache a block with `cache_control`, the cache hash key is generated by hashing all previous blocks in the conversation sequentially. This means the cache for each block depends on all content that came before it. 2. **Backward sequential checking**: The system checks for cache hits by working backwards from your explicit breakpoint, checking each previous block in reverse order. This ensures you get the longest possible cache hit. 3. **20-block lookback window**: The system only checks up to 20 blocks before each explicit `cache_control` breakpoint. After checking 20 blocks without a match, it stops checking and moves to the next explicit breakpoint (if any). **Example: Understanding the lookback window** Consider a conversation with 30 content blocks where you set `cache_control` only on block 30: - **If you send block 31 with no changes to previous blocks**: The system checks block 30 (match!). You get a cache hit at block 30, and only block 31 needs processing. - **If you modify block 25 and send block 31**: The system checks backwards from block 30 → 29 → 28... → 25 (no match) → 24 (match!). Since block 24 hasn't changed, you get a cache hit at block 24, and only blocks 25-30 need reprocessing. - **If you modify block 5 and send block 31**: The system checks backwards from block 30 → 29 → 28... → 11 (check #20). After 20 checks without finding a match, it stops looking. Since block 5 is beyond the 20-block window, no cache hit occurs and all blocks need reprocessing. However, if you had set an explicit `cache_control` breakpoint on block 5, the system would continue checking from that breakpoint: block 5 (no match) → block 4 (match!). This allows a cache hit at block 4, demonstrating why you should place breakpoints before editable content. **Key takeaway**: Always set an explicit cache breakpoint at the end of your conversation to maximize your chances of cache hits. Additionally, set breakpoints just before content blocks that might be editable to ensure those sections can be cached independently. #### When to use multiple breakpoints You can define up to 4 cache breakpoints if you want to: - Cache different sections that change at different frequencies (e.g., tools rarely change, but context updates daily) - Have more control over exactly what gets cached - Ensure caching for content more than 20 blocks before your final breakpoint - Place breakpoints before editable content to guarantee cache hits even when changes occur beyond the 20-block window **Important limitation**: If your prompt has more than 20 content blocks before your cache breakpoint, and you modify content earlier than those 20 blocks, you won't get a cache hit unless you add additional explicit breakpoints closer to that content. ### Cache limitations The minimum cacheable prompt length is: - 4096 tokens for Claude Opus 4.5 - 1024 tokens for Claude Opus 4.1, Claude Opus 4, Claude Sonnet 4.5, Claude Sonnet 4, and Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) - 4096 tokens for Claude Haiku 4.5 - 2048 tokens for Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)) and Claude Haiku 3 Shorter prompts cannot be cached, even if marked with `cache_control`. Any requests to cache fewer than this number of tokens will be processed without caching. To see if a prompt was cached, see the response usage [fields](/docs/en/build-with-claude/prompt-caching#tracking-cache-performance). For concurrent requests, note that a cache entry only becomes available after the first response begins. If you need cache hits for parallel requests, wait for the first response before sending subsequent requests. Currently, "ephemeral" is the only supported cache type, which by default has a 5-minute lifetime. ### Understanding cache breakpoint costs **Cache breakpoints themselves don't add any cost.** You are only charged for: - **Cache writes**: When new content is written to the cache (25% more than base input tokens for 5-minute TTL) - **Cache reads**: When cached content is used (10% of base input token price) - **Regular input tokens**: For any uncached content Adding more `cache_control` breakpoints doesn't increase your costs - you still pay the same amount based on what content is actually cached and read. The breakpoints simply give you control over what sections can be cached independently. ### What can be cached Most blocks in the request can be designated for caching with `cache_control`. This includes: - Tools: Tool definitions in the `tools` array - System messages: Content blocks in the `system` array - Text messages: Content blocks in the `messages.content` array, for both user and assistant turns - Images & Documents: Content blocks in the `messages.content` array, in user turns - Tool use and tool results: Content blocks in the `messages.content` array, in both user and assistant turns Each of these elements can be marked with `cache_control` to enable caching for that portion of the request. ### What cannot be cached While most request blocks can be cached, there are some exceptions: - Thinking blocks cannot be cached directly with `cache_control`. However, thinking blocks CAN be cached alongside other content when they appear in previous assistant turns. When cached this way, they DO count as input tokens when read from cache. - Sub-content blocks (like [citations](/docs/en/build-with-claude/citations)) themselves cannot be cached directly. Instead, cache the top-level block. In the case of citations, the top-level document content blocks that serve as the source material for citations can be cached. This allows you to use prompt caching with citations effectively by caching the documents that citations will reference. - Empty text blocks cannot be cached. ### What invalidates the cache Modifications to cached content can invalidate some or all of the cache. As described in [Structuring your prompt](#structuring-your-prompt), the cache follows the hierarchy: `tools` → `system` → `messages`. Changes at each level invalidate that level and all subsequent levels. The following table shows which parts of the cache are invalidated by different types of changes. ✘ indicates that the cache is invalidated, while ✓ indicates that the cache remains valid. | What changes | Tools cache | System cache | Messages cache | Impact | |------------|------------------|---------------|----------------|-------------| | **Tool definitions** | ✘ | ✘ | ✘ | Modifying tool definitions (names, descriptions, parameters) invalidates the entire cache | | **Web search toggle** | ✓ | ✘ | ✘ | Enabling/disabling web search modifies the system prompt | | **Citations toggle** | ✓ | ✘ | ✘ | Enabling/disabling citations modifies the system prompt | | **Tool choice** | ✓ | ✓ | ✘ | Changes to `tool_choice` parameter only affect message blocks | | **Images** | ✓ | ✓ | ✘ | Adding/removing images anywhere in the prompt affects message blocks | | **Thinking parameters** | ✓ | ✓ | ✘ | Changes to extended thinking settings (enable/disable, budget) affect message blocks | | **Non-tool results passed to extended thinking requests** | ✓ | ✓ | ✘ | When non-tool results are passed in requests while extended thinking is enabled, all previously-cached thinking blocks are stripped from context, and any messages in context that follow those thinking blocks are removed from the cache. For more details, see [Caching with thinking blocks](#caching-with-thinking-blocks). | ### Tracking cache performance Monitor cache performance using these API response fields, within `usage` in the response (or `message_start` event if [streaming](/docs/en/build-with-claude/streaming)): - `cache_creation_input_tokens`: Number of tokens written to the cache when creating a new entry. - `cache_read_input_tokens`: Number of tokens retrieved from the cache for this request. - `input_tokens`: Number of input tokens which were not read from or used to create a cache (i.e., tokens after the last cache breakpoint). **Understanding the token breakdown** The `input_tokens` field represents only the tokens that come **after the last cache breakpoint** in your request - not all the input tokens you sent. To calculate total input tokens: ``` total_input_tokens = cache_read_input_tokens + cache_creation_input_tokens + input_tokens ``` **Spatial explanation:** - `cache_read_input_tokens` = tokens before breakpoint already cached (reads) - `cache_creation_input_tokens` = tokens before breakpoint being cached now (writes) - `input_tokens` = tokens after your last breakpoint (not eligible for cache) **Example:** If you have a request with 100,000 tokens of cached content (read from cache), 0 tokens of new content being cached, and 50 tokens in your user message (after the cache breakpoint): - `cache_read_input_tokens`: 100,000 - `cache_creation_input_tokens`: 0 - `input_tokens`: 50 - **Total input tokens processed**: 100,050 tokens This is important for understanding both costs and rate limits, as `input_tokens` will typically be much smaller than your total input when using caching effectively. ### Best practices for effective caching To optimize prompt caching performance: - Cache stable, reusable content like system instructions, background information, large contexts, or frequent tool definitions. - Place cached content at the prompt's beginning for best performance. - Use cache breakpoints strategically to separate different cacheable prefix sections. - Set cache breakpoints at the end of conversations and just before editable content to maximize cache hit rates, especially when working with prompts that have more than 20 content blocks. - Regularly analyze cache hit rates and adjust your strategy as needed. ### Optimizing for different use cases Tailor your prompt caching strategy to your scenario: - Conversational agents: Reduce cost and latency for extended conversations, especially those with long instructions or uploaded documents. - Coding assistants: Improve autocomplete and codebase Q&A by keeping relevant sections or a summarized version of the codebase in the prompt. - Large document processing: Incorporate complete long-form material including images in your prompt without increasing response latency. - Detailed instruction sets: Share extensive lists of instructions, procedures, and examples to fine-tune Claude's responses. Developers often include an example or two in the prompt, but with prompt caching you can get even better performance by including 20+ diverse examples of high quality answers. - Agentic tool use: Enhance performance for scenarios involving multiple tool calls and iterative code changes, where each step typically requires a new API call. - Talk to books, papers, documentation, podcast transcripts, and other longform content: Bring any knowledge base alive by embedding the entire document(s) into the prompt, and letting users ask it questions. ### Troubleshooting common issues If experiencing unexpected behavior: - Ensure cached sections are identical and marked with cache_control in the same locations across calls - Check that calls are made within the cache lifetime (5 minutes by default) - Verify that `tool_choice` and image usage remain consistent between calls - Validate that you are caching at least the minimum number of tokens - The system automatically checks for cache hits at previous content block boundaries (up to ~20 blocks before your breakpoint). For prompts with more than 20 content blocks, you may need additional `cache_control` parameters earlier in the prompt to ensure all content can be cached - Verify that the keys in your `tool_use` content blocks have stable ordering as some languages (e.g. Swift, Go) randomize key order during JSON conversion, breaking caches Changes to `tool_choice` or the presence/absence of images anywhere in the prompt will invalidate the cache, requiring a new cache entry to be created. For more details on cache invalidation, see [What invalidates the cache](#what-invalidates-the-cache). ### Caching with thinking blocks When using [extended thinking](/docs/en/build-with-claude/extended-thinking) with prompt caching, thinking blocks have special behavior: **Automatic caching alongside other content**: While thinking blocks cannot be explicitly marked with `cache_control`, they get cached as part of the request content when you make subsequent API calls with tool results. This commonly happens during tool use when you pass thinking blocks back to continue the conversation. **Input token counting**: When thinking blocks are read from cache, they count as input tokens in your usage metrics. This is important for cost calculation and token budgeting. **Cache invalidation patterns**: - Cache remains valid when only tool results are provided as user messages - Cache gets invalidated when non-tool-result user content is added, causing all previous thinking blocks to be stripped - This caching behavior occurs even without explicit `cache_control` markers For more details on cache invalidation, see [What invalidates the cache](#what-invalidates-the-cache). **Example with tool use**: ``` Request 1: User: "What's the weather in Paris?" Response: [thinking_block_1] + [tool_use block 1] Request 2: User: ["What's the weather in Paris?"], Assistant: [thinking_block_1] + [tool_use block 1], User: [tool_result_1, cache=True] Response: [thinking_block_2] + [text block 2] # Request 2 caches its request content (not the response) # The cache includes: user message, thinking_block_1, tool_use block 1, and tool_result_1 Request 3: User: ["What's the weather in Paris?"], Assistant: [thinking_block_1] + [tool_use block 1], User: [tool_result_1, cache=True], Assistant: [thinking_block_2] + [text block 2], User: [Text response, cache=True] # Non-tool-result user block causes all thinking blocks to be ignored # This request is processed as if thinking blocks were never present ``` When a non-tool-result user block is included, it designates a new assistant loop and all previous thinking blocks are removed from context. For more detailed information, see the [extended thinking documentation](/docs/en/build-with-claude/extended-thinking#understanding-thinking-block-caching-behavior). --- ## Cache storage and sharing Starting February 5, 2026, prompt caching will use workspace-level isolation instead of organization-level isolation. Caches will be isolated per workspace, ensuring data separation between workspaces within the same organization. This change applies to the Claude API and Azure; Amazon Bedrock and Google Vertex AI will maintain organization-level cache isolation. If you use multiple workspaces, review your caching strategy to account for this change. - **Organization Isolation**: Caches are isolated between organizations. Different organizations never share caches, even if they use identical prompts. - **Exact Matching**: Cache hits require 100% identical prompt segments, including all text and images up to and including the block marked with cache control. - **Output Token Generation**: Prompt caching has no effect on output token generation. The response you receive will be identical to what you would get if prompt caching was not used. --- ## 1-hour cache duration If you find that 5 minutes is too short, Anthropic also offers a 1-hour cache duration [at additional cost](#pricing). To use the extended cache, include `ttl` in the `cache_control` definition like this: ```json "cache_control": { "type": "ephemeral", "ttl": "5m" | "1h" } ``` The response will include detailed cache information like the following: ```json { "usage": { "input_tokens": ..., "cache_read_input_tokens": ..., "cache_creation_input_tokens": ..., "output_tokens": ..., "cache_creation": { "ephemeral_5m_input_tokens": 456, "ephemeral_1h_input_tokens": 100, } } } ``` Note that the current `cache_creation_input_tokens` field equals the sum of the values in the `cache_creation` object. ### When to use the 1-hour cache If you have prompts that are used at a regular cadence (i.e., system prompts that are used more frequently than every 5 minutes), continue to use the 5-minute cache, since this will continue to be refreshed at no additional charge. The 1-hour cache is best used in the following scenarios: - When you have prompts that are likely used less frequently than 5 minutes, but more frequently than every hour. For example, when an agentic side-agent will take longer than 5 minutes, or when storing a long chat conversation with a user and you generally expect that user may not respond in the next 5 minutes. - When latency is important and your follow up prompts may be sent beyond 5 minutes. - When you want to improve your rate limit utilization, since cache hits are not deducted against your rate limit. The 5-minute and 1-hour cache behave the same with respect to latency. You will generally see improved time-to-first-token for long documents. ### Mixing different TTLs You can use both 1-hour and 5-minute cache controls in the same request, but with an important constraint: Cache entries with longer TTL must appear before shorter TTLs (i.e., a 1-hour cache entry must appear before any 5-minute cache entries). When mixing TTLs, we determine three billing locations in your prompt: 1. Position `A`: The token count at the highest cache hit (or 0 if no hits). 2. Position `B`: The token count at the highest 1-hour `cache_control` block after `A` (or equals `A` if none exist). 3. Position `C`: The token count at the last `cache_control` block. If `B` and/or `C` are larger than `A`, they will necessarily be cache misses, because `A` is the highest cache hit. You'll be charged for: 1. Cache read tokens for `A`. 2. 1-hour cache write tokens for `(B - A)`. 3. 5-minute cache write tokens for `(C - B)`. Here are 3 examples. This depicts the input tokens of 3 requests, each of which has different cache hits and cache misses. Each has a different calculated pricing, shown in the colored boxes, as a result. ![Mixing TTLs Diagram](/docs/images/prompt-cache-mixed-ttl.svg) --- ## Prompt caching examples To help you get started with prompt caching, we've prepared a [prompt caching cookbook](https://platform.claude.com/cookbook/misc-prompt-caching) with detailed examples and best practices. Below, we've included several code snippets that showcase various prompt caching patterns. These examples demonstrate how to implement caching in different scenarios, helping you understand the practical applications of this feature:
```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "system": [ { "type": "text", "text": "You are an AI assistant tasked with analyzing legal documents." }, { "type": "text", "text": "Here is the full text of a complex legal agreement: [Insert full text of a 50-page legal agreement here]", "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": "What are the key terms and conditions in this agreement?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system=[ { "type": "text", "text": "You are an AI assistant tasked with analyzing legal documents." }, { "type": "text", "text": "Here is the full text of a complex legal agreement: [Insert full text of a 50-page legal agreement here]", "cache_control": {"type": "ephemeral"} } ], messages=[ { "role": "user", "content": "What are the key terms and conditions in this agreement?" } ] ) print(response.model_dump_json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, system: [ { "type": "text", "text": "You are an AI assistant tasked with analyzing legal documents." }, { "type": "text", "text": "Here is the full text of a complex legal agreement: [Insert full text of a 50-page legal agreement here]", "cache_control": {"type": "ephemeral"} } ], messages: [ { "role": "user", "content": "What are the key terms and conditions in this agreement?" } ] }); console.log(response); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class LegalDocumentAnalysisExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .systemOfTextBlockParams(List.of( TextBlockParam.builder() .text("You are an AI assistant tasked with analyzing legal documents.") .build(), TextBlockParam.builder() .text("Here is the full text of a complex legal agreement: [Insert full text of a 50-page legal agreement here]") .cacheControl(CacheControlEphemeral.builder().build()) .build() )) .addUserMessage("What are the key terms and conditions in this agreement?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` This example demonstrates basic prompt caching usage, caching the full text of the legal agreement as a prefix while keeping the user instruction uncached. For the first request: - `input_tokens`: Number of tokens in the user message only - `cache_creation_input_tokens`: Number of tokens in the entire system message, including the legal document - `cache_read_input_tokens`: 0 (no cache hit on first request) For subsequent requests within the cache lifetime: - `input_tokens`: Number of tokens in the user message only - `cache_creation_input_tokens`: 0 (no new cache creation) - `cache_read_input_tokens`: Number of tokens in the entire cached system message
```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either celsius or fahrenheit" } }, "required": ["location"] } }, # many more tools { "name": "get_time", "description": "Get the current time in a given time zone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The IANA time zone name, e.g. America/Los_Angeles" } }, "required": ["timezone"] }, "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": "What is the weather and time in New York?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] }, }, # many more tools { "name": "get_time", "description": "Get the current time in a given time zone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The IANA time zone name, e.g. America/Los_Angeles" } }, "required": ["timezone"] }, "cache_control": {"type": "ephemeral"} } ], messages=[ { "role": "user", "content": "What's the weather and time in New York?" } ] ) print(response.model_dump_json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] }, }, // many more tools { "name": "get_time", "description": "Get the current time in a given time zone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The IANA time zone name, e.g. America/Los_Angeles" } }, "required": ["timezone"] }, "cache_control": {"type": "ephemeral"} } ], messages: [ { "role": "user", "content": "What's the weather and time in New York?" } ] }); console.log(response); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class ToolsWithCacheControlExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Weather tool schema InputSchema weatherSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either celsius or fahrenheit" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); // Time tool schema InputSchema timeSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "timezone", Map.of( "type", "string", "description", "The IANA time zone name, e.g. America/Los_Angeles" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("timezone"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(weatherSchema) .build()) .addTool(Tool.builder() .name("get_time") .description("Get the current time in a given time zone") .inputSchema(timeSchema) .cacheControl(CacheControlEphemeral.builder().build()) .build()) .addUserMessage("What is the weather and time in New York?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` In this example, we demonstrate caching tool definitions. The `cache_control` parameter is placed on the final tool (`get_time`) to designate all of the tools as part of the static prefix. This means that all tool definitions, including `get_weather` and any other tools defined before `get_time`, will be cached as a single prefix. This approach is useful when you have a consistent set of tools that you want to reuse across multiple requests without re-processing them each time. For the first request: - `input_tokens`: Number of tokens in the user message - `cache_creation_input_tokens`: Number of tokens in all tool definitions and system prompt - `cache_read_input_tokens`: 0 (no cache hit on first request) For subsequent requests within the cache lifetime: - `input_tokens`: Number of tokens in the user message - `cache_creation_input_tokens`: 0 (no new cache creation) - `cache_read_input_tokens`: Number of tokens in all cached tool definitions and system prompt
```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "system": [ { "type": "text", "text": "...long system prompt", "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Hello, can you tell me more about the solar system?", } ] }, { "role": "assistant", "content": "Certainly! The solar system is the collection of celestial bodies that orbit our Sun. It consists of eight planets, numerous moons, asteroids, comets, and other objects. The planets, in order from closest to farthest from the Sun, are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. Each planet has its own unique characteristics and features. Is there a specific aspect of the solar system you would like to know more about?" }, { "role": "user", "content": [ { "type": "text", "text": "Good to know." }, { "type": "text", "text": "Tell me more about Mars.", "cache_control": {"type": "ephemeral"} } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system=[ { "type": "text", "text": "...long system prompt", "cache_control": {"type": "ephemeral"} } ], messages=[ # ...long conversation so far { "role": "user", "content": [ { "type": "text", "text": "Hello, can you tell me more about the solar system?", } ] }, { "role": "assistant", "content": "Certainly! The solar system is the collection of celestial bodies that orbit our Sun. It consists of eight planets, numerous moons, asteroids, comets, and other objects. The planets, in order from closest to farthest from the Sun, are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. Each planet has its own unique characteristics and features. Is there a specific aspect of the solar system you'd like to know more about?" }, { "role": "user", "content": [ { "type": "text", "text": "Good to know." }, { "type": "text", "text": "Tell me more about Mars.", "cache_control": {"type": "ephemeral"} } ] } ] ) print(response.model_dump_json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, system=[ { "type": "text", "text": "...long system prompt", "cache_control": {"type": "ephemeral"} } ], messages=[ // ...long conversation so far { "role": "user", "content": [ { "type": "text", "text": "Hello, can you tell me more about the solar system?", } ] }, { "role": "assistant", "content": "Certainly! The solar system is the collection of celestial bodies that orbit our Sun. It consists of eight planets, numerous moons, asteroids, comets, and other objects. The planets, in order from closest to farthest from the Sun, are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. Each planet has its own unique characteristics and features. Is there a specific aspect of the solar system you'd like to know more about?" }, { "role": "user", "content": [ { "type": "text", "text": "Good to know." }, { "type": "text", "text": "Tell me more about Mars.", "cache_control": {"type": "ephemeral"} } ] } ] }); console.log(response); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class ConversationWithCacheControlExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Create ephemeral system prompt TextBlockParam systemPrompt = TextBlockParam.builder() .text("...long system prompt") .cacheControl(CacheControlEphemeral.builder().build()) .build(); // Create message params MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) .systemOfTextBlockParams(List.of(systemPrompt)) // First user message (without cache control) .addUserMessage("Hello, can you tell me more about the solar system?") // Assistant response .addAssistantMessage("Certainly! The solar system is the collection of celestial bodies that orbit our Sun. It consists of eight planets, numerous moons, asteroids, comets, and other objects. The planets, in order from closest to farthest from the Sun, are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune. Each planet has its own unique characteristics and features. Is there a specific aspect of the solar system you would like to know more about?") // Second user message (with cache control) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofText(TextBlockParam.builder() .text("Good to know.") .build()), ContentBlockParam.ofText(TextBlockParam.builder() .text("Tell me more about Mars.") .cacheControl(CacheControlEphemeral.builder().build()) .build()) )) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` In this example, we demonstrate how to use prompt caching in a multi-turn conversation. During each turn, we mark the final block of the final message with `cache_control` so the conversation can be incrementally cached. The system will automatically lookup and use the longest previously cached sequence of blocks for follow-up messages. That is, blocks that were previously marked with a `cache_control` block are later not marked with this, but they will still be considered a cache hit (and also a cache refresh!) if they are hit within 5 minutes. In addition, note that the `cache_control` parameter is placed on the system message. This is to ensure that if this gets evicted from the cache (after not being used for more than 5 minutes), it will get added back to the cache on the next request. This approach is useful for maintaining context in ongoing conversations without repeatedly processing the same information. When this is set up properly, you should see the following in the usage response of each request: - `input_tokens`: Number of tokens in the new user message (will be minimal) - `cache_creation_input_tokens`: Number of tokens in the new assistant and user turns - `cache_read_input_tokens`: Number of tokens in the conversation up to the previous turn
```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "search_documents", "description": "Search through the knowledge base", "input_schema": { "type": "object", "properties": { "query": { "type": "string", "description": "Search query" } }, "required": ["query"] } }, { "name": "get_document", "description": "Retrieve a specific document by ID", "input_schema": { "type": "object", "properties": { "doc_id": { "type": "string", "description": "Document ID" } }, "required": ["doc_id"] }, "cache_control": {"type": "ephemeral"} } ], "system": [ { "type": "text", "text": "You are a helpful research assistant with access to a document knowledge base.\n\n# Instructions\n- Always search for relevant documents before answering\n- Provide citations for your sources\n- Be objective and accurate in your responses\n- If multiple documents contain relevant information, synthesize them\n- Acknowledge when information is not available in the knowledge base", "cache_control": {"type": "ephemeral"} }, { "type": "text", "text": "# Knowledge Base Context\n\nHere are the relevant documents for this conversation:\n\n## Document 1: Solar System Overview\nThe solar system consists of the Sun and all objects that orbit it...\n\n## Document 2: Planetary Characteristics\nEach planet has unique features. Mercury is the smallest planet...\n\n## Document 3: Mars Exploration\nMars has been a target of exploration for decades...\n\n[Additional documents...]", "cache_control": {"type": "ephemeral"} } ], "messages": [ { "role": "user", "content": "Can you search for information about Mars rovers?" }, { "role": "assistant", "content": [ { "type": "tool_use", "id": "tool_1", "name": "search_documents", "input": {"query": "Mars rovers"} } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "tool_1", "content": "Found 3 relevant documents: Document 3 (Mars Exploration), Document 7 (Rover Technology), Document 9 (Mission History)" } ] }, { "role": "assistant", "content": [ { "type": "text", "text": "I found 3 relevant documents about Mars rovers. Let me get more details from the Mars Exploration document." } ] }, { "role": "user", "content": [ { "type": "text", "text": "Yes, please tell me about the Perseverance rover specifically.", "cache_control": {"type": "ephemeral"} } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "search_documents", "description": "Search through the knowledge base", "input_schema": { "type": "object", "properties": { "query": { "type": "string", "description": "Search query" } }, "required": ["query"] } }, { "name": "get_document", "description": "Retrieve a specific document by ID", "input_schema": { "type": "object", "properties": { "doc_id": { "type": "string", "description": "Document ID" } }, "required": ["doc_id"] }, "cache_control": {"type": "ephemeral"} } ], system=[ { "type": "text", "text": "You are a helpful research assistant with access to a document knowledge base.\n\n# Instructions\n- Always search for relevant documents before answering\n- Provide citations for your sources\n- Be objective and accurate in your responses\n- If multiple documents contain relevant information, synthesize them\n- Acknowledge when information is not available in the knowledge base", "cache_control": {"type": "ephemeral"} }, { "type": "text", "text": "# Knowledge Base Context\n\nHere are the relevant documents for this conversation:\n\n## Document 1: Solar System Overview\nThe solar system consists of the Sun and all objects that orbit it...\n\n## Document 2: Planetary Characteristics\nEach planet has unique features. Mercury is the smallest planet...\n\n## Document 3: Mars Exploration\nMars has been a target of exploration for decades...\n\n[Additional documents...]", "cache_control": {"type": "ephemeral"} } ], messages=[ { "role": "user", "content": "Can you search for information about Mars rovers?" }, { "role": "assistant", "content": [ { "type": "tool_use", "id": "tool_1", "name": "search_documents", "input": {"query": "Mars rovers"} } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "tool_1", "content": "Found 3 relevant documents: Document 3 (Mars Exploration), Document 7 (Rover Technology), Document 9 (Mission History)" } ] }, { "role": "assistant", "content": [ { "type": "text", "text": "I found 3 relevant documents about Mars rovers. Let me get more details from the Mars Exploration document." } ] }, { "role": "user", "content": [ { "type": "text", "text": "Yes, please tell me about the Perseverance rover specifically.", "cache_control": {"type": "ephemeral"} } ] } ] ) print(response.model_dump_json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { name: "search_documents", description: "Search through the knowledge base", input_schema: { type: "object", properties: { query: { type: "string", description: "Search query" } }, required: ["query"] } }, { name: "get_document", description: "Retrieve a specific document by ID", input_schema: { type: "object", properties: { doc_id: { type: "string", description: "Document ID" } }, required: ["doc_id"] }, cache_control: { type: "ephemeral" } } ], system: [ { type: "text", text: "You are a helpful research assistant with access to a document knowledge base.\n\n# Instructions\n- Always search for relevant documents before answering\n- Provide citations for your sources\n- Be objective and accurate in your responses\n- If multiple documents contain relevant information, synthesize them\n- Acknowledge when information is not available in the knowledge base", cache_control: { type: "ephemeral" } }, { type: "text", text: "# Knowledge Base Context\n\nHere are the relevant documents for this conversation:\n\n## Document 1: Solar System Overview\nThe solar system consists of the Sun and all objects that orbit it...\n\n## Document 2: Planetary Characteristics\nEach planet has unique features. Mercury is the smallest planet...\n\n## Document 3: Mars Exploration\nMars has been a target of exploration for decades...\n\n[Additional documents...]", cache_control: { type: "ephemeral" } } ], messages: [ { role: "user", content: "Can you search for information about Mars rovers?" }, { role: "assistant", content: [ { type: "tool_use", id: "tool_1", name: "search_documents", input: { query: "Mars rovers" } } ] }, { role: "user", content: [ { type: "tool_result", tool_use_id: "tool_1", content: "Found 3 relevant documents: Document 3 (Mars Exploration), Document 7 (Rover Technology), Document 9 (Mission History)" } ] }, { role: "assistant", content: [ { type: "text", text: "I found 3 relevant documents about Mars rovers. Let me get more details from the Mars Exploration document." } ] }, { role: "user", content: [ { type: "text", text: "Yes, please tell me about the Perseverance rover specifically.", cache_control: { type: "ephemeral" } } ] } ] }); console.log(response); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.CacheControlEphemeral; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; import com.anthropic.models.messages.ToolResultBlockParam; import com.anthropic.models.messages.ToolUseBlockParam; public class MultipleCacheBreakpointsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Search tool schema InputSchema searchSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "query", Map.of( "type", "string", "description", "Search query" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("query"))) .build(); // Get document tool schema InputSchema getDocSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "doc_id", Map.of( "type", "string", "description", "Document ID" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("doc_id"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_20250514) .maxTokens(1024) // Tools with cache control on the last one .addTool(Tool.builder() .name("search_documents") .description("Search through the knowledge base") .inputSchema(searchSchema) .build()) .addTool(Tool.builder() .name("get_document") .description("Retrieve a specific document by ID") .inputSchema(getDocSchema) .cacheControl(CacheControlEphemeral.builder().build()) .build()) // System prompts with cache control on instructions and context separately .systemOfTextBlockParams(List.of( TextBlockParam.builder() .text("You are a helpful research assistant with access to a document knowledge base.\n\n# Instructions\n- Always search for relevant documents before answering\n- Provide citations for your sources\n- Be objective and accurate in your responses\n- If multiple documents contain relevant information, synthesize them\n- Acknowledge when information is not available in the knowledge base") .cacheControl(CacheControlEphemeral.builder().build()) .build(), TextBlockParam.builder() .text("# Knowledge Base Context\n\nHere are the relevant documents for this conversation:\n\n## Document 1: Solar System Overview\nThe solar system consists of the Sun and all objects that orbit it...\n\n## Document 2: Planetary Characteristics\nEach planet has unique features. Mercury is the smallest planet...\n\n## Document 3: Mars Exploration\nMars has been a target of exploration for decades...\n\n[Additional documents...]") .cacheControl(CacheControlEphemeral.builder().build()) .build() )) // Conversation history .addUserMessage("Can you search for information about Mars rovers?") .addAssistantMessageOfBlockParams(List.of( ContentBlockParam.ofToolUse(ToolUseBlockParam.builder() .id("tool_1") .name("search_documents") .input(JsonValue.from(Map.of("query", "Mars rovers"))) .build()) )) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofToolResult(ToolResultBlockParam.builder() .toolUseId("tool_1") .content("Found 3 relevant documents: Document 3 (Mars Exploration), Document 7 (Rover Technology), Document 9 (Mission History)") .build()) )) .addAssistantMessageOfBlockParams(List.of( ContentBlockParam.ofText(TextBlockParam.builder() .text("I found 3 relevant documents about Mars rovers. Let me get more details from the Mars Exploration document.") .build()) )) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofText(TextBlockParam.builder() .text("Yes, please tell me about the Perseverance rover specifically.") .cacheControl(CacheControlEphemeral.builder().build()) .build()) )) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` This comprehensive example demonstrates how to use all 4 available cache breakpoints to optimize different parts of your prompt: 1. **Tools cache** (cache breakpoint 1): The `cache_control` parameter on the last tool definition caches all tool definitions. 2. **Reusable instructions cache** (cache breakpoint 2): The static instructions in the system prompt are cached separately. These instructions rarely change between requests. 3. **RAG context cache** (cache breakpoint 3): The knowledge base documents are cached independently, allowing you to update the RAG documents without invalidating the tools or instructions cache. 4. **Conversation history cache** (cache breakpoint 4): The assistant's response is marked with `cache_control` to enable incremental caching of the conversation as it progresses. This approach provides maximum flexibility: - If you only update the final user message, all four cache segments are reused - If you update the RAG documents but keep the same tools and instructions, the first two cache segments are reused - If you change the conversation but keep the same tools, instructions, and documents, the first three segments are reused - Each cache breakpoint can be invalidated independently based on what changes in your application For the first request: - `input_tokens`: Tokens in the final user message - `cache_creation_input_tokens`: Tokens in all cached segments (tools + instructions + RAG documents + conversation history) - `cache_read_input_tokens`: 0 (no cache hits) For subsequent requests with only a new user message: - `input_tokens`: Tokens in the new user message only - `cache_creation_input_tokens`: Any new tokens added to conversation history - `cache_read_input_tokens`: All previously cached tokens (tools + instructions + RAG documents + previous conversation) This pattern is especially powerful for: - RAG applications with large document contexts - Agent systems that use multiple tools - Long-running conversations that need to maintain context - Applications that need to optimize different parts of the prompt independently
--- ## FAQ
**In most cases, a single cache breakpoint at the end of your static content is sufficient.** The system automatically checks for cache hits at all previous content block boundaries (up to 20 blocks before your breakpoint) and uses the longest matching sequence of cached blocks. You only need multiple breakpoints if: - You have more than 20 content blocks before your desired cache point - You want to cache sections that update at different frequencies independently - You need explicit control over what gets cached for cost optimization Example: If you have system instructions (rarely change) and RAG context (changes daily), you might use two breakpoints to cache them separately.
No, cache breakpoints themselves are free. You only pay for: - Writing content to cache (25% more than base input tokens for 5-minute TTL) - Reading from cache (10% of base input token price) - Regular input tokens for uncached content The number of breakpoints doesn't affect pricing - only the amount of content cached and read matters.
The usage response includes three separate input token fields that together represent your total input: ``` total_input_tokens = cache_read_input_tokens + cache_creation_input_tokens + input_tokens ``` - `cache_read_input_tokens`: Tokens retrieved from cache (everything before cache breakpoints that was cached) - `cache_creation_input_tokens`: New tokens being written to cache (at cache breakpoints) - `input_tokens`: Tokens **after the last cache breakpoint** that aren't cached **Important:** `input_tokens` does NOT represent all input tokens - only the portion after your last cache breakpoint. If you have cached content, `input_tokens` will typically be much smaller than your total input. **Example:** With a 200K token document cached and a 50 token user question: - `cache_read_input_tokens`: 200,000 - `cache_creation_input_tokens`: 0 - `input_tokens`: 50 - **Total**: 200,050 tokens This breakdown is critical for understanding both your costs and rate limit usage. See [Tracking cache performance](#tracking-cache-performance) for more details.
The cache's default minimum lifetime (TTL) is 5 minutes. This lifetime is refreshed each time the cached content is used. If you find that 5 minutes is too short, Anthropic also offers a [1-hour cache TTL](#1-hour-cache-duration).
You can define up to 4 cache breakpoints (using `cache_control` parameters) in your prompt.
No, prompt caching is currently only available for Claude Opus 4.5, Claude Opus 4.1, Claude Opus 4, Claude Sonnet 4.5, Claude Sonnet 4, Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)), Claude Haiku 4.5, Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)), and Claude Haiku 3.
Cached system prompts and tools will be reused when thinking parameters change. However, thinking changes (enabling/disabling or budget changes) will invalidate previously cached prompt prefixes with messages content. For more details on cache invalidation, see [What invalidates the cache](#what-invalidates-the-cache). For more on extended thinking, including its interaction with tool use and prompt caching, see the [extended thinking documentation](/docs/en/build-with-claude/extended-thinking#extended-thinking-and-prompt-caching).
To enable prompt caching, include at least one `cache_control` breakpoint in your API request.
Yes, prompt caching can be used alongside other API features like tool use and vision capabilities. However, changing whether there are images in a prompt or modifying tool use settings will break the cache. For more details on cache invalidation, see [What invalidates the cache](#what-invalidates-the-cache).
Prompt caching introduces a new pricing structure where cache writes cost 25% more than base input tokens, while cache hits cost only 10% of the base input token price.
Currently, there's no way to manually clear the cache. Cached prefixes automatically expire after a minimum of 5 minutes of inactivity.
You can monitor cache performance using the `cache_creation_input_tokens` and `cache_read_input_tokens` fields in the API response.
See [What invalidates the cache](#what-invalidates-the-cache) for more details on cache invalidation, including a list of changes that require creating a new cache entry.
Prompt caching is designed with strong privacy and data separation measures: 1. Cache keys are generated using a cryptographic hash of the prompts up to the cache control point. This means only requests with identical prompts can access a specific cache. 2. Caches are organization-specific. Users within the same organization can access the same cache if they use identical prompts, but caches are not shared across different organizations, even for identical prompts. 3. The caching mechanism is designed to maintain the integrity and privacy of each unique conversation or context. 4. It's safe to use `cache_control` anywhere in your prompts. For cost efficiency, it's better to exclude highly variable parts (e.g., user's arbitrary input) from caching. These measures ensure that prompt caching maintains data privacy and security while offering performance benefits. Note: Starting February 5, 2026, caches will be isolated per workspace instead of per organization. This change applies to the Claude API and Azure. See [Cache storage and sharing](#cache-storage-and-sharing) for details.
Yes, it is possible to use prompt caching with your [Batches API](/docs/en/build-with-claude/batch-processing) requests. However, because asynchronous batch requests can be processed concurrently and in any order, cache hits are provided on a best-effort basis. The [1-hour cache](#1-hour-cache-duration) can help improve your cache hits. The most cost effective way of using it is the following: - Gather a set of message requests that have a shared prefix. - Send a batch request with just a single request that has this shared prefix and a 1-hour cache block. This will get written to the 1-hour cache. - As soon as this is complete, submit the rest of the requests. You will have to monitor the job to know when it completes. This is typically better than using the 5-minute cache simply because it’s common for batch requests to take between 5 minutes and 1 hour to complete. We’re considering ways to improve these cache hit rates and making this process more straightforward.
This error typically appears when you have upgraded your SDK or you are using outdated code examples. Prompt caching is now generally available, so you no longer need the beta prefix. Instead of: ```python Python python client.beta.prompt_caching.messages.create(...) ``` Simply use: ```python Python python client.messages.create(...) ```
This error typically appears when you have upgraded your SDK or you are using outdated code examples. Prompt caching is now generally available, so you no longer need the beta prefix. Instead of: ```typescript TypeScript client.beta.promptCaching.messages.create(...) ``` Simply use: ```typescript client.messages.create(...) ```
--- # Search results URL: https://platform.claude.com/docs/en/build-with-claude/search-results # Search results Enable natural citations for RAG applications by providing search results with source attribution --- Search result content blocks enable natural citations with proper source attribution, bringing web search-quality citations to your custom applications. This feature is particularly powerful for RAG (Retrieval-Augmented Generation) applications where you need Claude to cite sources accurately. The search results feature is available on the following models: - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-7-sonnet-20250219`) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) - Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-5-haiku-20241022`) ## Key benefits - **Natural citations** - Achieve the same citation quality as web search for any content - **Flexible integration** - Use in tool returns for dynamic RAG or as top-level content for pre-fetched data - **Proper source attribution** - Each result includes source and title information for clear attribution - **No document workarounds needed** - Eliminates the need for document-based workarounds - **Consistent citation format** - Matches the citation quality and format of Claude's web search functionality ## How it works Search results can be provided in two ways: 1. **From tool calls** - Your custom tools return search results, enabling dynamic RAG applications 2. **As top-level content** - You provide search results directly in user messages for pre-fetched or cached content In both cases, Claude can automatically cite information from the search results with proper source attribution. ### Search result schema Search results use the following structure: ```json { "type": "search_result", "source": "https://example.com/article", // Required: Source URL or identifier "title": "Article Title", // Required: Title of the result "content": [ // Required: Array of text blocks { "type": "text", "text": "The actual content of the search result..." } ], "citations": { // Optional: Citation configuration "enabled": true // Enable/disable citations for this result } } ``` ### Required fields | Field | Type | Description | |-------|------|-------------| | `type` | string | Must be `"search_result"` | | `source` | string | The source URL or identifier for the content | | `title` | string | A descriptive title for the search result | | `content` | array | An array of text blocks containing the actual content | ### Optional fields | Field | Type | Description | |-------|------|-------------| | `citations` | object | Citation configuration with `enabled` boolean field | | `cache_control` | object | Cache control settings (e.g., `{"type": "ephemeral"}`) | Each item in the `content` array must be a text block with: - `type`: Must be `"text"` - `text`: The actual text content (non-empty string) ## Method 1: Search results from tool calls The most powerful use case is returning search results from your custom tools. This enables dynamic RAG applications where tools fetch and return relevant content with automatic citations. ### Example: Knowledge base tool ```python Python from anthropic import Anthropic from anthropic.types import ( MessageParam, TextBlockParam, SearchResultBlockParam, ToolResultBlockParam ) client = Anthropic() # Define a knowledge base search tool knowledge_base_tool = { "name": "search_knowledge_base", "description": "Search the company knowledge base for information", "input_schema": { "type": "object", "properties": { "query": { "type": "string", "description": "The search query" } }, "required": ["query"] } } # Function to handle the tool call def search_knowledge_base(query): # Your search logic here # Returns search results in the correct format return [ SearchResultBlockParam( type="search_result", source="https://docs.company.com/product-guide", title="Product Configuration Guide", content=[ TextBlockParam( type="text", text="To configure the product, navigate to Settings > Configuration. The default timeout is 30 seconds, but can be adjusted between 10-120 seconds based on your needs." ) ], citations={"enabled": True} ), SearchResultBlockParam( type="search_result", source="https://docs.company.com/troubleshooting", title="Troubleshooting Guide", content=[ TextBlockParam( type="text", text="If you encounter timeout errors, first check the configuration settings. Common causes include network latency and incorrect timeout values." ) ], citations={"enabled": True} ) ] # Create a message with the tool response = client.messages.create( model="claude-sonnet-4-5", # Works with all supported models max_tokens=1024, tools=[knowledge_base_tool], messages=[ MessageParam( role="user", content="How do I configure the timeout settings?" ) ] ) # When Claude calls the tool, provide the search results if response.content[0].type == "tool_use": tool_result = search_knowledge_base(response.content[0].input["query"]) # Send the tool result back final_response = client.messages.create( model="claude-sonnet-4-5", # Works with all supported models max_tokens=1024, messages=[ MessageParam(role="user", content="How do I configure the timeout settings?"), MessageParam(role="assistant", content=response.content), MessageParam( role="user", content=[ ToolResultBlockParam( type="tool_result", tool_use_id=response.content[0].id, content=tool_result # Search results go here ) ] ) ] ) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Define a knowledge base search tool const knowledgeBaseTool = { name: "search_knowledge_base", description: "Search the company knowledge base for information", input_schema: { type: "object", properties: { query: { type: "string", description: "The search query" } }, required: ["query"] } }; // Function to handle the tool call function searchKnowledgeBase(query: string) { // Your search logic here // Returns search results in the correct format return [ { type: "search_result" as const, source: "https://docs.company.com/product-guide", title: "Product Configuration Guide", content: [ { type: "text" as const, text: "To configure the product, navigate to Settings > Configuration. The default timeout is 30 seconds, but can be adjusted between 10-120 seconds based on your needs." } ], citations: { enabled: true } }, { type: "search_result" as const, source: "https://docs.company.com/troubleshooting", title: "Troubleshooting Guide", content: [ { type: "text" as const, text: "If you encounter timeout errors, first check the configuration settings. Common causes include network latency and incorrect timeout values." } ], citations: { enabled: true } } ]; } // Create a message with the tool const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", // Works with all supported models max_tokens: 1024, tools: [knowledgeBaseTool], messages: [ { role: "user", content: "How do I configure the timeout settings?" } ] }); // Handle tool use and provide results if (response.content[0].type === "tool_use") { const toolResult = searchKnowledgeBase(response.content[0].input.query); const finalResponse = await anthropic.messages.create({ model: "claude-sonnet-4-5", // Works with all supported models max_tokens: 1024, messages: [ { role: "user", content: "How do I configure the timeout settings?" }, { role: "assistant", content: response.content }, { role: "user", content: [ { type: "tool_result" as const, tool_use_id: response.content[0].id, content: toolResult // Search results go here } ] } ] }); } ``` ## Method 2: Search results as top-level content You can also provide search results directly in user messages. This is useful for: - Pre-fetched content from your search infrastructure - Cached search results from previous queries - Content from external search services - Testing and development ### Example: Direct search results ```python Python from anthropic import Anthropic from anthropic.types import ( MessageParam, TextBlockParam, SearchResultBlockParam ) client = Anthropic() # Provide search results directly in the user message response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ MessageParam( role="user", content=[ SearchResultBlockParam( type="search_result", source="https://docs.company.com/api-reference", title="API Reference - Authentication", content=[ TextBlockParam( type="text", text="All API requests must include an API key in the Authorization header. Keys can be generated from the dashboard. Rate limits: 1000 requests per hour for standard tier, 10000 for premium." ) ], citations={"enabled": True} ), SearchResultBlockParam( type="search_result", source="https://docs.company.com/quickstart", title="Getting Started Guide", content=[ TextBlockParam( type="text", text="To get started: 1) Sign up for an account, 2) Generate an API key from the dashboard, 3) Install our SDK using pip install company-sdk, 4) Initialize the client with your API key." ) ], citations={"enabled": True} ), TextBlockParam( type="text", text="Based on these search results, how do I authenticate API requests and what are the rate limits?" ) ] ) ] ) print(response.model_dump_json(indent=2)) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Provide search results directly in the user message const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: [ { type: "search_result" as const, source: "https://docs.company.com/api-reference", title: "API Reference - Authentication", content: [ { type: "text" as const, text: "All API requests must include an API key in the Authorization header. Keys can be generated from the dashboard. Rate limits: 1000 requests per hour for standard tier, 10000 for premium." } ], citations: { enabled: true } }, { type: "search_result" as const, source: "https://docs.company.com/quickstart", title: "Getting Started Guide", content: [ { type: "text" as const, text: "To get started: 1) Sign up for an account, 2) Generate an API key from the dashboard, 3) Install our SDK using pip install company-sdk, 4) Initialize the client with your API key." } ], citations: { enabled: true } }, { type: "text" as const, text: "Based on these search results, how do I authenticate API requests and what are the rate limits?" } ] } ] }); console.log(response); ``` ```bash Shell #!/bin/sh curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "search_result", "source": "https://docs.company.com/api-reference", "title": "API Reference - Authentication", "content": [ { "type": "text", "text": "All API requests must include an API key in the Authorization header. Keys can be generated from the dashboard. Rate limits: 1000 requests per hour for standard tier, 10000 for premium." } ], "citations": { "enabled": true } }, { "type": "search_result", "source": "https://docs.company.com/quickstart", "title": "Getting Started Guide", "content": [ { "type": "text", "text": "To get started: 1) Sign up for an account, 2) Generate an API key from the dashboard, 3) Install our SDK using pip install company-sdk, 4) Initialize the client with your API key." } ], "citations": { "enabled": true } }, { "type": "text", "text": "Based on these search results, how do I authenticate API requests and what are the rate limits?" } ] } ] }' ``` ## Claude's response with citations Regardless of how search results are provided, Claude automatically includes citations when using information from them: ```json { "role": "assistant", "content": [ { "type": "text", "text": "To authenticate API requests, you need to include an API key in the Authorization header", "citations": [ { "type": "search_result_location", "source": "https://docs.company.com/api-reference", "title": "API Reference - Authentication", "cited_text": "All API requests must include an API key in the Authorization header", "search_result_index": 0, "start_block_index": 0, "end_block_index": 0 } ] }, { "type": "text", "text": ". You can generate API keys from your dashboard", "citations": [ { "type": "search_result_location", "source": "https://docs.company.com/api-reference", "title": "API Reference - Authentication", "cited_text": "Keys can be generated from the dashboard", "search_result_index": 0, "start_block_index": 0, "end_block_index": 0 } ] }, { "type": "text", "text": ". The rate limits are 1,000 requests per hour for the standard tier and 10,000 requests per hour for the premium tier.", "citations": [ { "type": "search_result_location", "source": "https://docs.company.com/api-reference", "title": "API Reference - Authentication", "cited_text": "Rate limits: 1000 requests per hour for standard tier, 10000 for premium", "search_result_index": 0, "start_block_index": 0, "end_block_index": 0 } ] } ] } ``` ### Citation fields Each citation includes: | Field | Type | Description | |-------|------|-------------| | `type` | string | Always `"search_result_location"` for search result citations | | `source` | string | The source from the original search result | | `title` | string or null | The title from the original search result | | `cited_text` | string | The exact text being cited | | `search_result_index` | integer | Index of the search result (0-based) | | `start_block_index` | integer | Starting position in the content array | | `end_block_index` | integer | Ending position in the content array | Note: The `search_result_index` refers to the index of the search result content block (0-based), regardless of how the search results were provided (tool call or top-level content). ## Multiple content blocks Search results can contain multiple text blocks in the `content` array: ```json { "type": "search_result", "source": "https://docs.company.com/api-guide", "title": "API Documentation", "content": [ { "type": "text", "text": "Authentication: All API requests require an API key." }, { "type": "text", "text": "Rate Limits: The API allows 1000 requests per hour per key." }, { "type": "text", "text": "Error Handling: The API returns standard HTTP status codes." } ] } ``` Claude can cite specific blocks using the `start_block_index` and `end_block_index` fields. ## Advanced usage ### Combining both methods You can use both tool-based and top-level search results in the same conversation: ```python # First message with top-level search results messages = [ MessageParam( role="user", content=[ SearchResultBlockParam( type="search_result", source="https://docs.company.com/overview", title="Product Overview", content=[ TextBlockParam(type="text", text="Our product helps teams collaborate...") ], citations={"enabled": True} ), TextBlockParam( type="text", text="Tell me about this product and search for pricing information" ) ] ) ] # Claude might respond and call a tool to search for pricing # Then you provide tool results with more search results ``` ### Combining with other content types Both methods support mixing search results with other content: ```python # In tool results tool_result = [ SearchResultBlockParam( type="search_result", source="https://docs.company.com/guide", title="User Guide", content=[TextBlockParam(type="text", text="Configuration details...")], citations={"enabled": True} ), TextBlockParam( type="text", text="Additional context: This applies to version 2.0 and later." ) ] # In top-level content user_content = [ SearchResultBlockParam( type="search_result", source="https://research.com/paper", title="Research Paper", content=[TextBlockParam(type="text", text="Key findings...")], citations={"enabled": True} ), { "type": "image", "source": {"type": "url", "url": "https://example.com/chart.png"} }, TextBlockParam( type="text", text="How does the chart relate to the research findings?" ) ] ``` ### Cache control Add cache control for better performance: ```json { "type": "search_result", "source": "https://docs.company.com/guide", "title": "User Guide", "content": [{"type": "text", "text": "..."}], "cache_control": { "type": "ephemeral" } } ``` ### Citation control By default, citations are disabled for search results. You can enable citations by explicitly setting the `citations` configuration: ```json { "type": "search_result", "source": "https://docs.company.com/guide", "title": "User Guide", "content": [{"type": "text", "text": "Important documentation..."}], "citations": { "enabled": true // Enable citations for this result } } ``` When `citations.enabled` is set to `true`, Claude will include citation references when using information from the search result. This enables: - Natural citations for your custom RAG applications - Source attribution when interfacing with proprietary knowledge bases - Web search-quality citations for any custom tool that returns search results If the `citations` field is omitted, citations are disabled by default. Citations are all-or-nothing: either all search results in a request must have citations enabled, or all must have them disabled. Mixing search results with different citation settings will result in an error. If you need to disable citations for some sources, you must disable them for all search results in that request. ## Best practices ### For tool-based search (Method 1) - **Dynamic content**: Use for real-time searches and dynamic RAG applications - **Error handling**: Return appropriate messages when searches fail - **Result limits**: Return only the most relevant results to avoid context overflow ### For top-level search (Method 2) - **Pre-fetched content**: Use when you already have search results - **Batch processing**: Ideal for processing multiple search results at once - **Testing**: Great for testing citation behavior with known content ### General best practices 1. **Structure results effectively** - Use clear, permanent source URLs - Provide descriptive titles - Break long content into logical text blocks 2. **Maintain consistency** - Use consistent source formats across your application - Ensure titles accurately reflect content - Keep formatting consistent 3. **Handle errors gracefully** ```python def search_with_fallback(query): try: results = perform_search(query) if not results: return {"type": "text", "text": "No results found."} return format_as_search_results(results) except Exception as e: return {"type": "text", "text": f"Search error: {str(e)}"} ``` ## Limitations - Search result content blocks are available on Claude API, Amazon Bedrock, and Google Cloud's Vertex AI - Only text content is supported within search results (no images or other media) - The `content` array must contain at least one text block --- # Streaming Messages URL: https://platform.claude.com/docs/en/build-with-claude/streaming # Streaming Messages --- When creating a Message, you can set `"stream": true` to incrementally stream the response using [server-sent events](https://developer.mozilla.org/en-US/Web/API/Server-sent%5Fevents/Using%5Fserver-sent%5Fevents) (SSE). ## Streaming with SDKs Our [Python](https://github.com/anthropics/anthropic-sdk-python) and [TypeScript](https://github.com/anthropics/anthropic-sdk-typescript) SDKs offer multiple ways of streaming. The Python SDK allows both sync and async streams. See the documentation in each SDK for details. ```python Python import anthropic client = anthropic.Anthropic() with client.messages.stream( max_tokens=1024, messages=[{"role": "user", "content": "Hello"}], model="claude-sonnet-4-5", ) as stream: for text in stream.text_stream: print(text, end="", flush=True) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); await client.messages.stream({ messages: [{role: 'user', content: "Hello"}], model: 'claude-sonnet-4-5', max_tokens: 1024, }).on('text', (text) => { console.log(text); }); ``` ## Event types Each server-sent event includes a named event type and associated JSON data. Each event will use an SSE event name (e.g. `event: message_stop`), and include the matching event `type` in its data. Each stream uses the following event flow: 1. `message_start`: contains a `Message` object with empty `content`. 2. A series of content blocks, each of which have a `content_block_start`, one or more `content_block_delta` events, and a `content_block_stop` event. Each content block will have an `index` that corresponds to its index in the final Message `content` array. 3. One or more `message_delta` events, indicating top-level changes to the final `Message` object. 4. A final `message_stop` event. The token counts shown in the `usage` field of the `message_delta` event are *cumulative*. ### Ping events Event streams may also include any number of `ping` events. ### Error events We may occasionally send [errors](/docs/en/api/errors) in the event stream. For example, during periods of high usage, you may receive an `overloaded_error`, which would normally correspond to an HTTP 529 in a non-streaming context: ```json Example error event: error data: {"type": "error", "error": {"type": "overloaded_error", "message": "Overloaded"}} ``` ### Other events In accordance with our [versioning policy](/docs/en/api/versioning), we may add new event types, and your code should handle unknown event types gracefully. ## Content block delta types Each `content_block_delta` event contains a `delta` of a type that updates the `content` block at a given `index`. ### Text delta A `text` content block delta looks like: ```json Text delta event: content_block_delta data: {"type": "content_block_delta","index": 0,"delta": {"type": "text_delta", "text": "ello frien"}} ``` ### Input JSON delta The deltas for `tool_use` content blocks correspond to updates for the `input` field of the block. To support maximum granularity, the deltas are _partial JSON strings_, whereas the final `tool_use.input` is always an _object_. You can accumulate the string deltas and parse the JSON once you receive a `content_block_stop` event, by using a library like [Pydantic](https://docs.pydantic.dev/latest/concepts/json/#partial-json-parsing) to do partial JSON parsing, or by using our [SDKs](/docs/en/api/client-sdks), which provide helpers to access parsed incremental values. A `tool_use` content block delta looks like: ```json Input JSON delta event: content_block_delta data: {"type": "content_block_delta","index": 1,"delta": {"type": "input_json_delta","partial_json": "{\"location\": \"San Fra"}}} ``` Note: Our current models only support emitting one complete key and value property from `input` at a time. As such, when using tools, there may be delays between streaming events while the model is working. Once an `input` key and value are accumulated, we emit them as multiple `content_block_delta` events with chunked partial json so that the format can automatically support finer granularity in future models. ### Thinking delta When using [extended thinking](/docs/en/build-with-claude/extended-thinking#streaming-thinking) with streaming enabled, you'll receive thinking content via `thinking_delta` events. These deltas correspond to the `thinking` field of the `thinking` content blocks. For thinking content, a special `signature_delta` event is sent just before the `content_block_stop` event. This signature is used to verify the integrity of the thinking block. A typical thinking delta looks like: ```json Thinking delta event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "Let me solve this step by step:\n\n1. First break down 27 * 453"}} ``` The signature delta looks like: ```json Signature delta event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}} ``` ## Full HTTP Stream response We strongly recommend that you use our [client SDKs](/docs/en/api/client-sdks) when using streaming mode. However, if you are building a direct API integration, you will need to handle these events yourself. A stream response is comprised of: 1. A `message_start` event 2. Potentially multiple content blocks, each of which contains: - A `content_block_start` event - Potentially multiple `content_block_delta` events - A `content_block_stop` event 3. A `message_delta` event 4. A `message_stop` event There may be `ping` events dispersed throughout the response as well. See [Event types](#event-types) for more details on the format. ### Basic streaming request ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --data \ '{ "model": "claude-sonnet-4-5", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 256, "stream": true }' ``` ```python Python import anthropic client = anthropic.Anthropic() with client.messages.stream( model="claude-sonnet-4-5", messages=[{"role": "user", "content": "Hello"}], max_tokens=256, ) as stream: for text in stream.text_stream: print(text, end="", flush=True) ``` ```json Response event: message_start data: {"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-sonnet-4-5-20250929", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 25, "output_tokens": 1}}} event: content_block_start data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}} event: ping data: {"type": "ping"} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "!"}} event: content_block_stop data: {"type": "content_block_stop", "index": 0} event: message_delta data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence":null}, "usage": {"output_tokens": 15}} event: message_stop data: {"type": "message_stop"} ``` ### Streaming request with tool use Tool use now supports fine-grained streaming for parameter values as a beta feature. For more details, see [Fine-grained tool streaming](/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming). In this request, we ask Claude to use a tool to tell us the weather. ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } } ], "tool_choice": {"type": "any"}, "messages": [ { "role": "user", "content": "What is the weather like in San Francisco?" } ], "stream": true }' ``` ```python Python import anthropic client = anthropic.Anthropic() tools = [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } } ] with client.messages.stream( model="claude-sonnet-4-5", max_tokens=1024, tools=tools, tool_choice={"type": "any"}, messages=[ { "role": "user", "content": "What is the weather like in San Francisco?" } ], ) as stream: for text in stream.text_stream: print(text, end="", flush=True) ``` ```json Response event: message_start data: {"type":"message_start","message":{"id":"msg_014p7gG3wDgGV9EUtLvnow3U","type":"message","role":"assistant","model":"claude-sonnet-4-5-20250929","stop_sequence":null,"usage":{"input_tokens":472,"output_tokens":2},"content":[],"stop_reason":null}} event: content_block_start data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}} event: ping data: {"type": "ping"} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Okay"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" let"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"'s"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" check"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" weather"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" for"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" San"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" Francisco"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" CA"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":":"}} event: content_block_stop data: {"type":"content_block_stop","index":0} event: content_block_start data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"toolu_01T1x1fJ34qAmk2tNTrN7Up6","name":"get_weather","input":{}}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"San"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" Francisc"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"o,"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" CA\""}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":", "}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"\"unit\": \"fah"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"renheit\"}"}} event: content_block_stop data: {"type":"content_block_stop","index":1} event: message_delta data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":89}} event: message_stop data: {"type":"message_stop"} ``` ### Streaming request with extended thinking In this request, we enable extended thinking with streaming to see Claude's step-by-step reasoning. ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 20000, "stream": true, "thinking": { "type": "enabled", "budget_tokens": 16000 }, "messages": [ { "role": "user", "content": "What is 27 * 453?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() with client.messages.stream( model="claude-sonnet-4-5", max_tokens=20000, thinking={ "type": "enabled", "budget_tokens": 16000 }, messages=[ { "role": "user", "content": "What is 27 * 453?" } ], ) as stream: for event in stream: if event.type == "content_block_delta": if event.delta.type == "thinking_delta": print(event.delta.thinking, end="", flush=True) elif event.delta.type == "text_delta": print(event.delta.text, end="", flush=True) ``` ```json Response event: message_start data: {"type": "message_start", "message": {"id": "msg_01...", "type": "message", "role": "assistant", "content": [], "model": "claude-sonnet-4-5-20250929", "stop_reason": null, "stop_sequence": null}} event: content_block_start data: {"type": "content_block_start", "index": 0, "content_block": {"type": "thinking", "thinking": ""}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "Let me solve this step by step:\n\n1. First break down 27 * 453"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n2. 453 = 400 + 50 + 3"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n3. 27 * 400 = 10,800"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n4. 27 * 50 = 1,350"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n5. 27 * 3 = 81"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n6. 10,800 + 1,350 + 81 = 12,231"}} event: content_block_delta data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}} event: content_block_stop data: {"type": "content_block_stop", "index": 0} event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "text", "text": ""}} event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "text_delta", "text": "27 * 453 = 12,231"}} event: content_block_stop data: {"type": "content_block_stop", "index": 1} event: message_delta data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}} event: message_stop data: {"type": "message_stop"} ``` ### Streaming request with web search tool use In this request, we ask Claude to search the web for current weather information. ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "stream": true, "tools": [ { "type": "web_search_20250305", "name": "web_search", "max_uses": 5 } ], "messages": [ { "role": "user", "content": "What is the weather like in New York City today?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() with client.messages.stream( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "web_search_20250305", "name": "web_search", "max_uses": 5 } ], messages=[ { "role": "user", "content": "What is the weather like in New York City today?" } ], ) as stream: for text in stream.text_stream: print(text, end="", flush=True) ``` ```json Response event: message_start data: {"type":"message_start","message":{"id":"msg_01G...","type":"message","role":"assistant","model":"claude-sonnet-4-5-20250929","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":2679,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":3}}} event: content_block_start data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"I'll check"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the current weather in New York City for you"}} event: ping data: {"type": "ping"} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"."}} event: content_block_stop data: {"type":"content_block_stop","index":0} event: content_block_start data: {"type":"content_block_start","index":1,"content_block":{"type":"server_tool_use","id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","name":"web_search","input":{}}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"query"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"\":"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"weather"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" NY"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"C to"}} event: content_block_delta data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"day\"}"}} event: content_block_stop data: {"type":"content_block_stop","index":1 } event: content_block_start data: {"type":"content_block_start","index":2,"content_block":{"type":"web_search_tool_result","tool_use_id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","content":[{"type":"web_search_result","title":"Weather in New York City in May 2025 (New York) - detailed Weather Forecast for a month","url":"https://world-weather.info/forecast/usa/new_york/may-2025/","encrypted_content":"Ev0DCioIAxgCIiQ3NmU4ZmI4OC1k...","page_age":null},...]}} event: content_block_stop data: {"type":"content_block_stop","index":2} event: content_block_start data: {"type":"content_block_start","index":3,"content_block":{"type":"text","text":""}} event: content_block_delta data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"Here's the current weather information for New York"}} event: content_block_delta data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" City:\n\n# Weather"}} event: content_block_delta data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" in New York City"}} event: content_block_delta data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"\n\n"}} ... event: content_block_stop data: {"type":"content_block_stop","index":17} event: message_delta data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"input_tokens":10682,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":510,"server_tool_use":{"web_search_requests":1}}} event: message_stop data: {"type":"message_stop"} ``` ## Error recovery When a streaming request is interrupted due to network issues, timeouts, or other errors, you can recover by resuming from where the stream was interrupted. This approach saves you from re-processing the entire response. The basic recovery strategy involves: 1. **Capture the partial response**: Save all content that was successfully received before the error occurred 2. **Construct a continuation request**: Create a new API request that includes the partial assistant response as the beginning of a new assistant message 3. **Resume streaming**: Continue receiving the rest of the response from where it was interrupted ### Error recovery best practices 1. **Use SDK features**: Leverage the SDK's built-in message accumulation and error handling capabilities 2. **Handle content types**: Be aware that messages can contain multiple content blocks (`text`, `tool_use`, `thinking`). Tool use and extended thinking blocks cannot be partially recovered. You can resume streaming from the most recent text block. --- # Structured outputs URL: https://platform.claude.com/docs/en/build-with-claude/structured-outputs # Structured outputs Get validated JSON results from agent workflows --- Structured outputs constrain Claude's responses to follow a specific schema, ensuring valid, parseable output for downstream processing. Two complementary features are available: - **JSON outputs** (`output_format`): Get Claude's response in a specific JSON format - **Strict tool use** (`strict: true`): Guarantee schema validation on tool names and inputs These features can be used independently or together in the same request. Structured outputs are currently available as a public beta feature in the Claude API for Claude Sonnet 4.5, Claude Opus 4.1, Claude Opus 4.5, and Claude Haiku 4.5. To use the feature, set the [beta header](/docs/en/api/beta-headers) `structured-outputs-2025-11-13`. Share feedback using this [form](https://forms.gle/BFnYc6iCkWoRzFgk7). ## Why use structured outputs Without structured outputs, Claude can generate malformed JSON responses or invalid tool inputs that break your applications. Even with careful prompting, you may encounter: - Parsing errors from invalid JSON syntax - Missing required fields - Inconsistent data types - Schema violations requiring error handling and retries Structured outputs guarantee schema-compliant responses through constrained decoding: - **Always valid**: No more `JSON.parse()` errors - **Type safe**: Guaranteed field types and required fields - **Reliable**: No retries needed for schema violations ## JSON outputs JSON outputs control Claude's response format, ensuring Claude returns valid JSON matching your schema. Use JSON outputs when you need to: - Control Claude's response format - Extract data from images or text - Generate structured reports - Format API responses ### Quick start ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: structured-outputs-2025-11-13" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], "output_format": { "type": "json_schema", "schema": { "type": "object", "properties": { "name": {"type": "string"}, "email": {"type": "string"}, "plan_interest": {"type": "string"}, "demo_requested": {"type": "boolean"} }, "required": ["name", "email", "plan_interest", "demo_requested"], "additionalProperties": false } } }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, betas=["structured-outputs-2025-11-13"], messages=[ { "role": "user", "content": "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], output_format={ "type": "json_schema", "schema": { "type": "object", "properties": { "name": {"type": "string"}, "email": {"type": "string"}, "plan_interest": {"type": "string"}, "demo_requested": {"type": "boolean"} }, "required": ["name", "email", "plan_interest", "demo_requested"], "additionalProperties": False } } ) print(response.content[0].text) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }); const response = await client.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, betas: ["structured-outputs-2025-11-13"], messages: [ { role: "user", content: "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], output_format: { type: "json_schema", schema: { type: "object", properties: { name: { type: "string" }, email: { type: "string" }, plan_interest: { type: "string" }, demo_requested: { type: "boolean" } }, required: ["name", "email", "plan_interest", "demo_requested"], additionalProperties: false } } }); console.log(response.content[0].text); ``` **Response format:** Valid JSON matching your schema in `response.content[0].text` ```json { "name": "John Smith", "email": "john@example.com", "plan_interest": "Enterprise", "demo_requested": true } ``` ### How it works Create a JSON schema that describes the structure you want Claude to follow. The schema uses standard JSON Schema format with some limitations (see [JSON Schema limitations](#json-schema-limitations)). Include the `output_format` parameter in your API request with `type: "json_schema"` and your schema definition. Add the `anthropic-beta: structured-outputs-2025-11-13` header to your request. Claude's response will be valid JSON matching your schema, returned in `response.content[0].text`. ### Working with JSON outputs in SDKs The Python and TypeScript SDKs provide helpers that make it easier to work with JSON outputs, including schema transformation, automatic validation, and integration with popular schema libraries. #### Using Pydantic and Zod For Python and TypeScript developers, you can use familiar schema definition tools like Pydantic and Zod instead of writing raw JSON schemas. ```python Python from pydantic import BaseModel from anthropic import Anthropic, transform_schema class ContactInfo(BaseModel): name: str email: str plan_interest: str demo_requested: bool client = Anthropic() # With .create() - requires transform_schema() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, betas=["structured-outputs-2025-11-13"], messages=[ { "role": "user", "content": "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], output_format={ "type": "json_schema", "schema": transform_schema(ContactInfo), } ) print(response.content[0].text) # With .parse() - can pass Pydantic model directly response = client.beta.messages.parse( model="claude-sonnet-4-5", max_tokens=1024, betas=["structured-outputs-2025-11-13"], messages=[ { "role": "user", "content": "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], output_format=ContactInfo, ) print(response.parsed_output) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; import { z } from 'zod'; import { betaZodOutputFormat } from '@anthropic-ai/sdk/helpers/beta/zod'; const ContactInfoSchema = z.object({ name: z.string(), email: z.string(), plan_interest: z.string(), demo_requested: z.boolean(), }); const client = new Anthropic(); const response = await client.beta.messages.parse({ model: "claude-sonnet-4-5", max_tokens: 1024, betas: ["structured-outputs-2025-11-13"], messages: [ { role: "user", content: "Extract the key information from this email: John Smith (john@example.com) is interested in our Enterprise plan and wants to schedule a demo for next Tuesday at 2pm." } ], output_format: betaZodOutputFormat(ContactInfoSchema), }); // Automatically parsed and validated console.log(response.parsed_output); ``` #### SDK-specific methods **Python: `client.beta.messages.parse()` (Recommended)** The `parse()` method automatically transforms your Pydantic model, validates the response, and returns a `parsed_output` attribute. The `parse()` method is available on `client.beta.messages`, not `client.messages`.
```python from pydantic import BaseModel import anthropic class ContactInfo(BaseModel): name: str email: str plan_interest: str client = anthropic.Anthropic() response = client.beta.messages.parse( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], max_tokens=1024, messages=[{"role": "user", "content": "..."}], output_format=ContactInfo, ) # Access the parsed output directly contact = response.parsed_output print(contact.name, contact.email) ```
**Python: `transform_schema()` helper** For when you need to manually transform schemas before sending, or when you want to modify a Pydantic-generated schema. Unlike `client.beta.messages.parse()`, which transforms provided schemas automatically, this gives you the transformed schema so you can further customize it.
```python from anthropic import transform_schema from pydantic import TypeAdapter # First convert Pydantic model to JSON schema, then transform schema = TypeAdapter(ContactInfo).json_schema() schema = transform_schema(schema) # Modify schema if needed schema["properties"]["custom_field"] = {"type": "string"} response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], max_tokens=1024, output_format=schema, messages=[{"role": "user", "content": "..."}], ) ```
#### How SDK transformation works Both Python and TypeScript SDKs automatically transform schemas with unsupported features: 1. **Remove unsupported constraints** (e.g., `minimum`, `maximum`, `minLength`, `maxLength`) 2. **Update descriptions** with constraint info (e.g., "Must be at least 100"), when the constraint is not directly supported with structured outputs 3. **Add `additionalProperties: false`** to all objects 4. **Filter string formats** to supported list only 5. **Validate responses** against your original schema (with all constraints) This means Claude receives a simplified schema, but your code still enforces all constraints through validation. **Example:** A Pydantic field with `minimum: 100` becomes a plain integer in the sent schema, but the description is updated to "Must be at least 100", and the SDK validates the response against the original constraint. ### Common use cases
Extract structured data from unstructured text: ```python Python from pydantic import BaseModel from typing import List class Invoice(BaseModel): invoice_number: str date: str total_amount: float line_items: List[dict] customer_name: str response = client.beta.messages.parse( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], output_format=Invoice, messages=[{"role": "user", "content": f"Extract invoice data from: {invoice_text}"}] ) ``` ```typescript TypeScript import { z } from 'zod'; const InvoiceSchema = z.object({ invoice_number: z.string(), date: z.string(), total_amount: z.number(), line_items: z.array(z.record(z.any())), customer_name: z.string(), }); const response = await client.beta.messages.parse({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], output_format: InvoiceSchema, messages: [{"role": "user", "content": `Extract invoice data from: ${invoiceText}`}] }); ```
Classify content with structured categories: ```python Python from pydantic import BaseModel from typing import List class Classification(BaseModel): category: str confidence: float tags: List[str] sentiment: str response = client.beta.messages.parse( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], output_format=Classification, messages=[{"role": "user", "content": f"Classify this feedback: {feedback_text}"}] ) ``` ```typescript TypeScript import { z } from 'zod'; const ClassificationSchema = z.object({ category: z.string(), confidence: z.number(), tags: z.array(z.string()), sentiment: z.string(), }); const response = await client.beta.messages.parse({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], output_format: ClassificationSchema, messages: [{"role": "user", "content": `Classify this feedback: ${feedbackText}`}] }); ```
Generate API-ready responses: ```python Python from pydantic import BaseModel from typing import List, Optional class APIResponse(BaseModel): status: str data: dict errors: Optional[List[dict]] metadata: dict response = client.beta.messages.parse( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], output_format=APIResponse, messages=[{"role": "user", "content": "Process this request: ..."}] ) ``` ```typescript TypeScript import { z } from 'zod'; const APIResponseSchema = z.object({ status: z.string(), data: z.record(z.any()), errors: z.array(z.record(z.any())).optional(), metadata: z.record(z.any()), }); const response = await client.beta.messages.parse({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], output_format: APIResponseSchema, messages: [{"role": "user", "content": "Process this request: ..."}] }); ```
## Strict tool use Strict tool use validates tool parameters, ensuring Claude calls your functions with correctly-typed arguments. Use strict tool use when you need to: - Validate tool parameters - Build agentic workflows - Ensure type-safe function calls - Handle complex tools with nested properties ### Why strict tool use matters for agents Building reliable agentic systems requires guaranteed schema conformance. Without strict mode, Claude might return incompatible types (`"2"` instead of `2`) or missing required fields, breaking your functions and causing runtime errors. Strict tool use guarantees type-safe parameters: - Functions receive correctly-typed arguments every time - No need to validate and retry tool calls - Production-ready agents that work consistently at scale For example, suppose a booking system needs `passengers: int`. Without strict mode, Claude might provide `passengers: "two"` or `passengers: "2"`. With `strict: true`, the response will always contain `passengers: 2`. ### Quick start ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: structured-outputs-2025-11-13" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ {"role": "user", "content": "What is the weather in San Francisco?"} ], "tools": [{ "name": "get_weather", "description": "Get the current weather in a given location", "strict": true, "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"], "additionalProperties": false } }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, betas=["structured-outputs-2025-11-13"], messages=[ {"role": "user", "content": "What's the weather like in San Francisco?"} ], tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "strict": True, # Enable strict mode "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"], "additionalProperties": False } } ] ) print(response.content) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }); const response = await client.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, betas: ["structured-outputs-2025-11-13"], messages: [ { role: "user", content: "What's the weather like in San Francisco?" } ], tools: [{ name: "get_weather", description: "Get the current weather in a given location", strict: true, // Enable strict mode input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" }, unit: { type: "string", enum: ["celsius", "fahrenheit"] } }, required: ["location"], additionalProperties: false } }] }); console.log(response.content); ``` **Response format:** Tool use blocks with validated inputs in `response.content[x].input` ```json { "type": "tool_use", "name": "get_weather", "input": { "location": "San Francisco, CA" } } ``` **Guarantees:** - Tool `input` strictly follows the `input_schema` - Tool `name` is always valid (from provided tools or server tools) ### How it works Create a JSON schema for your tool's `input_schema`. The schema uses standard JSON Schema format with some limitations (see [JSON Schema limitations](#json-schema-limitations)). Set `"strict": true` as a top-level property in your tool definition, alongside `name`, `description`, and `input_schema`. Add the `anthropic-beta: structured-outputs-2025-11-13` header to your request. When Claude uses the tool, the `input` field in the tool_use block will strictly follow your `input_schema`, and the `name` will always be valid. ### Common use cases
Ensure tool parameters exactly match your schema: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], messages=[{"role": "user", "content": "Search for flights to Tokyo"}], tools=[{ "name": "search_flights", "strict": True, "input_schema": { "type": "object", "properties": { "destination": {"type": "string"}, "departure_date": {"type": "string", "format": "date"}, "passengers": {"type": "integer", "enum": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]} }, "required": ["destination", "departure_date"], "additionalProperties": False } }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], messages: [{"role": "user", "content": "Search for flights to Tokyo"}], tools: [{ name: "search_flights", strict: true, input_schema: { type: "object", properties: { destination: {type: "string"}, departure_date: {type: "string", format: "date"}, passengers: {type: "integer", enum: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]} }, required: ["destination", "departure_date"], additionalProperties: false } }] }); ```
Build reliable multi-step agents with guaranteed tool parameters: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], messages=[{"role": "user", "content": "Help me plan a trip to Paris for 2 people"}], tools=[ { "name": "search_flights", "strict": True, "input_schema": { "type": "object", "properties": { "origin": {"type": "string"}, "destination": {"type": "string"}, "departure_date": {"type": "string", "format": "date"}, "travelers": {"type": "integer", "enum": [1, 2, 3, 4, 5, 6]} }, "required": ["origin", "destination", "departure_date"], "additionalProperties": False } }, { "name": "search_hotels", "strict": True, "input_schema": { "type": "object", "properties": { "city": {"type": "string"}, "check_in": {"type": "string", "format": "date"}, "guests": {"type": "integer", "enum": [1, 2, 3, 4]} }, "required": ["city", "check_in"], "additionalProperties": False } } ] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], messages: [{"role": "user", "content": "Help me plan a trip to Paris for 2 people"}], tools: [ { name: "search_flights", strict: true, input_schema: { type: "object", properties: { origin: {type: "string"}, destination: {type: "string"}, departure_date: {type: "string", format: "date"}, travelers: {type: "integer", enum: [1, 2, 3, 4, 5, 6]} }, required: ["origin", "destination", "departure_date"], additionalProperties: false } }, { name: "search_hotels", strict: true, input_schema: { type: "object", properties: { city: {type: "string"}, check_in: {type: "string", format: "date"}, guests: {type: "integer", enum: [1, 2, 3, 4]} }, required: ["city", "check_in"], additionalProperties: false } } ] }); ```
## Using both features together JSON outputs and strict tool use solve different problems and can be used together: - **JSON outputs** control Claude's response format (what Claude says) - **Strict tool use** validates tool parameters (how Claude calls your functions) When combined, Claude can call tools with guaranteed-valid parameters AND return structured JSON responses. This is useful for agentic workflows where you need both reliable tool calls and structured final outputs. ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["structured-outputs-2025-11-13"], max_tokens=1024, messages=[{"role": "user", "content": "Help me plan a trip to Paris for next month"}], # JSON outputs: structured response format output_format={ "type": "json_schema", "schema": { "type": "object", "properties": { "summary": {"type": "string"}, "next_steps": {"type": "array", "items": {"type": "string"}} }, "required": ["summary", "next_steps"], "additionalProperties": False } }, # Strict tool use: guaranteed tool parameters tools=[{ "name": "search_flights", "strict": True, "input_schema": { "type": "object", "properties": { "destination": {"type": "string"}, "date": {"type": "string", "format": "date"} }, "required": ["destination", "date"], "additionalProperties": False } }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["structured-outputs-2025-11-13"], max_tokens: 1024, messages: [{ role: "user", content: "Help me plan a trip to Paris for next month" }], // JSON outputs: structured response format output_format: { type: "json_schema", schema: { type: "object", properties: { summary: { type: "string" }, next_steps: { type: "array", items: { type: "string" } } }, required: ["summary", "next_steps"], additionalProperties: false } }, // Strict tool use: guaranteed tool parameters tools: [{ name: "search_flights", strict: true, input_schema: { type: "object", properties: { destination: { type: "string" }, date: { type: "string", format: "date" } }, required: ["destination", "date"], additionalProperties: false } }] }); ``` ## Important considerations ### Grammar compilation and caching Structured outputs use constrained sampling with compiled grammar artifacts. This introduces some performance characteristics to be aware of: - **First request latency**: The first time you use a specific schema, there will be additional latency while the grammar is compiled - **Automatic caching**: Compiled grammars are cached for 24 hours from last use, making subsequent requests much faster - **Cache invalidation**: The cache is invalidated if you change: - The JSON schema structure - The set of tools in your request (when using both structured outputs and tool use) - Changing only `name` or `description` fields does not invalidate the cache ### Prompt modification and token costs When using structured outputs, Claude automatically receives an additional system prompt explaining the expected output format. This means: - Your input token count will be slightly higher - The injected prompt costs you tokens like any other system prompt - Changing the `output_format` parameter will invalidate any [prompt cache](/docs/en/build-with-claude/prompt-caching) for that conversation thread ### JSON Schema limitations Structured outputs support standard JSON Schema with some limitations. Both JSON outputs and strict tool use share these limitations.
- All basic types: object, array, string, integer, number, boolean, null - `enum` (strings, numbers, bools, or nulls only - no complex types) - `const` - `anyOf` and `allOf` (with limitations - `allOf` with `$ref` not supported) - `$ref`, `$def`, and `definitions` (external `$ref` not supported) - `default` property for all supported types - `required` and `additionalProperties` (must be set to `false` for objects) - String formats: `date-time`, `time`, `date`, `duration`, `email`, `hostname`, `uri`, `ipv4`, `ipv6`, `uuid` - Array `minItems` (only values 0 and 1 supported)
- Recursive schemas - Complex types within enums - External `$ref` (e.g., `'$ref': 'http://...'`) - Numerical constraints (`minimum`, `maximum`, `multipleOf`, etc.) - String constraints (`minLength`, `maxLength`) - Array constraints beyond `minItems` of 0 or 1 - `additionalProperties` set to anything other than `false` If you use an unsupported feature, you'll receive a 400 error with details.
**Supported regex features:** - Full matching (`^...$`) and partial matching - Quantifiers: `*`, `+`, `?`, simple `{n,m}` cases - Character classes: `[]`, `.`, `\d`, `\w`, `\s` - Groups: `(...)` **NOT supported:** - Backreferences to groups (e.g., `\1`, `\2`) - Lookahead/lookbehind assertions (e.g., `(?=...)`, `(?!...)`) - Word boundaries: `\b`, `\B` - Complex `{n,m}` quantifiers with large ranges Simple regex patterns work well. Complex patterns may result in 400 errors.
The Python and TypeScript SDKs can automatically transform schemas with unsupported features by removing them and adding constraints to field descriptions. See [SDK-specific methods](#sdk-specific-methods) for details. ### Invalid outputs While structured outputs guarantee schema compliance in most cases, there are scenarios where the output may not match your schema: **Refusals** (`stop_reason: "refusal"`) Claude maintains its safety and helpfulness properties even when using structured outputs. If Claude refuses a request for safety reasons: - The response will have `stop_reason: "refusal"` - You'll receive a 200 status code - You'll be billed for the tokens generated - The output may not match your schema because the refusal message takes precedence over schema constraints **Token limit reached** (`stop_reason: "max_tokens"`) If the response is cut off due to reaching the `max_tokens` limit: - The response will have `stop_reason: "max_tokens"` - The output may be incomplete and not match your schema - Retry with a higher `max_tokens` value to get the complete structured output ### Schema validation errors If your schema uses unsupported features or is too complex, you'll receive a 400 error: **"Too many recursive definitions in schema"** - Cause: Schema has excessive or cyclic recursive definitions - Solution: Simplify schema structure, reduce nesting depth **"Schema is too complex"** - Cause: Schema exceeds complexity limits - Solution: Break into smaller schemas, simplify structure, or reduce the number of tools marked as `strict: true` For persistent issues with valid schemas, [contact support](https://support.claude.com/en/articles/9015913-how-to-get-support) with your schema definition. ## Feature compatibility **Works with:** - **[Batch processing](/docs/en/build-with-claude/batch-processing)**: Process structured outputs at scale with 50% discount - **[Token counting](/docs/en/build-with-claude/token-counting)**: Count tokens without compilation - **[Streaming](/docs/en/build-with-claude/streaming)**: Stream structured outputs like normal responses - **Combined usage**: Use JSON outputs (`output_format`) and strict tool use (`strict: true`) together in the same request **Incompatible with:** - **[Citations](/docs/en/build-with-claude/citations)**: Citations require interleaving citation blocks with text, which conflicts with strict JSON schema constraints. Returns 400 error if citations enabled with `output_format`. - **[Message Prefilling](/docs/en/build-with-claude/prompt-engineering/prefill-claudes-response)**: Incompatible with JSON outputs **Grammar scope**: Grammars apply only to Claude's direct output, not to tool use calls, tool results, or thinking tags (when using [Extended Thinking](/docs/en/build-with-claude/extended-thinking)). Grammar state resets between sections, allowing Claude to think freely while still producing structured output in the final response. --- # Token counting URL: https://platform.claude.com/docs/en/build-with-claude/token-counting # Token counting --- Token counting enables you to determine the number of tokens in a message before sending it to Claude, helping you make informed decisions about your prompts and usage. With token counting, you can - Proactively manage rate limits and costs - Make smart model routing decisions - Optimize prompts to be a specific length --- ## How to count message tokens The [token counting](/docs/en/api/messages-count-tokens) endpoint accepts the same structured list of inputs for creating a message, including support for system prompts, [tools](/docs/en/agents-and-tools/tool-use/overview), [images](/docs/en/build-with-claude/vision), and [PDFs](/docs/en/build-with-claude/pdf-support). The response contains the total number of input tokens. The token count should be considered an **estimate**. In some cases, the actual number of input tokens used when creating a message may differ by a small amount. Token counts may include tokens added automatically by Anthropic for system optimizations. **You are not billed for system-added tokens**. Billing reflects only your content. ### Supported models All [active models](/docs/en/about-claude/models/overview) support token counting. ### Count tokens in basic messages ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.count_tokens( model="claude-sonnet-4-5", system="You are a scientist", messages=[{ "role": "user", "content": "Hello, Claude" }], ) print(response.json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.countTokens({ model: 'claude-sonnet-4-5', system: 'You are a scientist', messages: [{ role: 'user', content: 'Hello, Claude' }] }); console.log(response); ``` ```bash Shell curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "content-type: application/json" \ --header "anthropic-version: 2023-06-01" \ --data '{ "model": "claude-sonnet-4-5", "system": "You are a scientist", "messages": [{ "role": "user", "content": "Hello, Claude" }] }' ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.MessageCountTokensParams; import com.anthropic.models.messages.MessageTokensCount; import com.anthropic.models.messages.Model; public class CountTokensExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCountTokensParams params = MessageCountTokensParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .system("You are a scientist") .addUserMessage("Hello, Claude") .build(); MessageTokensCount count = client.messages().countTokens(params); System.out.println(count); } } ``` ```json JSON { "input_tokens": 14 } ``` ### Count tokens in messages with tools [Server tool](/docs/en/agents-and-tools/tool-use/overview#server-tools) token counts only apply to the first sampling call. ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.count_tokens( model="claude-sonnet-4-5", tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", } }, "required": ["location"], }, } ], messages=[{"role": "user", "content": "What's the weather like in San Francisco?"}] ) print(response.json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.countTokens({ model: 'claude-sonnet-4-5', tools: [ { name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA", } }, required: ["location"], } } ], messages: [{ role: "user", content: "What's the weather like in San Francisco?" }] }); console.log(response); ``` ```bash Shell curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "content-type: application/json" \ --header "anthropic-version: 2023-06-01" \ --data '{ "model": "claude-sonnet-4-5", "tools": [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } } ], "messages": [ { "role": "user", "content": "What'\''s the weather like in San Francisco?" } ] }' ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.MessageCountTokensParams; import com.anthropic.models.messages.MessageTokensCount; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class CountTokensWithToolsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); MessageCountTokensParams params = MessageCountTokensParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(schema) .build()) .addUserMessage("What's the weather like in San Francisco?") .build(); MessageTokensCount count = client.messages().countTokens(params); System.out.println(count); } } ``` ```json JSON { "input_tokens": 403 } ``` ### Count tokens in messages with images ```bash Shell #!/bin/sh IMAGE_URL="https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" IMAGE_MEDIA_TYPE="image/jpeg" IMAGE_BASE64=$(curl "$IMAGE_URL" | base64) curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "messages": [ {"role": "user", "content": [ {"type": "image", "source": { "type": "base64", "media_type": "'$IMAGE_MEDIA_TYPE'", "data": "'$IMAGE_BASE64'" }}, {"type": "text", "text": "Describe this image"} ]} ] }' ``` ```python Python import anthropic import base64 import httpx image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" image_media_type = "image/jpeg" image_data = base64.standard_b64encode(httpx.get(image_url).content).decode("utf-8") client = anthropic.Anthropic() response = client.messages.count_tokens( model="claude-sonnet-4-5", messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image_media_type, "data": image_data, }, }, { "type": "text", "text": "Describe this image" } ], } ], ) print(response.json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" const image_media_type = "image/jpeg" const image_array_buffer = await ((await fetch(image_url)).arrayBuffer()); const image_data = Buffer.from(image_array_buffer).toString('base64'); const response = await anthropic.messages.countTokens({ model: 'claude-sonnet-4-5', messages: [ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image_media_type, "data": image_data, }, } ], }, { "type": "text", "text": "Describe this image" } ] }); console.log(response); ``` ```java Java import java.util.Base64; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Base64ImageSource; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.ImageBlockParam; import com.anthropic.models.messages.MessageCountTokensParams; import com.anthropic.models.messages.MessageTokensCount; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; import java.net.URI; import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; public class CountTokensImageExample { public static void main(String[] args) throws Exception { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); String imageUrl = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"; String imageMediaType = "image/jpeg"; HttpClient httpClient = HttpClient.newHttpClient(); HttpRequest request = HttpRequest.newBuilder() .uri(URI.create(imageUrl)) .build(); byte[] imageBytes = httpClient.send(request, HttpResponse.BodyHandlers.ofByteArray()).body(); String imageBase64 = Base64.getEncoder().encodeToString(imageBytes); ContentBlockParam imageBlock = ContentBlockParam.ofImage( ImageBlockParam.builder() .source(Base64ImageSource.builder() .mediaType(Base64ImageSource.MediaType.IMAGE_JPEG) .data(imageBase64) .build()) .build()); ContentBlockParam textBlock = ContentBlockParam.ofText( TextBlockParam.builder() .text("Describe this image") .build()); MessageCountTokensParams params = MessageCountTokensParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .addUserMessageOfBlockParams(List.of(imageBlock, textBlock)) .build(); MessageTokensCount count = client.messages().countTokens(params); System.out.println(count); } } ``` ```json JSON { "input_tokens": 1551 } ``` ### Count tokens in messages with extended thinking See [here](/docs/en/build-with-claude/extended-thinking#how-context-window-is-calculated-with-extended-thinking) for more details about how the context window is calculated with extended thinking - Thinking blocks from **previous** assistant turns are ignored and **do not** count toward your input tokens - **Current** assistant turn thinking **does** count toward your input tokens ```bash Shell curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "content-type: application/json" \ --header "anthropic-version: 2023-06-01" \ --data '{ "model": "claude-sonnet-4-5", "thinking": { "type": "enabled", "budget_tokens": 16000 }, "messages": [ { "role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?" }, { "role": "assistant", "content": [ { "type": "thinking", "thinking": "This is a nice number theory question. Lets think about it step by step...", "signature": "EuYBCkQYAiJAgCs1le6/Pol5Z4/JMomVOouGrWdhYNsH3ukzUECbB6iWrSQtsQuRHJID6lWV..." }, { "type": "text", "text": "Yes, there are infinitely many prime numbers p such that p mod 4 = 3..." } ] }, { "role": "user", "content": "Can you write a formal proof?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.count_tokens( model="claude-sonnet-4-5", thinking={ "type": "enabled", "budget_tokens": 16000 }, messages=[ { "role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?" }, { "role": "assistant", "content": [ { "type": "thinking", "thinking": "This is a nice number theory question. Let's think about it step by step...", "signature": "EuYBCkQYAiJAgCs1le6/Pol5Z4/JMomVOouGrWdhYNsH3ukzUECbB6iWrSQtsQuRHJID6lWV..." }, { "type": "text", "text": "Yes, there are infinitely many prime numbers p such that p mod 4 = 3..." } ] }, { "role": "user", "content": "Can you write a formal proof?" } ] ) print(response.json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.messages.countTokens({ model: 'claude-sonnet-4-5', thinking: { 'type': 'enabled', 'budget_tokens': 16000 }, messages: [ { 'role': 'user', 'content': 'Are there an infinite number of prime numbers such that n mod 4 == 3?' }, { 'role': 'assistant', 'content': [ { 'type': 'thinking', 'thinking': "This is a nice number theory question. Let's think about it step by step...", 'signature': 'EuYBCkQYAiJAgCs1le6/Pol5Z4/JMomVOouGrWdhYNsH3ukzUECbB6iWrSQtsQuRHJID6lWV...' }, { 'type': 'text', 'text': 'Yes, there are infinitely many prime numbers p such that p mod 4 = 3...', } ] }, { 'role': 'user', 'content': 'Can you write a formal proof?' } ] }); console.log(response); ``` ```java Java import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.MessageCountTokensParams; import com.anthropic.models.messages.MessageTokensCount; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; import com.anthropic.models.messages.ThinkingBlockParam; public class CountTokensThinkingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); List assistantBlocks = List.of( ContentBlockParam.ofThinking(ThinkingBlockParam.builder() .thinking("This is a nice number theory question. Let's think about it step by step...") .signature("EuYBCkQYAiJAgCs1le6/Pol5Z4/JMomVOouGrWdhYNsH3ukzUECbB6iWrSQtsQuRHJID6lWV...") .build()), ContentBlockParam.ofText(TextBlockParam.builder() .text("Yes, there are infinitely many prime numbers p such that p mod 4 = 3...") .build()) ); MessageCountTokensParams params = MessageCountTokensParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .enabledThinking(16000) .addUserMessage("Are there an infinite number of prime numbers such that n mod 4 == 3?") .addAssistantMessageOfBlockParams(assistantBlocks) .addUserMessage("Can you write a formal proof?") .build(); MessageTokensCount count = client.messages().countTokens(params); System.out.println(count); } } ``` ```json JSON { "input_tokens": 88 } ``` ### Count tokens in messages with PDFs Token counting supports PDFs with the same [limitations](/docs/en/build-with-claude/pdf-support#pdf-support-limitations) as the Messages API. ```bash Shell curl https://api.anthropic.com/v1/messages/count_tokens \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "content-type: application/json" \ --header "anthropic-version: 2023-06-01" \ --data '{ "model": "claude-sonnet-4-5", "messages": [{ "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": "'$(base64 -i document.pdf)'" } }, { "type": "text", "text": "Please summarize this document." } ] }] }' ``` ```python Python import base64 import anthropic client = anthropic.Anthropic() with open("document.pdf", "rb") as pdf_file: pdf_base64 = base64.standard_b64encode(pdf_file.read()).decode("utf-8") response = client.messages.count_tokens( model="claude-sonnet-4-5", messages=[{ "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_base64 } }, { "type": "text", "text": "Please summarize this document." } ] }] ) print(response.json()) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; import { readFileSync } from 'fs'; const client = new Anthropic(); const pdfBase64 = readFileSync('document.pdf', { encoding: 'base64' }); const response = await client.messages.countTokens({ model: 'claude-sonnet-4-5', messages: [{ role: 'user', content: [ { type: 'document', source: { type: 'base64', media_type: 'application/pdf', data: pdfBase64 } }, { type: 'text', text: 'Please summarize this document.' } ] }] }); console.log(response); ``` ```java Java import java.nio.file.Files; import java.nio.file.Path; import java.util.Base64; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Base64PdfSource; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.DocumentBlockParam; import com.anthropic.models.messages.MessageCountTokensParams; import com.anthropic.models.messages.MessageTokensCount; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; public class CountTokensPdfExample { public static void main(String[] args) throws Exception { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); byte[] fileBytes = Files.readAllBytes(Path.of("document.pdf")); String pdfBase64 = Base64.getEncoder().encodeToString(fileBytes); ContentBlockParam documentBlock = ContentBlockParam.ofDocument( DocumentBlockParam.builder() .source(Base64PdfSource.builder() .mediaType(Base64PdfSource.MediaType.APPLICATION_PDF) .data(pdfBase64) .build()) .build()); ContentBlockParam textBlock = ContentBlockParam.ofText( TextBlockParam.builder() .text("Please summarize this document.") .build()); MessageCountTokensParams params = MessageCountTokensParams.builder() .model(Model.CLAUDE_SONNET_4_20250514) .addUserMessageOfBlockParams(List.of(documentBlock, textBlock)) .build(); MessageTokensCount count = client.messages().countTokens(params); System.out.println(count); } } ``` ```json JSON { "input_tokens": 2188 } ``` --- ## Pricing and rate limits Token counting is **free to use** but subject to requests per minute rate limits based on your [usage tier](/docs/en/api/rate-limits#rate-limits). If you need higher limits, contact sales through the [Claude Console](/settings/limits). | Usage tier | Requests per minute (RPM) | |------------|---------------------------| | 1 | 100 | | 2 | 2,000 | | 3 | 4,000 | | 4 | 8,000 | Token counting and message creation have separate and independent rate limits -- usage of one does not count against the limits of the other. --- ## FAQ
No, token counting provides an estimate without using caching logic. While you may provide `cache_control` blocks in your token counting request, prompt caching only occurs during actual message creation.
--- # Vision URL: https://platform.claude.com/docs/en/build-with-claude/vision # Vision Claude's vision capabilities allow it to understand and analyze images, opening up exciting possibilities for multimodal interaction. --- This guide describes how to work with images in Claude, including best practices, code examples, and limitations to keep in mind. --- ## How to use vision Use Claude’s vision capabilities via: - [claude.ai](https://claude.ai/). Upload an image like you would a file, or drag and drop an image directly into the chat window. - The [Console Workbench](/workbench/). A button to add images appears at the top right of every User message block. - **API request**. See the examples in this guide. --- ## Before you upload ### Basics and Limits You can include multiple images in a single request (up to 20 for [claude.ai](https://claude.ai/) and 100 for API requests). Claude will analyze all provided images when formulating its response. This can be helpful for comparing or contrasting images. If you submit an image larger than 8000x8000 px, it will be rejected. If you submit more than 20 images in one API request, this limit is 2000x2000 px. While the API supports 100 images per request, there is a [32MB request size limit](/docs/en/api/overview#request-size-limits) for standard endpoints. ### Evaluate image size For optimal performance, we recommend resizing images before uploading if they are too large. If your image’s long edge is more than 1568 pixels, or your image is more than ~1,600 tokens, it will first be scaled down, preserving aspect ratio, until it’s within the size limits. If your input image is too large and needs to be resized, it will increase latency of [time-to-first-token](/docs/en/about-claude/glossary), without giving you any additional model performance. Very small images under 200 pixels on any given edge may degrade performance. To improve [time-to-first-token](/docs/en/about-claude/glossary), we recommend resizing images to no more than 1.15 megapixels (and within 1568 pixels in both dimensions). Here is a table of maximum image sizes accepted by our API that will not be resized for common aspect ratios. With Claude Sonnet 4.5, these images use approximately 1,600 tokens and around $4.80/1K images. | Aspect ratio | Image size | | ------------ | ------------ | | 1:1 | 1092x1092 px | | 3:4 | 951x1268 px | | 2:3 | 896x1344 px | | 9:16 | 819x1456 px | | 1:2 | 784x1568 px | ### Calculate image costs Each image you include in a request to Claude counts towards your token usage. To calculate the approximate cost, multiply the approximate number of image tokens by the [per-token price of the model](https://claude.com/pricing) you’re using. If your image does not need to be resized, you can estimate the number of tokens used through this algorithm: `tokens = (width px * height px)/750` Here are examples of approximate tokenization and costs for different image sizes within our API's size constraints based on Claude Sonnet 4.5 per-token price of $3 per million input tokens: | Image size | \# of Tokens | Cost / image | Cost / 1K images | | ----------------------------- | ------------ | ------------ | ---------------- | | 200x200 px(0.04 megapixels) | \~54 | \~$0.00016 | \~$0.16 | | 1000x1000 px(1 megapixel) | \~1334 | \~$0.004 | \~$4.00 | | 1092x1092 px(1.19 megapixels) | \~1590 | \~$0.0048 | \~$4.80 | ### Ensuring image quality When providing images to Claude, keep the following in mind for best results: - **Image format**: Use a supported image format: JPEG, PNG, GIF, or WebP. - **Image clarity**: Ensure images are clear and not too blurry or pixelated. - **Text**: If the image contains important text, make sure it’s legible and not too small. Avoid cropping out key visual context just to enlarge the text. --- ## Prompt examples Many of the [prompting techniques](/docs/en/build-with-claude/prompt-engineering/overview) that work well for text-based interactions with Claude can also be applied to image-based prompts. These examples demonstrate best practice prompt structures involving images. Just as with document-query placement, Claude works best when images come before text. Images placed after text or interpolated with text will still perform well, but if your use case allows it, we recommend an image-then-text structure. ### About the prompt examples The following examples demonstrate how to use Claude's vision capabilities using various programming languages and approaches. You can provide images to Claude in three ways: 1. As a base64-encoded image in `image` content blocks 2. As a URL reference to an image hosted online 3. Using the Files API (upload once, use multiple times) The base64 example prompts use these variables: ```bash Shell # For URL-based images, you can use the URL directly in your JSON request # For base64-encoded images, you need to first encode the image # Example of how to encode an image to base64 in bash: BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64) # The encoded data can now be used in your API calls ``` ```python Python import base64 import httpx # For base64-encoded images image1_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" image1_media_type = "image/jpeg" image1_data = base64.standard_b64encode(httpx.get(image1_url).content).decode("utf-8") image2_url = "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg" image2_media_type = "image/jpeg" image2_data = base64.standard_b64encode(httpx.get(image2_url).content).decode("utf-8") # For URL-based images, you can use the URLs directly in your requests ``` ```typescript TypeScript import axios from 'axios'; // For base64-encoded images async function getBase64Image(url: string): Promise { const response = await axios.get(url, { responseType: 'arraybuffer' }); return Buffer.from(response.data, 'binary').toString('base64'); } // Usage async function prepareImages() { const imageData = await getBase64Image('https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg'); // Now you can use imageData in your API calls } // For URL-based images, you can use the URLs directly in your requests ``` ```java Java import java.io.IOException; import java.util.Base64; import java.io.InputStream; import java.net.URL; public class ImageHandlingExample { public static void main(String[] args) throws IOException, InterruptedException { // For base64-encoded images String image1Url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"; String image1MediaType = "image/jpeg"; String image1Data = downloadAndEncodeImage(image1Url); String image2Url = "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg"; String image2MediaType = "image/jpeg"; String image2Data = downloadAndEncodeImage(image2Url); // For URL-based images, you can use the URLs directly in your requests } private static String downloadAndEncodeImage(String imageUrl) throws IOException { try (InputStream inputStream = new URL(imageUrl).openStream()) { return Base64.getEncoder().encodeToString(inputStream.readAllBytes()); } } } ``` Below are examples of how to include images in a Messages API request using base64-encoded images and URL references: ### Base64-encoded image example ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": "'"$BASE64_IMAGE_DATA"'" } }, { "type": "text", "text": "Describe this image." } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image1_media_type, "data": image1_data, }, }, { "type": "text", "text": "Describe this image." } ], } ], ) print(message) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); async function main() { const message = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: [ { type: "image", source: { type: "base64", media_type: "image/jpeg", data: imageData, // Base64-encoded image data as string } }, { type: "text", text: "Describe this image." } ] } ] }); console.log(message); } main(); ``` ```java Java import java.io.IOException; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.*; public class VisionExample { public static void main(String[] args) throws IOException, InterruptedException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); String imageData = ""; // // Base64-encoded image data as string List contentBlockParams = List.of( ContentBlockParam.ofImage( ImageBlockParam.builder() .source(Base64ImageSource.builder() .data(imageData) .build()) .build() ), ContentBlockParam.ofText(TextBlockParam.builder() .text("Describe this image.") .build()) ); Message message = client.messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_5_LATEST) .maxTokens(1024) .addUserMessageOfBlockParams(contentBlockParams) .build() ); System.out.println(message); } } ``` ### URL-based image example ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" } }, { "type": "text", "text": "Describe this image." } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "Describe this image." } ], } ], ) print(message) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); async function main() { const message = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: [ { type: "image", source: { type: "url", url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" } }, { type: "text", text: "Describe this image." } ] } ] }); console.log(message); } main(); ``` ```java Java import java.io.IOException; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.*; public class VisionExample { public static void main(String[] args) throws IOException, InterruptedException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); List contentBlockParams = List.of( ContentBlockParam.ofImage( ImageBlockParam.builder() .source(UrlImageSource.builder() .url("https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg") .build()) .build() ), ContentBlockParam.ofText(TextBlockParam.builder() .text("Describe this image.") .build()) ); Message message = client.messages().create( MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_5_LATEST) .maxTokens(1024) .addUserMessageOfBlockParams(contentBlockParams) .build() ); System.out.println(message); } } ``` ### Files API image example For images you'll use repeatedly or when you want to avoid encoding overhead, use the [Files API](/docs/en/build-with-claude/files): ```bash Shell # First, upload your image to the Files API curl -X POST https://api.anthropic.com/v1/files \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -F "file=@image.jpg" # Then use the returned file_id in your message curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": [ { "type": "image", "source": { "type": "file", "file_id": "file_abc123" } }, { "type": "text", "text": "Describe this image." } ] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() # Upload the image file with open("image.jpg", "rb") as f: file_upload = client.beta.files.upload(file=("image.jpg", f, "image/jpeg")) # Use the uploaded file in a message message = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, betas=["files-api-2025-04-14"], messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "file", "file_id": file_upload.id } }, { "type": "text", "text": "Describe this image." } ] } ], ) print(message.content) ``` ```typescript TypeScript import { Anthropic, toFile } from '@anthropic-ai/sdk'; import fs from 'fs'; const anthropic = new Anthropic(); async function main() { // Upload the image file const fileUpload = await anthropic.beta.files.upload({ file: toFile(fs.createReadStream('image.jpg'), undefined, { type: "image/jpeg" }) }, { betas: ['files-api-2025-04-14'] }); // Use the uploaded file in a message const response = await anthropic.beta.messages.create({ model: 'claude-sonnet-4-5', max_tokens: 1024, betas: ['files-api-2025-04-14'], messages: [ { role: 'user', content: [ { type: 'image', source: { type: 'file', file_id: fileUpload.id } }, { type: 'text', text: 'Describe this image.' } ] } ] }); console.log(response); } main(); ``` ```java Java import java.io.IOException; import java.nio.file.Files; import java.nio.file.Path; import java.util.List; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.File; import com.anthropic.models.files.FileUploadParams; import com.anthropic.models.messages.*; public class ImageFilesExample { public static void main(String[] args) throws IOException { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Upload the image file File file = client.beta().files().upload(FileUploadParams.builder() .file(Files.newInputStream(Path.of("image.jpg"))) .build()); // Use the uploaded file in a message ImageBlockParam imageParam = ImageBlockParam.builder() .fileSource(file.id()) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_5_LATEST) .maxTokens(1024) .addUserMessageOfBlockParams( List.of( ContentBlockParam.ofImage(imageParam), ContentBlockParam.ofText( TextBlockParam.builder() .text("Describe this image.") .build() ) ) ) .build(); Message message = client.messages().create(params); System.out.println(message.content()); } } ``` See [Messages API examples](/docs/en/api/messages) for more example code and parameter details.
It’s best to place images earlier in the prompt than questions about them or instructions for tasks that use them. Ask Claude to describe one image. | Role | Content | | ---- | ------------------------------ | | User | \[Image\] Describe this image. | ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": image1_media_type, "data": image1_data, }, }, { "type": "text", "text": "Describe this image." } ], } ], ) ``` ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "Describe this image." } ], } ], ) ```
In situations where there are multiple images, introduce each image with `Image 1:` and `Image 2:` and so on. You don’t need newlines between images or between images and the prompt. Ask Claude to describe the differences between multiple images. | Role | Content | | ---- | ------------------------------------------------------------------------- | | User | Image 1: \[Image 1\] Image 2: \[Image 2\] How are these images different? | ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "text", "text": "Image 1:" }, { "type": "image", "source": { "type": "base64", "media_type": image1_media_type, "data": image1_data, }, }, { "type": "text", "text": "Image 2:" }, { "type": "image", "source": { "type": "base64", "media_type": image2_media_type, "data": image2_data, }, }, { "type": "text", "text": "How are these images different?" } ], } ], ) ``` ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "text", "text": "Image 1:" }, { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "Image 2:" }, { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg", }, }, { "type": "text", "text": "How are these images different?" } ], } ], ) ```
Ask Claude to describe the differences between multiple images, while giving it a system prompt for how to respond. | Content | | | ------- | ------------------------------------------------------------------------- | | System | Respond only in Spanish. | | User | Image 1: \[Image 1\] Image 2: \[Image 2\] How are these images different? | ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system="Respond only in Spanish.", messages=[ { "role": "user", "content": [ { "type": "text", "text": "Image 1:" }, { "type": "image", "source": { "type": "base64", "media_type": image1_media_type, "data": image1_data, }, }, { "type": "text", "text": "Image 2:" }, { "type": "image", "source": { "type": "base64", "media_type": image2_media_type, "data": image2_data, }, }, { "type": "text", "text": "How are these images different?" } ], } ], ) ``` ```python Python message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system="Respond only in Spanish.", messages=[ { "role": "user", "content": [ { "type": "text", "text": "Image 1:" }, { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "Image 2:" }, { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/b/b5/Iridescent.green.sweat.bee1.jpg", }, }, { "type": "text", "text": "How are these images different?" } ], } ], ) ```
Claude’s vision capabilities shine in multimodal conversations that mix images and text. You can have extended back-and-forth exchanges with Claude, adding new images or follow-up questions at any point. This enables powerful workflows for iterative image analysis, comparison, or combining visuals with other knowledge. Ask Claude to contrast two images, then ask a follow-up question comparing the first images to two new images. | Role | Content | | --------- | ------------------------------------------------------------------------------------ | | User | Image 1: \[Image 1\] Image 2: \[Image 2\] How are these images different? | | Assistant | \[Claude's response\] | | User | Image 1: \[Image 3\] Image 2: \[Image 4\] Are these images similar to the first two? | | Assistant | \[Claude's response\] | When using the API, simply insert new images into the array of Messages in the `user` role as part of any standard [multiturn conversation](/docs/en/api/messages) structure.
--- ## Limitations While Claude's image understanding capabilities are cutting-edge, there are some limitations to be aware of: - **People identification**: Claude [cannot be used](https://www.anthropic.com/legal/aup) to identify (i.e., name) people in images and will refuse to do so. - **Accuracy**: Claude may hallucinate or make mistakes when interpreting low-quality, rotated, or very small images under 200 pixels. - **Spatial reasoning**: Claude's spatial reasoning abilities are limited. It may struggle with tasks requiring precise localization or layouts, like reading an analog clock face or describing exact positions of chess pieces. - **Counting**: Claude can give approximate counts of objects in an image but may not always be precisely accurate, especially with large numbers of small objects. - **AI generated images**: Claude does not know if an image is AI-generated and may be incorrect if asked. Do not rely on it to detect fake or synthetic images. - **Inappropriate content**: Claude will not process inappropriate or explicit images that violate our [Acceptable Use Policy](https://www.anthropic.com/legal/aup). - **Healthcare applications**: While Claude can analyze general medical images, it is not designed to interpret complex diagnostic scans such as CTs or MRIs. Claude's outputs should not be considered a substitute for professional medical advice or diagnosis. Always carefully review and verify Claude's image interpretations, especially for high-stakes use cases. Do not use Claude for tasks requiring perfect precision or sensitive image analysis without human oversight. --- ## FAQ
Claude currently supports JPEG, PNG, GIF, and WebP image formats, specifically: - `image/jpeg` - `image/png` - `image/gif` - `image/webp`
{" "}
Yes, Claude can now process images from URLs with our URL image source blocks in the API. Simply use the "url" source type instead of "base64" in your API requests. Example: ```json { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" } } ```
Yes, there are limits: - API: Maximum 5MB per image - claude.ai: Maximum 10MB per image Images larger than these limits will be rejected and return an error when using our API.
The image limits are: - Messages API: Up to 100 images per request - claude.ai: Up to 20 images per turn Requests exceeding these limits will be rejected and return an error.
{" "}
No, Claude does not parse or receive any metadata from images passed to it.
{" "}
No. Image uploads are ephemeral and not stored beyond the duration of the API request. Uploaded images are automatically deleted after they have been processed.
{" "}
Please refer to our privacy policy page for information on how we handle uploaded images and other data. We do not use uploaded images to train our models.
If Claude's image interpretation seems incorrect: 1. Ensure the image is clear, high-quality, and correctly oriented. 2. Try prompt engineering techniques to improve results. 3. If the issue persists, flag the output in claude.ai (thumbs up/down) or contact our support team. Your feedback helps us improve!
No, Claude is an image understanding model only. It can interpret and analyze images, but it cannot generate, produce, edit, manipulate, or create images.
--- ## Dive deeper into vision Ready to start building with images using Claude? Here are a few helpful resources: - [Multimodal cookbook](https://platform.claude.com/cookbook/multimodal-getting-started-with-vision): This cookbook has tips on [getting started with images](https://platform.claude.com/cookbook/multimodal-getting-started-with-vision) and [best practice techniques](https://platform.claude.com/cookbook/multimodal-best-practices-for-vision) to ensure the highest quality performance with images. See how you can effectively prompt Claude with images to carry out tasks such as [interpreting and analyzing charts](https://platform.claude.com/cookbook/multimodal-reading-charts-graphs-powerpoints) or [extracting content from forms](https://platform.claude.com/cookbook/multimodal-how-to-transcribe-text). - [API reference](/docs/en/api/messages): Visit our documentation for the Messages API, including example [API calls involving images](/docs/en/build-with-claude/working-with-messages#vision). If you have any other questions, feel free to reach out to our [support team](https://support.claude.com/). You can also join our [developer community](https://www.anthropic.com/discord) to connect with other creators and get help from Anthropic experts. ### Tools --- # Tool use with Claude URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview # Tool use with Claude --- Claude is capable of interacting with tools and functions, allowing you to extend Claude's capabilities to perform a wider variety of tasks. Learn everything you need to master tool use with Claude as part of our new [courses](https://anthropic.skilljar.com/)! Please continue to share your ideas and suggestions using this [form](https://forms.gle/BFnYc6iCkWoRzFgk7). **Guarantee schema conformance with strict tool use** [Structured Outputs](/docs/en/build-with-claude/structured-outputs) provides guaranteed schema validation for tool inputs. Add `strict: true` to your tool definitions to ensure Claude's tool calls always match your schema exactly—no more type mismatches or missing fields. Perfect for production agents where invalid tool parameters would cause failures. [Learn when to use strict tool use →](/docs/en/build-with-claude/structured-outputs#when-to-use-json-outputs-vs-strict-tool-use) Here's an example of how to provide tools to Claude using the Messages API: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } } ], "messages": [ { "role": "user", "content": "What is the weather like in San Francisco?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA", } }, "required": ["location"], }, } ], messages=[{"role": "user", "content": "What's the weather like in San Francisco?"}], ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }); async function main() { const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [{ name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" } }, required: ["location"] } }], messages: [{ role: "user", content: "Tell me the weather in San Francisco." }] }); console.log(response); } main().catch(console.error); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class GetWeatherExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA")))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(schema) .build()) .addUserMessage("What's the weather like in San Francisco?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` --- ## How tool use works Claude supports two types of tools: 1. **Client tools**: Tools that execute on your systems, which include: - User-defined custom tools that you create and implement - Anthropic-defined tools like [computer use](/docs/en/agents-and-tools/tool-use/computer-use-tool) and [text editor](/docs/en/agents-and-tools/tool-use/text-editor-tool) that require client implementation 2. **Server tools**: Tools that execute on Anthropic's servers, like the [web search](/docs/en/agents-and-tools/tool-use/web-search-tool) and [web fetch](/docs/en/agents-and-tools/tool-use/web-fetch-tool) tools. These tools must be specified in the API request but don't require implementation on your part. Anthropic-defined tools use versioned types (e.g., `web_search_20250305`, `text_editor_20250124`) to ensure compatibility across model versions. ### Client tools Integrate client tools with Claude in these steps: - Define client tools with names, descriptions, and input schemas in your API request. - Include a user prompt that might require these tools, e.g., "What's the weather in San Francisco?" - Claude assesses if any tools can help with the user's query. - If yes, Claude constructs a properly formatted tool use request. - For client tools, the API response has a `stop_reason` of `tool_use`, signaling Claude's intent. - Extract the tool name and input from Claude's request - Execute the tool code on your system - Return the results in a new `user` message containing a `tool_result` content block - Claude analyzes the tool results to craft its final response to the original user prompt. Note: Steps 3 and 4 are optional. For some workflows, Claude's tool use request (step 2) might be all you need, without sending results back to Claude. ### Server tools Server tools follow a different workflow: - Server tools, like [web search](/docs/en/agents-and-tools/tool-use/web-search-tool) and [web fetch](/docs/en/agents-and-tools/tool-use/web-fetch-tool), have their own parameters. - Include a user prompt that might require these tools, e.g., "Search for the latest news about AI" or "Analyze the content at this URL." - Claude assesses if a server tool can help with the user's query. - If yes, Claude executes the tool, and the results are automatically incorporated into Claude's response. - Claude analyzes the server tool results to craft its final response to the original user prompt. - No additional user interaction is needed for server tool execution. --- ## Using MCP tools with Claude If you're building an application that uses the [Model Context Protocol (MCP)](https://modelcontextprotocol.io), you can use tools from MCP servers directly with Claude's Messages API. MCP tool definitions use a schema format that's similar to Claude's tool format. You just need to rename `inputSchema` to `input_schema`. **Don't want to build your own MCP client?** Use the [MCP connector](/docs/en/agents-and-tools/mcp-connector) to connect directly to remote MCP servers from the Messages API without implementing a client. ### Converting MCP tools to Claude format When you build an MCP client and call `list_tools()` on an MCP server, you'll receive tool definitions with an `inputSchema` field. To use these tools with Claude, convert them to Claude's format: ```python Python from mcp import ClientSession async def get_claude_tools(mcp_session: ClientSession): """Convert MCP tools to Claude's tool format.""" mcp_tools = await mcp_session.list_tools() claude_tools = [] for tool in mcp_tools.tools: claude_tools.append({ "name": tool.name, "description": tool.description or "", "input_schema": tool.inputSchema # Rename inputSchema to input_schema }) return claude_tools ``` ```typescript TypeScript import { Client } from "@modelcontextprotocol/sdk/client/index.js"; async function getClaudeTools(mcpClient: Client) { // Convert MCP tools to Claude's tool format const mcpTools = await mcpClient.listTools(); return mcpTools.tools.map((tool) => ({ name: tool.name, description: tool.description ?? "", input_schema: tool.inputSchema, // Rename inputSchema to input_schema })); } ``` Then pass these converted tools to Claude: ```python Python import anthropic client = anthropic.Anthropic() claude_tools = await get_claude_tools(mcp_session) response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=claude_tools, messages=[{"role": "user", "content": "What tools do you have available?"}] ) ``` ```typescript TypeScript import Anthropic from "@anthropic-ai/sdk"; const anthropic = new Anthropic(); const claudeTools = await getClaudeTools(mcpClient); const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: claudeTools, messages: [{ role: "user", content: "What tools do you have available?" }], }); ``` When Claude responds with a `tool_use` block, execute the tool on your MCP server using `call_tool()` and return the result to Claude in a `tool_result` block. For a complete guide to building MCP clients, see [Build an MCP client](https://modelcontextprotocol.io/docs/develop/build-client). --- ## Tool use examples Here are a few code examples demonstrating various tool use patterns and techniques. For brevity's sake, the tools are simple tools, and the tool descriptions are shorter than would be ideal to ensure best performance.
```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [{ "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either \"celsius\" or \"fahrenheit\"" } }, "required": ["location"] } }], "messages": [{"role": "user", "content": "What is the weather like in San Francisco?"}] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either \"celsius\" or \"fahrenheit\"" } }, "required": ["location"] } } ], messages=[{"role": "user", "content": "What is the weather like in San Francisco?"}] ) print(response) ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class WeatherToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either \"celsius\" or \"fahrenheit\"" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(schema) .build()) .addUserMessage("What is the weather like in San Francisco?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` Claude will return a response similar to: ```json JSON { "id": "msg_01Aq9w938a90dw8q", "model": "claude-sonnet-4-5", "stop_reason": "tool_use", "role": "assistant", "content": [ { "type": "text", "text": "I'll check the current weather in San Francisco for you." }, { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": {"location": "San Francisco, CA", "unit": "celsius"} } ] } ``` You would then need to execute the `get_weather` function with the provided input, and return the result in a new `user` message: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either \"celsius\" or \"fahrenheit\"" } }, "required": ["location"] } } ], "messages": [ { "role": "user", "content": "What is the weather like in San Francisco?" }, { "role": "assistant", "content": [ { "type": "text", "text": "I'll check the current weather in San Francisco for you." }, { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": { "location": "San Francisco, CA", "unit": "celsius" } } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "15 degrees" } ] } ] }' ``` ```python Python response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } } ], messages=[ { "role": "user", "content": "What's the weather like in San Francisco?" }, { "role": "assistant", "content": [ { "type": "text", "text": "I'll check the current weather in San Francisco for you." }, { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": {"location": "San Francisco, CA", "unit": "celsius"} } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", # from the API response "content": "65 degrees" # from running your tool } ] } ] ) print(response) ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.*; import com.anthropic.models.messages.Tool.InputSchema; public class ToolConversationExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); InputSchema schema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either \"celsius\" or \"fahrenheit\"" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(schema) .build()) .addUserMessage("What is the weather like in San Francisco?") .addAssistantMessageOfBlockParams( List.of( ContentBlockParam.ofText( TextBlockParam.builder() .text("I'll check the current weather in San Francisco for you.") .build() ), ContentBlockParam.ofToolUse( ToolUseBlockParam.builder() .id("toolu_01A09q90qw90lq917835lq9") .name("get_weather") .input(JsonValue.from(Map.of( "location", "San Francisco, CA", "unit", "celsius" ))) .build() ) ) ) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofToolResult( ToolResultBlockParam.builder() .toolUseId("toolu_01A09q90qw90lq917835lq9") .content("15 degrees") .build() ) )) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` This will print Claude's final response, incorporating the weather data: ```json JSON { "id": "msg_01Aq9w938a90dw8q", "model": "claude-sonnet-4-5", "stop_reason": "stop_sequence", "role": "assistant", "content": [ { "type": "text", "text": "The current weather in San Francisco is 15 degrees Celsius (59 degrees Fahrenheit). It's a cool day in the city by the bay!" } ] } ```
Claude can call multiple tools in parallel within a single response, which is useful for tasks that require multiple independent operations. When using parallel tools, all `tool_use` blocks are included in a single assistant message, and all corresponding `tool_result` blocks must be provided in the subsequent user message. **Important**: Tool results must be formatted correctly to avoid API errors and ensure Claude continues using parallel tools. See our [implementation guide](/docs/en/agents-and-tools/tool-use/implement-tool-use#parallel-tool-use) for detailed formatting requirements and complete code examples. For comprehensive examples, test scripts, and best practices for implementing parallel tool calls, see the [parallel tool use section](/docs/en/agents-and-tools/tool-use/implement-tool-use#parallel-tool-use) in our implementation guide.
You can provide Claude with multiple tools to choose from in a single request. Here's an example with both a `get_weather` and a `get_time` tool, along with a user query that asks for both. ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [{ "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } }, { "name": "get_time", "description": "Get the current time in a given time zone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The IANA time zone name, e.g. America/Los_Angeles" } }, "required": ["timezone"] } }], "messages": [{ "role": "user", "content": "What is the weather like right now in New York? Also what time is it there?" }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } }, { "name": "get_time", "description": "Get the current time in a given time zone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The IANA time zone name, e.g. America/Los_Angeles" } }, "required": ["timezone"] } } ], messages=[ { "role": "user", "content": "What is the weather like right now in New York? Also what time is it there?" } ] ) print(response) ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class MultipleToolsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Weather tool schema InputSchema weatherSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either \"celsius\" or \"fahrenheit\"" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); // Time tool schema InputSchema timeSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "timezone", Map.of( "type", "string", "description", "The IANA time zone name, e.g. America/Los_Angeles" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("timezone"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(weatherSchema) .build()) .addTool(Tool.builder() .name("get_time") .description("Get the current time in a given time zone") .inputSchema(timeSchema) .build()) .addUserMessage("What is the weather like right now in New York? Also what time is it there?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` In this case, Claude may either: - Use the tools sequentially (one at a time) — calling `get_weather` first, then `get_time` after receiving the weather result - Use parallel tool calls — outputting multiple `tool_use` blocks in a single response when the operations are independent When Claude makes parallel tool calls, you must return all tool results in a single `user` message, with each result in its own `tool_result` block.
If the user's prompt doesn't include enough information to fill all the required parameters for a tool, Claude Opus is much more likely to recognize that a parameter is missing and ask for it. Claude Sonnet may ask, especially when prompted to think before outputting a tool request. But it may also do its best to infer a reasonable value. For example, using the `get_weather` tool above, if you ask Claude "What's the weather?" without specifying a location, Claude, particularly Claude Sonnet, may make a guess about tools inputs: ```json JSON { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": {"location": "New York, NY", "unit": "fahrenheit"} } ``` This behavior is not guaranteed, especially for more ambiguous prompts and for less intelligent models. If Claude Opus doesn't have enough context to fill in the required parameters, it is far more likely respond with a clarifying question instead of making a tool call.
Some tasks may require calling multiple tools in sequence, using the output of one tool as the input to another. In such a case, Claude will call one tool at a time. If prompted to call the tools all at once, Claude is likely to guess parameters for tools further downstream if they are dependent on tool results for tools further upstream. Here's an example of using a `get_location` tool to get the user's location, then passing that location to the `get_weather` tool: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "name": "get_location", "description": "Get the current user location based on their IP address. This tool has no parameters or arguments.", "input_schema": { "type": "object", "properties": {} } }, { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } } ], "messages": [{ "role": "user", "content": "What is the weather like where I am?" }] }' ``` ```python Python response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "name": "get_location", "description": "Get the current user location based on their IP address. This tool has no parameters or arguments.", "input_schema": { "type": "object", "properties": {} } }, { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } } ], messages=[{ "role": "user", "content": "What's the weather like where I am?" }] ) ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.Tool; import com.anthropic.models.messages.Tool.InputSchema; public class EmptySchemaToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); // Empty schema for location tool InputSchema locationSchema = InputSchema.builder() .properties(JsonValue.from(Map.of())) .build(); // Weather tool schema InputSchema weatherSchema = InputSchema.builder() .properties(JsonValue.from(Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either \"celsius\" or \"fahrenheit\"" ) ))) .putAdditionalProperty("required", JsonValue.from(List.of("location"))) .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_0) .maxTokens(1024) .addTool(Tool.builder() .name("get_location") .description("Get the current user location based on their IP address. This tool has no parameters or arguments.") .inputSchema(locationSchema) .build()) .addTool(Tool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(weatherSchema) .build()) .addUserMessage("What is the weather like where I am?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` In this case, Claude would first call the `get_location` tool to get the user's location. After you return the location in a `tool_result`, Claude would then call `get_weather` with that location to get the final answer. The full conversation might look like: | Role | Content | | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | User | What's the weather like where I am? | | Assistant | I'll find your current location first, then check the weather there. \[Tool use for get_location\] | | User | \[Tool result for get_location with matching id and result of San Francisco, CA\] | | Assistant | \[Tool use for get_weather with the following input\]\{ "location": "San Francisco, CA", "unit": "fahrenheit" } | | User | \[Tool result for get_weather with matching id and result of "59°F (15°C), mostly cloudy"\] | | Assistant | Based on your current location in San Francisco, CA, the weather right now is 59°F (15°C) and mostly cloudy. It's a fairly cool and overcast day in the city. You may want to bring a light jacket if you're heading outside. | This example demonstrates how Claude can chain together multiple tool calls to answer a question that requires gathering data from different sources. The key steps are: 1. Claude first realizes it needs the user's location to answer the weather question, so it calls the `get_location` tool. 2. The user (i.e. the client code) executes the actual `get_location` function and returns the result "San Francisco, CA" in a `tool_result` block. 3. With the location now known, Claude proceeds to call the `get_weather` tool, passing in "San Francisco, CA" as the `location` parameter (as well as a guessed `unit` parameter, as `unit` is not a required parameter). 4. The user again executes the actual `get_weather` function with the provided arguments and returns the weather data in another `tool_result` block. 5. Finally, Claude incorporates the weather data into a natural language response to the original question.
By default, Claude Opus is prompted to think before it answers a tool use query to best determine whether a tool is necessary, which tool to use, and the appropriate parameters. Claude Sonnet and Claude Haiku are prompted to try to use tools as much as possible and are more likely to call an unnecessary tool or infer missing parameters. To prompt Sonnet or Haiku to better assess the user query before making tool calls, the following prompt can be used: Chain of thought prompt `Answer the user's request using relevant tools (if they are available). Before calling a tool, do some analysis. First, think about which of the provided tools is the relevant tool to answer the user's request. Second, go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, proceed with the tool call. BUT, if one of the values for a required parameter is missing, DO NOT invoke the function (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters. DO NOT ask for more information on optional parameters if it is not provided. `
--- ## Pricing Tool use requests are priced based on: 1. The total number of input tokens sent to the model (including in the `tools` parameter) 2. The number of output tokens generated 3. For server-side tools, additional usage-based pricing (e.g., web search charges per search performed) Client-side tools are priced the same as any other Claude API request, while server-side tools may incur additional charges based on their specific usage. The additional tokens from tool use come from: - The `tools` parameter in API requests (tool names, descriptions, and schemas) - `tool_use` content blocks in API requests and responses - `tool_result` content blocks in API requests When you use `tools`, we also automatically include a special system prompt for the model which enables tool use. The number of tool use tokens required for each model are listed below (excluding the additional tokens listed above). Note that the table assumes at least 1 tool is provided. If no `tools` are provided, then a tool choice of `none` uses 0 additional system prompt tokens. | Model | Tool choice | Tool use system prompt token count | |--------------------------|------------------------------------------------------|---------------------------------------------| | Claude Opus 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Opus 4.1 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Opus 4 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 4 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Haiku 4.5 | `auto`, `none`
`any`, `tool` | 346 tokens
313 tokens | | Claude Haiku 3.5 | `auto`, `none`
`any`, `tool` | 264 tokens
340 tokens | | Claude Opus 3 ([deprecated](/docs/en/about-claude/model-deprecations)) | `auto`, `none`
`any`, `tool` | 530 tokens
281 tokens | | Claude Sonnet 3 | `auto`, `none`
`any`, `tool` | 159 tokens
235 tokens | | Claude Haiku 3 | `auto`, `none`
`any`, `tool` | 264 tokens
340 tokens | These token counts are added to your normal input and output tokens to calculate the total cost of a request. Refer to our [models overview table](/docs/en/about-claude/models/overview#latest-models-comparison) for current per-model prices. When you send a tool use prompt, just like any other API request, the response will output both input and output token counts as part of the reported `usage` metrics. --- ## Next Steps Explore our repository of ready-to-implement tool use code examples in our cookbooks: Learn how to integrate a simple calculator tool with Claude for precise numerical computations. {" "} Build a responsive customer service bot that leverages client tools to enhance support. See how Claude and tool use can extract structured data from unstructured text. --- # Bash tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/bash-tool # Bash tool --- The bash tool enables Claude to execute shell commands in a persistent bash session, allowing system operations, script execution, and command-line automation. ## Overview The bash tool provides Claude with: - Persistent bash session that maintains state - Ability to run any shell command - Access to environment variables and working directory - Command chaining and scripting capabilities ## Model compatibility | Model | Tool Version | |-------|--------------| | Claude 4 models and Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | `bash_20250124` | Older tool versions are not guaranteed to be backwards-compatible with newer models. Always use the tool version that corresponds to your model version. ## Use cases - **Development workflows**: Run build commands, tests, and development tools - **System automation**: Execute scripts, manage files, automate tasks - **Data processing**: Process files, run analysis scripts, manage datasets - **Environment setup**: Install packages, configure environments ## Quick start ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "bash_20250124", "name": "bash" } ], messages=[ {"role": "user", "content": "List all Python files in the current directory."} ] ) ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "type": "bash_20250124", "name": "bash" } ], "messages": [ { "role": "user", "content": "List all Python files in the current directory." } ] }' ``` ## How it works The bash tool maintains a persistent session: 1. Claude determines what command to run 2. You execute the command in a bash shell 3. Return the output (stdout and stderr) to Claude 4. Session state persists between commands (environment variables, working directory) ## Parameters | Parameter | Required | Description | |-----------|----------|-------------| | `command` | Yes* | The bash command to run | | `restart` | No | Set to `true` to restart the bash session | *Required unless using `restart`
```json // Run a command { "command": "ls -la *.py" } // Restart the session { "restart": true } ```
## Example: Multi-step automation Claude can chain commands to complete complex tasks: ```python # User request "Install the requests library and create a simple Python script that fetches a joke from an API, then run it." # Claude's tool uses: # 1. Install package {"command": "pip install requests"} # 2. Create script {"command": "cat > fetch_joke.py << 'EOF'\nimport requests\nresponse = requests.get('https://official-joke-api.appspot.com/random_joke')\njoke = response.json()\nprint(f\"Setup: {joke['setup']}\")\nprint(f\"Punchline: {joke['punchline']}\")\nEOF"} # 3. Run script {"command": "python fetch_joke.py"} ``` The session maintains state between commands, so files created in step 2 are available in step 3. *** ## Implement the bash tool The bash tool is implemented as a schema-less tool. When using this tool, you don't need to provide an input schema as with other tools; the schema is built into Claude's model and can't be modified. Create a persistent bash session that Claude can interact with: ```python import subprocess import threading import queue class BashSession: def __init__(self): self.process = subprocess.Popen( ['/bin/bash'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, bufsize=0 ) self.output_queue = queue.Queue() self.error_queue = queue.Queue() self._start_readers() ``` Create a function to execute commands and capture output: ```python def execute_command(self, command): # Send command to bash self.process.stdin.write(command + '\n') self.process.stdin.flush() # Capture output with timeout output = self._read_output(timeout=10) return output ``` Extract and execute commands from Claude's responses: ```python for content in response.content: if content.type == "tool_use" and content.name == "bash": if content.input.get("restart"): bash_session.restart() result = "Bash session restarted" else: command = content.input.get("command") result = bash_session.execute_command(command) # Return result to Claude tool_result = { "type": "tool_result", "tool_use_id": content.id, "content": result } ``` Add validation and restrictions: ```python def validate_command(command): # Block dangerous commands dangerous_patterns = ['rm -rf /', 'format', ':(){:|:&};:'] for pattern in dangerous_patterns: if pattern in command: return False, f"Command contains dangerous pattern: {pattern}" # Add more validation as needed return True, None ``` ### Handle errors When implementing the bash tool, handle various error scenarios:
If a command takes too long to execute: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Command timed out after 30 seconds", "is_error": true } ] } ```
If a command doesn't exist: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "bash: nonexistentcommand: command not found", "is_error": true } ] } ```
If there are permission issues: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "bash: /root/sensitive-file: Permission denied", "is_error": true } ] } ```
### Follow implementation best practices
Implement timeouts to prevent hanging commands: ```python def execute_with_timeout(command, timeout=30): try: result = subprocess.run( command, shell=True, capture_output=True, text=True, timeout=timeout ) return result.stdout + result.stderr except subprocess.TimeoutExpired: return f"Command timed out after {timeout} seconds" ```
Keep the bash session persistent to maintain environment variables and working directory: ```python # Commands run in the same session maintain state commands = [ "cd /tmp", "echo 'Hello' > test.txt", "cat test.txt" # This works because we're still in /tmp ] ```
Truncate very large outputs to prevent token limit issues: ```python def truncate_output(output, max_lines=100): lines = output.split('\n') if len(lines) > max_lines: truncated = '\n'.join(lines[:max_lines]) return f"{truncated}\n\n... Output truncated ({len(lines)} total lines) ..." return output ```
Keep an audit trail of executed commands: ```python import logging def log_command(command, output, user_id): logging.info(f"User {user_id} executed: {command}") logging.info(f"Output: {output[:200]}...") # Log first 200 chars ```
Remove sensitive information from command outputs: ```python def sanitize_output(output): # Remove potential secrets or credentials import re # Example: Remove AWS credentials output = re.sub(r'aws_access_key_id\s*=\s*\S+', 'aws_access_key_id=***', output) output = re.sub(r'aws_secret_access_key\s*=\s*\S+', 'aws_secret_access_key=***', output) return output ```
## Security The bash tool provides direct system access. Implement these essential safety measures: - Running in isolated environments (Docker/VM) - Implementing command filtering and allowlists - Setting resource limits (CPU, memory, disk) - Logging all executed commands ### Key recommendations - Use `ulimit` to set resource constraints - Filter dangerous commands (`sudo`, `rm -rf`, etc.) - Run with minimal user permissions - Monitor and log all command execution ## Pricing The bash tool adds **245 input tokens** to your API calls. Additional tokens are consumed by: - Command outputs (stdout/stderr) - Error messages - Large file contents See [tool use pricing](/docs/en/agents-and-tools/tool-use/overview#pricing) for complete pricing details. ## Common patterns ### Development workflows - Running tests: `pytest && coverage report` - Building projects: `npm install && npm run build` - Git operations: `git status && git add . && git commit -m "message"` ### File operations - Processing data: `wc -l *.csv && ls -lh *.csv` - Searching files: `find . -name "*.py" | xargs grep "pattern"` - Creating backups: `tar -czf backup.tar.gz ./data` ### System tasks - Checking resources: `df -h && free -m` - Process management: `ps aux | grep python` - Environment setup: `export PATH=$PATH:/new/path && echo $PATH` ## Limitations - **No interactive commands**: Cannot handle `vim`, `less`, or password prompts - **No GUI applications**: Command-line only - **Session scope**: Persists within conversation, lost between API calls - **Output limits**: Large outputs may be truncated - **No streaming**: Results returned after completion ## Combining with other tools The bash tool is most powerful when combined with the [text editor](/docs/en/agents-and-tools/tool-use/text-editor-tool) and other tools. ## Next steps Learn about tool use with Claude View and edit text files with Claude --- # Code execution tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/code-execution-tool # Code execution tool --- Claude can analyze data, create visualizations, perform complex calculations, run system commands, create and edit files, and process uploaded files directly within the API conversation. The code execution tool allows Claude to run Bash commands and manipulate files, including writing code, in a secure, sandboxed environment. The code execution tool is currently in public beta. To use this feature, add the `"code-execution-2025-08-25"` [beta header](/docs/en/api/beta-headers) to your API requests. Please reach out through our [feedback form](https://forms.gle/LTAU6Xn2puCJMi1n6) to share your feedback on this feature. ## Model compatibility The code execution tool is available on the following models: | Model | Tool Version | |-------|--------------| | Claude Opus 4.5 (`claude-opus-4-5-20251101`) | `code_execution_20250825` | | Claude Opus 4.1 (`claude-opus-4-1-20250805`) | `code_execution_20250825` | | Claude Opus 4 (`claude-opus-4-20250514`) | `code_execution_20250825` | | Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) | `code_execution_20250825` | | Claude Sonnet 4 (`claude-sonnet-4-20250514`) | `code_execution_20250825` | | Claude Sonnet 3.7 (`claude-3-7-sonnet-20250219`) ([deprecated](/docs/en/about-claude/model-deprecations)) | `code_execution_20250825` | | Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) | `code_execution_20250825` | | Claude Haiku 3.5 (`claude-3-5-haiku-latest`) ([deprecated](/docs/en/about-claude/model-deprecations)) | `code_execution_20250825` | The current version `code_execution_20250825` supports Bash commands and file operations. A legacy version `code_execution_20250522` (Python only) is also available. See [Upgrade to latest tool version](#upgrade-to-latest-tool-version) for migration details. Older tool versions are not guaranteed to be backwards-compatible with newer models. Always use the tool version that corresponds to your model version. ## Quick start Here's a simple example that asks Claude to perform a calculation: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [ { "role": "user", "content": "Calculate the mean and standard deviation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]" } ], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25"], max_tokens=4096, messages=[{ "role": "user", "content": "Calculate the mean and standard deviation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25"], max_tokens: 4096, messages: [ { role: "user", content: "Calculate the mean and standard deviation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]" } ], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); console.log(response); } main().catch(console.error); ``` ## How code execution works When you add the code execution tool to your API request: 1. Claude evaluates whether code execution would help answer your question 2. The tool automatically provides Claude with the following capabilities: - **Bash commands**: Execute shell commands for system operations and package management - **File operations**: Create, view, and edit files directly, including writing code 3. Claude can use any combination of these capabilities in a single request 4. All operations run in a secure sandbox environment 5. Claude provides results with any generated charts, calculations, or analysis ## How to use the tool ### Execute Bash commands Ask Claude to check system information and install packages: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": "Check the Python version and list installed packages" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25"], max_tokens=4096, messages=[{ "role": "user", "content": "Check the Python version and list installed packages" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25"], max_tokens: 4096, messages: [{ role: "user", content: "Check the Python version and list installed packages" }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); ``` ### Create and edit files directly Claude can create, view, and edit files directly in the sandbox using the file manipulation capabilities: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": "Create a config.yaml file with database settings, then update the port from 5432 to 3306" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25"], max_tokens=4096, messages=[{ "role": "user", "content": "Create a config.yaml file with database settings, then update the port from 5432 to 3306" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25"], max_tokens: 4096, messages: [{ role: "user", content: "Create a config.yaml file with database settings, then update the port from 5432 to 3306" }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); ``` ### Upload and analyze your own files To analyze your own data files (CSV, Excel, images, etc.), upload them via the Files API and reference them in your request: Using the Files API with Code Execution requires two beta headers: `"anthropic-beta": "code-execution-2025-08-25,files-api-2025-04-14"` The Python environment can process various file types uploaded via the Files API, including: - CSV - Excel (.xlsx, .xls) - JSON - XML - Images (JPEG, PNG, GIF, WebP) - Text files (.txt, .md, .py, etc) #### Upload and analyze files 1. **Upload your file** using the [Files API](/docs/en/build-with-claude/files) 2. **Reference the file** in your message using a `container_upload` content block 3. **Include the code execution tool** in your API request ```bash Shell # First, upload a file curl https://api.anthropic.com/v1/files \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: files-api-2025-04-14" \ --form 'file=@"data.csv"' \ # Then use the file_id with code execution curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25,files-api-2025-04-14" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": [ {"type": "text", "text": "Analyze this CSV data"}, {"type": "container_upload", "file_id": "file_abc123"} ] }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() # Upload a file file_object = client.beta.files.upload( file=open("data.csv", "rb"), ) # Use the file_id with code execution response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens=4096, messages=[{ "role": "user", "content": [ {"type": "text", "text": "Analyze this CSV data"}, {"type": "container_upload", "file_id": file_object.id} ] }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; import { createReadStream } from 'fs'; const anthropic = new Anthropic(); async function main() { // Upload a file const fileObject = await anthropic.beta.files.create({ file: createReadStream("data.csv"), }); // Use the file_id with code execution const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens: 4096, messages: [{ role: "user", content: [ { type: "text", text: "Analyze this CSV data" }, { type: "container_upload", file_id: fileObject.id } ] }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); console.log(response); } main().catch(console.error); ``` #### Retrieve generated files When Claude creates files during code execution, you can retrieve these files using the Files API: ```python Python from anthropic import Anthropic # Initialize the client client = Anthropic() # Request code execution that creates files response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens=4096, messages=[{ "role": "user", "content": "Create a matplotlib visualization and save it as output.png" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) # Extract file IDs from the response def extract_file_ids(response): file_ids = [] for item in response.content: if item.type == 'bash_code_execution_tool_result': content_item = item.content if content_item.type == 'bash_code_execution_result': for file in content_item.content: if hasattr(file, 'file_id'): file_ids.append(file.file_id) return file_ids # Download the created files for file_id in extract_file_ids(response): file_metadata = client.beta.files.retrieve_metadata(file_id) file_content = client.beta.files.download(file_id) file_content.write_to_file(file_metadata.filename) print(f"Downloaded: {file_metadata.filename}") ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; import { writeFileSync } from 'fs'; // Initialize the client const anthropic = new Anthropic(); async function main() { // Request code execution that creates files const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens: 4096, messages: [{ role: "user", content: "Create a matplotlib visualization and save it as output.png" }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); // Extract file IDs from the response function extractFileIds(response: any): string[] { const fileIds: string[] = []; for (const item of response.content) { if (item.type === 'bash_code_execution_tool_result') { const contentItem = item.content; if (contentItem.type === 'bash_code_execution_result' && contentItem.content) { for (const file of contentItem.content) { fileIds.push(file.file_id); } } } } return fileIds; } // Download the created files const fileIds = extractFileIds(response); for (const fileId of fileIds) { const fileMetadata = await anthropic.beta.files.retrieveMetadata(fileId); const fileContent = await anthropic.beta.files.download(fileId); // Convert ReadableStream to Buffer and save const chunks: Uint8Array[] = []; for await (const chunk of fileContent) { chunks.push(chunk); } const buffer = Buffer.concat(chunks); writeFileSync(fileMetadata.filename, buffer); console.log(`Downloaded: ${fileMetadata.filename}`); } } main().catch(console.error); ``` ### Combine operations A complex workflow using all capabilities: ```bash Shell # First, upload a file curl https://api.anthropic.com/v1/files \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: files-api-2025-04-14" \ --form 'file=@"data.csv"' \ > file_response.json # Extract file_id (using jq) FILE_ID=$(jq -r '.id' file_response.json) # Then use it with code execution curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25,files-api-2025-04-14" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": [ { "type": "text", "text": "Analyze this CSV data: create a summary report, save visualizations, and create a README with the findings" }, { "type": "container_upload", "file_id": "'$FILE_ID'" } ] }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ```python Python # Upload a file file_object = client.beta.files.upload( file=open("data.csv", "rb"), ) # Use it with code execution response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens=4096, messages=[{ "role": "user", "content": [ {"type": "text", "text": "Analyze this CSV data: create a summary report, save visualizations, and create a README with the findings"}, {"type": "container_upload", "file_id": file_object.id} ] }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) # Claude might: # 1. Use bash to check file size and preview data # 2. Use text_editor to write Python code to analyze the CSV and create visualizations # 3. Use bash to run the Python code # 4. Use text_editor to create a README.md with findings # 5. Use bash to organize files into a report directory ``` ```typescript TypeScript // Upload a file const fileObject = await anthropic.beta.files.create({ file: createReadStream("data.csv"), }); // Use it with code execution const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25", "files-api-2025-04-14"], max_tokens: 4096, messages: [{ role: "user", content: [ {type: "text", text: "Analyze this CSV data: create a summary report, save visualizations, and create a README with the findings"}, {type: "container_upload", file_id: fileObject.id} ] }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); // Claude might: // 1. Use bash to check file size and preview data // 2. Use text_editor to write Python code to analyze the CSV and create visualizations // 3. Use bash to run the Python code // 4. Use text_editor to create a README.md with findings // 5. Use bash to organize files into a report directory ``` ## Tool definition The code execution tool requires no additional parameters: ```json JSON { "type": "code_execution_20250825", "name": "code_execution" } ``` When this tool is provided, Claude automatically gains access to two sub-tools: - `bash_code_execution`: Run shell commands - `text_editor_code_execution`: View, create, and edit files, including writing code ## Response format The code execution tool can return two types of results depending on the operation: ### Bash command response ```json { "type": "server_tool_use", "id": "srvtoolu_01B3C4D5E6F7G8H9I0J1K2L3", "name": "bash_code_execution", "input": { "command": "ls -la | head -5" } }, { "type": "bash_code_execution_tool_result", "tool_use_id": "srvtoolu_01B3C4D5E6F7G8H9I0J1K2L3", "content": { "type": "bash_code_execution_result", "stdout": "total 24\ndrwxr-xr-x 2 user user 4096 Jan 1 12:00 .\ndrwxr-xr-x 3 user user 4096 Jan 1 11:00 ..\n-rw-r--r-- 1 user user 220 Jan 1 12:00 data.csv\n-rw-r--r-- 1 user user 180 Jan 1 12:00 config.json", "stderr": "", "return_code": 0 } } ``` ### File operation responses **View file:** ```json { "type": "server_tool_use", "id": "srvtoolu_01C4D5E6F7G8H9I0J1K2L3M4", "name": "text_editor_code_execution", "input": { "command": "view", "path": "config.json" } }, { "type": "text_editor_code_execution_tool_result", "tool_use_id": "srvtoolu_01C4D5E6F7G8H9I0J1K2L3M4", "content": { "type": "text_editor_code_execution_result", "file_type": "text", "content": "{\n \"setting\": \"value\",\n \"debug\": true\n}", "numLines": 4, "startLine": 1, "totalLines": 4 } } ``` **Create file:** ```json { "type": "server_tool_use", "id": "srvtoolu_01D5E6F7G8H9I0J1K2L3M4N5", "name": "text_editor_code_execution", "input": { "command": "create", "path": "new_file.txt", "file_text": "Hello, World!" } }, { "type": "text_editor_code_execution_tool_result", "tool_use_id": "srvtoolu_01D5E6F7G8H9I0J1K2L3M4N5", "content": { "type": "text_editor_code_execution_result", "is_file_update": false } } ``` **Edit file (str_replace):** ```json { "type": "server_tool_use", "id": "srvtoolu_01E6F7G8H9I0J1K2L3M4N5O6", "name": "text_editor_code_execution", "input": { "command": "str_replace", "path": "config.json", "old_str": "\"debug\": true", "new_str": "\"debug\": false" } }, { "type": "text_editor_code_execution_tool_result", "tool_use_id": "srvtoolu_01E6F7G8H9I0J1K2L3M4N5O6", "content": { "type": "text_editor_code_execution_result", "oldStart": 3, "oldLines": 1, "newStart": 3, "newLines": 1, "lines": ["- \"debug\": true", "+ \"debug\": false"] } } ``` ### Results All execution results include: - `stdout`: Output from successful execution - `stderr`: Error messages if execution fails - `return_code`: 0 for success, non-zero for failure Additional fields for file operations: - **View**: `file_type`, `content`, `numLines`, `startLine`, `totalLines` - **Create**: `is_file_update` (whether file already existed) - **Edit**: `oldStart`, `oldLines`, `newStart`, `newLines`, `lines` (diff format) ### Errors Each tool type can return specific errors: **Common errors (all tools):** ```json { "type": "bash_code_execution_tool_result", "tool_use_id": "srvtoolu_01VfmxgZ46TiHbmXgy928hQR", "content": { "type": "bash_code_execution_tool_result_error", "error_code": "unavailable" } } ``` **Error codes by tool type:** | Tool | Error Code | Description | |------|-----------|-------------| | All tools | `unavailable` | The tool is temporarily unavailable | | All tools | `execution_time_exceeded` | Execution exceeded maximum time limit | | All tools | `container_expired` | Container expired and is no longer available | | All tools | `invalid_tool_input` | Invalid parameters provided to the tool | | All tools | `too_many_requests` | Rate limit exceeded for tool usage | | text_editor | `file_not_found` | File doesn't exist (for view/edit operations) | | text_editor | `string_not_found` | The `old_str` not found in file (for str_replace) | #### `pause_turn` stop reason The response may include a `pause_turn` stop reason, which indicates that the API paused a long-running turn. You may provide the response back as-is in a subsequent request to let Claude continue its turn, or modify the content if you wish to interrupt the conversation. ## Containers The code execution tool runs in a secure, containerized environment designed specifically for code execution, with a higher focus on Python. ### Runtime environment - **Python version**: 3.11.12 - **Operating system**: Linux-based container - **Architecture**: x86_64 (AMD64) ### Resource limits - **Memory**: 5GiB RAM - **Disk space**: 5GiB workspace storage - **CPU**: 1 CPU ### Networking and security - **Internet access**: Completely disabled for security - **External connections**: No outbound network requests permitted - **Sandbox isolation**: Full isolation from host system and other containers - **File access**: Limited to workspace directory only - **Workspace scoping**: Like [Files](/docs/en/build-with-claude/files), containers are scoped to the workspace of the API key - **Expiration**: Containers expire 30 days after creation ### Pre-installed libraries The sandboxed Python environment includes these commonly used libraries: - **Data Science**: pandas, numpy, scipy, scikit-learn, statsmodels - **Visualization**: matplotlib, seaborn - **File Processing**: pyarrow, openpyxl, xlsxwriter, xlrd, pillow, python-pptx, python-docx, pypdf, pdfplumber, pypdfium2, pdf2image, pdfkit, tabula-py, reportlab[pycairo], Img2pdf - **Math & Computing**: sympy, mpmath - **Utilities**: tqdm, python-dateutil, pytz, joblib, unzip, unrar, 7zip, bc, rg (ripgrep), fd, sqlite ## Container reuse You can reuse an existing container across multiple API requests by providing the container ID from a previous response. This allows you to maintain created files between requests. ### Example ```python Python import os from anthropic import Anthropic # Initialize the client client = Anthropic( api_key=os.getenv("ANTHROPIC_API_KEY") ) # First request: Create a file with a random number response1 = client.beta.messages.create( model="claude-sonnet-4-5", betas=["code-execution-2025-08-25"], max_tokens=4096, messages=[{ "role": "user", "content": "Write a file with a random number and save it to '/tmp/number.txt'" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) # Extract the container ID from the first response container_id = response1.container.id # Second request: Reuse the container to read the file response2 = client.beta.messages.create( container=container_id, # Reuse the same container model="claude-sonnet-4-5", betas=["code-execution-2025-08-25"], max_tokens=4096, messages=[{ "role": "user", "content": "Read the number from '/tmp/number.txt' and calculate its square" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { // First request: Create a file with a random number const response1 = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25"], max_tokens: 4096, messages: [{ role: "user", content: "Write a file with a random number and save it to '/tmp/number.txt'" }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); // Extract the container ID from the first response const containerId = response1.container.id; // Second request: Reuse the container to read the file const response2 = await anthropic.beta.messages.create({ container: containerId, // Reuse the same container model: "claude-sonnet-4-5", betas: ["code-execution-2025-08-25"], max_tokens: 4096, messages: [{ role: "user", content: "Read the number from '/tmp/number.txt' and calculate its square" }], tools: [{ type: "code_execution_20250825", name: "code_execution" }] }); console.log(response2.content); } main().catch(console.error); ``` ```bash Shell # First request: Create a file with a random number curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": "Write a file with a random number and save it to \"/tmp/number.txt\"" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' > response1.json # Extract container ID from the response (using jq) CONTAINER_ID=$(jq -r '.container.id' response1.json) # Second request: Reuse the container to read the file curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: code-execution-2025-08-25" \ --header "content-type: application/json" \ --data '{ "container": "'$CONTAINER_ID'", "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [{ "role": "user", "content": "Read the number from \"/tmp/number.txt\" and calculate its square" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ## Streaming With streaming enabled, you'll receive code execution events as they occur: ```javascript event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "code_execution"}} // Code execution streamed event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"code\":\"import pandas as pd\\ndf = pd.read_csv('data.csv')\\nprint(df.head())\"}"}} // Pause while code executes // Execution results streamed event: content_block_start data: {"type": "content_block_start", "index": 2, "content_block": {"type": "code_execution_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"stdout": " A B C\n0 1 2 3\n1 4 5 6", "stderr": ""}}} ``` ## Batch requests You can include the code execution tool in the [Messages Batches API](/docs/en/build-with-claude/batch-processing). Code execution tool calls through the Messages Batches API are priced the same as those in regular Messages API requests. ## Usage and pricing Code execution tool usage is tracked separately from token usage. Execution time has a minimum of 5 minutes. If files are included in the request, execution time is billed even if the tool is not used due to files being preloaded onto the container. Each organization receives 1,550 free hours of usage with the code execution tool per month. Additional usage beyond the first 1,550 hours is billed at $0.05 per hour, per container. ## Upgrade to latest tool version By upgrading to `code-execution-2025-08-25`, you get access to file manipulation and Bash capabilities, including code in multiple languages. There is no price difference. ### What's changed | Component | Legacy | Current | |-----------|------------------|----------------------------| | Beta header | `code-execution-2025-05-22` | `code-execution-2025-08-25` | | Tool type | `code_execution_20250522` | `code_execution_20250825` | | Capabilities | Python only | Bash commands, file operations | | Response types | `code_execution_result` | `bash_code_execution_result`, `text_editor_code_execution_result` | ### Backward compatibility - All existing Python code execution continues to work exactly as before - No changes required to existing Python-only workflows ### Upgrade steps To upgrade, you need to make the following changes in your API requests: 1. **Update the beta header**: ```diff - "anthropic-beta": "code-execution-2025-05-22" + "anthropic-beta": "code-execution-2025-08-25" ``` 2. **Update the tool type**: ```diff - "type": "code_execution_20250522" + "type": "code_execution_20250825" ``` 3. **Review response handling** (if parsing responses programmatically): - The previous blocks for Python execution responses will no longer be sent - Instead, new response types for Bash and file operations will be sent (see Response Format section) ## Programmatic tool calling The code execution tool powers [programmatic tool calling](/docs/en/agents-and-tools/tool-use/programmatic-tool-calling), which allows Claude to write code that calls your custom tools programmatically within the execution container. This enables efficient multi-tool workflows, data filtering before reaching Claude's context, and complex conditional logic. ```python Python # Enable programmatic calling for your tools response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["advanced-tool-use-2025-11-20"], max_tokens=4096, messages=[{ "role": "user", "content": "Get weather for 5 cities and find the warmest" }], tools=[ { "type": "code_execution_20250825", "name": "code_execution" }, { "name": "get_weather", "description": "Get weather for a city", "input_schema": {...}, "allowed_callers": ["code_execution_20250825"] # Enable programmatic calling } ] ) ``` Learn more in the [Programmatic tool calling documentation](/docs/en/agents-and-tools/tool-use/programmatic-tool-calling). ## Using code execution with Agent Skills The code execution tool enables Claude to use [Agent Skills](/docs/en/agents-and-tools/agent-skills/overview). Skills are modular capabilities consisting of instructions, scripts, and resources that extend Claude's functionality. Learn more in the [Agent Skills documentation](/docs/en/agents-and-tools/agent-skills/overview) and [Agent Skills API guide](/docs/en/build-with-claude/skills-guide). --- # Computer use tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/computer-use-tool # Computer use tool --- Claude can interact with computer environments through the computer use tool, which provides screenshot capabilities and mouse/keyboard control for autonomous desktop interaction. Computer use is currently in beta and requires a [beta header](/docs/en/api/beta-headers): - `"computer-use-2025-11-24"` for Claude Opus 4.5 - `"computer-use-2025-01-24"` for Claude Sonnet 4.5, Haiku 4.5, Opus 4.1, Sonnet 4, Opus 4, and Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) Please reach out through our [feedback form](https://forms.gle/H6UFuXaaLywri9hz6) to share your feedback on this feature. ## Overview Computer use is a beta feature that enables Claude to interact with desktop environments. This tool provides: - **Screenshot capture**: See what's currently displayed on screen - **Mouse control**: Click, drag, and move the cursor - **Keyboard input**: Type text and use keyboard shortcuts - **Desktop automation**: Interact with any application or interface While computer use can be augmented with other tools like bash and text editor for more comprehensive automation workflows, computer use specifically refers to the computer use tool's capability to see and control desktop environments. ## Model compatibility Computer use is available for the following Claude models: | Model | Tool Version | Beta Flag | |-------|--------------|-----------| | Claude Opus 4.5 | `computer_20251124` | `computer-use-2025-11-24` | | All other supported models | `computer_20250124` | `computer-use-2025-01-24` | Claude Opus 4.5 introduces the `computer_20251124` tool version with new capabilities including the zoom action for detailed screen region inspection. All other models (Sonnet 4.5, Haiku 4.5, Sonnet 4, Opus 4, Opus 4.1, and Sonnet 3.7) use the `computer_20250124` tool version. Older tool versions are not guaranteed to be backwards-compatible with newer models. Always use the tool version that corresponds to your model version. ## Security considerations Computer use is a beta feature with unique risks distinct from standard API features. These risks are heightened when interacting with the internet. To minimize risks, consider taking precautions such as: 1. Using a dedicated virtual machine or container with minimal privileges to prevent direct system attacks or accidents. 2. Avoiding giving the model access to sensitive data, such as account login information, to prevent information theft. 3. Limiting internet access to an allowlist of domains to reduce exposure to malicious content. 4. Asking a human to confirm decisions that may result in meaningful real-world consequences as well as any tasks requiring affirmative consent, such as accepting cookies, executing financial transactions, or agreeing to terms of service. In some circumstances, Claude will follow commands found in content even if it conflicts with the user's instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection. We've trained the model to resist these prompt injections and have added an extra layer of defense. If you use our computer use tools, we'll automatically run classifiers on your prompts to flag potential instances of prompt injections. When these classifiers identify potential prompt injections in screenshots, they will automatically steer the model to ask for user confirmation before proceeding with the next action. We recognize that this extra protection won't be ideal for every use case (for example, use cases without a human in the loop), so if you'd like to opt out and turn it off, please [contact us](https://support.claude.com/en/). We still suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection. Finally, please inform end users of relevant risks and obtain their consent prior to enabling computer use in your own products. Get started quickly with our computer use reference implementation that includes a web interface, Docker container, example tool implementations, and an agent loop. **Note:** The implementation has been updated to include new tools for both Claude 4 models and Claude Sonnet 3.7. Be sure to pull the latest version of the repo to access these new features. Please use [this form](https://forms.gle/BT1hpBrqDPDUrCqo7) to provide feedback on the quality of the model responses, the API itself, or the quality of the documentation - we cannot wait to hear from you! ## Quick start Here's how to get started with computer use: ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", # or another compatible model max_tokens=1024, tools=[ { "type": "computer_20250124", "name": "computer", "display_width_px": 1024, "display_height_px": 768, "display_number": 1, }, { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" }, { "type": "bash_20250124", "name": "bash" } ], messages=[{"role": "user", "content": "Save a picture of a cat to my desktop."}], betas=["computer-use-2025-01-24"] ) print(response) ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: computer-use-2025-01-24" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "type": "computer_20250124", "name": "computer", "display_width_px": 1024, "display_height_px": 768, "display_number": 1 }, { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" }, { "type": "bash_20250124", "name": "bash" } ], "messages": [ { "role": "user", "content": "Save a picture of a cat to my desktop." } ] }' ``` A beta header is only required for the computer use tool. The example above shows all three tools being used together, which requires the beta header because it includes the computer use tool. --- ## How computer use works - Add the computer use tool (and optionally other tools) to your API request. - Include a user prompt that requires desktop interaction, e.g., "Save a picture of a cat to my desktop." - Claude assesses if the computer use tool can help with the user's query. - If yes, Claude constructs a properly formatted tool use request. - The API response has a `stop_reason` of `tool_use`, signaling Claude's intent. - On your end, extract the tool name and input from Claude's request. - Use the tool on a container or Virtual Machine. - Continue the conversation with a new `user` message containing a `tool_result` content block. - Claude analyzes the tool results to determine if more tool use is needed or the task has been completed. - If Claude decides it needs another tool, it responds with another `tool_use` `stop_reason` and you should return to step 3. - Otherwise, it crafts a text response to the user. We refer to the repetition of steps 3 and 4 without user input as the "agent loop" - i.e., Claude responding with a tool use request and your application responding to Claude with the results of evaluating that request. ### The computing environment Computer use requires a sandboxed computing environment where Claude can safely interact with applications and the web. This environment includes: 1. **Virtual display**: A virtual X11 display server (using Xvfb) that renders the desktop interface Claude will see through screenshots and control with mouse/keyboard actions. 2. **Desktop environment**: A lightweight UI with window manager (Mutter) and panel (Tint2) running on Linux, which provides a consistent graphical interface for Claude to interact with. 3. **Applications**: Pre-installed Linux applications like Firefox, LibreOffice, text editors, and file managers that Claude can use to complete tasks. 4. **Tool implementations**: Integration code that translates Claude's abstract tool requests (like "move mouse" or "take screenshot") into actual operations in the virtual environment. 5. **Agent loop**: A program that handles communication between Claude and the environment, sending Claude's actions to the environment and returning the results (screenshots, command outputs) back to Claude. When you use computer use, Claude doesn't directly connect to this environment. Instead, your application: 1. Receives Claude's tool use requests 2. Translates them into actions in your computing environment 3. Captures the results (screenshots, command outputs, etc.) 4. Returns these results to Claude For security and isolation, the reference implementation runs all of this inside a Docker container with appropriate port mappings for viewing and interacting with the environment. --- ## How to implement computer use ### Start with our reference implementation We have built a [reference implementation](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo) that includes everything you need to get started quickly with computer use: - A [containerized environment](https://github.com/anthropics/anthropic-quickstarts/blob/main/computer-use-demo/Dockerfile) suitable for computer use with Claude - Implementations of [the computer use tools](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo/computer_use_demo/tools) - An [agent loop](https://github.com/anthropics/anthropic-quickstarts/blob/main/computer-use-demo/computer_use_demo/loop.py) that interacts with the Claude API and executes the computer use tools - A web interface to interact with the container, agent loop, and tools. ### Understanding the multi-agent loop The core of computer use is the "agent loop" - a cycle where Claude requests tool actions, your application executes them, and returns results to Claude. Here's a simplified example: ```python async def sampling_loop( *, model: str, messages: list[dict], api_key: str, max_tokens: int = 4096, tool_version: str, thinking_budget: int | None = None, max_iterations: int = 10, # Add iteration limit to prevent infinite loops ): """ A simple agent loop for Claude computer use interactions. This function handles the back-and-forth between: 1. Sending user messages to Claude 2. Claude requesting to use tools 3. Your app executing those tools 4. Sending tool results back to Claude """ # Set up tools and API parameters client = Anthropic(api_key=api_key) beta_flag = "computer-use-2025-01-24" if "20250124" in tool_version else "computer-use-2024-10-22" # Configure tools - you should already have these initialized elsewhere tools = [ {"type": f"computer_{tool_version}", "name": "computer", "display_width_px": 1024, "display_height_px": 768}, {"type": f"text_editor_{tool_version}", "name": "str_replace_editor"}, {"type": f"bash_{tool_version}", "name": "bash"} ] # Main agent loop (with iteration limit to prevent runaway API costs) iterations = 0 while True and iterations < max_iterations: iterations += 1 # Set up optional thinking parameter (for Claude Sonnet 3.7) thinking = None if thinking_budget: thinking = {"type": "enabled", "budget_tokens": thinking_budget} # Call the Claude API response = client.beta.messages.create( model=model, max_tokens=max_tokens, messages=messages, tools=tools, betas=[beta_flag], thinking=thinking ) # Add Claude's response to the conversation history response_content = response.content messages.append({"role": "assistant", "content": response_content}) # Check if Claude used any tools tool_results = [] for block in response_content: if block.type == "tool_use": # In a real app, you would execute the tool here # For example: result = run_tool(block.name, block.input) result = {"result": "Tool executed successfully"} # Format the result for Claude tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": result }) # If no tools were used, Claude is done - return the final messages if not tool_results: return messages # Add tool results to messages for the next iteration with Claude messages.append({"role": "user", "content": tool_results}) ``` The loop continues until either Claude responds without requesting any tools (task completion) or the maximum iteration limit is reached. This safeguard prevents potential infinite loops that could result in unexpected API costs. We recommend trying the reference implementation out before reading the rest of this documentation. ### Optimize model performance with prompting Here are some tips on how to get the best quality outputs: 1. Specify simple, well-defined tasks and provide explicit instructions for each step. 2. Claude sometimes assumes outcomes of its actions without explicitly checking their results. To prevent this you can prompt Claude with `After each step, take a screenshot and carefully evaluate if you have achieved the right outcome. Explicitly show your thinking: "I have evaluated step X..." If not correct, try again. Only when you confirm a step was executed correctly should you move on to the next one.` 3. Some UI elements (like dropdowns and scrollbars) might be tricky for Claude to manipulate using mouse movements. If you experience this, try prompting the model to use keyboard shortcuts. 4. For repeatable tasks or UI interactions, include example screenshots and tool calls of successful outcomes in your prompt. 5. If you need the model to log in, provide it with the username and password in your prompt inside xml tags like ``. Using computer use within applications that require login increases the risk of bad outcomes as a result of prompt injection. Please review our [guide on mitigating prompt injections](/docs/en/test-and-evaluate/strengthen-guardrails/mitigate-jailbreaks) before providing the model with login credentials. If you repeatedly encounter a clear set of issues or know in advance the tasks Claude will need to complete, use the system prompt to provide Claude with explicit tips or instructions on how to do the tasks successfully. ### System prompts When one of the Anthropic-defined tools is requested via the Claude API, a computer use-specific system prompt is generated. It's similar to the [tool use system prompt](/docs/en/agents-and-tools/tool-use/implement-tool-use#tool-use-system-prompt) but starts with: > You have access to a set of functions you can use to answer the user's question. This includes access to a sandboxed computing environment. You do NOT currently have the ability to inspect files or interact with external resources, except by invoking the below functions. As with regular tool use, the user-provided `system_prompt` field is still respected and used in the construction of the combined system prompt. ### Available actions The computer use tool supports these actions: **Basic actions (all versions)** - **screenshot** - Capture the current display - **left_click** - Click at coordinates `[x, y]` - **type** - Type text string - **key** - Press key or key combination (e.g., "ctrl+s") - **mouse_move** - Move cursor to coordinates **Enhanced actions (`computer_20250124`)** Available in Claude 4 models and Claude Sonnet 3.7: - **scroll** - Scroll in any direction with amount control - **left_click_drag** - Click and drag between coordinates - **right_click**, **middle_click** - Additional mouse buttons - **double_click**, **triple_click** - Multiple clicks - **left_mouse_down**, **left_mouse_up** - Fine-grained click control - **hold_key** - Hold down a key for a specified duration (in seconds) - **wait** - Pause between actions **Enhanced actions (`computer_20251124`)** Available in Claude Opus 4.5: - All actions from `computer_20250124` - **zoom** - View a specific region of the screen at full resolution. Requires `enable_zoom: true` in tool definition. Takes a `region` parameter with coordinates `[x1, y1, x2, y2]` defining top-left and bottom-right corners of the area to inspect.
```json // Take a screenshot { "action": "screenshot" } // Click at position { "action": "left_click", "coordinate": [500, 300] } // Type text { "action": "type", "text": "Hello, world!" } // Scroll down (Claude 4/3.7) { "action": "scroll", "coordinate": [500, 400], "scroll_direction": "down", "scroll_amount": 3 } // Zoom to view region in detail (Opus 4.5) { "action": "zoom", "region": [100, 200, 400, 350] } ```
To hold modifier keys (like Shift, Ctrl, or Alt) while performing click or scroll actions, use the `text` parameter on those actions. This is different from `hold_key`, which simply holds a key for a duration without performing other actions. ```json // Shift+click (e.g., for selecting a range of items) { "action": "left_click", "coordinate": [500, 300], "text": "shift" } // Ctrl+click (e.g., for multi-select on Windows/Linux) { "action": "left_click", "coordinate": [500, 300], "text": "ctrl" } // Cmd+click (e.g., for multi-select on macOS) { "action": "left_click", "coordinate": [500, 300], "text": "super" } // Shift+scroll (e.g., for horizontal scrolling) { "action": "scroll", "coordinate": [500, 400], "scroll_direction": "down", "scroll_amount": 3, "text": "shift" } ``` The `text` parameter in click/scroll actions accepts modifier keys like `shift`, `ctrl`, `alt`, and `super` (for the Command/Windows key).
### Tool parameters | Parameter | Required | Description | |-----------|----------|-------------| | `type` | Yes | Tool version (`computer_20251124`, `computer_20250124`, or `computer_20241022`) | | `name` | Yes | Must be "computer" | | `display_width_px` | Yes | Display width in pixels | | `display_height_px` | Yes | Display height in pixels | | `display_number` | No | Display number for X11 environments | | `enable_zoom` | No | Enable zoom action (`computer_20251124` only). Set to `true` to allow Claude to zoom into specific screen regions. Default: `false` | **Important**: The computer use tool must be explicitly executed by your application - Claude cannot execute it directly. You are responsible for implementing the screenshot capture, mouse movements, keyboard inputs, and other actions based on Claude's requests. ### Enable thinking capability in Claude 4 models and Claude Sonnet 3.7 Claude Sonnet 3.7 introduced a new "thinking" capability that allows you to see the model's reasoning process as it works through complex tasks. This feature helps you understand how Claude is approaching a problem and can be particularly valuable for debugging or educational purposes. To enable thinking, add a `thinking` parameter to your API request: ```json "thinking": { "type": "enabled", "budget_tokens": 1024 } ``` The `budget_tokens` parameter specifies how many tokens Claude can use for thinking. This is subtracted from your overall `max_tokens` budget. When thinking is enabled, Claude will return its reasoning process as part of the response, which can help you: 1. Understand the model's decision-making process 2. Identify potential issues or misconceptions 3. Learn from Claude's approach to problem-solving 4. Get more visibility into complex multi-step operations Here's an example of what thinking output might look like: ``` [Thinking] I need to save a picture of a cat to the desktop. Let me break this down into steps: 1. First, I'll take a screenshot to see what's on the desktop 2. Then I'll look for a web browser to search for cat images 3. After finding a suitable image, I'll need to save it to the desktop Let me start by taking a screenshot to see what's available... ``` ### Augmenting computer use with other tools The computer use tool can be combined with other tools to create more powerful automation workflows. This is particularly useful when you need to: - Execute system commands ([bash tool](/docs/en/agents-and-tools/tool-use/bash-tool)) - Edit configuration files or scripts ([text editor tool](/docs/en/agents-and-tools/tool-use/text-editor-tool)) - Integrate with custom APIs or services (custom tools) ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: computer-use-2025-01-24" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 2000, "tools": [ { "type": "computer_20250124", "name": "computer", "display_width_px": 1024, "display_height_px": 768, "display_number": 1 }, { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" }, { "type": "bash_20250124", "name": "bash" }, { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } } ], "messages": [ { "role": "user", "content": "Find flights from San Francisco to a place with warmer weather." } ], "thinking": { "type": "enabled", "budget_tokens": 1024 } }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "computer_20250124", "name": "computer", "display_width_px": 1024, "display_height_px": 768, "display_number": 1, }, { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" }, { "type": "bash_20250124", "name": "bash" }, { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } }, ], messages=[{"role": "user", "content": "Find flights from San Francisco to a place with warmer weather."}], betas=["computer-use-2025-01-24"], thinking={"type": "enabled", "budget_tokens": 1024}, ) print(response) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const message = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { type: "computer_20250124", name: "computer", display_width_px: 1024, display_height_px: 768, display_number: 1, }, { type: "text_editor_20250728", name: "str_replace_based_edit_tool" }, { type: "bash_20250124", name: "bash" }, { name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" }, unit: { type: "string", enum: ["celsius", "fahrenheit"], description: "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, required: ["location"] } }, ], messages: [{ role: "user", content: "Find flights from San Francisco to a place with warmer weather." }], betas: ["computer-use-2025-01-24"], thinking: { type: "enabled", budget_tokens: 1024 }, }); console.log(message); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.beta.messages.BetaMessage; import com.anthropic.models.beta.messages.MessageCreateParams; import com.anthropic.models.beta.messages.BetaToolBash20250124; import com.anthropic.models.beta.messages.BetaToolComputerUse20250124; import com.anthropic.models.beta.messages.BetaToolTextEditor20250124; import com.anthropic.models.beta.messages.BetaThinkingConfigEnabled; import com.anthropic.models.beta.messages.BetaThinkingConfigParam; import com.anthropic.models.beta.messages.BetaTool; public class MultipleToolsExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model("claude-sonnet-4-5") .maxTokens(1024) .addTool(BetaToolComputerUse20250124.builder() .displayWidthPx(1024) .displayHeightPx(768) .displayNumber(1) .build()) .addTool(BetaToolTextEditor20250124.builder() .build()) .addTool(BetaToolBash20250124.builder() .build()) .addTool(BetaTool.builder() .name("get_weather") .description("Get the current weather in a given location") .inputSchema(BetaTool.InputSchema.builder() .properties( JsonValue.from( Map.of( "location", Map.of( "type", "string", "description", "The city and state, e.g. San Francisco, CA" ), "unit", Map.of( "type", "string", "enum", List.of("celsius", "fahrenheit"), "description", "The unit of temperature, either 'celsius' or 'fahrenheit'" ) ) )) .build() ) .build()) .thinking(BetaThinkingConfigParam.ofEnabled( BetaThinkingConfigEnabled.builder() .budgetTokens(1024) .build() )) .addUserMessage("Find flights from San Francisco to a place with warmer weather.") .addBeta("computer-use-2025-01-24") .build(); BetaMessage message = client.beta().messages().create(params); System.out.println(message); } } ``` ### Build a custom computer use environment The [reference implementation](https://github.com/anthropics/anthropic-quickstarts/tree/main/computer-use-demo) is meant to help you get started with computer use. It includes all of the components needed have Claude use a computer. However, you can build your own environment for computer use to suit your needs. You'll need: - A virtualized or containerized environment suitable for computer use with Claude - An implementation of at least one of the Anthropic-defined computer use tools - An agent loop that interacts with the Claude API and executes the `tool_use` results using your tool implementations - An API or UI that allows user input to start the agent loop #### Implement the computer use tool The computer use tool is implemented as a schema-less tool. When using this tool, you don't need to provide an input schema as with other tools; the schema is built into Claude's model and can't be modified. Create a virtual display or connect to an existing display that Claude will interact with. This typically involves setting up Xvfb (X Virtual Framebuffer) or similar technology. Create functions to handle each action type that Claude might request: ```python def handle_computer_action(action_type, params): if action_type == "screenshot": return capture_screenshot() elif action_type == "left_click": x, y = params["coordinate"] return click_at(x, y) elif action_type == "type": return type_text(params["text"]) # ... handle other actions ``` Extract and execute tool calls from Claude's responses: ```python for content in response.content: if content.type == "tool_use": action = content.input["action"] result = handle_computer_action(action, content.input) # Return result to Claude tool_result = { "type": "tool_result", "tool_use_id": content.id, "content": result } ``` Create a loop that continues until Claude completes the task: ```python while True: response = client.beta.messages.create(...) # Check if Claude used any tools tool_results = process_tool_calls(response) if not tool_results: # No more tool use, task complete break # Continue conversation with tool results messages.append({"role": "user", "content": tool_results}) ``` #### Handle errors When implementing the computer use tool, various errors may occur. Here's how to handle them:
If screenshot capture fails, return an appropriate error message: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Failed to capture screenshot. Display may be locked or unavailable.", "is_error": true } ] } ```
If Claude provides coordinates outside the display bounds: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Coordinates (1200, 900) are outside display bounds (1024x768).", "is_error": true } ] } ```
If an action fails to execute: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Failed to perform click action. The application may be unresponsive.", "is_error": true } ] } ```
#### Handle coordinate scaling for higher resolutions The API constrains images to a maximum of 1568 pixels on the longest edge and approximately 1.15 megapixels total (see [image resizing](/docs/en/build-with-claude/vision#evaluate-image-size) for details). For example, a 1512x982 screen gets downsampled to approximately 1330x864. Claude analyzes this smaller image and returns coordinates in that space, but your tool executes clicks in the original screen space. This can cause Claude's click coordinates to miss their targets unless you handle the coordinate transformation. To fix this, resize screenshots yourself and scale Claude's coordinates back up: ```python Python import math def get_scale_factor(width, height): """Calculate scale factor to meet API constraints.""" long_edge = max(width, height) total_pixels = width * height long_edge_scale = 1568 / long_edge total_pixels_scale = math.sqrt(1_150_000 / total_pixels) return min(1.0, long_edge_scale, total_pixels_scale) # When capturing screenshot scale = get_scale_factor(screen_width, screen_height) scaled_width = int(screen_width * scale) scaled_height = int(screen_height * scale) # Resize image to scaled dimensions before sending to Claude screenshot = capture_and_resize(scaled_width, scaled_height) # When handling Claude's coordinates, scale them back up def execute_click(x, y): screen_x = x / scale screen_y = y / scale perform_click(screen_x, screen_y) ``` ```typescript TypeScript const MAX_LONG_EDGE = 1568; const MAX_PIXELS = 1_150_000; function getScaleFactor(width: number, height: number): number { const longEdge = Math.max(width, height); const totalPixels = width * height; const longEdgeScale = MAX_LONG_EDGE / longEdge; const totalPixelsScale = Math.sqrt(MAX_PIXELS / totalPixels); return Math.min(1.0, longEdgeScale, totalPixelsScale); } // When capturing screenshot const scale = getScaleFactor(screenWidth, screenHeight); const scaledWidth = Math.floor(screenWidth * scale); const scaledHeight = Math.floor(screenHeight * scale); // Resize image to scaled dimensions before sending to Claude const screenshot = captureAndResize(scaledWidth, scaledHeight); // When handling Claude's coordinates, scale them back up function executeClick(x: number, y: number): void { const screenX = x / scale; const screenY = y / scale; performClick(screenX, screenY); } ``` #### Follow implementation best practices
Set display dimensions that match your use case while staying within recommended limits: - For general desktop tasks: 1024x768 or 1280x720 - For web applications: 1280x800 or 1366x768 - Avoid resolutions above 1920x1080 to prevent performance issues
When returning screenshots to Claude: - Encode screenshots as base64 PNG or JPEG - Consider compressing large screenshots to improve performance - Include relevant metadata like timestamp or display state - If using higher resolutions, ensure coordinates are accurately scaled
Some applications need time to respond to actions: ```python def click_and_wait(x, y, wait_time=0.5): click_at(x, y) time.sleep(wait_time) # Allow UI to update ```
Check that requested actions are safe and valid: ```python def validate_action(action_type, params): if action_type == "left_click": x, y = params.get("coordinate", (0, 0)) if not (0 <= x < display_width and 0 <= y < display_height): return False, "Coordinates out of bounds" return True, None ```
Keep a log of all actions for troubleshooting: ```python import logging def log_action(action_type, params, result): logging.info(f"Action: {action_type}, Params: {params}, Result: {result}") ```
--- ## Understand computer use limitations The computer use functionality is in beta. While Claude's capabilities are cutting edge, developers should be aware of its limitations: 1. **Latency**: the current computer use latency for human-AI interactions may be too slow compared to regular human-directed computer actions. We recommend focusing on use cases where speed isn't critical (e.g., background information gathering, automated software testing) in trusted environments. 2. **Computer vision accuracy and reliability**: Claude may make mistakes or hallucinate when outputting specific coordinates while generating actions. Claude Sonnet 3.7 introduces the thinking capability that can help you understand the model's reasoning and identify potential issues. 3. **Tool selection accuracy and reliability**: Claude may make mistakes or hallucinate when selecting tools while generating actions or take unexpected actions to solve problems. Additionally, reliability may be lower when interacting with niche applications or multiple applications at once. We recommend that users prompt the model carefully when requesting complex tasks. 4. **Scrolling reliability**: Claude Sonnet 3.7 introduced dedicated scroll actions with direction control that improves reliability. The model can now explicitly scroll in any direction (up/down/left/right) by a specified amount. 5. **Spreadsheet interaction**: Mouse clicks for spreadsheet interaction have improved in Claude Sonnet 3.7 with the addition of more precise mouse control actions like `left_mouse_down`, `left_mouse_up`, and new modifier key support. Cell selection can be more reliable by using these fine-grained controls and combining modifier keys with clicks. 6. **Account creation and content generation on social and communications platforms**: While Claude will visit websites, we are limiting its ability to create accounts or generate and share content or otherwise engage in human impersonation across social media websites and platforms. We may update this capability in the future. 7. **Vulnerabilities**: Vulnerabilities like jailbreaking or prompt injection may persist across frontier AI systems, including the beta computer use API. In some circumstances, Claude will follow commands found in content, sometimes even in conflict with the user's instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We recommend: a. Limiting computer use to trusted environments such as virtual machines or containers with minimal privileges b. Avoiding giving computer use access to sensitive accounts or data without strict oversight c. Informing end users of relevant risks and obtaining their consent before enabling or requesting permissions necessary for computer use features in your applications 8. **Inappropriate or illegal actions**: Per Anthropic's terms of service, you must not employ computer use to violate any laws or our Acceptable Use Policy. Always carefully review and verify Claude's computer use actions and logs. Do not use Claude for tasks requiring perfect precision or sensitive user information without human oversight. --- ## Pricing Computer use follows the standard [tool use pricing](/docs/en/agents-and-tools/tool-use/overview#pricing). When using the computer use tool: **System prompt overhead**: The computer use beta adds 466-499 tokens to the system prompt **Computer use tool token usage**: | Model | Input tokens per tool definition | | ----- | -------------------------------- | | Claude 4.x models | 735 tokens | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | 735 tokens | **Additional token consumption**: - Screenshot images (see [Vision pricing](/docs/en/build-with-claude/vision)) - Tool execution results returned to Claude If you're also using bash or text editor tools alongside computer use, those tools have their own token costs as documented in their respective pages. ## Next steps Get started quickly with our complete Docker-based implementation Learn more about tool use and creating custom tools --- # Fine-grained tool streaming URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming # Fine-grained tool streaming --- Tool use now supports fine-grained [streaming](/docs/en/build-with-claude/streaming) for parameter values. This allows developers to stream tool use parameters without buffering / JSON validation, reducing the latency to begin receiving large parameters. Fine-grained tool streaming is available via the Claude API, AWS Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. Fine-grained tool streaming is a beta feature. Please make sure to evaluate your responses before using it in production. Please use [this form](https://forms.gle/D4Fjr7GvQRzfTZT96) to provide feedback on the quality of the model responses, the API itself, or the quality of the documentation—we cannot wait to hear from you! When using fine-grained tool streaming, you may potentially receive invalid or partial JSON inputs. Please make sure to account for these edge cases in your code. ## How to use fine-grained tool streaming To use this beta feature, simply add the beta header `fine-grained-tool-streaming-2025-05-14` to a tool use request and turn on streaming. Here's an example of how to use fine-grained tool streaming with the API: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: fine-grained-tool-streaming-2025-05-14" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 65536, "tools": [ { "name": "make_file", "description": "Write text to a file", "input_schema": { "type": "object", "properties": { "filename": { "type": "string", "description": "The filename to write text to" }, "lines_of_text": { "type": "array", "description": "An array of lines of text to write to the file" } }, "required": ["filename", "lines_of_text"] } } ], "messages": [ { "role": "user", "content": "Can you write a long poem and make a file called poem.txt?" } ], "stream": true }' | jq '.usage' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.stream( max_tokens=65536, model="claude-sonnet-4-5", tools=[{ "name": "make_file", "description": "Write text to a file", "input_schema": { "type": "object", "properties": { "filename": { "type": "string", "description": "The filename to write text to" }, "lines_of_text": { "type": "array", "description": "An array of lines of text to write to the file" } }, "required": ["filename", "lines_of_text"] } }], messages=[{ "role": "user", "content": "Can you write a long poem and make a file called poem.txt?" }], betas=["fine-grained-tool-streaming-2025-05-14"] ) print(response.usage) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const message = await anthropic.beta.messages.stream({ model: "claude-sonnet-4-5", max_tokens: 65536, tools: [{ "name": "make_file", "description": "Write text to a file", "input_schema": { "type": "object", "properties": { "filename": { "type": "string", "description": "The filename to write text to" }, "lines_of_text": { "type": "array", "description": "An array of lines of text to write to the file" } }, "required": ["filename", "lines_of_text"] } }], messages: [{ role: "user", content: "Can you write a long poem and make a file called poem.txt?" }], betas: ["fine-grained-tool-streaming-2025-05-14"] }); console.log(message.usage); ``` In this example, fine-grained tool streaming enables Claude to stream the lines of a long poem into the tool call `make_file` without buffering to validate if the `lines_of_text` parameter is valid JSON. This means you can see the parameter stream as it arrives, without having to wait for the entire parameter to buffer and validate. With fine-grained tool streaming, tool use chunks start streaming faster, and are often longer and contain fewer word breaks. This is due to differences in chunking behavior. Example: Without fine-grained streaming (15s delay): ``` Chunk 1: '{"' Chunk 2: 'query": "Ty' Chunk 3: 'peScri' Chunk 4: 'pt 5.0 5.1 ' Chunk 5: '5.2 5' Chunk 6: '.3' Chunk 8: ' new f' Chunk 9: 'eatur' ... ``` With fine-grained streaming (3s delay): ``` Chunk 1: '{"query": "TypeScript 5.0 5.1 5.2 5.3' Chunk 2: ' new features comparison' ``` Because fine-grained streaming sends parameters without buffering or JSON validation, there is no guarantee that the resulting stream will complete in a valid JSON string. Particularly, if the [stop reason](/docs/en/build-with-claude/handling-stop-reasons) `max_tokens` is reached, the stream may end midway through a parameter and may be incomplete. You will generally have to write specific support to handle when `max_tokens` is reached. ## Handling invalid JSON in tool responses When using fine-grained tool streaming, you may receive invalid or incomplete JSON from the model. If you need to pass this invalid JSON back to the model in an error response block, you may wrap it in a JSON object to ensure proper handling (with a reasonable key). For example: ```json { "INVALID_JSON": "" } ``` This approach helps the model understand that the content is invalid JSON while preserving the original malformed data for debugging purposes. When wrapping invalid JSON, make sure to properly escape any quotes or special characters in the invalid JSON string to maintain valid JSON structure in the wrapper object. --- # How to implement tool use URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use # How to implement tool use --- ## Choosing a model We recommend using the latest Claude Sonnet (4.5) or Claude Opus (4.5) model for complex tools and ambiguous queries; they handle multiple tools better and seek clarification when needed. Use Claude Haiku models for straightforward tools, but note they may infer missing parameters. If using Claude with tool use and extended thinking, refer to our guide [here](/docs/en/build-with-claude/extended-thinking) for more information. ## Specifying client tools Client tools (both Anthropic-defined and user-defined) are specified in the `tools` top-level parameter of the API request. Each tool definition includes: | Parameter | Description | | :------------- | :-------------------------------------------------------------------------------------------------- | | `name` | The name of the tool. Must match the regex `^[a-zA-Z0-9_-]{1,64}$`. | | `description` | A detailed plaintext description of what the tool does, when it should be used, and how it behaves. | | `input_schema` | A [JSON Schema](https://json-schema.org/) object defining the expected parameters for the tool. | | `input_examples` | (Optional, beta) An array of example input objects to help Claude understand how to use the tool. See [Providing tool use examples](#providing-tool-use-examples). |
```json JSON { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature, either 'celsius' or 'fahrenheit'" } }, "required": ["location"] } } ``` This tool, named `get_weather`, expects an input object with a required `location` string and an optional `unit` string that must be either "celsius" or "fahrenheit".
### Tool use system prompt When you call the Claude API with the `tools` parameter, we construct a special system prompt from the tool definitions, tool configuration, and any user-specified system prompt. The constructed prompt is designed to instruct the model to use the specified tool(s) and provide the necessary context for the tool to operate properly: ``` In this environment you have access to a set of tools you can use to answer the user's question. {{ FORMATTING INSTRUCTIONS }} String and scalar parameters should be specified as is, while lists and objects should use JSON format. Note that spaces for string values are not stripped. The output is not expected to be valid XML and is parsed with regular expressions. Here are the functions available in JSONSchema format: {{ TOOL DEFINITIONS IN JSON SCHEMA }} {{ USER SYSTEM PROMPT }} {{ TOOL CONFIGURATION }} ``` ### Best practices for tool definitions To get the best performance out of Claude when using tools, follow these guidelines: - **Provide extremely detailed descriptions.** This is by far the most important factor in tool performance. Your descriptions should explain every detail about the tool, including: - What the tool does - When it should be used (and when it shouldn't) - What each parameter means and how it affects the tool's behavior - Any important caveats or limitations, such as what information the tool does not return if the tool name is unclear. The more context you can give Claude about your tools, the better it will be at deciding when and how to use them. Aim for at least 3-4 sentences per tool description, more if the tool is complex. - **Prioritize descriptions, but consider using `input_examples` for complex tools.** Clear descriptions are most important, but for tools with complex inputs, nested objects, or format-sensitive parameters, you can use the `input_examples` field (beta) to provide schema-validated examples. See [Providing tool use examples](#providing-tool-use-examples) for details.
```json JSON { "name": "get_stock_price", "description": "Retrieves the current stock price for a given ticker symbol. The ticker symbol must be a valid symbol for a publicly traded company on a major US stock exchange like NYSE or NASDAQ. The tool will return the latest trade price in USD. It should be used when the user asks about the current or most recent price of a specific stock. It will not provide any other information about the stock or company.", "input_schema": { "type": "object", "properties": { "ticker": { "type": "string", "description": "The stock ticker symbol, e.g. AAPL for Apple Inc." } }, "required": ["ticker"] } } ```
```json JSON { "name": "get_stock_price", "description": "Gets the stock price for a ticker.", "input_schema": { "type": "object", "properties": { "ticker": { "type": "string" } }, "required": ["ticker"] } } ```
The good description clearly explains what the tool does, when to use it, what data it returns, and what the `ticker` parameter means. The poor description is too brief and leaves Claude with many open questions about the tool's behavior and usage. ## Providing tool use examples You can provide concrete examples of valid tool inputs to help Claude understand how to use your tools more effectively. This is particularly useful for complex tools with nested objects, optional parameters, or format-sensitive inputs. Tool use examples is a beta feature. Include the appropriate [beta header](/docs/en/api/beta-headers) for your provider: | Provider | Beta header | Supported models | |----------|-------------|------------------| | Claude API,
Microsoft Foundry | `advanced-tool-use-2025-11-20` | All models | | Vertex AI,
Amazon Bedrock | `tool-examples-2025-10-29` | Claude Opus 4.5 only |
### Basic usage Add an optional `input_examples` field to your tool definition with an array of example input objects. Each example must be valid according to the tool's `input_schema`: ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, betas=["advanced-tool-use-2025-11-20"], tools=[ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature" } }, "required": ["location"] }, "input_examples": [ { "location": "San Francisco, CA", "unit": "fahrenheit" }, { "location": "Tokyo, Japan", "unit": "celsius" }, { "location": "New York, NY" # 'unit' is optional } ] } ], messages=[ {"role": "user", "content": "What's the weather like in San Francisco?"} ] ) ``` ```typescript TypeScript import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 1024, betas: ["advanced-tool-use-2025-11-20"], tools: [ { name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA", }, unit: { type: "string", enum: ["celsius", "fahrenheit"], description: "The unit of temperature", }, }, required: ["location"], }, input_examples: [ { location: "San Francisco, CA", unit: "fahrenheit", }, { location: "Tokyo, Japan", unit: "celsius", }, { location: "New York, NY", // Demonstrates that 'unit' is optional }, ], }, ], messages: [{ role: "user", content: "What's the weather like in San Francisco?" }], }); ``` Examples are included in the prompt alongside your tool schema, showing Claude concrete patterns for well-formed tool calls. This helps Claude understand when to include optional parameters, what formats to use, and how to structure complex inputs. ### Requirements and limitations - **Schema validation** - Each example must be valid according to the tool's `input_schema`. Invalid examples return a 400 error - **Not supported for server-side tools** - Only user-defined tools can have input examples - **Token cost** - Examples add to prompt tokens: ~20-50 tokens for simple examples, ~100-200 tokens for complex nested objects ## Tool runner (beta) The tool runner provides an out-of-the-box solution for executing tools with Claude. Instead of manually handling tool calls, tool results, and conversation management, the tool runner automatically: - Executes tools when Claude calls them - Handles the request/response cycle - Manages conversation state - Provides type safety and validation We recommend that you use the tool runner for most tool use implementations. The tool runner is currently in beta and available in the [Python](https://github.com/anthropics/anthropic-sdk-python/blob/main/tools.md), [TypeScript](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/helpers.md#tool-helpers), and [Ruby](https://github.com/anthropics/anthropic-sdk-ruby/blob/main/helpers.md#3-auto-looping-tool-runner-beta) SDKs. **Automatic context management with compaction** The tool runner supports automatic [compaction](/docs/en/build-with-claude/context-editing#client-side-compaction-sdk), which generates summaries when token usage exceeds a threshold. This allows long-running agentic tasks to continue beyond context window limits. ### Basic usage Define tools using the SDK helpers, then use the tool runner to execute them. Use the `@beta_tool` decorator to define tools with type hints and docstrings. If you're using the async client, replace `@beta_tool` with `@beta_async_tool` and define the function with `async def`. ```python import anthropic import json from anthropic import beta_tool # Initialize client client = anthropic.Anthropic() # Define tools using the decorator @beta_tool def get_weather(location: str, unit: str = "fahrenheit") -> str: """Get the current weather in a given location. Args: location: The city and state, e.g. San Francisco, CA unit: Temperature unit, either 'celsius' or 'fahrenheit' """ # In a full implementation, you'd call a weather API here return json.dumps({"temperature": "20°C", "condition": "Sunny"}) @beta_tool def calculate_sum(a: int, b: int) -> str: """Add two numbers together. Args: a: First number b: Second number """ return str(a + b) # Use the tool runner runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[get_weather, calculate_sum], messages=[ {"role": "user", "content": "What's the weather like in Paris? Also, what's 15 + 27?"} ] ) for message in runner: print(message.content[0].text) ``` The `@beta_tool` decorator inspects the function arguments and docstring to extract a JSON schema representation. For example, `calculate_sum` becomes: ```json { "name": "calculate_sum", "description": "Adds two integers together.", "input_schema": { "additionalProperties": false, "properties": { "left": { "description": "The first integer to add.", "title": "Left", "type": "integer" }, "right": { "description": "The second integer to add.", "title": "Right", "type": "integer" } }, "required": ["left", "right"], "type": "object" } } ``` Use `betaZodTool()` for type-safe tool definitions with Zod validation, or `betaTool()` for JSON Schema-based definitions. TypeScript offers two approaches for defining tools: **Using Zod (recommended)** - Use `betaZodTool()` for type-safe tool definitions with Zod validation (requires Zod 3.25.0 or higher): ```typescript import { Anthropic } from '@anthropic-ai/sdk'; import { betaZodTool } from '@anthropic-ai/sdk/helpers/beta/zod'; import { z } from 'zod'; const anthropic = new Anthropic(); const getWeatherTool = betaZodTool({ name: 'get_weather', description: 'Get the current weather in a given location', inputSchema: z.object({ location: z.string().describe('The city and state, e.g. San Francisco, CA'), unit: z.enum(['celsius', 'fahrenheit']).default('fahrenheit') .describe('Temperature unit') }), run: async (input) => { // In a full implementation, you'd call a weather API here return JSON.stringify({temperature: '20°C', condition: 'Sunny'}); } }); const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [getWeatherTool], messages: [{ role: 'user', content: "What's the weather like in Paris?" }] }); for await (const message of runner) { console.log(message.content[0].text); } ``` **Using JSON Schema** - Use `betaTool()` for type-safe tool definitions without Zod: The input generated by Claude will not be validated at runtime. Perform validation inside the `run` function if needed. ```typescript import { Anthropic } from '@anthropic-ai/sdk'; import { betaTool } from '@anthropic-ai/sdk/helpers/beta/json-schema'; const anthropic = new Anthropic(); const calculateSumTool = betaTool({ name: 'calculate_sum', description: 'Add two numbers together', inputSchema: { type: 'object', properties: { a: { type: 'number', description: 'First number' }, b: { type: 'number', description: 'Second number' } }, required: ['a', 'b'] }, run: async (input) => { return String(input.a + input.b); } }); const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [calculateSumTool], messages: [{ role: 'user', content: "What's 15 + 27?" }] }); for await (const message of runner) { console.log(message.content[0].text); } ``` Use the `Anthropic::BaseTool` class to define tools with typed input schemas. ```ruby require "anthropic" # Initialize client client = Anthropic::Client.new # Define input schema class GetWeatherInput < Anthropic::BaseModel required :location, String, doc: "The city and state, e.g. San Francisco, CA" optional :unit, Anthropic::InputSchema::EnumOf["celsius", "fahrenheit"], doc: "Temperature unit" end # Define tool class GetWeather < Anthropic::BaseTool doc "Get the current weather in a given location" input_schema GetWeatherInput def call(input) # In a full implementation, you'd call a weather API here JSON.generate({temperature: "20°C", condition: "Sunny"}) end end class CalculateSumInput < Anthropic::BaseModel required :a, Integer, doc: "First number" required :b, Integer, doc: "Second number" end class CalculateSum < Anthropic::BaseTool doc "Add two numbers together" input_schema CalculateSumInput def call(input) (input.a + input.b).to_s end end # Use the tool runner runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [GetWeather.new, CalculateSum.new], messages: [ {role: "user", content: "What's the weather like in Paris? Also, what's 15 + 27?"} ] ) runner.each_message do |message| message.content.each do |block| puts block.text if block.respond_to?(:text) end end ``` The `Anthropic::BaseTool` class uses the `doc` method for the tool description and `input_schema` to define the expected parameters. The SDK automatically converts this to the appropriate JSON schema format. The tool function must return a content block or content block array, including text, images, or document blocks. This allows tools to return rich, multimodal responses. Returned strings will be converted to a text content block. If you want to return a structured JSON object to Claude, encode it to a JSON string before returning it. Numbers, booleans, or other non-string primitives must also be converted to strings. ### Iterating over the tool runner The tool runner is an iterable that yields messages from Claude. This is often referred to as a "tool call loop". Each iteration, the runner checks if Claude requested a tool use. If so, it calls the tool and sends the result back to Claude automatically, then yields the next message from Claude to continue your loop. You can end the loop at any iteration with a `break` statement. The runner will loop until Claude returns a message without a tool use. If you don't need intermediate messages, you can get the final message directly: Use `runner.until_done()` to get the final message. ```python runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[get_weather, calculate_sum], messages=[ {"role": "user", "content": "What's the weather like in Paris? Also, what's 15 + 27?"} ] ) final_message = runner.until_done() print(final_message.content[0].text) ``` Simply `await` the runner to get the final message. ```typescript const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [getWeatherTool], messages: [{ role: 'user', content: "What's the weather like in Paris?" }] }); const finalMessage = await runner; console.log(finalMessage.content[0].text); ``` Use `runner.run_until_finished` to get all messages. ```ruby runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [GetWeather.new, CalculateSum.new], messages: [ {role: "user", content: "What's the weather like in Paris? Also, what's 15 + 27?"} ] ) all_messages = runner.run_until_finished all_messages.each { |msg| puts msg.content } ``` ### Advanced usage Within the loop, you can fully customize the tool runner's next request to the Messages API. The runner automatically appends tool results to the message history, so you don't need to manually manage them. You can optionally inspect the tool result for logging or debugging, and modify the request parameters before the next API call. Use `generate_tool_call_response()` to optionally inspect the tool result (the runner appends it automatically). Use `set_messages_params()` and `append_messages()` to modify the request. ```python runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[get_weather], messages=[{"role": "user", "content": "What's the weather in San Francisco?"}] ) for message in runner: # Optional: inspect the tool response (automatically appended by the runner) tool_response = runner.generate_tool_call_response() if tool_response: print(f"Tool result: {tool_response}") # Customize the next request runner.set_messages_params(lambda params: { **params, "max_tokens": 2048 # Increase tokens for next request }) # Or add additional messages runner.append_messages( {"role": "user", "content": "Please be concise in your response."} ) ``` Use `generateToolResponse()` to optionally inspect the tool result (the runner appends it automatically). Use `setMessagesParams()` and `pushMessages()` to modify the request. ```typescript const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [getWeatherTool], messages: [{ role: 'user', content: "What's the weather in San Francisco?" }] }); for await (const message of runner) { // Optional: inspect the tool result message (automatically appended by the runner) const toolResultMessage = await runner.generateToolResponse(); if (toolResultMessage) { console.log('Tool result:', toolResultMessage); } // Customize the next request runner.setMessagesParams(params => ({ ...params, max_tokens: 2048 // Increase tokens for next request })); // Or add additional messages runner.pushMessages( { role: 'user', content: 'Please be concise in your response.' } ); } ``` Use `next_message` for step-by-step control. Use `feed_messages` to inject messages and `params` to access parameters. ```ruby runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [GetWeather.new], messages: [{role: "user", content: "What's the weather in San Francisco?"}] ) # Manual step-by-step control message = runner.next_message puts message.content # Inject follow-up messages runner.feed_messages([ {role: "user", content: "Also check Boston"} ]) # Access current parameters puts runner.params ``` #### Debugging tool execution When a tool throws an exception, the tool runner catches it and returns the error to Claude as a tool result with `is_error: true`. By default, only the exception message is included, not the full stack trace. To view full stack traces and debug information, set the `ANTHROPIC_LOG` environment variable: ```bash # View info-level logs including tool errors export ANTHROPIC_LOG=info # View debug-level logs for more verbose output export ANTHROPIC_LOG=debug ``` When enabled, the SDK logs full exception details (using Python's `logging` module, the console in TypeScript, or Ruby's logger), including the complete stack trace when a tool fails. #### Intercepting tool errors By default, tool errors are passed back to Claude, which can then respond appropriately. However, you may want to detect errors and handle them differently—for example, to stop execution early or implement custom error handling. Use the tool response method to intercept tool results and check for errors before they're sent to Claude: ```python import json runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[my_tool], messages=[{"role": "user", "content": "Run the tool"}] ) for message in runner: tool_response = runner.generate_tool_call_response() if tool_response: # Check if any tool result has an error for block in tool_response.content: if block.is_error: # Option 1: Raise an exception to stop the loop raise RuntimeError(f"Tool failed: {json.dumps(block.content)}") # Option 2: Log and continue (let Claude handle it) # logger.error(f"Tool error: {json.dumps(block.content)}") # Process the message normally print(message.content) ``` ```typescript const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [myTool], messages: [{ role: 'user', content: 'Run the tool' }] }); for await (const message of runner) { const toolResultMessage = await runner.generateToolResponse(); if (toolResultMessage) { // Check if any tool result has an error for (const block of toolResultMessage.content) { if (block.type === 'tool_result' && block.is_error) { // Option 1: Throw to stop the loop throw new Error(`Tool failed: ${JSON.stringify(block.content)}`); // Option 2: Log and continue (let Claude handle it) // console.error(`Tool error: ${JSON.stringify(block.content)}`); } } } // Process the message normally console.log(message.content); } ``` ```ruby runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [MyTool.new], messages: [{role: "user", content: "Run the tool"}] ) runner.each_message do |message| # Get the tool response to check for errors # Note: The runner automatically handles tool execution and appends results # This is just for error checking/logging purposes tool_results = runner.params[:messages].last if tool_results && tool_results[:role] == "user" tool_results[:content].each do |block| if block[:type] == "tool_result" && block[:is_error] # Option 1: Raise an exception to stop the loop raise "Tool failed: #{block[:content]}" # Option 2: Log and continue (let Claude handle it) # logger.error("Tool error: #{block[:content]}") end end end puts message.content end ``` #### Modifying tool results You can modify tool results before they're sent back to Claude. This is useful for adding metadata like `cache_control` to enable [prompt caching](/docs/en/build-with-claude/prompt-caching) on tool results, or for transforming the tool output. Use the tool response method to get the tool result, modify it, then add your modified version to the messages: ```python runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[search_documents], messages=[{"role": "user", "content": "Search for information about the climate of San Francisco"}] ) for message in runner: tool_response = runner.generate_tool_call_response() if tool_response: # Modify the tool result to add cache control for block in tool_response.content: if block.type == "tool_result": # Add cache_control to cache this tool result block.cache_control = {"type": "ephemeral"} # Append the modified response (this prevents auto-append of original) runner.append_messages(message, tool_response) print(message.content) ``` ```typescript const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5', max_tokens: 1024, tools: [searchDocuments], messages: [{ role: 'user', content: 'Search for information about the climate of San Francisco' }] }); for await (const message of runner) { const toolResultMessage = await runner.generateToolResponse(); if (toolResultMessage) { // Modify the tool result to add cache control for (const block of toolResultMessage.content) { if (block.type === 'tool_result') { // Add cache_control to cache this tool result block.cache_control = { type: 'ephemeral' }; } } // Push the modified message (this prevents auto-append of original) runner.pushMessages(message, toolResultMessage); } console.log(message.content); } ``` ```ruby runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [SearchDocuments.new], messages: [{role: "user", content: "Search for information about the climate of San Francisco"}] ) loop do message = runner.next_message break unless message # Access the most recent tool results from the messages array # The runner automatically adds tool results, but we can modify them tool_results_message = runner.params[:messages].last if tool_results_message && tool_results_message[:role] == "user" tool_results_message[:content].each do |block| if block[:type] == "tool_result" # Modify the tool result to add cache control block[:cache_control] = {type: "ephemeral"} end end end puts message.content break if message.stop_reason != "tool_use" end ``` Adding `cache_control` to tool results is particularly useful when tools return large amounts of data (like document search results) that you want to cache for subsequent API calls. See [Prompt caching](/docs/en/build-with-claude/prompt-caching) for more details on caching strategies. ### Streaming Enable streaming to receive events as they arrive. Each iteration yields a stream object that you can iterate for events. Set `stream=True` and use `get_final_message()` to get the accumulated message. ```python runner = client.beta.messages.tool_runner( model="claude-sonnet-4-5", max_tokens=1024, tools=[calculate_sum], messages=[{"role": "user", "content": "What is 15 + 27?"}], stream=True ) # When streaming, the runner returns BetaMessageStream for message_stream in runner: for event in message_stream: print('event:', event) print('message:', message_stream.get_final_message()) print(runner.until_done()) ``` Set `stream: true` and use `finalMessage()` to get the accumulated message. ```typescript const runner = anthropic.beta.messages.toolRunner({ model: 'claude-sonnet-4-5-20250929', max_tokens: 1000, messages: [{ role: 'user', content: 'What is the weather in San Francisco?' }], tools: [getWeatherTool], stream: true, }); // When streaming, the runner returns BetaMessageStream for await (const messageStream of runner) { for await (const event of messageStream) { console.log('event:', event); } console.log('message:', await messageStream.finalMessage()); } console.log(await runner); ``` Use `each_streaming` to iterate over streaming events. ```ruby runner = client.beta.messages.tool_runner( model: "claude-sonnet-4-5", max_tokens: 1024, tools: [CalculateSum.new], messages: [{role: "user", content: "What is 15 + 27?"}] ) runner.each_streaming do |event| case event when Anthropic::Streaming::TextEvent print event.text when Anthropic::Streaming::ToolUseEvent puts "\nTool called: #{event.tool_name}" end end ``` The SDK tool runner is in beta. The rest of this document covers manual tool implementation. ## Controlling Claude's output ### Forcing tool use In some cases, you may want Claude to use a specific tool to answer the user's question, even if Claude thinks it can provide an answer without using a tool. You can do this by specifying the tool in the `tool_choice` field like so: ``` tool_choice = {"type": "tool", "name": "get_weather"} ``` When working with the tool_choice parameter, we have four possible options: - `auto` allows Claude to decide whether to call any provided tools or not. This is the default value when `tools` are provided. - `any` tells Claude that it must use one of the provided tools, but doesn't force a particular tool. - `tool` allows us to force Claude to always use a particular tool. - `none` prevents Claude from using any tools. This is the default value when no `tools` are provided. When using [prompt caching](/docs/en/build-with-claude/prompt-caching#what-invalidates-the-cache), changes to the `tool_choice` parameter will invalidate cached message blocks. Tool definitions and system prompts remain cached, but message content must be reprocessed. This diagram illustrates how each option works: ![Image](/docs/images/tool_choice.png) Note that when you have `tool_choice` as `any` or `tool`, we will prefill the assistant message to force a tool to be used. This means that the models will not emit a natural language response or explanation before `tool_use` content blocks, even if explicitly asked to do so. When using [extended thinking](/docs/en/build-with-claude/extended-thinking) with tool use, `tool_choice: {"type": "any"}` and `tool_choice: {"type": "tool", "name": "..."}` are not supported and will result in an error. Only `tool_choice: {"type": "auto"}` (the default) and `tool_choice: {"type": "none"}` are compatible with extended thinking. Our testing has shown that this should not reduce performance. If you would like the model to provide natural language context or explanations while still requesting that the model use a specific tool, you can use `{"type": "auto"}` for `tool_choice` (the default) and add explicit instructions in a `user` message. For example: `What's the weather like in London? Use the get_weather tool in your response.` **Guaranteed tool calls with strict tools** Combine `tool_choice: {"type": "any"}` with [strict tool use](/docs/en/build-with-claude/structured-outputs) to guarantee both that one of your tools will be called AND that the tool inputs strictly follow your schema. Set `strict: true` on your tool definitions to enable schema validation. ### JSON output Tools do not necessarily need to be client functions — you can use tools anytime you want the model to return JSON output that follows a provided schema. For example, you might use a `record_summary` tool with a particular schema. See [Tool use with Claude](/docs/en/agents-and-tools/tool-use/overview) for a full working example. ### Model responses with tools When using tools, Claude will often comment on what it's doing or respond naturally to the user before invoking tools. For example, given the prompt "What's the weather like in San Francisco right now, and what time is it there?", Claude might respond with: ```json JSON { "role": "assistant", "content": [ { "type": "text", "text": "I'll help you check the current weather and time in San Francisco." }, { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": {"location": "San Francisco, CA"} } ] } ``` This natural response style helps users understand what Claude is doing and creates a more conversational interaction. You can guide the style and content of these responses through your system prompts and by providing `` in your prompts. It's important to note that Claude may use various phrasings and approaches when explaining its actions. Your code should treat these responses like any other assistant-generated text, and not rely on specific formatting conventions. ### Parallel tool use By default, Claude may use multiple tools to answer a user query. You can disable this behavior by: - Setting `disable_parallel_tool_use=true` when tool_choice type is `auto`, which ensures that Claude uses **at most one** tool - Setting `disable_parallel_tool_use=true` when tool_choice type is `any` or `tool`, which ensures that Claude uses **exactly one** tool
**Simpler with Tool runner**: The example below shows manual parallel tool handling. For most use cases, [tool runner](#tool-runner-beta) automatically handle parallel tool execution with much less code. Here's a complete example showing how to properly format parallel tool calls in the message history: ```python Python import anthropic client = anthropic.Anthropic() # Define tools tools = [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } }, { "name": "get_time", "description": "Get the current time in a given timezone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The timezone, e.g. America/New_York" } }, "required": ["timezone"] } } ] # Initial request response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=tools, messages=[ { "role": "user", "content": "What's the weather in SF and NYC, and what time is it there?" } ] ) # Claude's response with parallel tool calls print("Claude wants to use tools:", response.stop_reason == "tool_use") print("Number of tool calls:", len([c for c in response.content if c.type == "tool_use"])) # Build the conversation with tool results messages = [ { "role": "user", "content": "What's the weather in SF and NYC, and what time is it there?" }, { "role": "assistant", "content": response.content # Contains multiple tool_use blocks }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01", # Must match the ID from tool_use "content": "San Francisco: 68°F, partly cloudy" }, { "type": "tool_result", "tool_use_id": "toolu_02", "content": "New York: 45°F, clear skies" }, { "type": "tool_result", "tool_use_id": "toolu_03", "content": "San Francisco time: 2:30 PM PST" }, { "type": "tool_result", "tool_use_id": "toolu_04", "content": "New York time: 5:30 PM EST" } ] } ] # Get final response final_response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=tools, messages=messages ) print(final_response.content[0].text) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Define tools const tools = [ { name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" } }, required: ["location"] } }, { name: "get_time", description: "Get the current time in a given timezone", input_schema: { type: "object", properties: { timezone: { type: "string", description: "The timezone, e.g. America/New_York" } }, required: ["timezone"] } } ]; // Initial request const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: tools, messages: [ { role: "user", content: "What's the weather in SF and NYC, and what time is it there?" } ] }); // Build conversation with tool results const messages = [ { role: "user", content: "What's the weather in SF and NYC, and what time is it there?" }, { role: "assistant", content: response.content // Contains multiple tool_use blocks }, { role: "user", content: [ { type: "tool_result", tool_use_id: "toolu_01", // Must match the ID from tool_use content: "San Francisco: 68°F, partly cloudy" }, { type: "tool_result", tool_use_id: "toolu_02", content: "New York: 45°F, clear skies" }, { type: "tool_result", tool_use_id: "toolu_03", content: "San Francisco time: 2:30 PM PST" }, { type: "tool_result", tool_use_id: "toolu_04", content: "New York time: 5:30 PM EST" } ] } ]; // Get final response const finalResponse = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: tools, messages: messages }); console.log(finalResponse.content[0].text); ``` The assistant message with parallel tool calls would look like this: ```json { "role": "assistant", "content": [ { "type": "text", "text": "I'll check the weather and time for both San Francisco and New York City." }, { "type": "tool_use", "id": "toolu_01", "name": "get_weather", "input": {"location": "San Francisco, CA"} }, { "type": "tool_use", "id": "toolu_02", "name": "get_weather", "input": {"location": "New York, NY"} }, { "type": "tool_use", "id": "toolu_03", "name": "get_time", "input": {"timezone": "America/Los_Angeles"} }, { "type": "tool_use", "id": "toolu_04", "name": "get_time", "input": {"timezone": "America/New_York"} } ] } ```
Here's a complete, runnable script to test and verify parallel tool calls are working correctly: ```python Python #!/usr/bin/env python3 """Test script to verify parallel tool calls with the Claude API""" import os from anthropic import Anthropic # Initialize client client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY")) # Define tools tools = [ { "name": "get_weather", "description": "Get the current weather in a given location", "input_schema": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" } }, "required": ["location"] } }, { "name": "get_time", "description": "Get the current time in a given timezone", "input_schema": { "type": "object", "properties": { "timezone": { "type": "string", "description": "The timezone, e.g. America/New_York" } }, "required": ["timezone"] } } ] # Test conversation with parallel tool calls messages = [ { "role": "user", "content": "What's the weather in SF and NYC, and what time is it there?" } ] # Make initial request print("Requesting parallel tool calls...") response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=tools ) # Check for parallel tool calls tool_uses = [block for block in response.content if block.type == "tool_use"] print(f"\n✓ Claude made {len(tool_uses)} tool calls") if len(tool_uses) > 1: print("✓ Parallel tool calls detected!") for tool in tool_uses: print(f" - {tool.name}: {tool.input}") else: print("✗ No parallel tool calls detected") # Simulate tool execution and format results correctly tool_results = [] for tool_use in tool_uses: if tool_use.name == "get_weather": if "San Francisco" in str(tool_use.input): result = "San Francisco: 68°F, partly cloudy" else: result = "New York: 45°F, clear skies" else: # get_time if "Los_Angeles" in str(tool_use.input): result = "2:30 PM PST" else: result = "5:30 PM EST" tool_results.append({ "type": "tool_result", "tool_use_id": tool_use.id, "content": result }) # Continue conversation with tool results messages.extend([ {"role": "assistant", "content": response.content}, {"role": "user", "content": tool_results} # All results in one message! ]) # Get final response print("\nGetting final response...") final_response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=tools ) print(f"\nClaude's response:\n{final_response.content[0].text}") # Verify formatting print("\n--- Verification ---") print(f"✓ Tool results sent in single user message: {len(tool_results)} results") print("✓ No text before tool results in content array") print("✓ Conversation formatted correctly for future parallel tool use") ``` ```typescript TypeScript #!/usr/bin/env node // Test script to verify parallel tool calls with the Claude API import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }); // Define tools const tools = [ { name: "get_weather", description: "Get the current weather in a given location", input_schema: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" } }, required: ["location"] } }, { name: "get_time", description: "Get the current time in a given timezone", input_schema: { type: "object", properties: { timezone: { type: "string", description: "The timezone, e.g. America/New_York" } }, required: ["timezone"] } } ]; async function testParallelTools() { // Make initial request console.log("Requesting parallel tool calls..."); const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [{ role: "user", content: "What's the weather in SF and NYC, and what time is it there?" }], tools: tools }); // Check for parallel tool calls const toolUses = response.content.filter(block => block.type === "tool_use"); console.log(`\n✓ Claude made ${toolUses.length} tool calls`); if (toolUses.length > 1) { console.log("✓ Parallel tool calls detected!"); toolUses.forEach(tool => { console.log(` - ${tool.name}: ${JSON.stringify(tool.input)}`); }); } else { console.log("✗ No parallel tool calls detected"); } // Simulate tool execution and format results correctly const toolResults = toolUses.map(toolUse => { let result; if (toolUse.name === "get_weather") { result = toolUse.input.location.includes("San Francisco") ? "San Francisco: 68°F, partly cloudy" : "New York: 45°F, clear skies"; } else { result = toolUse.input.timezone.includes("Los_Angeles") ? "2:30 PM PST" : "5:30 PM EST"; } return { type: "tool_result", tool_use_id: toolUse.id, content: result }; }); // Get final response with correct formatting console.log("\nGetting final response..."); const finalResponse = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: "What's the weather in SF and NYC, and what time is it there?" }, { role: "assistant", content: response.content }, { role: "user", content: toolResults } // All results in one message! ], tools: tools }); console.log(`\nClaude's response:\n${finalResponse.content[0].text}`); // Verify formatting console.log("\n--- Verification ---"); console.log(`✓ Tool results sent in single user message: ${toolResults.length} results`); console.log("✓ No text before tool results in content array"); console.log("✓ Conversation formatted correctly for future parallel tool use"); } testParallelTools().catch(console.error); ``` This script demonstrates: - How to properly format parallel tool calls and results - How to verify that parallel calls are being made - The correct message structure that encourages future parallel tool use - Common mistakes to avoid (like text before tool results) Run this script to test your implementation and ensure Claude is making parallel tool calls effectively.
#### Maximizing parallel tool use While Claude 4 models have excellent parallel tool use capabilities by default, you can increase the likelihood of parallel tool execution across all models with targeted prompting:
For Claude 4 models (Opus 4, and Sonnet 4), add this to your system prompt: ```text For maximum efficiency, whenever you need to perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially. ``` For even stronger parallel tool use (recommended if the default isn't sufficient), use: ```text For maximum efficiency, whenever you perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially. Prioritize calling tools in parallel whenever possible. For example, when reading 3 files, run 3 tool calls in parallel to read all 3 files into context at the same time. When running multiple read-only commands like `ls` or `list_dir`, always run all of the commands in parallel. Err on the side of maximizing parallel tool calls rather than running too many tools sequentially. ```
You can also encourage parallel tool use within specific user messages: ```python # Instead of: "What's the weather in Paris? Also check London." # Use: "Check the weather in Paris and London simultaneously." # Or be explicit: "Please use parallel tool calls to get the weather for Paris, London, and Tokyo at the same time." ```
**Parallel tool use with Claude Sonnet 3.7** Claude Sonnet 3.7 may be less likely to make make parallel tool calls in a response, even when you have not set `disable_parallel_tool_use`. We recommend [upgrading to Claude 4 models](/docs/en/about-claude/models/migrating-to-claude-4), which have built-in token-efficient tool use and improved parallel tool calling. If you're still using Claude Sonnet 3.7, you can enable the `token-efficient-tools-2025-02-19` [beta header](/docs/en/api/beta-headers), which helps encourage Claude to use parallel tools. You can also introduce a "batch tool" that can act as a meta-tool to wrap invocations to other tools simultaneously. See [this example](https://platform.claude.com/cookbook/tool-use-parallel-tools) in our cookbook for how to use this workaround. ## Handling tool use and tool result content blocks **Simpler with Tool runner**: The manual tool handling described in this section is automatically managed by [tool runner](#tool-runner-beta). Use this section when you need custom control over tool execution. Claude's response differs based on whether it uses a client or server tool. ### Handling results from client tools The response will have a `stop_reason` of `tool_use` and one or more `tool_use` content blocks that include: - `id`: A unique identifier for this particular tool use block. This will be used to match up the tool results later. - `name`: The name of the tool being used. - `input`: An object containing the input being passed to the tool, conforming to the tool's `input_schema`.
```json JSON { "id": "msg_01Aq9w938a90dw8q", "model": "claude-sonnet-4-5", "stop_reason": "tool_use", "role": "assistant", "content": [ { "type": "text", "text": "I'll check the current weather in San Francisco for you." }, { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "get_weather", "input": {"location": "San Francisco, CA", "unit": "celsius"} } ] } ```
When you receive a tool use response for a client tool, you should: 1. Extract the `name`, `id`, and `input` from the `tool_use` block. 2. Run the actual tool in your codebase corresponding to that tool name, passing in the tool `input`. 3. Continue the conversation by sending a new message with the `role` of `user`, and a `content` block containing the `tool_result` type and the following information: - `tool_use_id`: The `id` of the tool use request this is a result for. - `content`: The result of the tool, as a string (e.g. `"content": "15 degrees"`), a list of nested content blocks (e.g. `"content": [{"type": "text", "text": "15 degrees"}]`), or a list of document blocks (e.g. `"content": ["type": "document", "source": {"type": "text", "media_type": "text/plain", "data": "15 degrees"}]`). These content blocks can use the `text`, `image`, or `document` types. - `is_error` (optional): Set to `true` if the tool execution resulted in an error. **Important formatting requirements**: - Tool result blocks must immediately follow their corresponding tool use blocks in the message history. You cannot include any messages between the assistant's tool use message and the user's tool result message. - In the user message containing tool results, the tool_result blocks must come FIRST in the content array. Any text must come AFTER all tool results. For example, this will cause a 400 error: ```json {"role": "user", "content": [ {"type": "text", "text": "Here are the results:"}, // ❌ Text before tool_result {"type": "tool_result", "tool_use_id": "toolu_01", ...} ]} ``` This is correct: ```json {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_01", ...}, {"type": "text", "text": "What should I do next?"} // ✅ Text after tool_result ]} ``` If you receive an error like "tool_use ids were found without tool_result blocks immediately after", check that your tool results are formatted correctly.
```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "15 degrees" } ] } ```
```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": [ {"type": "text", "text": "15 degrees"}, { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": "/9j/4AAQSkZJRg...", } } ] } ] } ```
```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", } ] } ```
```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": [ {"type": "text", "text": "The weather is"}, { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "15 degrees" } } ] } ] } ```
After receiving the tool result, Claude will use that information to continue generating a response to the original user prompt. ### Handling results from server tools Claude executes the tool internally and incorporates the results directly into its response without requiring additional user interaction. **Differences from other APIs** Unlike APIs that separate tool use or use special roles like `tool` or `function`, the Claude API integrates tools directly into the `user` and `assistant` message structure. Messages contain arrays of `text`, `image`, `tool_use`, and `tool_result` blocks. `user` messages include client content and `tool_result`, while `assistant` messages contain AI-generated content and `tool_use`. ### Handling the `max_tokens` stop reason If Claude's [response is cut off due to hitting the `max_tokens` limit](/docs/en/build-with-claude/handling-stop-reasons#max-tokens), and the truncated response contains an incomplete tool use block, you'll need to retry the request with a higher `max_tokens` value to get the full tool use. ```python Python # Check if response was truncated during tool use if response.stop_reason == "max_tokens": # Check if the last content block is an incomplete tool_use last_block = response.content[-1] if last_block.type == "tool_use": # Send the request with higher max_tokens response = client.messages.create( model="claude-sonnet-4-5", max_tokens=4096, # Increased limit messages=messages, tools=tools ) ``` ```typescript TypeScript // Check if response was truncated during tool use if (response.stop_reason === "max_tokens") { // Check if the last content block is an incomplete tool_use const lastBlock = response.content[response.content.length - 1]; if (lastBlock.type === "tool_use") { // Send the request with higher max_tokens response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 4096, // Increased limit messages: messages, tools: tools }); } } ``` #### Handling the `pause_turn` stop reason When using server tools like web search, the API may return a `pause_turn` stop reason, indicating that the API has paused a long-running turn. Here's how to handle the `pause_turn` stop reason: ```python Python import anthropic client = anthropic.Anthropic() # Initial request with web search response = client.messages.create( model="claude-3-7-sonnet-latest", max_tokens=1024, messages=[ { "role": "user", "content": "Search for comprehensive information about quantum computing breakthroughs in 2025" } ], tools=[{ "type": "web_search_20250305", "name": "web_search", "max_uses": 10 }] ) # Check if the response has pause_turn stop reason if response.stop_reason == "pause_turn": # Continue the conversation with the paused content messages = [ {"role": "user", "content": "Search for comprehensive information about quantum computing breakthroughs in 2025"}, {"role": "assistant", "content": response.content} ] # Send the continuation request continuation = client.messages.create( model="claude-3-7-sonnet-latest", max_tokens=1024, messages=messages, tools=[{ "type": "web_search_20250305", "name": "web_search", "max_uses": 10 }] ) print(continuation) else: print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); // Initial request with web search const response = await anthropic.messages.create({ model: "claude-3-7-sonnet-latest", max_tokens: 1024, messages: [ { role: "user", content: "Search for comprehensive information about quantum computing breakthroughs in 2025" } ], tools: [{ type: "web_search_20250305", name: "web_search", max_uses: 10 }] }); // Check if the response has pause_turn stop reason if (response.stop_reason === "pause_turn") { // Continue the conversation with the paused content const messages = [ { role: "user", content: "Search for comprehensive information about quantum computing breakthroughs in 2025" }, { role: "assistant", content: response.content } ]; // Send the continuation request const continuation = await anthropic.messages.create({ model: "claude-3-7-sonnet-latest", max_tokens: 1024, messages: messages, tools: [{ type: "web_search_20250305", name: "web_search", max_uses: 10 }] }); console.log(continuation); } else { console.log(response); } ``` When handling `pause_turn`: - **Continue the conversation**: Pass the paused response back as-is in a subsequent request to let Claude continue its turn - **Modify if needed**: You can optionally modify the content before continuing if you want to interrupt or redirect the conversation - **Preserve tool state**: Include the same tools in the continuation request to maintain functionality ## Troubleshooting errors **Built-in Error Handling**: [Tool runner](#tool-runner-beta) provide automatic error handling for most common scenarios. This section covers manual error handling for advanced use cases. There are a few different types of errors that can occur when using tools with Claude:
If the tool itself throws an error during execution (e.g. a network error when fetching weather data), you can return the error message in the `content` along with `"is_error": true`: ```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "ConnectionError: the weather service API is not available (HTTP 500)", "is_error": true } ] } ``` Claude will then incorporate this error into its response to the user, e.g. "I'm sorry, I was unable to retrieve the current weather because the weather service API is not available. Please try again later."
If Claude's attempted use of a tool is invalid (e.g. missing required parameters), it usually means that the there wasn't enough information for Claude to use the tool correctly. Your best bet during development is to try the request again with more-detailed `description` values in your tool definitions. However, you can also continue the conversation forward with a `tool_result` that indicates the error, and Claude will try to use the tool again with the missing information filled in: ```json JSON { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Missing required 'location' parameter", "is_error": true } ] } ``` If a tool request is invalid or missing parameters, Claude will retry 2-3 times with corrections before apologizing to the user. To eliminate invalid tool calls entirely, use [strict tool use](/docs/en/build-with-claude/structured-outputs) with `strict: true` on your tool definitions. This guarantees that tool inputs will always match your schema exactly, preventing missing parameters and type mismatches.
To prevent Claude from reflecting on search quality with \ tags, add "Do not reflect on the quality of the returned search results in your response" to your prompt.
When server tools encounter errors (e.g., network issues with Web Search), Claude will transparently handle these errors and attempt to provide an alternative response or explanation to the user. Unlike client tools, you do not need to handle `is_error` results for server tools. For web search specifically, possible error codes include: - `too_many_requests`: Rate limit exceeded - `invalid_input`: Invalid search query parameter - `max_uses_exceeded`: Maximum web search tool uses exceeded - `query_too_long`: Query exceeds maximum length - `unavailable`: An internal error occurred
If Claude isn't making parallel tool calls when expected, check these common issues: **1. Incorrect tool result formatting** The most common issue is formatting tool results incorrectly in the conversation history. This "teaches" Claude to avoid parallel calls. Specifically for parallel tool use: - ❌ **Wrong**: Sending separate user messages for each tool result - ✅ **Correct**: All tool results must be in a single user message ```json // ❌ This reduces parallel tool use [ {"role": "assistant", "content": [tool_use_1, tool_use_2]}, {"role": "user", "content": [tool_result_1]}, {"role": "user", "content": [tool_result_2]} // Separate message ] // ✅ This maintains parallel tool use [ {"role": "assistant", "content": [tool_use_1, tool_use_2]}, {"role": "user", "content": [tool_result_1, tool_result_2]} // Single message ] ``` See the [general formatting requirements above](#handling-tool-use-and-tool-result-content-blocks) for other formatting rules. **2. Weak prompting** Default prompting may not be sufficient. Use stronger language: ```text For maximum efficiency, whenever you perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially. Prioritize calling tools in parallel whenever possible. ``` **3. Measuring parallel tool usage** To verify parallel tool calls are working: ```python # Calculate average tools per tool-calling message tool_call_messages = [msg for msg in messages if any( block.type == "tool_use" for block in msg.content )] total_tool_calls = sum( len([b for b in msg.content if b.type == "tool_use"]) for msg in tool_call_messages ) avg_tools_per_message = total_tool_calls / len(tool_call_messages) print(f"Average tools per message: {avg_tools_per_message}") # Should be > 1.0 if parallel calls are working ``` **4. Model-specific behavior** - Claude Opus 4.5, Opus 4.1, and Sonnet 4: Excel at parallel tool use with minimal prompting - Claude Sonnet 3.7: May need stronger prompting or the `token-efficient-tools-2025-02-19` [beta header](/docs/en/api/beta-headers). Consider [upgrading to Claude 4](/docs/en/about-claude/models/migrating-to-claude-4). - Claude Haiku: Less likely to use parallel tools without explicit prompting
--- # Memory tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool # Memory tool --- The memory tool enables Claude to store and retrieve information across conversations through a memory file directory. Claude can create, read, update, and delete files that persist between sessions, allowing it to build knowledge over time without keeping everything in the context window. The memory tool operates client-side—you control where and how the data is stored through your own infrastructure. The memory tool is currently in beta. To enable it, use the beta header `context-management-2025-06-27` in your API requests. Please reach out through our [feedback form](https://forms.gle/YXC2EKGMhjN1c4L88) to share your feedback on this feature. ## Use cases - Maintain project context across multiple agent executions - Learn from past interactions, decisions, and feedback - Build knowledge bases over time - Enable cross-conversation learning where Claude improves at recurring workflows ## How it works When enabled, Claude automatically checks its memory directory before starting tasks. Claude can create, read, update, and delete files in the `/memories` directory to store what it learns while working, then reference those memories in future conversations to handle similar tasks more effectively or pick up where it left off. Since this is a client-side tool, Claude makes tool calls to perform memory operations, and your application executes those operations locally. This gives you complete control over where and how the memory is stored. For security, you should restrict all memory operations to the `/memories` directory. ### Example: How memory tool calls work When you ask Claude to help with a task, Claude automatically checks its memory directory first. Here's what a typical interaction looks like: **1. User request:** ``` "Help me respond to this customer service ticket." ``` **2. Claude checks the memory directory:** ``` "I'll help you respond to the customer service ticket. Let me check my memory for any previous context." ``` Claude calls the memory tool: ```json { "type": "tool_use", "id": "toolu_01C4D5E6F7G8H9I0J1K2L3M4", "name": "memory", "input": { "command": "view", "path": "/memories" } } ``` **3. Your application returns the directory contents:** ```json { "type": "tool_result", "tool_use_id": "toolu_01C4D5E6F7G8H9I0J1K2L3M4", "content": "Here're the files and directories up to 2 levels deep in /memories, excluding hidden items and node_modules:\n4.0K\t/memories\n1.5K\t/memories/customer_service_guidelines.xml\n2.0K\t/memories/refund_policies.xml" } ``` **4. Claude reads relevant files:** ```json { "type": "tool_use", "id": "toolu_01D5E6F7G8H9I0J1K2L3M4N5", "name": "memory", "input": { "command": "view", "path": "/memories/customer_service_guidelines.xml" } } ``` **5. Your application returns the file contents:** ```json { "type": "tool_result", "tool_use_id": "toolu_01D5E6F7G8H9I0J1K2L3M4N5", "content": "Here's the content of /memories/customer_service_guidelines.xml with line numbers:\n 1\t\n 2\t\n 3\t- Always address customers by their first name\n 4\t- Use empathetic language\n..." } ``` **6. Claude uses the memory to help:** ``` "Based on your customer service guidelines, I can help you craft a response. Please share the ticket details..." ``` ## Supported models The memory tool is available on: - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) ## Getting started To use the memory tool: 1. Include the beta header `context-management-2025-06-27` in your API requests 2. Add the memory tool to your request 3. Implement client-side handlers for memory operations To handle memory tool operations in your application, you need to implement handlers for each memory command. Our SDKs provide memory tool helpers that handle the tool interface—you can subclass `BetaAbstractMemoryTool` (Python) or use `betaMemoryTool` (TypeScript) to implement your own memory backend (file-based, database, cloud storage, encrypted files, etc.). For working examples, see: - Python: [examples/memory/basic.py](https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py) - TypeScript: [examples/tools-helpers-memory.ts](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/examples/tools-helpers-memory.ts) ## Basic usage ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --header "anthropic-beta: context-management-2025-06-27" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 2048, "messages": [ { "role": "user", "content": "I'\''m working on a Python web scraper that keeps crashing with a timeout error. Here'\''s the problematic function:\n\n```python\ndef fetch_page(url, retries=3):\n for i in range(retries):\n try:\n response = requests.get(url, timeout=5)\n return response.text\n except requests.exceptions.Timeout:\n if i == retries - 1:\n raise\n time.sleep(1)\n```\n\nPlease help me debug this." } ], "tools": [{ "type": "memory_20250818", "name": "memory" }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() message = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=2048, messages=[ { "role": "user", "content": "I'm working on a Python web scraper that keeps crashing with a timeout error. Here's the problematic function:\n\n```python\ndef fetch_page(url, retries=3):\n for i in range(retries):\n try:\n response = requests.get(url, timeout=5)\n return response.text\n except requests.exceptions.Timeout:\n if i == retries - 1:\n raise\n time.sleep(1)\n```\n\nPlease help me debug this." } ], tools=[{ "type": "memory_20250818", "name": "memory" }], betas=["context-management-2025-06-27"] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const message = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 2048, messages: [ { role: "user", content: "I'm working on a Python web scraper that keeps crashing with a timeout error. Here's the problematic function:\n\n```python\ndef fetch_page(url, retries=3):\n for i in range(retries):\n try:\n response = requests.get(url, timeout=5)\n return response.text\n except requests.exceptions.Timeout:\n if i == retries - 1:\n raise\n time.sleep(1)\n```\n\nPlease help me debug this." } ], tools: [{ type: "memory_20250818", name: "memory" }], betas: ["context-management-2025-06-27"] }); ``` ## Tool commands Your client-side implementation needs to handle these memory tool commands. While these specifications describe the recommended behaviors that Claude is most familiar with, you can modify your implementation and return strings as needed for your use case. ### view Shows directory contents or file contents with optional line ranges: ```json { "command": "view", "path": "/memories", "view_range": [1, 10] // Optional: view specific lines } ``` #### Return values **For directories:** Return a listing that shows files and directories with their sizes: ``` Here're the files and directories up to 2 levels deep in {path}, excluding hidden items and node_modules: {size} {path} {size} {path}/{filename1} {size} {path}/{filename2} ``` - Lists files up to 2 levels deep - Shows human-readable sizes (e.g., `5.5K`, `1.2M`) - Excludes hidden items (files starting with `.`) and `node_modules` - Uses tab character between size and path **For files:** Return file contents with a header and line numbers: ``` Here's the content of {path} with line numbers: {line_numbers}{tab}{content} ``` Line number formatting: - **Width**: 6 characters, right-aligned with space padding - **Separator**: Tab character between line number and content - **Indexing**: 1-indexed (first line is line 1) - **Line limit**: Files with more than 999,999 lines should return an error: `"File {path} exceeds maximum line limit of 999,999 lines."` **Example output:** ``` Here's the content of /memories/notes.txt with line numbers: 1 Hello World 2 This is line two 10 Line ten 100 Line one hundred ``` #### Error handling - **File/directory does not exist**: `"The path {path} does not exist. Please provide a valid path."` ### create Create a new file: ```json { "command": "create", "path": "/memories/notes.txt", "file_text": "Meeting notes:\n- Discussed project timeline\n- Next steps defined\n" } ``` #### Return values - **Success**: `"File created successfully at: {path}"` #### Error handling - **File already exists**: `"Error: File {path} already exists"` ### str_replace Replace text in a file: ```json { "command": "str_replace", "path": "/memories/preferences.txt", "old_str": "Favorite color: blue", "new_str": "Favorite color: green" } ``` #### Return values - **Success**: `"The memory file has been edited."` followed by a snippet of the edited file with line numbers #### Error handling - **File does not exist**: `"Error: The path {path} does not exist. Please provide a valid path."` - **Text not found**: ``"No replacement was performed, old_str `{old_str}` did not appear verbatim in {path}."`` - **Duplicate text**: When `old_str` appears multiple times, return: ``"No replacement was performed. Multiple occurrences of old_str `{old_str}` in lines: {line_numbers}. Please ensure it is unique"`` #### Directory handling If the path is a directory, return a "file does not exist" error. ### insert Insert text at a specific line: ```json { "command": "insert", "path": "/memories/todo.txt", "insert_line": 2, "insert_text": "- Review memory tool documentation\n" } ``` #### Return values - **Success**: `"The file {path} has been edited."` #### Error handling - **File does not exist**: `"Error: The path {path} does not exist"` - **Invalid line number**: ``"Error: Invalid `insert_line` parameter: {insert_line}. It should be within the range of lines of the file: [0, {n_lines}]"`` #### Directory handling If the path is a directory, return a "file does not exist" error. ### delete Delete a file or directory: ```json { "command": "delete", "path": "/memories/old_file.txt" } ``` #### Return values - **Success**: `"Successfully deleted {path}"` #### Error handling - **File/directory does not exist**: `"Error: The path {path} does not exist"` #### Directory handling Deletes the directory and all its contents recursively. ### rename Rename or move a file/directory: ```json { "command": "rename", "old_path": "/memories/draft.txt", "new_path": "/memories/final.txt" } ``` #### Return values - **Success**: `"Successfully renamed {old_path} to {new_path}"` #### Error handling - **Source does not exist**: `"Error: The path {old_path} does not exist"` - **Destination already exists**: Return an error (do not overwrite): `"Error: The destination {new_path} already exists"` #### Directory handling Renames the directory. ## Prompting guidance We automatically include this instruction to the system prompt when the memory tool is included: ``` IMPORTANT: ALWAYS VIEW YOUR MEMORY DIRECTORY BEFORE DOING ANYTHING ELSE. MEMORY PROTOCOL: 1. Use the `view` command of your `memory` tool to check for earlier progress. 2. ... (work on the task) ... - As you make progress, record status / progress / thoughts etc in your memory. ASSUME INTERRUPTION: Your context window might be reset at any moment, so you risk losing any progress that is not recorded in your memory directory. ``` If you observe Claude creating cluttered memory files, you can include this instruction: > Note: when editing your memory folder, always try to keep its content up-to-date, coherent and organized. You can rename or delete files that are no longer relevant. Do not create new files unless necessary. You can also guide what Claude writes to memory, e.g., "Only write down information relevant to \ in your memory system." ## Security considerations Here are important security concerns when implementing your memory store: ### Sensitive information Claude will usually refuse to write down sensitive information in memory files. However, you may want to implement stricter validation that strips out potentially sensitive information. ### File storage size Consider tracking memory file sizes and preventing files from growing too large. Consider adding a maximum number of characters the memory read command can return, and let Claude paginate through contents. ### Memory expiration Consider clearing out memory files periodically that haven't been accessed in an extended time. ### Path traversal protection Malicious path inputs could attempt to access files outside the `/memories` directory. Your implementation **MUST** validate all paths to prevent directory traversal attacks. Consider these safeguards: - Validate that all paths start with `/memories` - Resolve paths to their canonical form and verify they remain within the memory directory - Reject paths containing sequences like `../`, `..\\`, or other traversal patterns - Watch for URL-encoded traversal sequences (`%2e%2e%2f`) - Use your language's built-in path security utilities (e.g., Python's `pathlib.Path.resolve()` and `relative_to()`) ## Error handling The memory tool uses similar error handling patterns to the [text editor tool](/docs/en/agents-and-tools/tool-use/text-editor-tool#handle-errors). See the individual tool command sections above for detailed error messages and behaviors. Common errors include file not found, permission errors, invalid paths, and duplicate text matches. ## Using with Context Editing The memory tool can be combined with [context editing](/docs/en/build-with-claude/context-editing), which automatically clears old tool results when conversation context grows beyond a configured threshold. This combination enables long-running agentic workflows that would otherwise exceed context limits. ### How they work together When context editing is enabled and your conversation approaches the clearing threshold, Claude automatically receives a warning notification. This prompts Claude to preserve any important information from tool results into memory files before those results are cleared from the context window. After tool results are cleared, Claude can retrieve the stored information from memory files whenever needed, effectively treating memory as an extension of its working context. This allows Claude to: - Continue complex, multi-step workflows without losing critical information - Reference past work and decisions even after tool results are removed - Maintain coherent context across conversations that would exceed typical context limits - Build up a knowledge base over time while keeping the active context window manageable ### Example workflow Consider a code refactoring project with many file operations: 1. Claude makes numerous edits to files, generating many tool results 2. As the context grows and approaches your threshold, Claude receives a warning 3. Claude summarizes the changes made so far to a memory file (e.g., `/memories/refactoring_progress.xml`) 4. Context editing clears the older tool results automatically 5. Claude continues working, referencing the memory file when it needs to recall what changes were already completed 6. The workflow can continue indefinitely, with Claude managing both active context and persistent memory ### Configuration To use both features together: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", max_tokens=4096, messages=[...], tools=[ { "type": "memory_20250818", "name": "memory" }, # Your other tools ], betas=["context-management-2025-06-27"], context_management={ "edits": [ { "type": "clear_tool_uses_20250919", "trigger": { "type": "input_tokens", "value": 100000 }, "keep": { "type": "tool_uses", "value": 3 } } ] } ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY, }); const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", max_tokens: 4096, messages: [...], tools: [ { type: "memory_20250818", name: "memory" }, // Your other tools ], betas: ["context-management-2025-06-27"], context_management: { edits: [ { type: "clear_tool_uses_20250919", trigger: { type: "input_tokens", value: 100000 }, keep: { type: "tool_uses", value: 3 } } ] } }); ``` You can also exclude memory tool calls from being cleared to ensure Claude always has access to recent memory operations: ```python Python context_management={ "edits": [ { "type": "clear_tool_uses_20250919", "exclude_tools": ["memory"] } ] } ``` ```typescript TypeScript context_management: { edits: [ { type: "clear_tool_uses_20250919", exclude_tools: ["memory"] } ] } ``` --- # Programmatic tool calling URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling # Programmatic tool calling --- Programmatic tool calling allows Claude to write code that calls your tools programmatically within a [code execution](/docs/en/agents-and-tools/tool-use/code-execution-tool) container, rather than requiring round trips through the model for each tool invocation. This reduces latency for multi-tool workflows and decreases token consumption by allowing Claude to filter or process data before it reaches the model's context window. Programmatic tool calling is currently in public beta. To use this feature, add the `"advanced-tool-use-2025-11-20"` [beta header](/docs/en/api/beta-headers) to your API requests. This feature requires the code execution tool to be enabled. Please reach out through our [feedback form](https://forms.gle/MVGTnrHe73HpMiho8) to share your feedback on this feature. ## Model compatibility Programmatic tool calling is available on the following models: | Model | Tool Version | |-------|--------------| | Claude Opus 4.5 (`claude-opus-4-5-20251101`) | `code_execution_20250825` | | Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) | `code_execution_20250825` | Programmatic tool calling is available via the Claude API and Microsoft Foundry. ## Quick start Here's a simple example where Claude programmatically queries a database multiple times and aggregates results: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: advanced-tool-use-2025-11-20" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 4096, "messages": [ { "role": "user", "content": "Query sales data for the West, East, and Central regions, then tell me which region had the highest revenue" } ], "tools": [ { "type": "code_execution_20250825", "name": "code_execution" }, { "name": "query_database", "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", "input_schema": { "type": "object", "properties": { "sql": { "type": "string", "description": "SQL query to execute" } }, "required": ["sql"] }, "allowed_callers": ["code_execution_20250825"] } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["advanced-tool-use-2025-11-20"], max_tokens=4096, messages=[{ "role": "user", "content": "Query sales data for the West, East, and Central regions, then tell me which region had the highest revenue" }], tools=[ { "type": "code_execution_20250825", "name": "code_execution" }, { "name": "query_database", "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", "input_schema": { "type": "object", "properties": { "sql": { "type": "string", "description": "SQL query to execute" } }, "required": ["sql"] }, "allowed_callers": ["code_execution_20250825"] } ] ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["advanced-tool-use-2025-11-20"], max_tokens: 4096, messages: [ { role: "user", content: "Query sales data for the West, East, and Central regions, then tell me which region had the highest revenue" } ], tools: [ { type: "code_execution_20250825", name: "code_execution" }, { name: "query_database", description: "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", input_schema: { type: "object", properties: { sql: { type: "string", description: "SQL query to execute" } }, required: ["sql"] }, allowed_callers: ["code_execution_20250825"] } ] }); console.log(response); } main().catch(console.error); ``` ## How programmatic tool calling works When you configure a tool to be callable from code execution and Claude decides to use that tool: 1. Claude writes Python code that invokes the tool as a function, potentially including multiple tool calls and pre/post-processing logic 2. Claude runs this code in a sandboxed container via code execution 3. When a tool function is called, code execution pauses and the API returns a `tool_use` block 4. You provide the tool result, and code execution continues (intermediate results are not loaded into Claude's context window) 5. Once all code execution completes, Claude receives the final output and continues working on the task This approach is particularly useful for: - **Large data processing**: Filter or aggregate tool results before they reach Claude's context - **Multi-step workflows**: Save tokens and latency by calling tools serially or in a loop without sampling Claude in-between tool calls - **Conditional logic**: Make decisions based on intermediate tool results Custom tools are converted to async Python functions to support parallel tool calling. When Claude writes code that calls your tools, it uses `await` (e.g., `result = await query_database("")`) and automatically includes the appropriate async wrapper function. The async wrapper is omitted from code examples in this documentation for clarity. ## Core concepts ### The `allowed_callers` field The `allowed_callers` field specifies which contexts can invoke a tool: ```json { "name": "query_database", "description": "Execute a SQL query against the database", "input_schema": {...}, "allowed_callers": ["code_execution_20250825"] } ``` **Possible values:** - `["direct"]` - Only Claude can call this tool directly (default if omitted) - `["code_execution_20250825"]` - Only callable from within code execution - `["direct", "code_execution_20250825"]` - Callable both directly and from code execution We recommend choosing either `["direct"]` or `["code_execution_20250825"]` for each tool rather than enabling both, as this provides clearer guidance to Claude for how best to use the tool. ### The `caller` field in responses Every tool use block includes a `caller` field indicating how it was invoked: **Direct invocation (traditional tool use):** ```json { "type": "tool_use", "id": "toolu_abc123", "name": "query_database", "input": {"sql": ""}, "caller": {"type": "direct"} } ``` **Programmatic invocation:** ```json { "type": "tool_use", "id": "toolu_xyz789", "name": "query_database", "input": {"sql": ""}, "caller": { "type": "code_execution_20250825", "tool_id": "srvtoolu_abc123" } } ``` The `tool_id` references the code execution tool that made the programmatic call. ### Container lifecycle Programmatic tool calling uses the same containers as code execution: - **Container creation**: A new container is created for each session unless you reuse an existing one - **Expiration**: Containers expire after approximately 4.5 minutes of inactivity (subject to change) - **Container ID**: Returned in responses via the `container` field - **Reuse**: Pass the container ID to maintain state across requests When a tool is called programmatically and the container is waiting for your tool result, you must respond before the container expires. Monitor the `expires_at` field. If the container expires, Claude may treat the tool call as timed out and retry it. ## Example workflow Here's how a complete programmatic tool calling flow works: ### Step 1: Initial request Send a request with code execution and a tool that allows programmatic calling. To enable programmatic calling, add the `allowed_callers` field to your tool definition. Provide detailed descriptions of your tool's output format in the tool description. If you specify that the tool returns JSON, Claude will attempt to deserialize and process the result in code. The more detail you provide about the output schema, the better Claude can handle the response programmatically. ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["advanced-tool-use-2025-11-20"], max_tokens=4096, messages=[{ "role": "user", "content": "Query customer purchase history from the last quarter and identify our top 5 customers by revenue" }], tools=[ { "type": "code_execution_20250825", "name": "code_execution" }, { "name": "query_database", "description": "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", "input_schema": {...}, "allowed_callers": ["code_execution_20250825"] } ] ) ``` ```typescript TypeScript const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["advanced-tool-use-2025-11-20"], max_tokens: 4096, messages: [{ role: "user", content: "Query customer purchase history from the last quarter and identify our top 5 customers by revenue" }], tools: [ { type: "code_execution_20250825", name: "code_execution" }, { name: "query_database", description: "Execute a SQL query against the sales database. Returns a list of rows as JSON objects.", input_schema: {...}, allowed_callers: ["code_execution_20250825"] } ] }); ``` ### Step 2: API response with tool call Claude writes code that calls your tool. The API pauses and returns: ```json { "role": "assistant", "content": [ { "type": "text", "text": "I'll query the purchase history and analyze the results." }, { "type": "server_tool_use", "id": "srvtoolu_abc123", "name": "code_execution", "input": { "code": "results = await query_database('')\ntop_customers = sorted(results, key=lambda x: x['revenue'], reverse=True)[:5]\nprint(f'Top 5 customers: {top_customers}')" } }, { "type": "tool_use", "id": "toolu_def456", "name": "query_database", "input": {"sql": ""}, "caller": { "type": "code_execution_20250825", "tool_id": "srvtoolu_abc123" } } ], "container": { "id": "container_xyz789", "expires_at": "2025-01-15T14:30:00Z" }, "stop_reason": "tool_use" } ``` ### Step 3: Provide tool result Include the full conversation history plus your tool result: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5", betas=["advanced-tool-use-2025-11-20"], max_tokens=4096, container="container_xyz789", # Reuse the container messages=[ {"role": "user", "content": "Query customer purchase history from the last quarter and identify our top 5 customers by revenue"}, { "role": "assistant", "content": [ {"type": "text", "text": "I'll query the purchase history and analyze the results."}, { "type": "server_tool_use", "id": "srvtoolu_abc123", "name": "code_execution", "input": {"code": "..."} }, { "type": "tool_use", "id": "toolu_def456", "name": "query_database", "input": {"sql": ""}, "caller": { "type": "code_execution_20250825", "tool_id": "srvtoolu_abc123" } } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_def456", "content": "[{\"customer_id\": \"C1\", \"revenue\": 45000}, {\"customer_id\": \"C2\", \"revenue\": 38000}, ...]" } ] } ], tools=[...] ) ``` ```typescript TypeScript const response = await anthropic.beta.messages.create({ model: "claude-sonnet-4-5", betas: ["advanced-tool-use-2025-11-20"], max_tokens: 4096, container: "container_xyz789", // Reuse the container messages: [ { role: "user", content: "Query customer purchase history from the last quarter and identify our top 5 customers by revenue" }, { role: "assistant", content: [ { type: "text", text: "I'll query the purchase history and analyze the results." }, { type: "server_tool_use", id: "srvtoolu_abc123", name: "code_execution", input: { code: "..." } }, { type: "tool_use", id: "toolu_def456", name: "query_database", input: { sql: "" }, caller: { type: "code_execution_20250825", tool_id: "srvtoolu_abc123" } } ] }, { role: "user", content: [ { type: "tool_result", tool_use_id: "toolu_def456", content: "[{\"customer_id\": \"C1\", \"revenue\": 45000}, {\"customer_id\": \"C2\", \"revenue\": 38000}, ...]" } ] } ], tools: [...] }); ``` ### Step 4: Next tool call or completion The code execution continues and processes the results. If additional tool calls are needed, repeat Step 3 until all tool calls are satisfied. ### Step 5: Final response Once the code execution completes, Claude provides the final response: ```json { "content": [ { "type": "code_execution_tool_result", "tool_use_id": "srvtoolu_abc123", "content": { "type": "code_execution_result", "stdout": "Top 5 customers by revenue:\n1. Customer C1: $45,000\n2. Customer C2: $38,000\n3. Customer C5: $32,000\n4. Customer C8: $28,500\n5. Customer C3: $24,000", "stderr": "", "return_code": 0, "content": [] } }, { "type": "text", "text": "I've analyzed the purchase history from last quarter. Your top 5 customers generated $167,500 in total revenue, with Customer C1 leading at $45,000." } ], "stop_reason": "end_turn" } ``` ## Advanced patterns ### Batch processing with loops Claude can write code that processes multiple items efficiently: ```python # async wrapper omitted for clarity regions = ["West", "East", "Central", "North", "South"] results = {} for region in regions: data = await query_database(f"") results[region] = sum(row["revenue"] for row in data) # Process results programmatically top_region = max(results.items(), key=lambda x: x[1]) print(f"Top region: {top_region[0]} with ${top_region[1]:,} in revenue") ``` This pattern: - Reduces model round-trips from N (one per region) to 1 - Processes large result sets programmatically before returning to Claude - Saves tokens by only returning aggregated conclusions instead of raw data ### Early termination Claude can stop processing as soon as success criteria are met: ```python # async wrapper omitted for clarity endpoints = ["us-east", "eu-west", "apac"] for endpoint in endpoints: status = await check_health(endpoint) if status == "healthy": print(f"Found healthy endpoint: {endpoint}") break # Stop early, don't check remaining ``` ### Conditional tool selection ```python # async wrapper omitted for clarity file_info = await get_file_info(path) if file_info["size"] < 10000: content = await read_full_file(path) else: content = await read_file_summary(path) print(content) ``` ### Data filtering ```python # async wrapper omitted for clarity logs = await fetch_logs(server_id) errors = [log for log in logs if "ERROR" in log] print(f"Found {len(errors)} errors") for error in errors[-10:]: # Only return last 10 errors print(error) ``` ## Response format ### Programmatic tool call When code execution calls a tool: ```json { "type": "tool_use", "id": "toolu_abc123", "name": "query_database", "input": {"sql": ""}, "caller": { "type": "code_execution_20250825", "tool_id": "srvtoolu_xyz789" } } ``` ### Tool result handling Your tool result is passed back to the running code: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_abc123", "content": "[{\"customer_id\": \"C1\", \"revenue\": 45000, \"orders\": 23}, {\"customer_id\": \"C2\", \"revenue\": 38000, \"orders\": 18}, ...]" } ] } ``` ### Code execution completion When all tool calls are satisfied and code completes: ```json { "type": "code_execution_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": { "type": "code_execution_result", "stdout": "Analysis complete. Top 5 customers identified from 847 total records.", "stderr": "", "return_code": 0, "content": [] } } ``` ## Error handling ### Common errors | Error | Description | Solution | |-------|-------------|----------| | `invalid_tool_input` | Tool input doesn't match schema | Validate your tool's input_schema | | `tool_not_allowed` | Tool doesn't allow the requested caller type | Check `allowed_callers` includes the right contexts | | `missing_beta_header` | PTC beta header not provided | Add both beta headers to your request | ### Container expiration during tool call If your tool takes too long to respond, the code execution will receive a `TimeoutError`. Claude sees this in stderr and will typically retry: ```json { "type": "code_execution_tool_result", "tool_use_id": "srvtoolu_abc123", "content": { "type": "code_execution_result", "stdout": "", "stderr": "TimeoutError: Calling tool ['query_database'] timed out.", "return_code": 0, "content": [] } } ``` To prevent timeouts: - Monitor the `expires_at` field in responses - Implement timeouts for your tool execution - Consider breaking long operations into smaller chunks ### Tool execution errors If your tool returns an error: ```python # Provide error information in the tool result { "type": "tool_result", "tool_use_id": "toolu_abc123", "content": "Error: Query timeout - table lock exceeded 30 seconds" } ``` Claude's code will receive this error and can handle it appropriately. ## Constraints and limitations ### Feature incompatibilities - **Structured outputs**: Tools with `strict: true` are not supported with programmatic calling - **Tool choice**: You cannot force programmatic calling of a specific tool via `tool_choice` - **Parallel tool use**: `disable_parallel_tool_use: true` is not supported with programmatic calling ### Tool restrictions The following tools cannot currently be called programmatically, but support may be added in future releases: - Web search - Web fetch - Tools provided by an [MCP connector](/docs/en/agents-and-tools/mcp-connector) ### Message formatting restrictions When responding to programmatic tool calls, there are strict formatting requirements: **Tool result only responses**: If there are pending programmatic tool calls waiting for results, your response message must contain **only** `tool_result` blocks. You cannot include any text content, even after the tool results. ```json // ❌ INVALID - Cannot include text when responding to programmatic tool calls { "role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_01", "content": "[{\"customer_id\": \"C1\", \"revenue\": 45000}]"}, {"type": "text", "text": "What should I do next?"} // This will cause an error ] } // ✅ VALID - Only tool results when responding to programmatic tool calls { "role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_01", "content": "[{\"customer_id\": \"C1\", \"revenue\": 45000}]"} ] } ``` This restriction only applies when responding to programmatic (code execution) tool calls. For regular client-side tool calls, you can include text content after tool results. ### Rate limits Programmatic tool calls are subject to the same rate limits as regular tool calls. Each tool call from code execution counts as a separate invocation. ### Validate tool results before use When implementing custom tools that will be called programmatically: - **Tool results are returned as strings**: They can contain any content, including code snippets or executable commands that may be processed by the execution environment. - **Validate external tool results**: If your tool returns data from external sources or accepts user input, be aware of code injection risks if the output will be interpreted or executed as code. ## Token efficiency Programmatic tool calling can significantly reduce token consumption: - **Tool results from programmatic calls are not added to Claude's context** - only the final code output is - **Intermediate processing happens in code** - filtering, aggregation, etc. don't consume model tokens - **Multiple tool calls in one code execution** - reduces overhead compared to separate model turns For example, calling 10 tools directly uses ~10x the tokens of calling them programmatically and returning a summary. ## Usage and pricing Programmatic tool calling uses the same pricing as code execution. See the [code execution pricing](/docs/en/agents-and-tools/tool-use/code-execution-tool#usage-and-pricing) for details. Token counting for programmatic tool calls: Tool results from programmatic invocations do not count toward your input/output token usage. Only the final code execution result and Claude's response count. ## Best practices ### Tool design - **Provide detailed output descriptions**: Since Claude deserializes tool results in code, clearly document the format (JSON structure, field types, etc.) - **Return structured data**: JSON or other easily parseable formats work best for programmatic processing - **Keep responses concise**: Return only necessary data to minimize processing overhead ### When to use programmatic calling **Good use cases:** - Processing large datasets where you only need aggregates or summaries - Multi-step workflows with 3+ dependent tool calls - Operations requiring filtering, sorting, or transformation of tool results - Tasks where intermediate data shouldn't influence Claude's reasoning - Parallel operations across many items (e.g., checking 50 endpoints) **Less ideal use cases:** - Single tool calls with simple responses - Tools that need immediate user feedback - Very fast operations where code execution overhead would outweigh the benefit ### Performance optimization - **Reuse containers** when making multiple related requests to maintain state - **Batch similar operations** in a single code execution when possible ## Troubleshooting ### Common issues **"Tool not allowed" error** - Verify your tool definition includes `"allowed_callers": ["code_execution_20250825"]` - Check that you're using the correct beta headers **Container expiration** - Ensure you respond to tool calls within the container's lifetime (~4.5 minutes) - Monitor the `expires_at` field in responses - Consider implementing faster tool execution **Beta header issues** - You need the header: `"advanced-tool-use-2025-11-20"` **Tool result not parsed correctly** - Ensure your tool returns string data that Claude can deserialize - Provide clear output format documentation in your tool description ### Debugging tips 1. **Log all tool calls and results** to track the flow 2. **Check the `caller` field** to confirm programmatic invocation 3. **Monitor container IDs** to ensure proper reuse 4. **Test tools independently** before enabling programmatic calling ## Why programmatic tool calling works Claude's training includes extensive exposure to code, making it effective at reasoning through and chaining function calls. When tools are presented as callable functions within a code execution environment, Claude can leverage this strength to: - **Reason naturally about tool composition**: Chain operations and handle dependencies as naturally as writing any Python code - **Process large results efficiently**: Filter down large tool outputs, extract only relevant data, or write intermediate results to files before returning summaries to the context window - **Reduce latency significantly**: Eliminate the overhead of re-sampling Claude between each tool call in multi-step workflows This approach enables workflows that would be impractical with traditional tool use—such as processing files over 1M tokens—by allowing Claude to work with data programmatically rather than loading everything into the conversation context. ## Alternative implementations Programmatic tool calling is a generalizable pattern that can be implemented outside of Anthropic's managed code execution. Here's an overview of the approaches: ### Client-side direct execution Provide Claude with a code execution tool and describe what functions are available in that environment. When Claude invokes the tool with code, your application executes it locally where those functions are defined. **Advantages:** - Simple to implement with minimal re-architecting - Full control over the environment and instructions **Disadvantages:** - Executes untrusted code outside of a sandbox - Tool invocations can be vectors for code injection **Use when:** Your application can safely execute arbitrary code, you want a simple solution, and Anthropic's managed offering doesn't fit your needs. ### Self-managed sandboxed execution Same approach from Claude's perspective, but code runs in a sandboxed container with security restrictions (e.g., no network egress). If your tools require external resources, you'll need a protocol for executing tool calls outside the sandbox. **Advantages:** - Safe programmatic tool calling on your own infrastructure - Full control over the execution environment **Disadvantages:** - Complex to build and maintain - Requires managing both infrastructure and inter-process communication **Use when:** Security is critical and Anthropic's managed solution doesn't fit your requirements. ### Anthropic-managed execution Anthropic's programmatic tool calling is a managed version of sandboxed execution with an opinionated Python environment tuned for Claude. Anthropic handles container management, code execution, and secure tool invocation communication. **Advantages:** - Safe and secure by default - Easy to enable with minimal configuration - Environment and instructions optimized for Claude We recommend using Anthropic's managed solution if you're using the Claude API. ## Related features Learn about the underlying code execution capability that powers programmatic tool calling. Understand the fundamentals of tool use with Claude. Step-by-step guide for implementing tools. --- # Text editor tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/text-editor-tool # Text editor tool --- Claude can use an Anthropic-defined text editor tool to view and modify text files, helping you debug, fix, and improve your code or other text documents. This allows Claude to directly interact with your files, providing hands-on assistance rather than just suggesting changes. ## Model compatibility | Model | Tool Version | |-------|--------------| | Claude 4.x models | `text_editor_20250728` | | Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) | `text_editor_20250124` | The `text_editor_20250728` tool for Claude 4 models does not include the `undo_edit` command. If you require this functionality, you'll need to use Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)). Older tool versions are not guaranteed to be backwards-compatible with newer models. Always use the tool version that corresponds to your model version. ## When to use the text editor tool Some examples of when to use the text editor tool are: - **Code debugging**: Have Claude identify and fix bugs in your code, from syntax errors to logic issues. - **Code refactoring**: Let Claude improve your code structure, readability, and performance through targeted edits. - **Documentation generation**: Ask Claude to add docstrings, comments, or README files to your codebase. - **Test creation**: Have Claude create unit tests for your code based on its understanding of the implementation. ## Use the text editor tool Provide the text editor tool (named `str_replace_based_edit_tool`) to Claude using the Messages API. You can optionally specify a `max_characters` parameter to control truncation when viewing large files. `max_characters` is only compatible with `text_editor_20250728` and later versions of the text editor tool. ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool", "max_characters": 10000 } ], "messages": [ { "role": "user", "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool", "max_characters": 10000 } ], messages=[ { "role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?" } ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { type: "text_editor_20250728", name: "str_replace_based_edit_tool", max_characters: 10000 } ], messages: [ { role: "user", content: "There's a syntax error in my primes.py file. Can you help me fix it?" } ] }); ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ToolStrReplaceBasedEditTool20250728; public class TextEditorToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); ToolStrReplaceBasedEditTool20250728 editorTool = ToolStrReplaceBasedEditTool20250728.builder() .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_0) .maxTokens(1024) .addTool(editorTool) .addUserMessage("There's a syntax error in my primes.py file. Can you help me fix it?") .build(); Message message = client.messages().create(params); } } ``` Provide the text editor tool (named `str_replace_editor`) to Claude using the Messages API: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-3-7-sonnet-20250219", "max_tokens": 1024, "tools": [ { "type": "text_editor_20250124", "name": "str_replace_editor" } ], "messages": [ { "role": "user", "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-3-7-sonnet-20250219", max_tokens=1024, tools=[ { "type": "text_editor_20250124", "name": "str_replace_editor" } ], messages=[ { "role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?" } ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const response = await anthropic.messages.create({ model: "claude-3-7-sonnet-20250219", max_tokens: 1024, tools: [ { type: "text_editor_20250124", name: "str_replace_editor" } ], messages: [ { role: "user", content: "There's a syntax error in my primes.py file. Can you help me fix it?" } ] }); ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ToolTextEditor20250124; public class TextEditorToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); ToolTextEditor20250124 editorTool = ToolTextEditor20250124.builder() .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_3_7_SONNET_LATEST) .maxTokens(1024) .addTool(editorTool) .addUserMessage("There's a syntax error in my primes.py file. Can you help me fix it?") .build(); Message message = client.messages().create(params); } } ``` The text editor tool can be used in the following way: - Include the text editor tool in your API request - Provide a user prompt that may require examining or modifying files, such as "Can you fix the syntax error in my code?" - Claude assesses what it needs to look at and uses the `view` command to examine file contents or list directory contents - The API response will contain a `tool_use` content block with the `view` command - Extract the file or directory path from Claude's tool use request - Read the file's contents or list the directory contents - If a `max_characters` parameter was specified in the tool configuration, truncate the file contents to that length - Return the results to Claude by continuing the conversation with a new `user` message containing a `tool_result` content block - After examining the file or directory, Claude may use a command such as `str_replace` to make changes or `insert` to add text at a specific line number. - If Claude uses the `str_replace` command, Claude constructs a properly formatted tool use request with the old text and new text to replace it with - Extract the file path, old text, and new text from Claude's tool use request - Perform the text replacement in the file - Return the results to Claude - After examining and possibly editing the files, Claude provides a complete explanation of what it found and what changes it made ### Text editor tool commands The text editor tool supports several commands for viewing and modifying files: #### view The `view` command allows Claude to examine the contents of a file or list the contents of a directory. It can read the entire file or a specific range of lines. Parameters: - `command`: Must be "view" - `path`: The path to the file or directory to view - `view_range` (optional): An array of two integers specifying the start and end line numbers to view. Line numbers are 1-indexed, and -1 for the end line means read to the end of the file. This parameter only applies when viewing files, not directories.
```json // Example for viewing a file { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "str_replace_editor", "input": { "command": "view", "path": "primes.py" } } // Example for viewing a directory { "type": "tool_use", "id": "toolu_02B19r91rw91mr917835mr9", "name": "str_replace_editor", "input": { "command": "view", "path": "src/" } } ```
#### str_replace The `str_replace` command allows Claude to replace a specific string in a file with a new string. This is used for making precise edits. Parameters: - `command`: Must be "str_replace" - `path`: The path to the file to modify - `old_str`: The text to replace (must match exactly, including whitespace and indentation) - `new_str`: The new text to insert in place of the old text
```json { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "str_replace_editor", "input": { "command": "str_replace", "path": "primes.py", "old_str": "for num in range(2, limit + 1)", "new_str": "for num in range(2, limit + 1):" } } ```
#### create The `create` command allows Claude to create a new file with specified content. Parameters: - `command`: Must be "create" - `path`: The path where the new file should be created - `file_text`: The content to write to the new file
```json { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "str_replace_editor", "input": { "command": "create", "path": "test_primes.py", "file_text": "import unittest\nimport primes\n\nclass TestPrimes(unittest.TestCase):\n def test_is_prime(self):\n self.assertTrue(primes.is_prime(2))\n self.assertTrue(primes.is_prime(3))\n self.assertFalse(primes.is_prime(4))\n\nif __name__ == '__main__':\n unittest.main()" } } ```
#### insert The `insert` command allows Claude to insert text at a specific location in a file. Parameters: - `command`: Must be "insert" - `path`: The path to the file to modify - `insert_line`: The line number after which to insert the text (0 for beginning of file) - `new_str`: The text to insert
```json { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "str_replace_editor", "input": { "command": "insert", "path": "primes.py", "insert_line": 0, "new_str": "\"\"\"Module for working with prime numbers.\n\nThis module provides functions to check if a number is prime\nand to generate a list of prime numbers up to a given limit.\n\"\"\"\n" } } ```
#### undo_edit The `undo_edit` command allows Claude to revert the last edit made to a file. This command is only available in Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)). It is not supported in Claude 4 models using the `text_editor_20250728`. Parameters: - `command`: Must be "undo_edit" - `path`: The path to the file whose last edit should be undone
```json { "type": "tool_use", "id": "toolu_01A09q90qw90lq917835lq9", "name": "str_replace_editor", "input": { "command": "undo_edit", "path": "primes.py" } } ```
### Example: Fixing a syntax error with the text editor tool This example demonstrates how Claude 4 models use the text editor tool to fix a syntax error in a Python file. First, your application provides Claude with the text editor tool and a prompt to fix a syntax error: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ], "messages": [ { "role": "user", "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ], messages=[ { "role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?" } ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { type: "text_editor_20250728", name: "str_replace_based_edit_tool" } ], messages: [ { role: "user", content: "There's a syntax error in my primes.py file. Can you help me fix it?" } ] }); ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ToolStrReplaceBasedEditTool20250728; public class TextEditorToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); ToolStrReplaceBasedEditTool20250728 editorTool = ToolStrReplaceBasedEditTool20250728.builder() .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_0) .maxTokens(1024) .addTool(editorTool) .addUserMessage("There's a syntax error in my primes.py file. Can you help me fix it?") .build(); Message message = client.messages().create(params); } } ``` Claude will use the text editor tool first to view the file: ```json { "id": "msg_01XAbCDeFgHiJkLmNoPQrStU", "model": "claude-sonnet-4-5", "stop_reason": "tool_use", "role": "assistant", "content": [ { "type": "text", "text": "I'll help you fix the syntax error in your primes.py file. First, let me take a look at the file to identify the issue." }, { "type": "tool_use", "id": "toolu_01AbCdEfGhIjKlMnOpQrStU", "name": "str_replace_based_edit_tool", "input": { "command": "view", "path": "primes.py" } } ] } ``` Your application should then read the file and return its contents to Claude: ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "content-type: application/json" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "tools": [ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ], "messages": [ { "role": "user", "content": "There'\''s a syntax error in my primes.py file. Can you help me fix it?" }, { "role": "assistant", "content": [ { "type": "text", "text": "I'\''ll help you fix the syntax error in your primes.py file. First, let me take a look at the file to identify the issue." }, { "type": "tool_use", "id": "toolu_01AbCdEfGhIjKlMnOpQrStU", "name": "str_replace_based_edit_tool", "input": { "command": "view", "path": "primes.py" } } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01AbCdEfGhIjKlMnOpQrStU", "content": "1: def is_prime(n):\n2: \"\"\"Check if a number is prime.\"\"\"\n3: if n <= 1:\n4: return False\n5: if n <= 3:\n6: return True\n7: if n % 2 == 0 or n % 3 == 0:\n8: return False\n9: i = 5\n10: while i * i <= n:\n11: if n % i == 0 or n % (i + 2) == 0:\n12: return False\n13: i += 6\n14: return True\n15: \n16: def get_primes(limit):\n17: \"\"\"Generate a list of prime numbers up to the given limit.\"\"\"\n18: primes = []\n19: for num in range(2, limit + 1)\n20: if is_prime(num):\n21: primes.append(num)\n22: return primes\n23: \n24: def main():\n25: \"\"\"Main function to demonstrate prime number generation.\"\"\"\n26: limit = 100\n27: prime_list = get_primes(limit)\n28: print(f\"Prime numbers up to {limit}:\")\n29: print(prime_list)\n30: print(f\"Found {len(prime_list)} prime numbers.\")\n31: \n32: if __name__ == \"__main__\":\n33: main()" } ] } ] }' ``` ```python Python response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ], messages=[ { "role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?" }, { "role": "assistant", "content": [ { "type": "text", "text": "I'll help you fix the syntax error in your primes.py file. First, let me take a look at the file to identify the issue." }, { "type": "tool_use", "id": "toolu_01AbCdEfGhIjKlMnOpQrStU", "name": "str_replace_based_edit_tool", "input": { "command": "view", "path": "primes.py" } } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01AbCdEfGhIjKlMnOpQrStU", "content": "1: def is_prime(n):\n2: \"\"\"Check if a number is prime.\"\"\"\n3: if n <= 1:\n4: return False\n5: if n <= 3:\n6: return True\n7: if n % 2 == 0 or n % 3 == 0:\n8: return False\n9: i = 5\n10: while i * i <= n:\n11: if n % i == 0 or n % (i + 2) == 0:\n12: return False\n13: i += 6\n14: return True\n15: \n16: def get_primes(limit):\n17: \"\"\"Generate a list of prime numbers up to the given limit.\"\"\"\n18: primes = []\n19: for num in range(2, limit + 1)\n20: if is_prime(num):\n21: primes.append(num)\n22: return primes\n23: \n24: def main():\n25: \"\"\"Main function to demonstrate prime number generation.\"\"\"\n26: limit = 100\n27: prime_list = get_primes(limit)\n28: print(f\"Prime numbers up to {limit}:\")\n29: print(prime_list)\n30: print(f\"Found {len(prime_list)} prime numbers.\")\n31: \n32: if __name__ == \"__main__\":\n33: main()" } ] } ] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { type: "text_editor_20250728", name: "str_replace_based_edit_tool" } ], messages: [ { role: "user", content: "There's a syntax error in my primes.py file. Can you help me fix it?" }, { role: "assistant", content: [ { type: "text", text: "I'll help you fix the syntax error in your primes.py file. First, let me take a look at the file to identify the issue." }, { type: "tool_use", id: "toolu_01AbCdEfGhIjKlMnOpQrStU", name: "str_replace_based_edit_tool", input: { command: "view", path: "primes.py" } } ] }, { role: "user", content: [ { type: "tool_result", tool_use_id: "toolu_01AbCdEfGhIjKlMnOpQrStU", content: "1: def is_prime(n):\n2: \"\"\"Check if a number is prime.\"\"\"\n3: if n <= 1:\n4: return False\n5: if n <= 3:\n6: return True\n7: if n % 2 == 0 or n % 3 == 0:\n8: return False\n9: i = 5\n10: while i * i <= n:\n11: if n % i == 0 or n % (i + 2) == 0:\n12: return False\n13: i += 6\n14: return True\n15: \n16: def get_primes(limit):\n17: \"\"\"Generate a list of prime numbers up to the given limit.\"\"\"\n18: primes = []\n19: for num in range(2, limit + 1)\n20: if is_prime(num):\n21: primes.append(num)\n22: return primes\n23: \n24: def main():\n25: \"\"\"Main function to demonstrate prime number generation.\"\"\"\n26: limit = 100\n27: prime_list = get_primes(limit)\n28: print(f\"Prime numbers up to {limit}:\")\n29: print(prime_list)\n30: print(f\"Found {len(prime_list)} prime numbers.\")\n31: \n32: if __name__ == \"__main__\":\n33: main()" } ] } ] }); ``` ```java Java import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ToolStrReplaceBasedEditTool20250728; public class TextEditorToolExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); ToolStrReplaceBasedEditTool20250728 editorTool = ToolStrReplaceBasedEditTool20250728.builder() .build(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_0) .maxTokens(1024) .addTool(editorTool) .addUserMessage("There's a syntax error in my primes.py file. Can you help me fix it?") .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` **Line numbers** In the example above, the `view` tool result includes file contents with line numbers prepended to each line (e.g., "1: def is_prime(n):"). Line numbers are not required, but they are essential for successfully using the `view_range` parameter to examine specific sections of files and the `insert_line` parameter to add content at precise locations. Claude will identify the syntax error and use the `str_replace` command to fix it: ```json { "id": "msg_01VwXyZAbCdEfGhIjKlMnO", "model": "claude-sonnet-4-5", "stop_reason": "tool_use", "role": "assistant", "content": [ { "type": "text", "text": "I found the syntax error in your primes.py file. In the `get_primes` function, there is a missing colon (:) at the end of the for loop line. Let me fix that for you." }, { "type": "tool_use", "id": "toolu_01PqRsTuVwXyZAbCdEfGh", "name": "str_replace_based_edit_tool", "input": { "command": "str_replace", "path": "primes.py", "old_str": " for num in range(2, limit + 1)", "new_str": " for num in range(2, limit + 1):" } } ] } ``` Your application should then make the edit and return the result: ```python Python response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, tools=[ { "type": "text_editor_20250728", "name": "str_replace_based_edit_tool" } ], messages=[ # Previous messages... { "role": "assistant", "content": [ { "type": "text", "text": "I found the syntax error in your primes.py file. In the `get_primes` function, there is a missing colon (:) at the end of the for loop line. Let me fix that for you." }, { "type": "tool_use", "id": "toolu_01PqRsTuVwXyZAbCdEfGh", "name": "str_replace_based_edit_tool", "input": { "command": "str_replace", "path": "primes.py", "old_str": " for num in range(2, limit + 1)", "new_str": " for num in range(2, limit + 1):" } } ] }, { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01PqRsTuVwXyZAbCdEfGh", "content": "Successfully replaced text at exactly one location." } ] } ] ) ``` ```typescript TypeScript const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, tools: [ { type: "text_editor_20250728", name: "str_replace_based_edit_tool" } ], messages: [ // Previous messages... { role: "assistant", content: [ { type: "text", text: "I found the syntax error in your primes.py file. In the `get_primes` function, there is a missing colon (:) at the end of the for loop line. Let me fix that for you." }, { type: "tool_use", id: "toolu_01PqRsTuVwXyZAbCdEfGh", name: "str_replace_based_edit_tool", input: { command: "str_replace", path: "primes.py", old_str: " for num in range(2, limit + 1)", new_str: " for num in range(2, limit + 1):" } } ] }, { role: "user", content: [ { type: "tool_result", tool_use_id: "toolu_01PqRsTuVwXyZAbCdEfGh", content: "Successfully replaced text at exactly one location." } ] } ] }); ``` ```java Java import java.util.List; import java.util.Map; import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.core.JsonValue; import com.anthropic.models.messages.ContentBlockParam; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.MessageParam; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.TextBlockParam; import com.anthropic.models.messages.ToolResultBlockParam; import com.anthropic.models.messages.ToolStrReplaceBasedEditTool20250728; import com.anthropic.models.messages.ToolUseBlockParam; public class TextEditorConversationExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_SONNET_4_0) .maxTokens(1024) .addTool(ToolStrReplaceBasedEditTool20250728.builder() .build()) // Previous messages would go here .addAssistantMessageOfBlockParams( List.of( ContentBlockParam.ofText(TextBlockParam.builder() .text("I found the syntax error in your primes.py file. In the `get_primes` function, there is a missing colon (:) at the end of the for loop line. Let me fix that for you.") .build()), ContentBlockParam.ofToolUse(ToolUseBlockParam.builder() .id("toolu_01PqRsTuVwXyZAbCdEfGh") .name("str_replace_based_edit_tool") .input(JsonValue.from(Map.of( "command", "str_replace", "path", "primes.py", "old_str", " for num in range(2, limit + 1)", "new_str", " for num in range(2, limit + 1):" ))) .build() ) ) ) .addUserMessageOfBlockParams(List.of( ContentBlockParam.ofToolResult(ToolResultBlockParam.builder() .toolUseId("toolu_01PqRsTuVwXyZAbCdEfGh") .content("Successfully replaced text at exactly one location.") .build() ) )) .build(); Message message = client.messages().create(params); System.out.println(message); } } ``` Finally, Claude will provide a complete explanation of the fix: ```json { "id": "msg_01IjKlMnOpQrStUvWxYzAb", "model": "claude-sonnet-4-5", "stop_reason": "end_turn", "role": "assistant", "content": [ { "type": "text", "text": "I've fixed the syntax error in your primes.py file. The issue was in the `get_primes` function at line 19. There was a missing colon (:) at the end of the for loop line.\n\nHere's what I changed:\n\nFrom:\n```python\nfor num in range(2, limit + 1)\n```\n\nTo:\n```python\nfor num in range(2, limit + 1):\n```\n\nIn Python, a colon is required at the end of statements that start a new code block, such as for loops, if statements, function definitions, etc. Without the colon, Python raises a syntax error because it doesn't know where the code block begins.\n\nYour code should now run correctly. The function will properly generate a list of prime numbers up to the specified limit." } ] } ``` This example demonstrates how Claude Sonnet 3.7 uses the text editor tool to fix a syntax error in a Python file. First, your application provides Claude with the text editor tool and a prompt to fix a syntax error: ```python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-3-7-sonnet-20250219", max_tokens=1024, tools=[ { "type": "text_editor_20250124", "name": "str_replace_editor" } ], messages=[ { "role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?" } ] ) print(response) ``` The Claude Sonnet 3.7 examples follow the same format as the Claude 4 examples above, using the same tool calls and responses but with the `text_editor_20250124` tool type and `str_replace_editor` name. *** ## Implement the text editor tool The text editor tool is implemented as a schema-less tool. When using this tool, you don't need to provide an input schema as with other tools; the schema is built into Claude's model and can't be modified. The tool type depends on the model version: - **Claude 4**: `type: "text_editor_20250728"` - **Claude Sonnet 3.7**: `type: "text_editor_20250124"` Create helper functions to handle file operations like reading, writing, and modifying files. Consider implementing backup functionality to recover from mistakes. Create a function that processes tool calls from Claude based on the command type: ```python def handle_editor_tool(tool_call, model_version): input_params = tool_call.input command = input_params.get('command', '') file_path = input_params.get('path', '') if command == 'view': # Read and return file contents pass elif command == 'str_replace': # Replace text in file pass elif command == 'create': # Create new file pass elif command == 'insert': # Insert text at location pass elif command == 'undo_edit': # Check if it's a Claude 4 model if 'str_replace_based_edit_tool' in model_version: return {"error": "undo_edit command is not supported in Claude 4"} # Restore from backup for Claude 3.7 pass ``` Add validation and security checks: - Validate file paths to prevent directory traversal - Create backups before making changes - Handle errors gracefully - Implement permissions checks Extract and handle tool calls from Claude's responses: ```python # Process tool use in Claude's response for content in response.content: if content.type == "tool_use": # Execute the tool based on command result = handle_editor_tool(content) # Return result to Claude tool_result = { "type": "tool_result", "tool_use_id": content.id, "content": result } ``` When implementing the text editor tool, keep in mind: 1. **Security**: The tool has access to your local filesystem, so implement proper security measures. 2. **Backup**: Always create backups before allowing edits to important files. 3. **Validation**: Validate all inputs to prevent unintended changes. 4. **Unique matching**: Make sure replacements match exactly one location to avoid unintended edits. ### Handle errors When using the text editor tool, various errors may occur. Here is guidance on how to handle them:
If Claude tries to view or modify a file that doesn't exist, return an appropriate error message in the `tool_result`: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: File not found", "is_error": true } ] } ```
If Claude's `str_replace` command matches multiple locations in the file, return an appropriate error message: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Found 3 matches for replacement text. Please provide more context to make a unique match.", "is_error": true } ] } ```
If Claude's `str_replace` command doesn't match any text in the file, return an appropriate error message: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: No match found for replacement. Please check your text and try again.", "is_error": true } ] } ```
If there are permission issues with creating, reading, or modifying files, return an appropriate error message: ```json { "role": "user", "content": [ { "type": "tool_result", "tool_use_id": "toolu_01A09q90qw90lq917835lq9", "content": "Error: Permission denied. Cannot write to file.", "is_error": true } ] } ```
### Follow implementation best practices
When asking Claude to fix or modify code, be specific about what files need to be examined or what issues need to be addressed. Clear context helps Claude identify the right files and make appropriate changes. **Less helpful prompt**: "Can you fix my code?" **Better prompt**: "There's a syntax error in my primes.py file that prevents it from running. Can you fix it?"
Specify file paths clearly when needed, especially if you're working with multiple files or files in different directories. **Less helpful prompt**: "Review my helper file" **Better prompt**: "Can you check my utils/helpers.py file for any performance issues?"
Implement a backup system in your application that creates copies of files before allowing Claude to edit them, especially for important or production code. ```python def backup_file(file_path): """Create a backup of a file before editing.""" backup_path = f"{file_path}.backup" if os.path.exists(file_path): with open(file_path, 'r') as src, open(backup_path, 'w') as dst: dst.write(src.read()) ```
The `str_replace` command requires an exact match for the text to be replaced. Your application should ensure that there is exactly one match for the old text or provide appropriate error messages. ```python def safe_replace(file_path, old_text, new_text): """Replace text only if there's exactly one match.""" with open(file_path, 'r') as f: content = f.read() count = content.count(old_text) if count == 0: return "Error: No match found" elif count > 1: return f"Error: Found {count} matches" else: new_content = content.replace(old_text, new_text) with open(file_path, 'w') as f: f.write(new_content) return "Successfully replaced text" ```
After Claude makes changes to a file, verify the changes by running tests or checking that the code still works as expected. ```python def verify_changes(file_path): """Run tests or checks after making changes.""" try: # For Python files, check for syntax errors if file_path.endswith('.py'): import ast with open(file_path, 'r') as f: ast.parse(f.read()) return "Syntax check passed" except Exception as e: return f"Verification failed: {str(e)}" ```
--- ## Pricing and token usage The text editor tool uses the same pricing structure as other tools used with Claude. It follows the standard input and output token pricing based on the Claude model you're using. In addition to the base tokens, the following additional input tokens are needed for the text editor tool: | Tool | Additional input tokens | | ----------------------------------------- | --------------------------------------- | | `text_editor_20250429` (Claude 4.x) | 700 tokens | | `text_editor_20250124` (Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations))) | 700 tokens | For more detailed information about tool pricing, see [Tool use pricing](/docs/en/agents-and-tools/tool-use/overview#pricing). ## Integrate the text editor tool with other tools The text editor tool can be used alongside other Claude tools. When combining tools, ensure you: - Match the tool version with the model you're using - Account for the additional token usage for all tools included in your request ## Change log | Date | Version | Changes | | ---- | ------- | ------- | | July 28, 2025 | `text_editor_20250728` | Release of an updated text editor Tool that fixes some issues and adds an optional `max_characters` parameter. It is otherwise identical to `text_editor_20250429`. | | April 29, 2025 | `text_editor_20250429` | Release of the text editor Tool for Claude 4. This version removes the `undo_edit` command but maintains all other capabilities. The tool name has been updated to reflect its str_replace-based architecture. | | March 13, 2025 | `text_editor_20250124` | Introduction of standalone text editor Tool documentation. This version is optimized for Claude Sonnet 3.7 but has identical capabilities to the previous version. | | October 22, 2024 | `text_editor_20241022` | Initial release of the text editor Tool with Claude Sonnet 3.5 ([retired](/docs/en/about-claude/model-deprecations)). Provides capabilities for viewing, creating, and editing files through the `view`, `create`, `str_replace`, `insert`, and `undo_edit` commands. | ## Next steps Here are some ideas for how to use the text editor tool in more convenient and powerful ways: - **Integrate with your development workflow**: Build the text editor tool into your development tools or IDE - **Create a code review system**: Have Claude review your code and make improvements - **Build a debugging assistant**: Create a system where Claude can help you diagnose and fix issues in your code - **Implement file format conversion**: Let Claude help you convert files from one format to another - **Automate documentation**: Set up workflows for Claude to automatically document your code As you build applications with the text editor tool, we're excited to see how you leverage Claude's capabilities to enhance your development workflow and productivity. Learn how to implement tool workflows for use with Claude. Execute shell commands with Claude. --- # Tool search tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool # Tool search tool --- The tool search tool enables Claude to work with hundreds or thousands of tools by dynamically discovering and loading them on-demand. Instead of loading all tool definitions into the context window upfront, Claude searches your tool catalog—including tool names, descriptions, argument names, and argument descriptions—and loads only the tools it needs. This approach solves two critical challenges as tool libraries scale: - **Context efficiency**: Tool definitions can consume massive portions of your context window (50 tools ≈ 10-20K tokens), leaving less room for actual work - **Tool selection accuracy**: Claude's ability to correctly select tools degrades significantly with more than 30-50 conventionally-available tools Although this is provided as a server-side tool, you can also implement your own client-side tool search functionality. See [Custom tool search implementation](#custom-tool-search-implementation) for details. The tool search tool is currently in public beta. Include the appropriate [beta header](/docs/en/api/beta-headers) for your provider: | Provider | Beta header | Supported models | | ------------------------ | ------------------------------ | -------------------------------------- | | Claude API
Microsoft Foundry | `advanced-tool-use-2025-11-20` | Claude Opus 4.5
Claude Sonnet 4.5 | | Google Cloud's Vertex AI | `tool-search-tool-2025-10-19` | Claude Opus 4.5
Claude Sonnet 4.5 | | Amazon Bedrock | `tool-search-tool-2025-10-19` | Claude Opus 4.5
Claude Sonnet 4.5 | Please reach out through our [feedback form](https://forms.gle/MhcGFFwLxuwnWTkYA) to share your feedback on this feature.
On Amazon Bedrock, server-side tool search is available only via the [invoke API](https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-runtime_example_bedrock-runtime_InvokeModel_AnthropicClaude_section.html), not the converse API. You can also implement [client-side tool search](#custom-tool-search-implementation) by returning `tool_reference` blocks from your own search implementation. ## How tool search works There are two tool search variants: - **Regex** (`tool_search_tool_regex_20251119`): Claude constructs regex patterns to search for tools - **BM25** (`tool_search_tool_bm25_20251119`): Claude uses natural language queries to search for tools When you enable the tool search tool: 1. You include a tool search tool (e.g., `tool_search_tool_regex_20251119` or `tool_search_tool_bm25_20251119`) in your tools list 2. You provide all tool definitions with `defer_loading: true` for tools that shouldn't be loaded immediately 3. Claude sees only the tool search tool and any non-deferred tools initially 4. When Claude needs additional tools, it searches using a tool search tool 5. The API returns 3-5 most relevant `tool_reference` blocks 6. These references are automatically expanded into full tool definitions 7. Claude selects from the discovered tools and invokes them This keeps your context window efficient while maintaining high tool selection accuracy. ## Quick start Here's a simple example with deferred tools: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: advanced-tool-use-2025-11-20" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 2048, "messages": [ { "role": "user", "content": "What is the weather in San Francisco?" } ], "tools": [ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get the weather at a specific location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"}, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] }, "defer_loading": true }, { "name": "search_files", "description": "Search through files in the workspace", "input_schema": { "type": "object", "properties": { "query": {"type": "string"}, "file_types": { "type": "array", "items": {"type": "string"} } }, "required": ["query"] }, "defer_loading": true } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20"], max_tokens=2048, messages=[ { "role": "user", "content": "What is the weather in San Francisco?" } ], tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get the weather at a specific location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"}, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] }, "defer_loading": True }, { "name": "search_files", "description": "Search through files in the workspace", "input_schema": { "type": "object", "properties": { "query": {"type": "string"}, "file_types": { "type": "array", "items": {"type": "string"} } }, "required": ["query"] }, "defer_loading": True } ] ) print(response) ``` ```typescript TypeScript import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); async function main() { const response = await client.beta.messages.create({ model: "claude-sonnet-4-5-20250929", betas: ["advanced-tool-use-2025-11-20"], max_tokens: 2048, messages: [ { role: "user", content: "What is the weather in San Francisco?", }, ], tools: [ { type: "tool_search_tool_regex_20251119", name: "tool_search_tool_regex", }, { name: "get_weather", description: "Get the weather at a specific location", input_schema: { type: "object", properties: { location: { type: "string" }, unit: { type: "string", enum: ["celsius", "fahrenheit"], }, }, required: ["location"], }, defer_loading: true, }, { name: "search_files", description: "Search through files in the workspace", input_schema: { type: "object", properties: { query: { type: "string" }, file_types: { type: "array", items: { type: "string" }, }, }, required: ["query"], }, defer_loading: true, }, ], }); console.log(JSON.stringify(response, null, 2)); } main(); ``` ## Tool definition The tool search tool has two variants: ```json JSON { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" } ``` ```json JSON { "type": "tool_search_tool_bm25_20251119", "name": "tool_search_tool_bm25" } ``` **Regex variant query format: Python regex, NOT natural language** When using `tool_search_tool_regex_20251119`, Claude constructs regex patterns using Python's `re.search()` syntax, not natural language queries. Common patterns: - `"weather"` - matches tool names/descriptions containing "weather" - `"get_.*_data"` - matches tools like `get_user_data`, `get_weather_data` - `"database.*query|query.*database"` - OR patterns for flexibility - `"(?i)slack"` - case-insensitive search Maximum query length: 200 characters **BM25 variant query format: Natural language** When using `tool_search_tool_bm25_20251119`, Claude uses natural language queries to search for tools. ### Deferred tool loading Mark tools for on-demand loading by adding `defer_loading: true`: ```json JSON { "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": { "type": "string" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] }, "defer_loading": true } ``` **Key points:** - Tools without `defer_loading` are loaded into context immediately - Tools with `defer_loading: true` are only loaded when Claude discovers them via search - The tool search tool itself should **never** have `defer_loading: true` - Keep your 3-5 most frequently used tools as non-deferred for optimal performance Both tool search variants (`regex` and `bm25`) search tool names, descriptions, argument names, and argument descriptions. ## Response format When Claude uses the tool search tool, the response includes new block types: ```json JSON { "role": "assistant", "content": [ { "type": "text", "text": "I'll search for tools to help with the weather information." }, { "type": "server_tool_use", "id": "srvtoolu_01ABC123", "name": "tool_search_tool_regex", "input": { "query": "weather" } }, { "type": "tool_search_tool_result", "tool_use_id": "srvtoolu_01ABC123", "content": { "type": "tool_search_tool_search_result", "tool_references": [{ "type": "tool_reference", "tool_name": "get_weather" }] } }, { "type": "text", "text": "I found a weather tool. Let me get the weather for San Francisco." }, { "type": "tool_use", "id": "toolu_01XYZ789", "name": "get_weather", "input": { "location": "San Francisco", "unit": "fahrenheit" } } ], "stop_reason": "tool_use" } ``` ### Understanding the response - **`server_tool_use`**: Indicates Claude is invoking the tool search tool - **`tool_search_tool_result`**: Contains the search results with a nested `tool_search_tool_search_result` object - **`tool_references`**: Array of `tool_reference` objects pointing to discovered tools - **`tool_use`**: Claude invoking the discovered tool The `tool_reference` blocks are automatically expanded into full tool definitions before being shown to Claude. You don't need to handle this expansion yourself. It happens automatically in the API as long as you provide all matching tool definitions in the `tools` parameter. ## MCP integration The tool search tool works with [MCP servers](/docs/en/agents-and-tools/mcp-connector). Add the `"mcp-client-2025-11-20"` [beta header](/docs/en/api/beta-headers) to your API request, and then use `mcp_toolset` with `default_config` to defer loading MCP tools: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: advanced-tool-use-2025-11-20,mcp-client-2025-11-20" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 2048, "mcp_servers": [ { "type": "url", "name": "database-server", "url": "https://mcp-db.example.com" } ], "tools": [ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "type": "mcp_toolset", "mcp_server_name": "database-server", "default_config": { "defer_loading": true }, "configs": { "search_events": { "defer_loading": false } } } ], "messages": [ { "role": "user", "content": "What events are in my database?" } ] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20", "mcp-client-2025-11-20"], max_tokens=2048, mcp_servers=[ { "type": "url", "name": "database-server", "url": "https://mcp-db.example.com" } ], tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "type": "mcp_toolset", "mcp_server_name": "database-server", "default_config": { "defer_loading": True }, "configs": { "search_events": { "defer_loading": False } } } ], messages=[ { "role": "user", "content": "What events are in my database?" } ] ) print(response) ``` ```typescript TypeScript import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); async function main() { const response = await client.beta.messages.create({ model: "claude-sonnet-4-5-20250929", betas: ["advanced-tool-use-2025-11-20", "mcp-client-2025-11-20"], max_tokens: 2048, mcp_servers: [ { type: "url", name: "database-server", url: "https://mcp-db.example.com", }, ], tools: [ { type: "tool_search_tool_regex_20251119", name: "tool_search_tool_regex", }, { type: "mcp_toolset", mcp_server_name: "database-server", default_config: { defer_loading: true, }, configs: { search_events: { defer_loading: false, }, }, }, ], messages: [ { role: "user", content: "What events are in my database?", }, ], }); console.log(JSON.stringify(response, null, 2)); } main(); ``` **MCP configuration options:** - `default_config.defer_loading`: Set default for all tools from the MCP server - `configs`: Override defaults for specific tools by name - Combine multiple MCP servers with tool search for massive tool libraries ## Custom tool search implementation You can implement your own tool search logic (e.g., using embeddings or semantic search) by returning `tool_reference` blocks from a custom tool. When Claude calls your custom search tool, return a standard `tool_result` with `tool_reference` blocks in the content array: ```json JSON { "type": "tool_result", "tool_use_id": "toolu_your_tool_id", "content": [ { "type": "tool_reference", "tool_name": "discovered_tool_name" } ] } ``` Every tool referenced must have a corresponding tool definition in the top-level `tools` parameter with `defer_loading: true`. This approach lets you use more sophisticated search algorithms while maintaining compatibility with the tool search system. The `tool_search_tool_result` format shown in the [Response format](#response-format) section is the server-side format used internally by Anthropic's built-in tool search. For custom client-side implementations, always use the standard `tool_result` format with `tool_reference` content blocks as shown above. For a complete example using embeddings, see our [tool search with embeddings cookbook](https://platform.claude.com/cookbooks). ## Error handling The tool search tool is not compatible with [tool use examples](/docs/en/agents-and-tools/tool-use/implement-tool-use#providing-tool-use-examples). If you need to provide examples of tool usage, use standard tool calling without tool search. ### HTTP errors (400 status) These errors prevent the request from being processed: **All tools deferred:** ```json { "type": "error", "error": { "type": "invalid_request_error", "message": "All tools have defer_loading set. At least one tool must be non-deferred." } } ``` **Missing tool definition:** ```json { "type": "error", "error": { "type": "invalid_request_error", "message": "Tool reference 'unknown_tool' has no corresponding tool definition" } } ``` ### Tool result errors (200 status) Errors during tool execution return a 200 response with error information in the body: ```json JSON { "type": "tool_result", "tool_use_id": "srvtoolu_01ABC123", "content": { "type": "tool_search_tool_result_error", "error_code": "invalid_pattern" } } ``` **Error codes:** - `too_many_requests`: Rate limit exceeded for tool search operations - `invalid_pattern`: Malformed regex pattern - `pattern_too_long`: Pattern exceeds 200 character limit - `unavailable`: Tool search service temporarily unavailable ### Common mistakes
**Cause**: You set `defer_loading: true` on ALL tools including the search tool **Fix**: Remove `defer_loading` from the tool search tool: ```json { "type": "tool_search_tool_regex_20251119", // No defer_loading here "name": "tool_search_tool_regex" } ```
**Cause**: A `tool_reference` points to a tool not in your `tools` array **Fix**: Ensure every tool that could be discovered has a complete definition: ```json { "name": "my_tool", "description": "Full description here", "input_schema": { /* complete schema */ }, "defer_loading": true } ```
**Cause**: Tool names or descriptions don't match the regex pattern **Debugging steps:** 1. Check tool name and description—Claude searches BOTH fields 2. Test your pattern: `import re; re.search(r"your_pattern", "tool_name")` 3. Remember searches are case-sensitive by default (use `(?i)` for case-insensitive) 4. Claude uses broad patterns like `".*weather.*"` not exact matches **Tip**: Add common keywords to tool descriptions to improve discoverability
## Prompt caching Tool search works with [prompt caching](/docs/en/build-with-claude/prompt-caching). Add `cache_control` breakpoints to optimize multi-turn conversations: ```python Python import anthropic client = anthropic.Anthropic() # First request with tool search messages = [ { "role": "user", "content": "What's the weather in Seattle?" } ] response1 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20"], max_tokens=2048, messages=messages, tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] }, "defer_loading": True } ] ) # Add Claude's response to conversation messages.append({ "role": "assistant", "content": response1.content }) # Second request with cache breakpoint messages.append({ "role": "user", "content": "What about New York?", "cache_control": {"type": "ephemeral"} }) response2 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", betas=["advanced-tool-use-2025-11-20"], max_tokens=2048, messages=messages, tools=[ { "type": "tool_search_tool_regex_20251119", "name": "tool_search_tool_regex" }, { "name": "get_weather", "description": "Get weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] }, "defer_loading": True } ] ) print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}") ``` The system automatically expands tool_reference blocks throughout the entire conversation history, so Claude can reuse discovered tools in subsequent turns without re-searching. ## Streaming With streaming enabled, you'll receive tool search events as part of the stream: ```javascript event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "tool_search_tool_regex"}} // Search query streamed event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"query\":\"weather\"}"}} // Pause while search executes // Search results streamed event: content_block_start data: {"type": "content_block_start", "index": 2, "content_block": {"type": "tool_search_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"type": "tool_search_tool_search_result", "tool_references": [{"type": "tool_reference", "tool_name": "get_weather"}]}}} // Claude continues with discovered tools ``` ## Batch requests You can include the tool search tool in the [Messages Batches API](/docs/en/build-with-claude/batch-processing). Tool search operations through the Messages Batches API are priced the same as those in regular Messages API requests. ## Limits and best practices ### Limits - **Maximum tools**: 10,000 tools in your catalog - **Search results**: Returns 3-5 most relevant tools per search - **Pattern length**: Maximum 200 characters for regex patterns - **Model support**: Sonnet 4.0+, Opus 4.0+ only (no Haiku) ### When to use tool search **Good use cases:** - 10+ tools available in your system - Tool definitions consuming >10K tokens - Experiencing tool selection accuracy issues with large tool sets - Building MCP-powered systems with multiple servers (200+ tools) - Tool library growing over time **When traditional tool calling might be better:** - Less than 10 tools total - All tools are frequently used in every request - Very small tool definitions (\<100 tokens total) ### Optimization tips - Keep 3-5 most frequently used tools as non-deferred - Write clear, descriptive tool names and descriptions - Use semantic keywords in descriptions that match how users describe tasks - Add a system prompt section describing available tool categories: "You can search for tools to interact with Slack, GitHub, and Jira" - Monitor which tools Claude discovers to refine descriptions ## Usage Tool search tool usage is tracked in the response usage object: ```json JSON { "usage": { "input_tokens": 1024, "output_tokens": 256, "server_tool_use": { "tool_search_requests": 2 } } } ``` --- # Web fetch tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-fetch-tool # Web fetch tool --- The web fetch tool allows Claude to retrieve full content from specified web pages and PDF documents. The web fetch tool is currently in beta. To enable it, use the beta header `web-fetch-2025-09-10` in your API requests. Please use [this form](https://forms.gle/NhWcgmkcvPCMmPE86) to provide feedback on the quality of the model responses, the API itself, or the quality of the documentation. Enabling the web fetch tool in environments where Claude processes untrusted input alongside sensitive data poses data exfiltration risks. We recommend only using this tool in trusted environments or when handling non-sensitive data. To minimize exfiltration risks, Claude is not allowed to dynamically construct URLs. Claude can only fetch URLs that have been explicitly provided by the user or that come from previous web search or web fetch results. However, there is still residual risk that should be carefully considered when using this tool. If data exfiltration is a concern, consider: - Disabling the web fetch tool entirely - Using the `max_uses` parameter to limit the number of requests - Using the `allowed_domains` parameter to restrict to known safe domains ## Supported models Web fetch is available on: - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-7-sonnet-20250219`) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) - Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-5-haiku-latest`) - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) ## How web fetch works When you add the web fetch tool to your API request: 1. Claude decides when to fetch content based on the prompt and available URLs. 2. The API retrieves the full text content from the specified URL. 3. For PDFs, automatic text extraction is performed. 4. Claude analyzes the fetched content and provides a response with optional citations. The web fetch tool currently does not support web sites dynamically rendered via Javascript. ## How to use web fetch Provide the web fetch tool in your API request: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "anthropic-beta: web-fetch-2025-09-10" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": "Please analyze the content at https://example.com/article" } ], "tools": [{ "type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5 }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": "Please analyze the content at https://example.com/article" } ], tools=[{ "type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5 }], extra_headers={ "anthropic-beta": "web-fetch-2025-09-10" } ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: "Please analyze the content at https://example.com/article" } ], tools: [{ type: "web_fetch_20250910", name: "web_fetch", max_uses: 5 }], headers: { "anthropic-beta": "web-fetch-2025-09-10" } }); console.log(response); } main().catch(console.error); ``` ### Tool definition The web fetch tool supports the following parameters: ```json JSON { "type": "web_fetch_20250910", "name": "web_fetch", // Optional: Limit the number of fetches per request "max_uses": 10, // Optional: Only fetch from these domains "allowed_domains": ["example.com", "docs.example.com"], // Optional: Never fetch from these domains "blocked_domains": ["private.example.com"], // Optional: Enable citations for fetched content "citations": { "enabled": true }, // Optional: Maximum content length in tokens "max_content_tokens": 100000 } ``` #### Max uses The `max_uses` parameter limits the number of web fetches performed. If Claude attempts more fetches than allowed, the `web_fetch_tool_result` will be an error with the `max_uses_exceeded` error code. There is currently no default limit. #### Domain filtering When using domain filters: - Domains should not include the HTTP/HTTPS scheme (use `example.com` instead of `https://example.com`) - Subdomains are automatically included (`example.com` covers `docs.example.com`) - Subpaths are supported (`example.com/blog`) - You can use either `allowed_domains` or `blocked_domains`, but not both in the same request. Be aware that Unicode characters in domain names can create security vulnerabilities through homograph attacks, where visually similar characters from different scripts can bypass domain filters. For example, `аmazon.com` (using Cyrillic 'а') may appear identical to `amazon.com` but represents a different domain. When configuring domain allow/block lists: - Use ASCII-only domain names when possible - Consider that URL parsers may handle Unicode normalization differently - Test your domain filters with potential homograph variations - Regularly audit your domain configurations for suspicious Unicode characters #### Content limits The `max_content_tokens` parameter limits the amount of content that will be included in the context. If the fetched content exceeds this limit, it will be truncated. This helps control token usage when fetching large documents. The `max_content_tokens` parameter limit is approximate. The actual number of input tokens used can vary by a small amount. #### Citations Unlike web search where citations are always enabled, citations are optional for web fetch. Set `"citations": {"enabled": true}` to enable Claude to cite specific passages from fetched documents. When displaying API outputs directly to end users, citations must be included to the original source. If you are making modifications to API outputs, including by reprocessing and/or combining them with your own material before displaying them to end users, display citations as appropriate based on consultation with your legal team. ### Response Here's an example response structure: ```json { "role": "assistant", "content": [ // 1. Claude's decision to fetch { "type": "text", "text": "I'll fetch the content from the article to analyze it." }, // 2. The fetch request { "type": "server_tool_use", "id": "srvtoolu_01234567890abcdef", "name": "web_fetch", "input": { "url": "https://example.com/article" } }, // 3. Fetch results { "type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_01234567890abcdef", "content": { "type": "web_fetch_result", "url": "https://example.com/article", "content": { "type": "document", "source": { "type": "text", "media_type": "text/plain", "data": "Full text content of the article..." }, "title": "Article Title", "citations": {"enabled": true} }, "retrieved_at": "2025-08-25T10:30:00Z" } }, // 4. Claude's analysis with citations (if enabled) { "text": "Based on the article, ", "type": "text" }, { "text": "the main argument presented is that artificial intelligence will transform healthcare", "type": "text", "citations": [ { "type": "char_location", "document_index": 0, "document_title": "Article Title", "start_char_index": 1234, "end_char_index": 1456, "cited_text": "Artificial intelligence is poised to revolutionize healthcare delivery..." } ] } ], "id": "msg_a930390d3a", "usage": { "input_tokens": 25039, "output_tokens": 931, "server_tool_use": { "web_fetch_requests": 1 } }, "stop_reason": "end_turn" } ``` #### Fetch results Fetch results include: - `url`: The URL that was fetched - `content`: A document block containing the fetched content - `retrieved_at`: Timestamp when the content was retrieved The web fetch tool caches results to improve performance and reduce redundant requests. This means the content returned may not always be the latest version available at the URL. The cache behavior is managed automatically and may change over time to optimize for different content types and usage patterns. For PDF documents, the content will be returned as base64-encoded data: ```json { "type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_02", "content": { "type": "web_fetch_result", "url": "https://example.com/paper.pdf", "content": { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": "JVBERi0xLjQKJcOkw7zDtsOfCjIgMCBvYmo..." }, "citations": {"enabled": true} }, "retrieved_at": "2025-08-25T10:30:02Z" } } ``` #### Errors When the web fetch tool encounters an error, the Claude API returns a 200 (success) response with the error represented in the response body: ```json { "type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_a93jad", "content": { "type": "web_fetch_tool_error", "error_code": "url_not_accessible" } } ``` These are the possible error codes: - `invalid_input`: Invalid URL format - `url_too_long`: URL exceeds maximum length (250 characters) - `url_not_allowed`: URL blocked by domain filtering rules and model restrictions - `url_not_accessible`: Failed to fetch content (HTTP error) - `too_many_requests`: Rate limit exceeded - `unsupported_content_type`: Content type not supported (only text and PDF) - `max_uses_exceeded`: Maximum web fetch tool uses exceeded - `unavailable`: An internal error occurred ## URL validation For security reasons, the web fetch tool can only fetch URLs that have previously appeared in the conversation context. This includes: - URLs in user messages - URLs in client-side tool results - URLs from previous web search or web fetch results The tool cannot fetch arbitrary URLs that Claude generates or URLs from container-based server tools (Code Execution, Bash, etc.). ## Combined search and fetch Web fetch works seamlessly with web search for comprehensive information gathering: ```python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=4096, messages=[ { "role": "user", "content": "Find recent articles about quantum computing and analyze the most relevant one in detail" } ], tools=[ { "type": "web_search_20250305", "name": "web_search", "max_uses": 3 }, { "type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5, "citations": {"enabled": True} } ], extra_headers={ "anthropic-beta": "web-fetch-2025-09-10" } ) ``` In this workflow, Claude will: 1. Use web search to find relevant articles 2. Select the most promising results 3. Use web fetch to retrieve full content 4. Provide detailed analysis with citations ## Prompt caching Web fetch works with [prompt caching](/docs/en/build-with-claude/prompt-caching). To enable prompt caching, add `cache_control` breakpoints in your request. Cached fetch results can be reused across conversation turns. ```python import anthropic client = anthropic.Anthropic() # First request with web fetch messages = [ { "role": "user", "content": "Analyze this research paper: https://arxiv.org/abs/2024.12345" } ] response1 = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=[{ "type": "web_fetch_20250910", "name": "web_fetch" }], extra_headers={ "anthropic-beta": "web-fetch-2025-09-10" } ) # Add Claude's response to conversation messages.append({ "role": "assistant", "content": response1.content }) # Second request with cache breakpoint messages.append({ "role": "user", "content": "What methodology does the paper use?", "cache_control": {"type": "ephemeral"} }) response2 = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=[{ "type": "web_fetch_20250910", "name": "web_fetch" }], extra_headers={ "anthropic-beta": "web-fetch-2025-09-10" } ) # The second response benefits from cached fetch results print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}") ``` ## Streaming With streaming enabled, fetch events are part of the stream with a pause during content retrieval: ```javascript event: message_start data: {"type": "message_start", "message": {"id": "msg_abc123", "type": "message"}} event: content_block_start data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}} // Claude's decision to fetch event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "web_fetch"}} // Fetch URL streamed event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"url\":\"https://example.com/article\"}"}} // Pause while fetch executes // Fetch results streamed event: content_block_start data: {"type": "content_block_start", "index": 2, "content_block": {"type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"type": "web_fetch_result", "url": "https://example.com/article", "content": {"type": "document", "source": {"type": "text", "media_type": "text/plain", "data": "Article content..."}}}}} // Claude's response continues... ``` ## Batch requests You can include the web fetch tool in the [Messages Batches API](/docs/en/build-with-claude/batch-processing). Web fetch tool calls through the Messages Batches API are priced the same as those in regular Messages API requests. ## Usage and pricing Web fetch usage has **no additional charges** beyond standard token costs: ```json "usage": { "input_tokens": 25039, "output_tokens": 931, "cache_read_input_tokens": 0, "cache_creation_input_tokens": 0, "server_tool_use": { "web_fetch_requests": 1 } } ``` The web fetch tool is available on the Claude API at **no additional cost**. You only pay standard token costs for the fetched content that becomes part of your conversation context. To protect against inadvertently fetching large content that would consume excessive tokens, use the `max_content_tokens` parameter to set appropriate limits based on your use case and budget considerations. Example token usage for typical content: - Average web page (10KB): ~2,500 tokens - Large documentation page (100KB): ~25,000 tokens - Research paper PDF (500KB): ~125,000 tokens --- # Web search tool URL: https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool # Web search tool --- The web search tool gives Claude direct access to real-time web content, allowing it to answer questions with up-to-date information beyond its knowledge cutoff. Claude automatically cites sources from search results as part of its answer. Please reach out through our [feedback form](https://forms.gle/sWjBtsrNEY2oKGuE8) to share your experience with the web search tool. ## Supported models Web search is available on: - Claude Sonnet 4.5 (`claude-sonnet-4-5-20250929`) - Claude Sonnet 4 (`claude-sonnet-4-20250514`) - Claude Sonnet 3.7 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-7-sonnet-20250219`) - Claude Haiku 4.5 (`claude-haiku-4-5-20251001`) - Claude Haiku 3.5 ([deprecated](/docs/en/about-claude/model-deprecations)) (`claude-3-5-haiku-latest`) - Claude Opus 4.5 (`claude-opus-4-5-20251101`) - Claude Opus 4.1 (`claude-opus-4-1-20250805`) - Claude Opus 4 (`claude-opus-4-20250514`) ## How web search works When you add the web search tool to your API request: 1. Claude decides when to search based on the prompt. 2. The API executes the searches and provides Claude with the results. This process may repeat multiple times throughout a single request. 3. At the end of its turn, Claude provides a final response with cited sources. ## How to use web search Your organization's administrator must enable web search in [Console](/settings/privacy). Provide the web search tool in your API request: ```bash Shell curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data '{ "model": "claude-sonnet-4-5", "max_tokens": 1024, "messages": [ { "role": "user", "content": "What is the weather in NYC?" } ], "tools": [{ "type": "web_search_20250305", "name": "web_search", "max_uses": 5 }] }' ``` ```python Python import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": "What's the weather in NYC?" } ], tools=[{ "type": "web_search_20250305", "name": "web_search", "max_uses": 5 }] ) print(response) ``` ```typescript TypeScript import { Anthropic } from '@anthropic-ai/sdk'; const anthropic = new Anthropic(); async function main() { const response = await anthropic.messages.create({ model: "claude-sonnet-4-5", max_tokens: 1024, messages: [ { role: "user", content: "What's the weather in NYC?" } ], tools: [{ type: "web_search_20250305", name: "web_search", max_uses: 5 }] }); console.log(response); } main().catch(console.error); ``` ### Tool definition The web search tool supports the following parameters: ```json JSON { "type": "web_search_20250305", "name": "web_search", // Optional: Limit the number of searches per request "max_uses": 5, // Optional: Only include results from these domains "allowed_domains": ["example.com", "trusteddomain.org"], // Optional: Never include results from these domains "blocked_domains": ["untrustedsource.com"], // Optional: Localize search results "user_location": { "type": "approximate", "city": "San Francisco", "region": "California", "country": "US", "timezone": "America/Los_Angeles" } } ``` #### Max uses The `max_uses` parameter limits the number of searches performed. If Claude attempts more searches than allowed, the `web_search_tool_result` will be an error with the `max_uses_exceeded` error code. #### Domain filtering When using domain filters: - Domains should not include the HTTP/HTTPS scheme (use `example.com` instead of `https://example.com`) - Subdomains are automatically included (`example.com` covers `docs.example.com`) - Specific subdomains restrict results to only that subdomain (`docs.example.com` returns only results from that subdomain, not from `example.com` or `api.example.com`) - Subpaths are supported and match anything after the path (`example.com/blog` matches `example.com/blog/post-1`) - You can use either `allowed_domains` or `blocked_domains`, but not both in the same request. **Wildcard support:** - Only one wildcard (`*`) is allowed per domain entry, and it must appear after the domain part (in the path) - Valid: `example.com/*`, `example.com/*/articles` - Invalid: `*.example.com`, `ex*.com`, `example.com/*/news/*` Invalid domain formats will return an `invalid_tool_input` tool error. Request-level domain restrictions must be compatible with organization-level domain restrictions configured in the Console. Request-level domains can only further restrict domains, not override or expand beyond the organization-level list. If your request includes domains that conflict with organization settings, the API will return a validation error. #### Localization The `user_location` parameter allows you to localize search results based on a user's location. - `type`: The type of location (must be `approximate`) - `city`: The city name - `region`: The region or state - `country`: The country - `timezone`: The [IANA timezone ID](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). ### Response Here's an example response structure: ```json { "role": "assistant", "content": [ // 1. Claude's decision to search { "type": "text", "text": "I'll search for when Claude Shannon was born." }, // 2. The search query used { "type": "server_tool_use", "id": "srvtoolu_01WYG3ziw53XMcoyKL4XcZmE", "name": "web_search", "input": { "query": "claude shannon birth date" } }, // 3. Search results { "type": "web_search_tool_result", "tool_use_id": "srvtoolu_01WYG3ziw53XMcoyKL4XcZmE", "content": [ { "type": "web_search_result", "url": "https://en.wikipedia.org/wiki/Claude_Shannon", "title": "Claude Shannon - Wikipedia", "encrypted_content": "EqgfCioIARgBIiQ3YTAwMjY1Mi1mZjM5LTQ1NGUtODgxNC1kNjNjNTk1ZWI3Y...", "page_age": "April 30, 2025" } ] }, { "text": "Based on the search results, ", "type": "text" }, // 4. Claude's response with citations { "text": "Claude Shannon was born on April 30, 1916, in Petoskey, Michigan", "type": "text", "citations": [ { "type": "web_search_result_location", "url": "https://en.wikipedia.org/wiki/Claude_Shannon", "title": "Claude Shannon - Wikipedia", "encrypted_index": "Eo8BCioIAhgBIiQyYjQ0OWJmZi1lNm..", "cited_text": "Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, computer scientist, cryptographer and i..." } ] } ], "id": "msg_a930390d3a", "usage": { "input_tokens": 6039, "output_tokens": 931, "server_tool_use": { "web_search_requests": 1 } }, "stop_reason": "end_turn" } ``` #### Search results Search results include: - `url`: The URL of the source page - `title`: The title of the source page - `page_age`: When the site was last updated - `encrypted_content`: Encrypted content that must be passed back in multi-turn conversations for citations #### Citations Citations are always enabled for web search, and each `web_search_result_location` includes: - `url`: The URL of the cited source - `title`: The title of the cited source - `encrypted_index`: A reference that must be passed back for multi-turn conversations. - `cited_text`: Up to 150 characters of the cited content The web search citation fields `cited_text`, `title`, and `url` do not count towards input or output token usage. When displaying API outputs directly to end users, citations must be included to the original source. If you are making modifications to API outputs, including by reprocessing and/or combining them with your own material before displaying them to end users, display citations as appropriate based on consultation with your legal team. #### Errors When the web search tool encounters an error (such as hitting rate limits), the Claude API still returns a 200 (success) response. The error is represented within the response body using the following structure: ```json { "type": "web_search_tool_result", "tool_use_id": "servertoolu_a93jad", "content": { "type": "web_search_tool_result_error", "error_code": "max_uses_exceeded" } } ``` These are the possible error codes: - `too_many_requests`: Rate limit exceeded - `invalid_input`: Invalid search query parameter - `max_uses_exceeded`: Maximum web search tool uses exceeded - `query_too_long`: Query exceeds maximum length - `unavailable`: An internal error occurred #### `pause_turn` stop reason The response may include a `pause_turn` stop reason, which indicates that the API paused a long-running turn. You may provide the response back as-is in a subsequent request to let Claude continue its turn, or modify the content if you wish to interrupt the conversation. ## Prompt caching Web search works with [prompt caching](/docs/en/build-with-claude/prompt-caching). To enable prompt caching, add at least one `cache_control` breakpoint in your request. The system will automatically cache up until the last `web_search_tool_result` block when executing the tool. For multi-turn conversations, set a `cache_control` breakpoint on or after the last `web_search_tool_result` block to reuse cached content. For example, to use prompt caching with web search for a multi-turn conversation: ```python import anthropic client = anthropic.Anthropic() # First request with web search and cache breakpoint messages = [ { "role": "user", "content": "What's the current weather in San Francisco today?" } ] response1 = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=[{ "type": "web_search_20250305", "name": "web_search", "user_location": { "type": "approximate", "city": "San Francisco", "region": "California", "country": "US", "timezone": "America/Los_Angeles" } }] ) # Add Claude's response to the conversation messages.append({ "role": "assistant", "content": response1.content }) # Second request with cache breakpoint after the search results messages.append({ "role": "user", "content": "Should I expect rain later this week?", "cache_control": {"type": "ephemeral"} # Cache up to this point }) response2 = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=messages, tools=[{ "type": "web_search_20250305", "name": "web_search", "user_location": { "type": "approximate", "city": "San Francisco", "region": "California", "country": "US", "timezone": "America/Los_Angeles" } }] ) # The second response will benefit from cached search results # while still being able to perform new searches if needed print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}") ``` ## Streaming With streaming enabled, you'll receive search events as part of the stream. There will be a pause while the search executes: ```javascript event: message_start data: {"type": "message_start", "message": {"id": "msg_abc123", "type": "message"}} event: content_block_start data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}} // Claude's decision to search event: content_block_start data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "web_search"}} // Search query streamed event: content_block_delta data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"query\":\"latest quantum computing breakthroughs 2025\"}"}} // Pause while search executes // Search results streamed event: content_block_start data: {"type": "content_block_start", "index": 2, "content_block": {"type": "web_search_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": [{"type": "web_search_result", "title": "Quantum Computing Breakthroughs in 2025", "url": "https://example.com"}]}} // Claude's response with citations (omitted in this example) ``` ## Batch requests You can include the web search tool in the [Messages Batches API](/docs/en/build-with-claude/batch-processing). Web search tool calls through the Messages Batches API are priced the same as those in regular Messages API requests. ## Usage and pricing Web search usage is charged in addition to token usage: ```json "usage": { "input_tokens": 105, "output_tokens": 6039, "cache_read_input_tokens": 7123, "cache_creation_input_tokens": 7345, "server_tool_use": { "web_search_requests": 1 } } ``` Web search is available on the Claude API for **$10 per 1,000 searches**, plus standard token costs for search-generated content. Web search results retrieved throughout a conversation are counted as input tokens, in search iterations executed during a single turn and in subsequent conversation turns. Each web search counts as one use, regardless of the number of results returned. If an error occurs during web search, the web search will not be billed. ### Agent Skills --- # Agent Skills URL: https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview # Agent Skills Agent Skills are modular capabilities that extend Claude's functionality. Each Skill packages instructions, metadata, and optional resources (scripts, templates) that Claude uses automatically when relevant. --- ## Why use Skills Skills are reusable, filesystem-based resources that provide Claude with domain-specific expertise: workflows, context, and best practices that transform general-purpose agents into specialists. Unlike prompts (conversation-level instructions for one-off tasks), Skills load on-demand and eliminate the need to repeatedly provide the same guidance across multiple conversations. **Key benefits**: - **Specialize Claude**: Tailor capabilities for domain-specific tasks - **Reduce repetition**: Create once, use automatically - **Compose capabilities**: Combine Skills to build complex workflows For a deep dive into the architecture and real-world applications of Agent Skills, read our engineering blog: [Equipping agents for the real world with Agent Skills](https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills). ## Using Skills Anthropic provides pre-built Agent Skills for common document tasks (PowerPoint, Excel, Word, PDF), and you can create your own custom Skills. Both work the same way. Claude automatically uses them when relevant to your request. **Pre-built Agent Skills** are available to all users on claude.ai and via the Claude API. See the [Available Skills](#available-skills) section below for the complete list. **Custom Skills** let you package domain expertise and organizational knowledge. They're available across Claude's products: create them in Claude Code, upload them via the API, or add them in claude.ai settings. **Get started:** - For pre-built Agent Skills: See the [quickstart tutorial](/docs/en/agents-and-tools/agent-skills/quickstart) to start using PowerPoint, Excel, Word, and PDF skills in the API - For custom Skills: See the [Agent Skills Cookbook](https://platform.claude.com/cookbook/skills-notebooks-01-skills-introduction) to learn how to create your own Skills ## How Skills work Skills leverage Claude's VM environment to provide capabilities beyond what's possible with prompts alone. Claude operates in a virtual machine with filesystem access, allowing Skills to exist as directories containing instructions, executable code, and reference materials, organized like an onboarding guide you'd create for a new team member. This filesystem-based architecture enables **progressive disclosure**: Claude loads information in stages as needed, rather than consuming context upfront. ### Three types of Skill content, three levels of loading Skills can contain three types of content, each loaded at different times: ### Level 1: Metadata (always loaded) **Content type: Instructions**. The Skill's YAML frontmatter provides discovery information: ```yaml --- name: pdf-processing description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction. --- ``` Claude loads this metadata at startup and includes it in the system prompt. This lightweight approach means you can install many Skills without context penalty; Claude only knows each Skill exists and when to use it. ### Level 2: Instructions (loaded when triggered) **Content type: Instructions**. The main body of SKILL.md contains procedural knowledge: workflows, best practices, and guidance: ````markdown # PDF Processing ## Quick start Use pdfplumber to extract text from PDFs: ```python import pdfplumber with pdfplumber.open("document.pdf") as pdf: text = pdf.pages[0].extract_text() ``` For advanced form filling, see [FORMS.md](FORMS.md). ```` When you request something that matches a Skill's description, Claude reads SKILL.md from the filesystem via bash. Only then does this content enter the context window. ### Level 3: Resources and code (loaded as needed) **Content types: Instructions, code, and resources**. Skills can bundle additional materials: ``` pdf-skill/ ├── SKILL.md (main instructions) ├── FORMS.md (form-filling guide) ├── REFERENCE.md (detailed API reference) └── scripts/ └── fill_form.py (utility script) ``` **Instructions**: Additional markdown files (FORMS.md, REFERENCE.md) containing specialized guidance and workflows **Code**: Executable scripts (fill_form.py, validate.py) that Claude runs via bash; scripts provide deterministic operations without consuming context **Resources**: Reference materials like database schemas, API documentation, templates, or examples Claude accesses these files only when referenced. The filesystem model means each content type has different strengths: instructions for flexible guidance, code for reliability, resources for factual lookup. | Level | When Loaded | Token Cost | Content | |-------|------------|------------|---------| | **Level 1: Metadata** | Always (at startup) | ~100 tokens per Skill | `name` and `description` from YAML frontmatter | | **Level 2: Instructions** | When Skill is triggered | Under 5k tokens | SKILL.md body with instructions and guidance | | **Level 3+: Resources** | As needed | Effectively unlimited | Bundled files executed via bash without loading contents into context | Progressive disclosure ensures only relevant content occupies the context window at any given time. ### The Skills architecture Skills run in a code execution environment where Claude has filesystem access, bash commands, and code execution capabilities. Think of it like this: Skills exist as directories on a virtual machine, and Claude interacts with them using the same bash commands you'd use to navigate files on your computer. ![Agent Skills Architecture - showing how Skills integrate with the agent's configuration and virtual machine](/docs/images/agent-skills-architecture.png) **How Claude accesses Skill content:** When a Skill is triggered, Claude uses bash to read SKILL.md from the filesystem, bringing its instructions into the context window. If those instructions reference other files (like FORMS.md or a database schema), Claude reads those files too using additional bash commands. When instructions mention executable scripts, Claude runs them via bash and receives only the output (the script code itself never enters context). **What this architecture enables:** **On-demand file access**: Claude reads only the files needed for each specific task. A Skill can include dozens of reference files, but if your task only needs the sales schema, Claude loads just that one file. The rest remain on the filesystem consuming zero tokens. **Efficient script execution**: When Claude runs `validate_form.py`, the script's code never loads into the context window. Only the script's output (like "Validation passed" or specific error messages) consumes tokens. This makes scripts far more efficient than having Claude generate equivalent code on the fly. **No practical limit on bundled content**: Because files don't consume context until accessed, Skills can include comprehensive API documentation, large datasets, extensive examples, or any reference materials you need. There's no context penalty for bundled content that isn't used. This filesystem-based model is what makes progressive disclosure work. Claude navigates your Skill like you'd reference specific sections of an onboarding guide, accessing exactly what each task requires. ### Example: Loading a PDF processing skill Here's how Claude loads and uses a PDF processing skill: 1. **Startup**: System prompt includes: `PDF Processing - Extract text and tables from PDF files, fill forms, merge documents` 2. **User request**: "Extract the text from this PDF and summarize it" 3. **Claude invokes**: `bash: read pdf-skill/SKILL.md` → Instructions loaded into context 4. **Claude determines**: Form filling is not needed, so FORMS.md is not read 5. **Claude executes**: Uses instructions from SKILL.md to complete the task ![Skills loading into context window - showing the progressive loading of skill metadata and content](/docs/images/agent-skills-context-window.png) The diagram shows: 1. Default state with system prompt and skill metadata pre-loaded 2. Claude triggers the skill by reading SKILL.md via bash 3. Claude optionally reads additional bundled files like FORMS.md as needed 4. Claude proceeds with the task This dynamic loading ensures only relevant skill content occupies the context window. ## Where Skills work Skills are available across Claude's agent products: ### Claude API The Claude API supports both pre-built Agent Skills and custom Skills. Both work identically: specify the relevant `skill_id` in the `container` parameter along with the code execution tool. **Prerequisites**: Using Skills via the API requires three beta headers: - `code-execution-2025-08-25` - Skills run in the code execution container - `skills-2025-10-02` - Enables Skills functionality - `files-api-2025-04-14` - Required for uploading/downloading files to/from the container Use pre-built Agent Skills by referencing their `skill_id` (e.g., `pptx`, `xlsx`), or create and upload your own via the Skills API (`/v1/skills` endpoints). Custom Skills are shared organization-wide. To learn more, see [Use Skills with the Claude API](/docs/en/build-with-claude/skills-guide). ### Claude Code [Claude Code](https://code.claude.com/docs/en/overview) supports only Custom Skills. **Custom Skills**: Create Skills as directories with SKILL.md files. Claude discovers and uses them automatically. Custom Skills in Claude Code are filesystem-based and don't require API uploads. To learn more, see [Use Skills in Claude Code](https://code.claude.com/docs/en/skills). ### Claude Agent SDK The [Claude Agent SDK](/docs/en/agent-sdk/overview) supports custom Skills through filesystem-based configuration. **Custom Skills**: Create Skills as directories with SKILL.md files in `.claude/skills/`. Enable Skills by including `"Skill"` in your `allowed_tools` configuration. Skills in the Agent SDK are then automatically discovered when the SDK runs. To learn more, see [Agent Skills in the SDK](/docs/en/agent-sdk/skills). ### Claude.ai [Claude.ai](https://claude.ai) supports both pre-built Agent Skills and custom Skills. **Pre-built Agent Skills**: These Skills are already working behind the scenes when you create documents. Claude uses them without requiring any setup. **Custom Skills**: Upload your own Skills as zip files through Settings > Features. Available on Pro, Max, Team, and Enterprise plans with code execution enabled. Custom Skills are individual to each user; they are not shared organization-wide and cannot be centrally managed by admins. To learn more about using Skills in Claude.ai, see the following resources in the Claude Help Center: - [What are Skills?](https://support.claude.com/en/articles/12512176-what-are-skills) - [Using Skills in Claude](https://support.claude.com/en/articles/12512180-using-skills-in-claude) - [How to create custom Skills](https://support.claude.com/en/articles/12512198-creating-custom-skills) - [Teach Claude your way of working using Skills](https://support.claude.com/en/articles/12580051-teach-claude-your-way-of-working-using-skills) ## Skill structure Every Skill requires a `SKILL.md` file with YAML frontmatter: ```yaml --- name: your-skill-name description: Brief description of what this Skill does and when to use it --- # Your Skill Name ## Instructions [Clear, step-by-step guidance for Claude to follow] ## Examples [Concrete examples of using this Skill] ``` **Required fields**: `name` and `description` **Field requirements**: `name`: - Maximum 64 characters - Must contain only lowercase letters, numbers, and hyphens - Cannot contain XML tags - Cannot contain reserved words: "anthropic", "claude" `description`: - Must be non-empty - Maximum 1024 characters - Cannot contain XML tags The `description` should include both what the Skill does and when Claude should use it. For complete authoring guidance, see the [best practices guide](/docs/en/agents-and-tools/agent-skills/best-practices). ## Security considerations We strongly recommend using Skills only from trusted sources: those you created yourself or obtained from Anthropic. Skills provide Claude with new capabilities through instructions and code, and while this makes them powerful, it also means a malicious Skill can direct Claude to invoke tools or execute code in ways that don't match the Skill's stated purpose. If you must use a Skill from an untrusted or unknown source, exercise extreme caution and thoroughly audit it before use. Depending on what access Claude has when executing the Skill, malicious Skills could lead to data exfiltration, unauthorized system access, or other security risks. **Key security considerations**: - **Audit thoroughly**: Review all files bundled in the Skill: SKILL.md, scripts, images, and other resources. Look for unusual patterns like unexpected network calls, file access patterns, or operations that don't match the Skill's stated purpose - **External sources are risky**: Skills that fetch data from external URLs pose particular risk, as fetched content may contain malicious instructions. Even trustworthy Skills can be compromised if their external dependencies change over time - **Tool misuse**: Malicious Skills can invoke tools (file operations, bash commands, code execution) in harmful ways - **Data exposure**: Skills with access to sensitive data could be designed to leak information to external systems - **Treat like installing software**: Only use Skills from trusted sources. Be especially careful when integrating Skills into production systems with access to sensitive data or critical operations ## Available Skills ### Pre-built Agent Skills The following pre-built Agent Skills are available for immediate use: - **PowerPoint (pptx)**: Create presentations, edit slides, analyze presentation content - **Excel (xlsx)**: Create spreadsheets, analyze data, generate reports with charts - **Word (docx)**: Create documents, edit content, format text - **PDF (pdf)**: Generate formatted PDF documents and reports These Skills are available on the Claude API and claude.ai. See the [quickstart tutorial](/docs/en/agents-and-tools/agent-skills/quickstart) to start using them in the API. ### Custom Skills examples For complete examples of custom Skills, see the [Skills cookbook](https://platform.claude.com/cookbook/skills-notebooks-01-skills-introduction). ## Limitations and constraints Understanding these limitations helps you plan your Skills deployment effectively. ### Cross-surface availability **Custom Skills do not sync across surfaces**. Skills uploaded to one surface are not automatically available on others: - Skills uploaded to Claude.ai must be separately uploaded to the API - Skills uploaded via the API are not available on Claude.ai - Claude Code Skills are filesystem-based and separate from both Claude.ai and API You'll need to manage and upload Skills separately for each surface where you want to use them. ### Sharing scope Skills have different sharing models depending on where you use them: - **Claude.ai**: Individual user only; each team member must upload separately - **Claude API**: Workspace-wide; all workspace members can access uploaded Skills - **Claude Code**: Personal (`~/.claude/skills/`) or project-based (`.claude/skills/`); can also be shared via Claude Code Plugins Claude.ai does not currently support centralized admin management or org-wide distribution of custom Skills. ### Runtime environment constraints The exact runtime environment available to your skill depends on the product surface where you use it. - **Claude.ai**: - **Varying network access**: Depending on user/admin settings, Skills may have full, partial, or no network access. For more details, see the [Create and Edit Files](https://support.claude.com/en/articles/12111783-create-and-edit-files-with-claude#h_6b7e833898) support article. - **Claude API**: - **No network access**: Skills cannot make external API calls or access the internet - **No runtime package installation**: Only pre-installed packages are available. You cannot install new packages during execution. - **Pre-configured dependencies only**: Check the [code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool) for the list of available packages - **Claude Code**: - **Full network access**: Skills have the same network access as any other program on the user's computer - **Global package installation discouraged**: Skills should only install packages locally in order to avoid interfering with the user's computer Plan your Skills to work within these constraints. ## Next steps Create your first Skill Use Skills with the Claude API Create and manage custom Skills in Claude Code Use Skills programmatically in TypeScript and Python Write Skills that Claude can use effectively --- # Get started with Agent Skills in the API URL: https://platform.claude.com/docs/en/agents-and-tools/agent-skills/quickstart # Get started with Agent Skills in the API Learn how to use Agent Skills to create documents with the Claude API in under 10 minutes. --- This tutorial shows you how to use Agent Skills to create a PowerPoint presentation. You'll learn how to enable Skills, make a simple request, and access the generated file. ## Prerequisites - [Anthropic API key](/settings/keys) - Python 3.7+ or curl installed - Basic familiarity with making API requests ## What are Agent Skills? Pre-built Agent Skills extend Claude's capabilities with specialized expertise for tasks like creating documents, analyzing data, and processing files. Anthropic provides the following pre-built Agent Skills in the API: - **PowerPoint (pptx)**: Create and edit presentations - **Excel (xlsx)**: Create and analyze spreadsheets - **Word (docx)**: Create and edit documents - **PDF (pdf)**: Generate PDF documents **Want to create custom Skills?** See the [Agent Skills Cookbook](https://platform.claude.com/cookbook/skills-notebooks-01-skills-introduction) for examples of building your own Skills with domain-specific expertise. ## Step 1: List available Skills First, let's see what Skills are available. We'll use the Skills API to list all Anthropic-managed Skills: ```python Python import anthropic client = anthropic.Anthropic() # List Anthropic-managed Skills skills = client.beta.skills.list( source="anthropic", betas=["skills-2025-10-02"] ) for skill in skills.data: print(f"{skill.id}: {skill.display_title}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // List Anthropic-managed Skills const skills = await client.beta.skills.list({ source: 'anthropic', betas: ['skills-2025-10-02'] }); for (const skill of skills.data) { console.log(`${skill.id}: ${skill.display_title}`); } ``` ```bash Shell curl "https://api.anthropic.com/v1/skills?source=anthropic" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" ``` You see the following Skills: `pptx`, `xlsx`, `docx`, and `pdf`. This API returns each Skill's metadata: its name and description. Claude loads this metadata at startup to know what Skills are available. This is the first level of **progressive disclosure**, where Claude discovers Skills without loading their full instructions yet. ## Step 2: Create a presentation Now we'll use the PowerPoint Skill to create a presentation about renewable energy. We specify Skills using the `container` parameter in the Messages API: ```python Python import anthropic client = anthropic.Anthropic() # Create a message with the PowerPoint Skill response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "pptx", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Create a presentation about renewable energy with 5 slides" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) print(response.content) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // Create a message with the PowerPoint Skill const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'pptx', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Create a presentation about renewable energy with 5 slides' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); console.log(response.content); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "pptx", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Create a presentation about renewable energy with 5 slides" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` Let's break down what each part does: - **`container.skills`**: Specifies which Skills Claude can use - **`type: "anthropic"`**: Indicates this is an Anthropic-managed Skill - **`skill_id: "pptx"`**: The PowerPoint Skill identifier - **`version: "latest"`**: The Skill version set to the most recently published - **`tools`**: Enables code execution (required for Skills) - **Beta headers**: `code-execution-2025-08-25` and `skills-2025-10-02` When you make this request, Claude automatically matches your task to the relevant Skill. Since you asked for a presentation, Claude determines the PowerPoint Skill is relevant and loads its full instructions: the second level of progressive disclosure. Then Claude executes the Skill's code to create your presentation. ## Step 3: Download the created file The presentation was created in the code execution container and saved as a file. The response includes a file reference with a file ID. Extract the file ID and download it using the Files API: ```python Python # Extract file ID from response file_id = None for block in response.content: if block.type == 'tool_use' and block.name == 'code_execution': # File ID is in the tool result for result_block in block.content: if hasattr(result_block, 'file_id'): file_id = result_block.file_id break if file_id: # Download the file file_content = client.beta.files.download( file_id=file_id, betas=["files-api-2025-04-14"] ) # Save to disk with open("renewable_energy.pptx", "wb") as f: file_content.write_to_file(f.name) print(f"Presentation saved to renewable_energy.pptx") ``` ```typescript TypeScript // Extract file ID from response let fileId: string | null = null; for (const block of response.content) { if (block.type === 'tool_use' && block.name === 'code_execution') { // File ID is in the tool result for (const resultBlock of block.content) { if ('file_id' in resultBlock) { fileId = resultBlock.file_id; break; } } } } if (fileId) { // Download the file const fileContent = await client.beta.files.download(fileId, { betas: ['files-api-2025-04-14'] }); // Save to disk const fs = require('fs'); fs.writeFileSync('renewable_energy.pptx', Buffer.from(await fileContent.arrayBuffer())); console.log('Presentation saved to renewable_energy.pptx'); } ``` ```bash Shell # Extract file_id from response (using jq) FILE_ID=$(echo "$RESPONSE" | jq -r '.content[] | select(.type=="tool_use" and .name=="code_execution") | .content[] | select(.file_id) | .file_id') # Download the file curl "https://api.anthropic.com/v1/files/$FILE_ID/content" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ --output renewable_energy.pptx echo "Presentation saved to renewable_energy.pptx" ``` For complete details on working with generated files, see the [code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool#retrieve-generated-files). ## Try more examples Now that you've created your first document with Skills, try these variations: ### Create a spreadsheet ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "xlsx", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Create a quarterly sales tracking spreadsheet with sample data" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'xlsx', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Create a quarterly sales tracking spreadsheet with sample data' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "xlsx", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Create a quarterly sales tracking spreadsheet with sample data" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ### Create a Word document ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "docx", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Write a 2-page report on the benefits of renewable energy" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'docx', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Write a 2-page report on the benefits of renewable energy' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "docx", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Write a 2-page report on the benefits of renewable energy" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ### Generate a PDF ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "pdf", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Generate a PDF invoice template" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'pdf', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Generate a PDF invoice template' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "pdf", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Generate a PDF invoice template" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ## Next steps Now that you've used pre-built Agent Skills, you can: Use Skills with the Claude API Upload your own Skills for specialized tasks Learn best practices for writing effective Skills Learn about Skills in Claude Code Use Skills programmatically in TypeScript and Python Explore example Skills and implementation patterns --- # Skill authoring best practices URL: https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices # Skill authoring best practices Learn how to write effective Skills that Claude can discover and use successfully. --- Good Skills are concise, well-structured, and tested with real usage. This guide provides practical authoring decisions to help you write Skills that Claude can discover and use effectively. For conceptual background on how Skills work, see the [Skills overview](/docs/en/agents-and-tools/agent-skills/overview). ## Core principles ### Concise is key The [context window](/docs/en/build-with-claude/context-windows) is a public good. Your Skill shares the context window with everything else Claude needs to know, including: - The system prompt - Conversation history - Other Skills' metadata - Your actual request Not every token in your Skill has an immediate cost. At startup, only the metadata (name and description) from all Skills is pre-loaded. Claude reads SKILL.md only when the Skill becomes relevant, and reads additional files only as needed. However, being concise in SKILL.md still matters: once Claude loads it, every token competes with conversation history and other context. **Default assumption**: Claude is already very smart Only add context Claude doesn't already have. Challenge each piece of information: - "Does Claude really need this explanation?" - "Can I assume Claude knows this?" - "Does this paragraph justify its token cost?" **Good example: Concise** (approximately 50 tokens): ````markdown ## Extract PDF text Use pdfplumber for text extraction: ```python import pdfplumber with pdfplumber.open("file.pdf") as pdf: text = pdf.pages[0].extract_text() ``` ```` **Bad example: Too verbose** (approximately 150 tokens): ```markdown ## Extract PDF text PDF (Portable Document Format) files are a common file format that contains text, images, and other content. To extract text from a PDF, you'll need to use a library. There are many libraries available for PDF processing, but we recommend pdfplumber because it's easy to use and handles most cases well. First, you'll need to install it using pip. Then you can use the code below... ``` The concise version assumes Claude knows what PDFs are and how libraries work. ### Set appropriate degrees of freedom Match the level of specificity to the task's fragility and variability. **High freedom** (text-based instructions): Use when: - Multiple approaches are valid - Decisions depend on context - Heuristics guide the approach Example: ```markdown ## Code review process 1. Analyze the code structure and organization 2. Check for potential bugs or edge cases 3. Suggest improvements for readability and maintainability 4. Verify adherence to project conventions ``` **Medium freedom** (pseudocode or scripts with parameters): Use when: - A preferred pattern exists - Some variation is acceptable - Configuration affects behavior Example: ````markdown ## Generate report Use this template and customize as needed: ```python def generate_report(data, format="markdown", include_charts=True): # Process data # Generate output in specified format # Optionally include visualizations ``` ```` **Low freedom** (specific scripts, few or no parameters): Use when: - Operations are fragile and error-prone - Consistency is critical - A specific sequence must be followed Example: ````markdown ## Database migration Run exactly this script: ```bash python scripts/migrate.py --verify --backup ``` Do not modify the command or add additional flags. ```` **Analogy**: Think of Claude as a robot exploring a path: - **Narrow bridge with cliffs on both sides**: There's only one safe way forward. Provide specific guardrails and exact instructions (low freedom). Example: database migrations that must run in exact sequence. - **Open field with no hazards**: Many paths lead to success. Give general direction and trust Claude to find the best route (high freedom). Example: code reviews where context determines the best approach. ### Test with all models you plan to use Skills act as additions to models, so effectiveness depends on the underlying model. Test your Skill with all the models you plan to use it with. **Testing considerations by model**: - **Claude Haiku** (fast, economical): Does the Skill provide enough guidance? - **Claude Sonnet** (balanced): Is the Skill clear and efficient? - **Claude Opus** (powerful reasoning): Does the Skill avoid over-explaining? What works perfectly for Opus might need more detail for Haiku. If you plan to use your Skill across multiple models, aim for instructions that work well with all of them. ## Skill structure **YAML Frontmatter**: The SKILL.md frontmatter requires two fields: `name`: - Maximum 64 characters - Must contain only lowercase letters, numbers, and hyphens - Cannot contain XML tags - Cannot contain reserved words: "anthropic", "claude" `description`: - Must be non-empty - Maximum 1024 characters - Cannot contain XML tags - Should describe what the Skill does and when to use it For complete Skill structure details, see the [Skills overview](/docs/en/agents-and-tools/agent-skills/overview#skill-structure). ### Naming conventions Use consistent naming patterns to make Skills easier to reference and discuss. We recommend using **gerund form** (verb + -ing) for Skill names, as this clearly describes the activity or capability the Skill provides. Remember that the `name` field must use lowercase letters, numbers, and hyphens only. **Good naming examples (gerund form)**: - `processing-pdfs` - `analyzing-spreadsheets` - `managing-databases` - `testing-code` - `writing-documentation` **Acceptable alternatives**: - Noun phrases: `pdf-processing`, `spreadsheet-analysis` - Action-oriented: `process-pdfs`, `analyze-spreadsheets` **Avoid**: - Vague names: `helper`, `utils`, `tools` - Overly generic: `documents`, `data`, `files` - Reserved words: `anthropic-helper`, `claude-tools` - Inconsistent patterns within your skill collection Consistent naming makes it easier to: - Reference Skills in documentation and conversations - Understand what a Skill does at a glance - Organize and search through multiple Skills - Maintain a professional, cohesive skill library ### Writing effective descriptions The `description` field enables Skill discovery and should include both what the Skill does and when to use it. **Always write in third person**. The description is injected into the system prompt, and inconsistent point-of-view can cause discovery problems. - **Good:** "Processes Excel files and generates reports" - **Avoid:** "I can help you process Excel files" - **Avoid:** "You can use this to process Excel files" **Be specific and include key terms**. Include both what the Skill does and specific triggers/contexts for when to use it. Each Skill has exactly one description field. The description is critical for skill selection: Claude uses it to choose the right Skill from potentially 100+ available Skills. Your description must provide enough detail for Claude to know when to select this Skill, while the rest of SKILL.md provides the implementation details. Effective examples: **PDF Processing skill:** ```yaml description: Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction. ``` **Excel Analysis skill:** ```yaml description: Analyze Excel spreadsheets, create pivot tables, generate charts. Use when analyzing Excel files, spreadsheets, tabular data, or .xlsx files. ``` **Git Commit Helper skill:** ```yaml description: Generate descriptive commit messages by analyzing git diffs. Use when the user asks for help writing commit messages or reviewing staged changes. ``` Avoid vague descriptions like these: ```yaml description: Helps with documents ``` ```yaml description: Processes data ``` ```yaml description: Does stuff with files ``` ### Progressive disclosure patterns SKILL.md serves as an overview that points Claude to detailed materials as needed, like a table of contents in an onboarding guide. For an explanation of how progressive disclosure works, see [How Skills work](/docs/en/agents-and-tools/agent-skills/overview#how-skills-work) in the overview. **Practical guidance:** - Keep SKILL.md body under 500 lines for optimal performance - Split content into separate files when approaching this limit - Use the patterns below to organize instructions, code, and resources effectively #### Visual overview: From simple to complex A basic Skill starts with just a SKILL.md file containing metadata and instructions: ![Simple SKILL.md file showing YAML frontmatter and markdown body](/docs/images/agent-skills-simple-file.png) As your Skill grows, you can bundle additional content that Claude loads only when needed: ![Bundling additional reference files like reference.md and forms.md.](/docs/images/agent-skills-bundling-content.png) The complete Skill directory structure might look like this: ``` pdf/ ├── SKILL.md # Main instructions (loaded when triggered) ├── FORMS.md # Form-filling guide (loaded as needed) ├── reference.md # API reference (loaded as needed) ├── examples.md # Usage examples (loaded as needed) └── scripts/ ├── analyze_form.py # Utility script (executed, not loaded) ├── fill_form.py # Form filling script └── validate.py # Validation script ``` #### Pattern 1: High-level guide with references ````markdown --- name: pdf-processing description: Extracts text and tables from PDF files, fills forms, and merges documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction. --- # PDF Processing ## Quick start Extract text with pdfplumber: ```python import pdfplumber with pdfplumber.open("file.pdf") as pdf: text = pdf.pages[0].extract_text() ``` ## Advanced features **Form filling**: See [FORMS.md](FORMS.md) for complete guide **API reference**: See [REFERENCE.md](REFERENCE.md) for all methods **Examples**: See [EXAMPLES.md](EXAMPLES.md) for common patterns ```` Claude loads FORMS.md, REFERENCE.md, or EXAMPLES.md only when needed. #### Pattern 2: Domain-specific organization For Skills with multiple domains, organize content by domain to avoid loading irrelevant context. When a user asks about sales metrics, Claude only needs to read sales-related schemas, not finance or marketing data. This keeps token usage low and context focused. ``` bigquery-skill/ ├── SKILL.md (overview and navigation) └── reference/ ├── finance.md (revenue, billing metrics) ├── sales.md (opportunities, pipeline) ├── product.md (API usage, features) └── marketing.md (campaigns, attribution) ``` ````markdown SKILL.md # BigQuery Data Analysis ## Available datasets **Finance**: Revenue, ARR, billing → See [reference/finance.md](reference/finance.md) **Sales**: Opportunities, pipeline, accounts → See [reference/sales.md](reference/sales.md) **Product**: API usage, features, adoption → See [reference/product.md](reference/product.md) **Marketing**: Campaigns, attribution, email → See [reference/marketing.md](reference/marketing.md) ## Quick search Find specific metrics using grep: ```bash grep -i "revenue" reference/finance.md grep -i "pipeline" reference/sales.md grep -i "api usage" reference/product.md ``` ```` #### Pattern 3: Conditional details Show basic content, link to advanced content: ```markdown # DOCX Processing ## Creating documents Use docx-js for new documents. See [DOCX-JS.md](DOCX-JS.md). ## Editing documents For simple edits, modify the XML directly. **For tracked changes**: See [REDLINING.md](REDLINING.md) **For OOXML details**: See [OOXML.md](OOXML.md) ``` Claude reads REDLINING.md or OOXML.md only when the user needs those features. ### Avoid deeply nested references Claude may partially read files when they're referenced from other referenced files. When encountering nested references, Claude might use commands like `head -100` to preview content rather than reading entire files, resulting in incomplete information. **Keep references one level deep from SKILL.md**. All reference files should link directly from SKILL.md to ensure Claude reads complete files when needed. **Bad example: Too deep**: ```markdown # SKILL.md See [advanced.md](advanced.md)... # advanced.md See [details.md](details.md)... # details.md Here's the actual information... ``` **Good example: One level deep**: ```markdown # SKILL.md **Basic usage**: [instructions in SKILL.md] **Advanced features**: See [advanced.md](advanced.md) **API reference**: See [reference.md](reference.md) **Examples**: See [examples.md](examples.md) ``` ### Structure longer reference files with table of contents For reference files longer than 100 lines, include a table of contents at the top. This ensures Claude can see the full scope of available information even when previewing with partial reads. **Example**: ```markdown # API Reference ## Contents - Authentication and setup - Core methods (create, read, update, delete) - Advanced features (batch operations, webhooks) - Error handling patterns - Code examples ## Authentication and setup ... ## Core methods ... ``` Claude can then read the complete file or jump to specific sections as needed. For details on how this filesystem-based architecture enables progressive disclosure, see the [Runtime environment](#runtime-environment) section in the Advanced section below. ## Workflows and feedback loops ### Use workflows for complex tasks Break complex operations into clear, sequential steps. For particularly complex workflows, provide a checklist that Claude can copy into its response and check off as it progresses. **Example 1: Research synthesis workflow** (for Skills without code): ````markdown ## Research synthesis workflow Copy this checklist and track your progress: ``` Research Progress: - [ ] Step 1: Read all source documents - [ ] Step 2: Identify key themes - [ ] Step 3: Cross-reference claims - [ ] Step 4: Create structured summary - [ ] Step 5: Verify citations ``` **Step 1: Read all source documents** Review each document in the `sources/` directory. Note the main arguments and supporting evidence. **Step 2: Identify key themes** Look for patterns across sources. What themes appear repeatedly? Where do sources agree or disagree? **Step 3: Cross-reference claims** For each major claim, verify it appears in the source material. Note which source supports each point. **Step 4: Create structured summary** Organize findings by theme. Include: - Main claim - Supporting evidence from sources - Conflicting viewpoints (if any) **Step 5: Verify citations** Check that every claim references the correct source document. If citations are incomplete, return to Step 3. ```` This example shows how workflows apply to analysis tasks that don't require code. The checklist pattern works for any complex, multi-step process. **Example 2: PDF form filling workflow** (for Skills with code): ````markdown ## PDF form filling workflow Copy this checklist and check off items as you complete them: ``` Task Progress: - [ ] Step 1: Analyze the form (run analyze_form.py) - [ ] Step 2: Create field mapping (edit fields.json) - [ ] Step 3: Validate mapping (run validate_fields.py) - [ ] Step 4: Fill the form (run fill_form.py) - [ ] Step 5: Verify output (run verify_output.py) ``` **Step 1: Analyze the form** Run: `python scripts/analyze_form.py input.pdf` This extracts form fields and their locations, saving to `fields.json`. **Step 2: Create field mapping** Edit `fields.json` to add values for each field. **Step 3: Validate mapping** Run: `python scripts/validate_fields.py fields.json` Fix any validation errors before continuing. **Step 4: Fill the form** Run: `python scripts/fill_form.py input.pdf fields.json output.pdf` **Step 5: Verify output** Run: `python scripts/verify_output.py output.pdf` If verification fails, return to Step 2. ```` Clear steps prevent Claude from skipping critical validation. The checklist helps both Claude and you track progress through multi-step workflows. ### Implement feedback loops **Common pattern**: Run validator → fix errors → repeat This pattern greatly improves output quality. **Example 1: Style guide compliance** (for Skills without code): ```markdown ## Content review process 1. Draft your content following the guidelines in STYLE_GUIDE.md 2. Review against the checklist: - Check terminology consistency - Verify examples follow the standard format - Confirm all required sections are present 3. If issues found: - Note each issue with specific section reference - Revise the content - Review the checklist again 4. Only proceed when all requirements are met 5. Finalize and save the document ``` This shows the validation loop pattern using reference documents instead of scripts. The "validator" is STYLE_GUIDE.md, and Claude performs the check by reading and comparing. **Example 2: Document editing process** (for Skills with code): ```markdown ## Document editing process 1. Make your edits to `word/document.xml` 2. **Validate immediately**: `python ooxml/scripts/validate.py unpacked_dir/` 3. If validation fails: - Review the error message carefully - Fix the issues in the XML - Run validation again 4. **Only proceed when validation passes** 5. Rebuild: `python ooxml/scripts/pack.py unpacked_dir/ output.docx` 6. Test the output document ``` The validation loop catches errors early. ## Content guidelines ### Avoid time-sensitive information Don't include information that will become outdated: **Bad example: Time-sensitive** (will become wrong): ```markdown If you're doing this before August 2025, use the old API. After August 2025, use the new API. ``` **Good example** (use "old patterns" section): ```markdown ## Current method Use the v2 API endpoint: `api.example.com/v2/messages` ## Old patterns
Legacy v1 API (deprecated 2025-08) The v1 API used: `api.example.com/v1/messages` This endpoint is no longer supported.
``` The old patterns section provides historical context without cluttering the main content. ### Use consistent terminology Choose one term and use it throughout the Skill: **Good - Consistent**: - Always "API endpoint" - Always "field" - Always "extract" **Bad - Inconsistent**: - Mix "API endpoint", "URL", "API route", "path" - Mix "field", "box", "element", "control" - Mix "extract", "pull", "get", "retrieve" Consistency helps Claude understand and follow instructions. ## Common patterns ### Template pattern Provide templates for output format. Match the level of strictness to your needs. **For strict requirements** (like API responses or data formats): ````markdown ## Report structure ALWAYS use this exact template structure: ```markdown # [Analysis Title] ## Executive summary [One-paragraph overview of key findings] ## Key findings - Finding 1 with supporting data - Finding 2 with supporting data - Finding 3 with supporting data ## Recommendations 1. Specific actionable recommendation 2. Specific actionable recommendation ``` ```` **For flexible guidance** (when adaptation is useful): ````markdown ## Report structure Here is a sensible default format, but use your best judgment based on the analysis: ```markdown # [Analysis Title] ## Executive summary [Overview] ## Key findings [Adapt sections based on what you discover] ## Recommendations [Tailor to the specific context] ``` Adjust sections as needed for the specific analysis type. ```` ### Examples pattern For Skills where output quality depends on seeing examples, provide input/output pairs just like in regular prompting: ````markdown ## Commit message format Generate commit messages following these examples: **Example 1:** Input: Added user authentication with JWT tokens Output: ``` feat(auth): implement JWT-based authentication Add login endpoint and token validation middleware ``` **Example 2:** Input: Fixed bug where dates displayed incorrectly in reports Output: ``` fix(reports): correct date formatting in timezone conversion Use UTC timestamps consistently across report generation ``` **Example 3:** Input: Updated dependencies and refactored error handling Output: ``` chore: update dependencies and refactor error handling - Upgrade lodash to 4.17.21 - Standardize error response format across endpoints ``` Follow this style: type(scope): brief description, then detailed explanation. ```` Examples help Claude understand the desired style and level of detail more clearly than descriptions alone. ### Conditional workflow pattern Guide Claude through decision points: ```markdown ## Document modification workflow 1. Determine the modification type: **Creating new content?** → Follow "Creation workflow" below **Editing existing content?** → Follow "Editing workflow" below 2. Creation workflow: - Use docx-js library - Build document from scratch - Export to .docx format 3. Editing workflow: - Unpack existing document - Modify XML directly - Validate after each change - Repack when complete ``` If workflows become large or complicated with many steps, consider pushing them into separate files and tell Claude to read the appropriate file based on the task at hand. ## Evaluation and iteration ### Build evaluations first **Create evaluations BEFORE writing extensive documentation.** This ensures your Skill solves real problems rather than documenting imagined ones. **Evaluation-driven development:** 1. **Identify gaps**: Run Claude on representative tasks without a Skill. Document specific failures or missing context 2. **Create evaluations**: Build three scenarios that test these gaps 3. **Establish baseline**: Measure Claude's performance without the Skill 4. **Write minimal instructions**: Create just enough content to address the gaps and pass evaluations 5. **Iterate**: Execute evaluations, compare against baseline, and refine This approach ensures you're solving actual problems rather than anticipating requirements that may never materialize. **Evaluation structure**: ```json { "skills": ["pdf-processing"], "query": "Extract all text from this PDF file and save it to output.txt", "files": ["test-files/document.pdf"], "expected_behavior": [ "Successfully reads the PDF file using an appropriate PDF processing library or command-line tool", "Extracts text content from all pages in the document without missing any pages", "Saves the extracted text to a file named output.txt in a clear, readable format" ] } ``` This example demonstrates a data-driven evaluation with a simple testing rubric. We do not currently provide a built-in way to run these evaluations. Users can create their own evaluation system. Evaluations are your source of truth for measuring Skill effectiveness. ### Develop Skills iteratively with Claude The most effective Skill development process involves Claude itself. Work with one instance of Claude ("Claude A") to create a Skill that will be used by other instances ("Claude B"). Claude A helps you design and refine instructions, while Claude B tests them in real tasks. This works because Claude models understand both how to write effective agent instructions and what information agents need. **Creating a new Skill:** 1. **Complete a task without a Skill**: Work through a problem with Claude A using normal prompting. As you work, you'll naturally provide context, explain preferences, and share procedural knowledge. Notice what information you repeatedly provide. 2. **Identify the reusable pattern**: After completing the task, identify what context you provided that would be useful for similar future tasks. **Example**: If you worked through a BigQuery analysis, you might have provided table names, field definitions, filtering rules (like "always exclude test accounts"), and common query patterns. 3. **Ask Claude A to create a Skill**: "Create a Skill that captures this BigQuery analysis pattern we just used. Include the table schemas, naming conventions, and the rule about filtering test accounts." Claude models understand the Skill format and structure natively. You don't need special system prompts or a "writing skills" skill to get Claude to help create Skills. Simply ask Claude to create a Skill and it will generate properly structured SKILL.md content with appropriate frontmatter and body content. 4. **Review for conciseness**: Check that Claude A hasn't added unnecessary explanations. Ask: "Remove the explanation about what win rate means - Claude already knows that." 5. **Improve information architecture**: Ask Claude A to organize the content more effectively. For example: "Organize this so the table schema is in a separate reference file. We might add more tables later." 6. **Test on similar tasks**: Use the Skill with Claude B (a fresh instance with the Skill loaded) on related use cases. Observe whether Claude B finds the right information, applies rules correctly, and handles the task successfully. 7. **Iterate based on observation**: If Claude B struggles or misses something, return to Claude A with specifics: "When Claude used this Skill, it forgot to filter by date for Q4. Should we add a section about date filtering patterns?" **Iterating on existing Skills:** The same hierarchical pattern continues when improving Skills. You alternate between: - **Working with Claude A** (the expert who helps refine the Skill) - **Testing with Claude B** (the agent using the Skill to perform real work) - **Observing Claude B's behavior** and bringing insights back to Claude A 1. **Use the Skill in real workflows**: Give Claude B (with the Skill loaded) actual tasks, not test scenarios 2. **Observe Claude B's behavior**: Note where it struggles, succeeds, or makes unexpected choices **Example observation**: "When I asked Claude B for a regional sales report, it wrote the query but forgot to filter out test accounts, even though the Skill mentions this rule." 3. **Return to Claude A for improvements**: Share the current SKILL.md and describe what you observed. Ask: "I noticed Claude B forgot to filter test accounts when I asked for a regional report. The Skill mentions filtering, but maybe it's not prominent enough?" 4. **Review Claude A's suggestions**: Claude A might suggest reorganizing to make rules more prominent, using stronger language like "MUST filter" instead of "always filter", or restructuring the workflow section. 5. **Apply and test changes**: Update the Skill with Claude A's refinements, then test again with Claude B on similar requests 6. **Repeat based on usage**: Continue this observe-refine-test cycle as you encounter new scenarios. Each iteration improves the Skill based on real agent behavior, not assumptions. **Gathering team feedback:** 1. Share Skills with teammates and observe their usage 2. Ask: Does the Skill activate when expected? Are instructions clear? What's missing? 3. Incorporate feedback to address blind spots in your own usage patterns **Why this approach works**: Claude A understands agent needs, you provide domain expertise, Claude B reveals gaps through real usage, and iterative refinement improves Skills based on observed behavior rather than assumptions. ### Observe how Claude navigates Skills As you iterate on Skills, pay attention to how Claude actually uses them in practice. Watch for: - **Unexpected exploration paths**: Does Claude read files in an order you didn't anticipate? This might indicate your structure isn't as intuitive as you thought - **Missed connections**: Does Claude fail to follow references to important files? Your links might need to be more explicit or prominent - **Overreliance on certain sections**: If Claude repeatedly reads the same file, consider whether that content should be in the main SKILL.md instead - **Ignored content**: If Claude never accesses a bundled file, it might be unnecessary or poorly signaled in the main instructions Iterate based on these observations rather than assumptions. The 'name' and 'description' in your Skill's metadata are particularly critical. Claude uses these when deciding whether to trigger the Skill in response to the current task. Make sure they clearly describe what the Skill does and when it should be used. ## Anti-patterns to avoid ### Avoid Windows-style paths Always use forward slashes in file paths, even on Windows: - ✓ **Good**: `scripts/helper.py`, `reference/guide.md` - ✗ **Avoid**: `scripts\helper.py`, `reference\guide.md` Unix-style paths work across all platforms, while Windows-style paths cause errors on Unix systems. ### Avoid offering too many options Don't present multiple approaches unless necessary: ````markdown **Bad example: Too many choices** (confusing): "You can use pypdf, or pdfplumber, or PyMuPDF, or pdf2image, or..." **Good example: Provide a default** (with escape hatch): "Use pdfplumber for text extraction: ```python import pdfplumber ``` For scanned PDFs requiring OCR, use pdf2image with pytesseract instead." ```` ## Advanced: Skills with executable code The sections below focus on Skills that include executable scripts. If your Skill uses only markdown instructions, skip to [Checklist for effective Skills](#checklist-for-effective-skills). ### Solve, don't punt When writing scripts for Skills, handle error conditions rather than punting to Claude. **Good example: Handle errors explicitly**: ```python def process_file(path): """Process a file, creating it if it doesn't exist.""" try: with open(path) as f: return f.read() except FileNotFoundError: # Create file with default content instead of failing print(f"File {path} not found, creating default") with open(path, 'w') as f: f.write('') return '' except PermissionError: # Provide alternative instead of failing print(f"Cannot access {path}, using default") return '' ``` **Bad example: Punt to Claude**: ```python def process_file(path): # Just fail and let Claude figure it out return open(path).read() ``` Configuration parameters should also be justified and documented to avoid "voodoo constants" (Ousterhout's law). If you don't know the right value, how will Claude determine it? **Good example: Self-documenting**: ```python # HTTP requests typically complete within 30 seconds # Longer timeout accounts for slow connections REQUEST_TIMEOUT = 30 # Three retries balances reliability vs speed # Most intermittent failures resolve by the second retry MAX_RETRIES = 3 ``` **Bad example: Magic numbers**: ```python TIMEOUT = 47 # Why 47? RETRIES = 5 # Why 5? ``` ### Provide utility scripts Even if Claude could write a script, pre-made scripts offer advantages: **Benefits of utility scripts**: - More reliable than generated code - Save tokens (no need to include code in context) - Save time (no code generation required) - Ensure consistency across uses ![Bundling executable scripts alongside instruction files](/docs/images/agent-skills-executable-scripts.png) The diagram above shows how executable scripts work alongside instruction files. The instruction file (forms.md) references the script, and Claude can execute it without loading its contents into context. **Important distinction**: Make clear in your instructions whether Claude should: - **Execute the script** (most common): "Run `analyze_form.py` to extract fields" - **Read it as reference** (for complex logic): "See `analyze_form.py` for the field extraction algorithm" For most utility scripts, execution is preferred because it's more reliable and efficient. See the [Runtime environment](#runtime-environment) section below for details on how script execution works. **Example**: ````markdown ## Utility scripts **analyze_form.py**: Extract all form fields from PDF ```bash python scripts/analyze_form.py input.pdf > fields.json ``` Output format: ```json { "field_name": {"type": "text", "x": 100, "y": 200}, "signature": {"type": "sig", "x": 150, "y": 500} } ``` **validate_boxes.py**: Check for overlapping bounding boxes ```bash python scripts/validate_boxes.py fields.json # Returns: "OK" or lists conflicts ``` **fill_form.py**: Apply field values to PDF ```bash python scripts/fill_form.py input.pdf fields.json output.pdf ``` ```` ### Use visual analysis When inputs can be rendered as images, have Claude analyze them: ````markdown ## Form layout analysis 1. Convert PDF to images: ```bash python scripts/pdf_to_images.py form.pdf ``` 2. Analyze each page image to identify form fields 3. Claude can see field locations and types visually ```` In this example, you'd need to write the `pdf_to_images.py` script. Claude's vision capabilities help understand layouts and structures. ### Create verifiable intermediate outputs When Claude performs complex, open-ended tasks, it can make mistakes. The "plan-validate-execute" pattern catches errors early by having Claude first create a plan in a structured format, then validate that plan with a script before executing it. **Example**: Imagine asking Claude to update 50 form fields in a PDF based on a spreadsheet. Without validation, Claude might reference non-existent fields, create conflicting values, miss required fields, or apply updates incorrectly. **Solution**: Use the workflow pattern shown above (PDF form filling), but add an intermediate `changes.json` file that gets validated before applying changes. The workflow becomes: analyze → **create plan file** → **validate plan** → execute → verify. **Why this pattern works:** - **Catches errors early**: Validation finds problems before changes are applied - **Machine-verifiable**: Scripts provide objective verification - **Reversible planning**: Claude can iterate on the plan without touching originals - **Clear debugging**: Error messages point to specific problems **When to use**: Batch operations, destructive changes, complex validation rules, high-stakes operations. **Implementation tip**: Make validation scripts verbose with specific error messages like "Field 'signature_date' not found. Available fields: customer_name, order_total, signature_date_signed" to help Claude fix issues. ### Package dependencies Skills run in the code execution environment with platform-specific limitations: - **claude.ai**: Can install packages from npm and PyPI and pull from GitHub repositories - **Anthropic API**: Has no network access and no runtime package installation List required packages in your SKILL.md and verify they're available in the [code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool). ### Runtime environment Skills run in a code execution environment with filesystem access, bash commands, and code execution capabilities. For the conceptual explanation of this architecture, see [The Skills architecture](/docs/en/agents-and-tools/agent-skills/overview#the-skills-architecture) in the overview. **How this affects your authoring:** **How Claude accesses Skills:** 1. **Metadata pre-loaded**: At startup, the name and description from all Skills' YAML frontmatter are loaded into the system prompt 2. **Files read on-demand**: Claude uses bash Read tools to access SKILL.md and other files from the filesystem when needed 3. **Scripts executed efficiently**: Utility scripts can be executed via bash without loading their full contents into context. Only the script's output consumes tokens 4. **No context penalty for large files**: Reference files, data, or documentation don't consume context tokens until actually read - **File paths matter**: Claude navigates your skill directory like a filesystem. Use forward slashes (`reference/guide.md`), not backslashes - **Name files descriptively**: Use names that indicate content: `form_validation_rules.md`, not `doc2.md` - **Organize for discovery**: Structure directories by domain or feature - Good: `reference/finance.md`, `reference/sales.md` - Bad: `docs/file1.md`, `docs/file2.md` - **Bundle comprehensive resources**: Include complete API docs, extensive examples, large datasets; no context penalty until accessed - **Prefer scripts for deterministic operations**: Write `validate_form.py` rather than asking Claude to generate validation code - **Make execution intent clear**: - "Run `analyze_form.py` to extract fields" (execute) - "See `analyze_form.py` for the extraction algorithm" (read as reference) - **Test file access patterns**: Verify Claude can navigate your directory structure by testing with real requests **Example:** ``` bigquery-skill/ ├── SKILL.md (overview, points to reference files) └── reference/ ├── finance.md (revenue metrics) ├── sales.md (pipeline data) └── product.md (usage analytics) ``` When the user asks about revenue, Claude reads SKILL.md, sees the reference to `reference/finance.md`, and invokes bash to read just that file. The sales.md and product.md files remain on the filesystem, consuming zero context tokens until needed. This filesystem-based model is what enables progressive disclosure. Claude can navigate and selectively load exactly what each task requires. For complete details on the technical architecture, see [How Skills work](/docs/en/agents-and-tools/agent-skills/overview#how-skills-work) in the Skills overview. ### MCP tool references If your Skill uses MCP (Model Context Protocol) tools, always use fully qualified tool names to avoid "tool not found" errors. **Format**: `ServerName:tool_name` **Example**: ```markdown Use the BigQuery:bigquery_schema tool to retrieve table schemas. Use the GitHub:create_issue tool to create issues. ``` Where: - `BigQuery` and `GitHub` are MCP server names - `bigquery_schema` and `create_issue` are the tool names within those servers Without the server prefix, Claude may fail to locate the tool, especially when multiple MCP servers are available. ### Avoid assuming tools are installed Don't assume packages are available: ````markdown **Bad example: Assumes installation**: "Use the pdf library to process the file." **Good example: Explicit about dependencies**: "Install required package: `pip install pypdf` Then use it: ```python from pypdf import PdfReader reader = PdfReader("file.pdf") ```" ```` ## Technical notes ### YAML frontmatter requirements The SKILL.md frontmatter requires `name` and `description` fields with specific validation rules: - `name`: Maximum 64 characters, lowercase letters/numbers/hyphens only, no XML tags, no reserved words - `description`: Maximum 1024 characters, non-empty, no XML tags See the [Skills overview](/docs/en/agents-and-tools/agent-skills/overview#skill-structure) for complete structure details. ### Token budgets Keep SKILL.md body under 500 lines for optimal performance. If your content exceeds this, split it into separate files using the progressive disclosure patterns described earlier. For architectural details, see the [Skills overview](/docs/en/agents-and-tools/agent-skills/overview#how-skills-work). ## Checklist for effective Skills Before sharing a Skill, verify: ### Core quality - [ ] Description is specific and includes key terms - [ ] Description includes both what the Skill does and when to use it - [ ] SKILL.md body is under 500 lines - [ ] Additional details are in separate files (if needed) - [ ] No time-sensitive information (or in "old patterns" section) - [ ] Consistent terminology throughout - [ ] Examples are concrete, not abstract - [ ] File references are one level deep - [ ] Progressive disclosure used appropriately - [ ] Workflows have clear steps ### Code and scripts - [ ] Scripts solve problems rather than punt to Claude - [ ] Error handling is explicit and helpful - [ ] No "voodoo constants" (all values justified) - [ ] Required packages listed in instructions and verified as available - [ ] Scripts have clear documentation - [ ] No Windows-style paths (all forward slashes) - [ ] Validation/verification steps for critical operations - [ ] Feedback loops included for quality-critical tasks ### Testing - [ ] At least three evaluations created - [ ] Tested with Haiku, Sonnet, and Opus - [ ] Tested with real usage scenarios - [ ] Team feedback incorporated (if applicable) ## Next steps Create your first Skill Create and manage Skills in Claude Code Use Skills programmatically in TypeScript and Python Upload and use Skills programmatically --- # Using Agent Skills with the API URL: https://platform.claude.com/docs/en/build-with-claude/skills-guide # Using Agent Skills with the API Learn how to use Agent Skills to extend Claude's capabilities through the API. --- Agent Skills extend Claude's capabilities through organized folders of instructions, scripts, and resources. This guide shows you how to use both pre-built and custom Skills with the Claude API. For complete API reference including request/response schemas and all parameters, see: - [Skill Management API Reference](/docs/en/api/skills/list-skills) - CRUD operations for Skills - [Skill Versions API Reference](/docs/en/api/skills/list-skill-versions) - Version management ## Quick Links Create your first Skill Best practices for authoring Skills ## Overview For a deep dive into the architecture and real-world applications of Agent Skills, read our engineering blog: [Equipping agents for the real world with Agent Skills](https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills). Skills integrate with the Messages API through the code execution tool. Whether using pre-built Skills managed by Anthropic or custom Skills you've uploaded, the integration shape is identical—both require code execution and use the same `container` structure. ### Using Skills Skills integrate identically in the Messages API regardless of source. You specify Skills in the `container` parameter with a `skill_id`, `type`, and optional `version`, and they execute in the code execution environment. **You can use Skills from two sources:** | Aspect | Anthropic Skills | Custom Skills | |--------|------------------|---------------| | **Type value** | `anthropic` | `custom` | | **Skill IDs** | Short names: `pptx`, `xlsx`, `docx`, `pdf` | Generated: `skill_01AbCdEfGhIjKlMnOpQrStUv` | | **Version format** | Date-based: `20251013` or `latest` | Epoch timestamp: `1759178010641129` or `latest` | | **Management** | Pre-built and maintained by Anthropic | Upload and manage via [Skills API](/docs/en/api/skills/create-skill) | | **Availability** | Available to all users | Private to your workspace | Both skill sources are returned by the [List Skills endpoint](/docs/en/api/skills/list-skills) (use the `source` parameter to filter). The integration shape and execution environment are identical—the only difference is where the Skills come from and how they're managed. ### Prerequisites To use Skills, you need: 1. **Anthropic API key** from the [Console](/settings/keys) 2. **Beta headers**: - `code-execution-2025-08-25` - Enables code execution (required for Skills) - `skills-2025-10-02` - Enables Skills API - `files-api-2025-04-14` - For uploading/downloading files to/from container 3. **Code execution tool** enabled in your requests --- ## Using Skills in Messages ### Container Parameter Skills are specified using the `container` parameter in the Messages API. You can include up to 8 Skills per request. The structure is identical for both Anthropic and custom Skills—specify the required `type` and `skill_id`, and optionally include `version` to pin to a specific version: ```python Python import anthropic client = anthropic.Anthropic() response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "pptx", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Create a presentation about renewable energy" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'pptx', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Create a presentation about renewable energy' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "pptx", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Create a presentation about renewable energy" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` ### Downloading Generated Files When Skills create documents (Excel, PowerPoint, PDF, Word), they return `file_id` attributes in the response. You must use the Files API to download these files. **How it works:** 1. Skills create files during code execution 2. Response includes `file_id` for each created file 3. Use Files API to download the actual file content 4. Save locally or process as needed **Example: Creating and downloading an Excel file** ```python Python import anthropic client = anthropic.Anthropic() # Step 1: Use a Skill to create a file response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, messages=[{ "role": "user", "content": "Create an Excel file with a simple budget spreadsheet" }], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) # Step 2: Extract file IDs from the response def extract_file_ids(response): file_ids = [] for item in response.content: if item.type == 'bash_code_execution_tool_result': content_item = item.content if content_item.type == 'bash_code_execution_result': for file in content_item.content: if hasattr(file, 'file_id'): file_ids.append(file.file_id) return file_ids # Step 3: Download the file using Files API for file_id in extract_file_ids(response): file_metadata = client.beta.files.retrieve_metadata( file_id=file_id, betas=["files-api-2025-04-14"] ) file_content = client.beta.files.download( file_id=file_id, betas=["files-api-2025-04-14"] ) # Step 4: Save to disk file_content.write_to_file(file_metadata.filename) print(f"Downloaded: {file_metadata.filename}") ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const client = new Anthropic(); // Step 1: Use a Skill to create a file const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'} ] }, messages: [{ role: 'user', content: 'Create an Excel file with a simple budget spreadsheet' }], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); // Step 2: Extract file IDs from the response function extractFileIds(response: any): string[] { const fileIds: string[] = []; for (const item of response.content) { if (item.type === 'bash_code_execution_tool_result') { const contentItem = item.content; if (contentItem.type === 'bash_code_execution_result') { for (const file of contentItem.content) { if ('file_id' in file) { fileIds.push(file.file_id); } } } } } return fileIds; } // Step 3: Download the file using Files API const fs = require('fs'); for (const fileId of extractFileIds(response)) { const fileMetadata = await client.beta.files.retrieve_metadata(fileId, { betas: ['files-api-2025-04-14'] }); const fileContent = await client.beta.files.download(fileId, { betas: ['files-api-2025-04-14'] }); // Step 4: Save to disk fs.writeFileSync(fileMetadata.filename, Buffer.from(await fileContent.arrayBuffer())); console.log(`Downloaded: ${fileMetadata.filename}`); } ``` ```bash Shell # Step 1: Use a Skill to create a file RESPONSE=$(curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, "messages": [{ "role": "user", "content": "Create an Excel file with a simple budget spreadsheet" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }') # Step 2: Extract file_id from response (using jq) FILE_ID=$(echo "$RESPONSE" | jq -r '.content[] | select(.type=="bash_code_execution_tool_result") | .content | select(.type=="bash_code_execution_result") | .content[] | select(.file_id) | .file_id') # Step 3: Get filename from metadata FILENAME=$(curl "https://api.anthropic.com/v1/files/$FILE_ID" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" | jq -r '.filename') # Step 4: Download the file using Files API curl "https://api.anthropic.com/v1/files/$FILE_ID/content" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" \ --output "$FILENAME" echo "Downloaded: $FILENAME" ``` **Additional Files API operations:** ```python Python # Get file metadata file_info = client.beta.files.retrieve_metadata( file_id=file_id, betas=["files-api-2025-04-14"] ) print(f"Filename: {file_info.filename}, Size: {file_info.size_bytes} bytes") # List all files files = client.beta.files.list(betas=["files-api-2025-04-14"]) for file in files.data: print(f"{file.filename} - {file.created_at}") # Delete a file client.beta.files.delete( file_id=file_id, betas=["files-api-2025-04-14"] ) ``` ```typescript TypeScript // Get file metadata const fileInfo = await client.beta.files.retrieve_metadata(fileId, { betas: ['files-api-2025-04-14'] }); console.log(`Filename: ${fileInfo.filename}, Size: ${fileInfo.size_bytes} bytes`); // List all files const files = await client.beta.files.list({ betas: ['files-api-2025-04-14'] }); for (const file of files.data) { console.log(`${file.filename} - ${file.created_at}`); } // Delete a file await client.beta.files.delete(fileId, { betas: ['files-api-2025-04-14'] }); ``` ```bash Shell # Get file metadata curl "https://api.anthropic.com/v1/files/$FILE_ID" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" # List all files curl "https://api.anthropic.com/v1/files" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" # Delete a file curl -X DELETE "https://api.anthropic.com/v1/files/$FILE_ID" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: files-api-2025-04-14" ``` For complete details on the Files API, see the [Files API documentation](/docs/en/api/files-content). ### Multi-Turn Conversations Reuse the same container across multiple messages by specifying the container ID: ```python Python # First request creates container response1 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, messages=[{"role": "user", "content": "Analyze this sales data"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) # Continue conversation with same container messages = [ {"role": "user", "content": "Analyze this sales data"}, {"role": "assistant", "content": response1.content}, {"role": "user", "content": "What was the total revenue?"} ] response2 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "id": response1.container.id, # Reuse container "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, messages=messages, tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) ``` ```typescript TypeScript // First request creates container const response1 = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'} ] }, messages: [{role: 'user', content: 'Analyze this sales data'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); // Continue conversation with same container const messages = [ {role: 'user', content: 'Analyze this sales data'}, {role: 'assistant', content: response1.content}, {role: 'user', content: 'What was the total revenue?'} ]; const response2 = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { id: response1.container.id, // Reuse container skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'} ] }, messages, tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); ``` ### Long-Running Operations Skills may perform operations that require multiple turns. Handle `pause_turn` stop reasons: ```python Python messages = [{"role": "user", "content": "Process this large dataset"}] max_retries = 10 response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ {"type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest"} ] }, messages=messages, tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) # Handle pause_turn for long operations for i in range(max_retries): if response.stop_reason != "pause_turn": break messages.append({"role": "assistant", "content": response.content}) response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "id": response.container.id, "skills": [ {"type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest"} ] }, messages=messages, tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) ``` ```typescript TypeScript let messages = [{role: 'user' as const, content: 'Process this large dataset'}]; const maxRetries = 10; let response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ {type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: 'latest'} ] }, messages, tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); // Handle pause_turn for long operations for (let i = 0; i < maxRetries; i++) { if (response.stop_reason !== 'pause_turn') { break; } messages.push({role: 'assistant', content: response.content}); response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { id: response.container.id, skills: [ {type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: 'latest'} ] }, messages, tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); } ``` ```bash Shell # Initial request RESPONSE=$(curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Process this large dataset" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }') # Check stop_reason and handle pause_turn in a loop STOP_REASON=$(echo "$RESPONSE" | jq -r '.stop_reason') CONTAINER_ID=$(echo "$RESPONSE" | jq -r '.container.id') while [ "$STOP_REASON" = "pause_turn" ]; do # Continue with same container RESPONSE=$(curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d "{ \"model\": \"claude-sonnet-4-5-20250929\", \"max_tokens\": 4096, \"container\": { \"id\": \"$CONTAINER_ID\", \"skills\": [{ \"type\": \"custom\", \"skill_id\": \"skill_01AbCdEfGhIjKlMnOpQrStUv\", \"version\": \"latest\" }] }, \"messages\": [/* include conversation history */], \"tools\": [{ \"type\": \"code_execution_20250825\", \"name\": \"code_execution\" }] }") STOP_REASON=$(echo "$RESPONSE" | jq -r '.stop_reason') done ``` The response may include a `pause_turn` stop reason, which indicates that the API paused a long-running Skill operation. You can provide the response back as-is in a subsequent request to let Claude continue its turn, or modify the content if you wish to interrupt the conversation and provide additional guidance. ### Using Multiple Skills Combine multiple Skills in a single request to handle complex workflows: ```python Python response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ { "type": "anthropic", "skill_id": "xlsx", "version": "latest" }, { "type": "anthropic", "skill_id": "pptx", "version": "latest" }, { "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" } ] }, messages=[{ "role": "user", "content": "Analyze sales data and create a presentation" }], tools=[{ "type": "code_execution_20250825", "name": "code_execution" }] ) ``` ```typescript TypeScript const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ { type: 'anthropic', skill_id: 'xlsx', version: 'latest' }, { type: 'anthropic', skill_id: 'pptx', version: 'latest' }, { type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: 'latest' } ] }, messages: [{ role: 'user', content: 'Analyze sales data and create a presentation' }], tools: [{ type: 'code_execution_20250825', name: 'code_execution' }] }); ``` ```bash Shell curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ { "type": "anthropic", "skill_id": "xlsx", "version": "latest" }, { "type": "anthropic", "skill_id": "pptx", "version": "latest" }, { "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" } ] }, "messages": [{ "role": "user", "content": "Analyze sales data and create a presentation" }], "tools": [{ "type": "code_execution_20250825", "name": "code_execution" }] }' ``` --- ## Managing Custom Skills ### Creating a Skill Upload your custom Skill to make it available in your workspace. You can upload using either a directory path or individual file objects. ```python Python import anthropic client = anthropic.Anthropic() # Option 1: Using files_from_dir helper (Python only, recommended) from anthropic.lib import files_from_dir skill = client.beta.skills.create( display_title="Financial Analysis", files=files_from_dir("/path/to/financial_analysis_skill"), betas=["skills-2025-10-02"] ) # Option 2: Using a zip file skill = client.beta.skills.create( display_title="Financial Analysis", files=[("skill.zip", open("financial_analysis_skill.zip", "rb"))], betas=["skills-2025-10-02"] ) # Option 3: Using file tuples (filename, file_content, mime_type) skill = client.beta.skills.create( display_title="Financial Analysis", files=[ ("financial_skill/SKILL.md", open("financial_skill/SKILL.md", "rb"), "text/markdown"), ("financial_skill/analyze.py", open("financial_skill/analyze.py", "rb"), "text/x-python"), ], betas=["skills-2025-10-02"] ) print(f"Created skill: {skill.id}") print(f"Latest version: {skill.latest_version}") ``` ```typescript TypeScript import Anthropic, { toFile } from '@anthropic-ai/sdk'; import fs from 'fs'; const client = new Anthropic(); // Option 1: Using a zip file const skill = await client.beta.skills.create({ displayTitle: 'Financial Analysis', files: [ await toFile( fs.createReadStream('financial_analysis_skill.zip'), 'skill.zip' ) ], betas: ['skills-2025-10-02'] }); // Option 2: Using individual file objects const skill = await client.beta.skills.create({ displayTitle: 'Financial Analysis', files: [ await toFile( fs.createReadStream('financial_skill/SKILL.md'), 'financial_skill/SKILL.md', { type: 'text/markdown' } ), await toFile( fs.createReadStream('financial_skill/analyze.py'), 'financial_skill/analyze.py', { type: 'text/x-python' } ), ], betas: ['skills-2025-10-02'] }); console.log(`Created skill: ${skill.id}`); console.log(`Latest version: ${skill.latest_version}`); ``` ```bash Shell curl -X POST "https://api.anthropic.com/v1/skills" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" \ -F "display_title=Financial Analysis" \ -F "files[]=@financial_skill/SKILL.md;filename=financial_skill/SKILL.md" \ -F "files[]=@financial_skill/analyze.py;filename=financial_skill/analyze.py" ``` **Requirements:** - Must include a SKILL.md file at the top level - All files must specify a common root directory in their paths - Total upload size must be under 8MB - YAML frontmatter requirements: - `name`: Maximum 64 characters, lowercase letters/numbers/hyphens only, no XML tags, no reserved words ("anthropic", "claude") - `description`: Maximum 1024 characters, non-empty, no XML tags For complete request/response schemas, see the [Create Skill API reference](/docs/en/api/skills/create-skill). ### Listing Skills Retrieve all Skills available to your workspace, including both Anthropic pre-built Skills and your custom Skills. Use the `source` parameter to filter by skill type: ```python Python # List all Skills skills = client.beta.skills.list( betas=["skills-2025-10-02"] ) for skill in skills.data: print(f"{skill.id}: {skill.display_title} (source: {skill.source})") # List only custom Skills custom_skills = client.beta.skills.list( source="custom", betas=["skills-2025-10-02"] ) ``` ```typescript TypeScript // List all Skills const skills = await client.beta.skills.list({ betas: ['skills-2025-10-02'] }); for (const skill of skills.data) { console.log(`${skill.id}: ${skill.display_title} (source: ${skill.source})`); } // List only custom Skills const customSkills = await client.beta.skills.list({ source: 'custom', betas: ['skills-2025-10-02'] }); ``` ```bash Shell # List all Skills curl "https://api.anthropic.com/v1/skills" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" # List only custom Skills curl "https://api.anthropic.com/v1/skills?source=custom" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" ``` See the [List Skills API reference](/docs/en/api/skills/list-skills) for pagination and filtering options. ### Retrieving a Skill Get details about a specific Skill: ```python Python skill = client.beta.skills.retrieve( skill_id="skill_01AbCdEfGhIjKlMnOpQrStUv", betas=["skills-2025-10-02"] ) print(f"Skill: {skill.display_title}") print(f"Latest version: {skill.latest_version}") print(f"Created: {skill.created_at}") ``` ```typescript TypeScript const skill = await client.beta.skills.retrieve( 'skill_01AbCdEfGhIjKlMnOpQrStUv', { betas: ['skills-2025-10-02'] } ); console.log(`Skill: ${skill.display_title}`); console.log(`Latest version: ${skill.latest_version}`); console.log(`Created: ${skill.created_at}`); ``` ```bash Shell curl "https://api.anthropic.com/v1/skills/skill_01AbCdEfGhIjKlMnOpQrStUv" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" ``` ### Deleting a Skill To delete a Skill, you must first delete all its versions: ```python Python # Step 1: Delete all versions versions = client.beta.skills.versions.list( skill_id="skill_01AbCdEfGhIjKlMnOpQrStUv", betas=["skills-2025-10-02"] ) for version in versions.data: client.beta.skills.versions.delete( skill_id="skill_01AbCdEfGhIjKlMnOpQrStUv", version=version.version, betas=["skills-2025-10-02"] ) # Step 2: Delete the Skill client.beta.skills.delete( skill_id="skill_01AbCdEfGhIjKlMnOpQrStUv", betas=["skills-2025-10-02"] ) ``` ```typescript TypeScript // Step 1: Delete all versions const versions = await client.beta.skills.versions.list( 'skill_01AbCdEfGhIjKlMnOpQrStUv', { betas: ['skills-2025-10-02'] } ); for (const version of versions.data) { await client.beta.skills.versions.delete( 'skill_01AbCdEfGhIjKlMnOpQrStUv', version.version, { betas: ['skills-2025-10-02'] } ); } // Step 2: Delete the Skill await client.beta.skills.delete( 'skill_01AbCdEfGhIjKlMnOpQrStUv', { betas: ['skills-2025-10-02'] } ); ``` ```bash Shell # Delete all versions first, then delete the Skill curl -X DELETE "https://api.anthropic.com/v1/skills/skill_01AbCdEfGhIjKlMnOpQrStUv" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" ``` Attempting to delete a Skill with existing versions will return a 400 error. ### Versioning Skills support versioning to manage updates safely: **Anthropic-Managed Skills**: - Versions use date format: `20251013` - New versions released as updates are made - Specify exact versions for stability **Custom Skills**: - Auto-generated epoch timestamps: `1759178010641129` - Use `"latest"` to always get the most recent version - Create new versions when updating Skill files ```python Python # Create a new version from anthropic.lib import files_from_dir new_version = client.beta.skills.versions.create( skill_id="skill_01AbCdEfGhIjKlMnOpQrStUv", files=files_from_dir("/path/to/updated_skill"), betas=["skills-2025-10-02"] ) # Use specific version response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [{ "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": new_version.version }] }, messages=[{"role": "user", "content": "Use updated Skill"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) # Use latest version response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [{ "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" }] }, messages=[{"role": "user", "content": "Use latest Skill version"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) ``` ```typescript TypeScript // Create a new version using a zip file const fs = require('fs'); const newVersion = await client.beta.skills.versions.create( 'skill_01AbCdEfGhIjKlMnOpQrStUv', { files: [ fs.createReadStream('updated_skill.zip') ], betas: ['skills-2025-10-02'] } ); // Use specific version const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [{ type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: newVersion.version }] }, messages: [{role: 'user', content: 'Use updated Skill'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); // Use latest version const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [{ type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: 'latest' }] }, messages: [{role: 'user', content: 'Use latest Skill version'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); ``` ```bash Shell # Create a new version NEW_VERSION=$(curl -X POST "https://api.anthropic.com/v1/skills/skill_01AbCdEfGhIjKlMnOpQrStUv/versions" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" \ -F "files[]=@updated_skill/SKILL.md;filename=updated_skill/SKILL.md") VERSION_NUMBER=$(echo "$NEW_VERSION" | jq -r '.version') # Use specific version curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d "{ \"model\": \"claude-sonnet-4-5-20250929\", \"max_tokens\": 4096, \"container\": { \"skills\": [{ \"type\": \"custom\", \"skill_id\": \"skill_01AbCdEfGhIjKlMnOpQrStUv\", \"version\": \"$VERSION_NUMBER\" }] }, \"messages\": [{\"role\": \"user\", \"content\": \"Use updated Skill\"}], \"tools\": [{\"type\": \"code_execution_20250825\", \"name\": \"code_execution\"}] }" # Use latest version curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [{ "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" }] }, "messages": [{"role": "user", "content": "Use latest Skill version"}], "tools": [{"type": "code_execution_20250825", "name": "code_execution"}] }' ``` See the [Create Skill Version API reference](/docs/en/api/skills/create-skill-version) for complete details. --- ## How Skills Are Loaded When you specify Skills in a container: 1. **Metadata Discovery**: Claude sees metadata for each Skill (name, description) in the system prompt 2. **File Loading**: Skill files are copied into the container at `/skills/{directory}/` 3. **Automatic Use**: Claude automatically loads and uses Skills when relevant to your request 4. **Composition**: Multiple Skills compose together for complex workflows The progressive disclosure architecture ensures efficient context usage—Claude only loads full Skill instructions when needed. --- ## Use Cases ### Organizational Skills **Brand & Communications** - Apply company-specific formatting (colors, fonts, layouts) to documents - Generate communications following organizational templates - Ensure consistent brand guidelines across all outputs **Project Management** - Structure notes with company-specific formats (OKRs, decision logs) - Generate tasks following team conventions - Create standardized meeting recaps and status updates **Business Operations** - Create company-standard reports, proposals, and analyses - Execute company-specific analytical procedures - Generate financial models following organizational templates ### Personal Skills **Content Creation** - Custom document templates - Specialized formatting and styling - Domain-specific content generation **Data Analysis** - Custom data processing pipelines - Specialized visualization templates - Industry-specific analytical methods **Development & Automation** - Code generation templates - Testing frameworks - Deployment workflows ### Example: Financial Modeling Combine Excel and custom DCF analysis Skills: ```python Python # Create custom DCF analysis Skill from anthropic.lib import files_from_dir dcf_skill = client.beta.skills.create( display_title="DCF Analysis", files=files_from_dir("/path/to/dcf_skill"), betas=["skills-2025-10-02"] ) # Use with Excel to create financial model response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"}, {"type": "custom", "skill_id": dcf_skill.id, "version": "latest"} ] }, messages=[{ "role": "user", "content": "Build a DCF valuation model for a SaaS company with the attached financials" }], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) ``` ```typescript TypeScript // Create custom DCF analysis Skill import { toFile } from '@anthropic-ai/sdk'; import fs from 'fs'; const dcfSkill = await client.beta.skills.create({ displayTitle: 'DCF Analysis', files: [ await toFile(fs.createReadStream('dcf_skill.zip'), 'skill.zip') ], betas: ['skills-2025-10-02'] }); // Use with Excel to create financial model const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'}, {type: 'custom', skill_id: dcfSkill.id, version: 'latest'} ] }, messages: [{ role: 'user', content: 'Build a DCF valuation model for a SaaS company with the attached financials' }], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); ``` ```bash Shell # Create custom DCF analysis Skill DCF_SKILL=$(curl -X POST "https://api.anthropic.com/v1/skills" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: skills-2025-10-02" \ -F "display_title=DCF Analysis" \ -F "files[]=@dcf_skill/SKILL.md;filename=dcf_skill/SKILL.md") DCF_SKILL_ID=$(echo "$DCF_SKILL" | jq -r '.id') # Use with Excel to create financial model curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02" \ -H "content-type: application/json" \ -d "{ \"model\": \"claude-sonnet-4-5-20250929\", \"max_tokens\": 4096, \"container\": { \"skills\": [ { \"type\": \"anthropic\", \"skill_id\": \"xlsx\", \"version\": \"latest\" }, { \"type\": \"custom\", \"skill_id\": \"$DCF_SKILL_ID\", \"version\": \"latest\" } ] }, \"messages\": [{ \"role\": \"user\", \"content\": \"Build a DCF valuation model for a SaaS company with the attached financials\" }], \"tools\": [{ \"type\": \"code_execution_20250825\", \"name\": \"code_execution\" }] }" ``` --- ## Limits and Constraints ### Request Limits - **Maximum Skills per request**: 8 - **Maximum Skill upload size**: 8MB (all files combined) - **YAML frontmatter requirements**: - `name`: Maximum 64 characters, lowercase letters/numbers/hyphens only, no XML tags, no reserved words - `description`: Maximum 1024 characters, non-empty, no XML tags ### Environment Constraints Skills run in the code execution container with these limitations: - **No network access** - Cannot make external API calls - **No runtime package installation** - Only pre-installed packages available - **Isolated environment** - Each request gets a fresh container See the [code execution tool documentation](/docs/en/agents-and-tools/tool-use/code-execution-tool) for available packages. --- ## Best Practices ### When to Use Multiple Skills Combine Skills when tasks involve multiple document types or domains: **Good use cases:** - Data analysis (Excel) + presentation creation (PowerPoint) - Report generation (Word) + export to PDF - Custom domain logic + document generation **Avoid:** - Including unused Skills (impacts performance) ### Version Management Strategy **For production:** ```python # Pin to specific versions for stability container={ "skills": [{ "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "1759178010641129" # Specific version }] } ``` **For development:** ```python # Use latest for active development container={ "skills": [{ "type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest" # Always get newest }] } ``` ### Prompt Caching Considerations When using prompt caching, note that changing the Skills list in your container will break the cache: ```python Python # First request creates cache response1 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02", "prompt-caching-2024-07-31"], container={ "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, messages=[{"role": "user", "content": "Analyze sales data"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) # Adding/removing Skills breaks cache response2 = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02", "prompt-caching-2024-07-31"], container={ "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"}, {"type": "anthropic", "skill_id": "pptx", "version": "latest"} # Cache miss ] }, messages=[{"role": "user", "content": "Create a presentation"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) ``` ```typescript TypeScript // First request creates cache const response1 = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02', 'prompt-caching-2024-07-31'], container: { skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'} ] }, messages: [{role: 'user', content: 'Analyze sales data'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); // Adding/removing Skills breaks cache const response2 = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02', 'prompt-caching-2024-07-31'], container: { skills: [ {type: 'anthropic', skill_id: 'xlsx', version: 'latest'}, {type: 'anthropic', skill_id: 'pptx', version: 'latest'} // Cache miss ] }, messages: [{role: 'user', content: 'Create a presentation'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); ``` ```bash Shell # First request creates cache curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02,prompt-caching-2024-07-31" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"} ] }, "messages": [{"role": "user", "content": "Analyze sales data"}], "tools": [{"type": "code_execution_20250825", "name": "code_execution"}] }' # Adding/removing Skills breaks cache curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" \ -H "anthropic-beta: code-execution-2025-08-25,skills-2025-10-02,prompt-caching-2024-07-31" \ -H "content-type: application/json" \ -d '{ "model": "claude-sonnet-4-5-20250929", "max_tokens": 4096, "container": { "skills": [ {"type": "anthropic", "skill_id": "xlsx", "version": "latest"}, {"type": "anthropic", "skill_id": "pptx", "version": "latest"} ] }, "messages": [{"role": "user", "content": "Create a presentation"}], "tools": [{"type": "code_execution_20250825", "name": "code_execution"}] }' ``` For best caching performance, keep your Skills list consistent across requests. ### Error Handling Handle Skill-related errors gracefully: ```python Python try: response = client.beta.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=4096, betas=["code-execution-2025-08-25", "skills-2025-10-02"], container={ "skills": [ {"type": "custom", "skill_id": "skill_01AbCdEfGhIjKlMnOpQrStUv", "version": "latest"} ] }, messages=[{"role": "user", "content": "Process data"}], tools=[{"type": "code_execution_20250825", "name": "code_execution"}] ) except anthropic.BadRequestError as e: if "skill" in str(e): print(f"Skill error: {e}") # Handle skill-specific errors else: raise ``` ```typescript TypeScript try { const response = await client.beta.messages.create({ model: 'claude-sonnet-4-5-20250929', max_tokens: 4096, betas: ['code-execution-2025-08-25', 'skills-2025-10-02'], container: { skills: [ {type: 'custom', skill_id: 'skill_01AbCdEfGhIjKlMnOpQrStUv', version: 'latest'} ] }, messages: [{role: 'user', content: 'Process data'}], tools: [{type: 'code_execution_20250825', name: 'code_execution'}] }); } catch (error) { if (error instanceof Anthropic.BadRequestError && error.message.includes('skill')) { console.error(`Skill error: ${error.message}`); // Handle skill-specific errors } else { throw error; } } ``` --- ## Next Steps Complete API reference with all endpoints Best practices for writing effective Skills Learn about the code execution environment ### Agent SDK --- # Agent SDK overview URL: https://platform.claude.com/docs/en/agent-sdk/overview # Agent SDK overview Build production AI agents with Claude Code as a library --- The Claude Code SDK has been renamed to the Claude Agent SDK. If you're migrating from the old SDK, see the [Migration Guide](/docs/en/agent-sdk/migration-guide). Build AI agents that autonomously read files, run commands, search the web, edit code, and more. The Agent SDK gives you the same tools, agent loop, and context management that power Claude Code, programmable in Python and TypeScript. ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): async for message in query( prompt="Find and fix the bug in auth.py", options=ClaudeAgentOptions(allowed_tools=["Read", "Edit", "Bash"]) ): print(message) # Claude reads the file, finds the bug, edits it asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Find and fix the bug in auth.py", options: { allowedTools: ["Read", "Edit", "Bash"] } })) { console.log(message); // Claude reads the file, finds the bug, edits it } ``` The Agent SDK includes built-in tools for reading files, running commands, and editing code, so your agent can start working immediately without you implementing tool execution. Dive into the quickstart or explore real agents built with the SDK: Build a bug-fixing agent in minutes Email assistant, research agent, and more ## Capabilities Everything that makes Claude Code powerful is available in the SDK: Your agent can read files, run commands, and search codebases out of the box. Key tools include: | Tool | What it does | |------|--------------| | **Read** | Read any file in the working directory | | **Write** | Create new files | | **Edit** | Make precise edits to existing files | | **Bash** | Run terminal commands, scripts, git operations | | **Glob** | Find files by pattern (`**/*.ts`, `src/**/*.py`) | | **Grep** | Search file contents with regex | | **WebSearch** | Search the web for current information | | **WebFetch** | Fetch and parse web page content | | **[AskUserQuestion](/docs/en/agent-sdk/user-input#handle-clarifying-questions)** | Ask the user clarifying questions with multiple choice options | This example creates an agent that searches your codebase for TODO comments: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): async for message in query( prompt="Find all TODO comments and create a summary", options=ClaudeAgentOptions(allowed_tools=["Read", "Glob", "Grep"]) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Find all TODO comments and create a summary", options: { allowedTools: ["Read", "Glob", "Grep"] } })) { if ("result" in message) console.log(message.result); } ``` Run custom code at key points in the agent lifecycle. SDK hooks use callback functions to validate, log, block, or transform agent behavior. **Available hooks:** `PreToolUse`, `PostToolUse`, `Stop`, `SessionStart`, `SessionEnd`, `UserPromptSubmit`, and more. This example logs all file changes to an audit file: ```python Python import asyncio from datetime import datetime from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher async def log_file_change(input_data, tool_use_id, context): file_path = input_data.get('tool_input', {}).get('file_path', 'unknown') with open('./audit.log', 'a') as f: f.write(f"{datetime.now()}: modified {file_path}\n") return {} async def main(): async for message in query( prompt="Refactor utils.py to improve readability", options=ClaudeAgentOptions( permission_mode="acceptEdits", hooks={ "PostToolUse": [HookMatcher(matcher="Edit|Write", hooks=[log_file_change])] } ) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query, HookCallback } from "@anthropic-ai/claude-agent-sdk"; import { appendFileSync } from "fs"; const logFileChange: HookCallback = async (input) => { const filePath = (input as any).tool_input?.file_path ?? "unknown"; appendFileSync("./audit.log", `${new Date().toISOString()}: modified ${filePath}\n`); return {}; }; for await (const message of query({ prompt: "Refactor utils.py to improve readability", options: { permissionMode: "acceptEdits", hooks: { PostToolUse: [{ matcher: "Edit|Write", hooks: [logFileChange] }] } } })) { if ("result" in message) console.log(message.result); } ``` [Learn more about hooks →](/docs/en/agent-sdk/hooks) Spawn specialized agents to handle focused subtasks. Your main agent delegates work, and subagents report back with results. Define custom agents with specialized instructions. Include `Task` in `allowedTools` since subagents are invoked via the Task tool: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition async def main(): async for message in query( prompt="Use the code-reviewer agent to review this codebase", options=ClaudeAgentOptions( allowed_tools=["Read", "Glob", "Grep", "Task"], agents={ "code-reviewer": AgentDefinition( description="Expert code reviewer for quality and security reviews.", prompt="Analyze code quality and suggest improvements.", tools=["Read", "Glob", "Grep"] ) } ) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Use the code-reviewer agent to review this codebase", options: { allowedTools: ["Read", "Glob", "Grep", "Task"], agents: { "code-reviewer": { description: "Expert code reviewer for quality and security reviews.", prompt: "Analyze code quality and suggest improvements.", tools: ["Read", "Glob", "Grep"] } } } })) { if ("result" in message) console.log(message.result); } ``` Messages from within a subagent's context include a `parent_tool_use_id` field, letting you track which messages belong to which subagent execution. [Learn more about subagents →](/docs/en/agent-sdk/subagents) Connect to external systems via the Model Context Protocol: databases, browsers, APIs, and [hundreds more](https://github.com/modelcontextprotocol/servers). This example connects the [Playwright MCP server](https://github.com/microsoft/playwright-mcp) to give your agent browser automation capabilities: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): async for message in query( prompt="Open example.com and describe what you see", options=ClaudeAgentOptions( mcp_servers={ "playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]} } ) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Open example.com and describe what you see", options: { mcpServers: { playwright: { command: "npx", args: ["@playwright/mcp@latest"] } } } })) { if ("result" in message) console.log(message.result); } ``` [Learn more about MCP →](/docs/en/agent-sdk/mcp) Control exactly which tools your agent can use. Allow safe operations, block dangerous ones, or require approval for sensitive actions. For interactive approval prompts and the `AskUserQuestion` tool, see [Handle approvals and user input](/docs/en/agent-sdk/user-input). This example creates a read-only agent that can analyze but not modify code: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): async for message in query( prompt="Review this code for best practices", options=ClaudeAgentOptions( allowed_tools=["Read", "Glob", "Grep"], permission_mode="bypassPermissions" ) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "Review this code for best practices", options: { allowedTools: ["Read", "Glob", "Grep"], permissionMode: "bypassPermissions" } })) { if ("result" in message) console.log(message.result); } ``` [Learn more about permissions →](/docs/en/agent-sdk/permissions) Maintain context across multiple exchanges. Claude remembers files read, analysis done, and conversation history. Resume sessions later, or fork them to explore different approaches. This example captures the session ID from the first query, then resumes to continue with full context: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): session_id = None # First query: capture the session ID async for message in query( prompt="Read the authentication module", options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]) ): if hasattr(message, 'subtype') and message.subtype == 'init': session_id = message.session_id # Resume with full context from the first query async for message in query( prompt="Now find all places that call it", # "it" = auth module options=ClaudeAgentOptions(resume=session_id) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; let sessionId: string | undefined; // First query: capture the session ID for await (const message of query({ prompt: "Read the authentication module", options: { allowedTools: ["Read", "Glob"] } })) { if (message.type === "system" && message.subtype === "init") { sessionId = message.session_id; } } // Resume with full context from the first query for await (const message of query({ prompt: "Now find all places that call it", // "it" = auth module options: { resume: sessionId } })) { if ("result" in message) console.log(message.result); } ``` [Learn more about sessions →](/docs/en/agent-sdk/sessions) ### Claude Code features The SDK also supports Claude Code's filesystem-based configuration. To use these features, set `setting_sources=["project"]` (Python) or `settingSources: ['project']` (TypeScript) in your options. | Feature | Description | Location | |---------|-------------|----------| | [Skills](/docs/en/agent-sdk/skills) | Specialized capabilities defined in Markdown | `.claude/skills/SKILL.md` | | [Slash commands](/docs/en/agent-sdk/slash-commands) | Custom commands for common tasks | `.claude/commands/*.md` | | [Memory](/docs/en/agent-sdk/modifying-system-prompts) | Project context and instructions | `CLAUDE.md` or `.claude/CLAUDE.md` | | [Plugins](/docs/en/agent-sdk/plugins) | Extend with custom commands, agents, and MCP servers | Programmatic via `plugins` option | ## Get started The SDK uses Claude Code as its runtime: ```bash curl -fsSL https://claude.ai/install.sh | bash ``` ```bash brew install --cask claude-code ``` ```powershell winget install Anthropic.ClaudeCode ``` See [Claude Code setup](https://code.claude.com/docs/en/setup) for Windows and other options. ```bash npm install @anthropic-ai/claude-agent-sdk ``` ```bash pip install claude-agent-sdk ``` ```bash export ANTHROPIC_API_KEY=your-api-key ``` Get your key from the [Console](https://platform.claude.com/). The SDK also supports authentication via third-party API providers: - **Amazon Bedrock**: set `CLAUDE_CODE_USE_BEDROCK=1` environment variable and configure AWS credentials - **Google Vertex AI**: set `CLAUDE_CODE_USE_VERTEX=1` environment variable and configure Google Cloud credentials - **Microsoft Foundry**: set `CLAUDE_CODE_USE_FOUNDRY=1` environment variable and configure Azure credentials Unless previously approved, we do not allow third party developers to offer Claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead. This example creates an agent that lists files in your current directory using built-in tools. ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): async for message in query( prompt="What files are in this directory?", options=ClaudeAgentOptions(allowed_tools=["Bash", "Glob"]) ): if hasattr(message, "result"): print(message.result) asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; for await (const message of query({ prompt: "What files are in this directory?", options: { allowedTools: ["Bash", "Glob"] }, })) { if ("result" in message) console.log(message.result); } ``` **Ready to build?** Follow the [Quickstart](/docs/en/agent-sdk/quickstart) to create an agent that finds and fixes bugs in minutes. ## Compare the Agent SDK to other Claude tools The Claude platform offers multiple ways to build with Claude. Here's how the Agent SDK fits in: The [Anthropic Client SDK](/docs/en/api/client-sdks) gives you direct API access: you send prompts and implement tool execution yourself. The **Agent SDK** gives you Claude with built-in tool execution. With the Client SDK, you implement a tool loop. With the Agent SDK, Claude handles it: ```python Python # Client SDK: You implement the tool loop response = client.messages.create(...) while response.stop_reason == "tool_use": result = your_tool_executor(response.tool_use) response = client.messages.create(tool_result=result, ...) # Agent SDK: Claude handles tools autonomously async for message in query(prompt="Fix the bug in auth.py"): print(message) ``` ```typescript TypeScript // Client SDK: You implement the tool loop let response = await client.messages.create({...}); while (response.stop_reason === "tool_use") { const result = yourToolExecutor(response.tool_use); response = await client.messages.create({ tool_result: result, ... }); } // Agent SDK: Claude handles tools autonomously for await (const message of query({ prompt: "Fix the bug in auth.py" })) { console.log(message); } ``` Same capabilities, different interface: | Use case | Best choice | |----------|-------------| | Interactive development | CLI | | CI/CD pipelines | SDK | | Custom applications | SDK | | One-off tasks | CLI | | Production automation | SDK | Many teams use both: CLI for daily development, SDK for production. Workflows translate directly between them. ## Changelog View the full changelog for SDK updates, bug fixes, and new features: - **TypeScript SDK**: [view CHANGELOG.md](https://github.com/anthropics/claude-agent-sdk-typescript/blob/main/CHANGELOG.md) - **Python SDK**: [view CHANGELOG.md](https://github.com/anthropics/claude-agent-sdk-python/blob/main/CHANGELOG.md) ## Reporting bugs If you encounter bugs or issues with the Agent SDK: - **TypeScript SDK**: [report issues on GitHub](https://github.com/anthropics/claude-agent-sdk-typescript/issues) - **Python SDK**: [report issues on GitHub](https://github.com/anthropics/claude-agent-sdk-python/issues) ## Branding guidelines For partners integrating the Claude Agent SDK, use of Claude branding is optional. When referencing Claude in your product: **Allowed:** - "Claude Agent" (preferred for dropdown menus) - "Claude" (when within a menu already labeled "Agents") - "{YourAgentName} Powered by Claude" (if you have an existing agent name) **Not permitted:** - "Claude Code" or "Claude Code Agent" - Claude Code-branded ASCII art or visual elements that mimic Claude Code Your product should maintain its own branding and not appear to be Claude Code or any Anthropic product. For questions about branding compliance, contact our [sales team](https://www.anthropic.com/contact-sales). ## License and terms Use of the Claude Agent SDK is governed by [Anthropic's Commercial Terms of Service](https://www.anthropic.com/legal/commercial-terms), including when you use it to power products and services that you make available to your own customers and end users, except to the extent a specific component or dependency is covered by a different license as indicated in that component's LICENSE file. ## Next steps Build an agent that finds and fixes bugs in minutes Email assistant, research agent, and more Full TypeScript API reference and examples Full Python API reference and examples --- # Quickstart URL: https://platform.claude.com/docs/en/agent-sdk/quickstart # Quickstart Get started with the Python or TypeScript Agent SDK to build AI agents that work autonomously --- Use the Agent SDK to build an AI agent that reads your code, finds bugs, and fixes them, all without manual intervention. **What you'll do:** 1. Set up a project with the Agent SDK 2. Create a file with some buggy code 3. Run an agent that finds and fixes the bugs automatically ## Prerequisites - **Node.js 18+** or **Python 3.10+** - An **Anthropic account** ([sign up here](https://platform.claude.com/)) ## Setup The Agent SDK uses Claude Code as its runtime. Install it for your platform: ```bash curl -fsSL https://claude.ai/install.sh | bash ``` ```bash brew install --cask claude-code ``` ```powershell winget install Anthropic.ClaudeCode ``` After installing Claude Code onto your machine, run `claude` in your terminal and follow the prompts to authenticate. The SDK will use this authentication automatically. For more information on Claude Code installation, see [Claude Code setup](https://code.claude.com/docs/en/setup). Create a new directory for this quickstart: ```bash mkdir my-agent && cd my-agent ``` For your own projects, you can run the SDK from any folder; it will have access to files in that directory and its subdirectories by default. Install the Agent SDK package for your language: ```bash npm install @anthropic-ai/claude-agent-sdk ``` [uv Python package manager](https://docs.astral.sh/uv/) is a fast Python package manager that handles virtual environments automatically: ```bash uv init && uv add claude-agent-sdk ``` Create a virtual environment first, then install: ```bash python3 -m venv .venv && source .venv/bin/activate pip3 install claude-agent-sdk ``` If you've already authenticated Claude Code (by running `claude` in your terminal), the SDK uses that authentication automatically. Otherwise, you need an API key, which you can get from the [Claude Console](https://platform.claude.com/). Create a `.env` file in your project directory and store the API key there: ```bash ANTHROPIC_API_KEY=your-api-key ``` **Using Amazon Bedrock, Google Vertex AI, or Microsoft Azure?** See the setup guides for [Bedrock](https://code.claude.com/docs/en/amazon-bedrock), [Vertex AI](https://code.claude.com/docs/en/google-vertex-ai), or [Azure AI Foundry](https://code.claude.com/docs/en/azure-ai-foundry). Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead. ## Create a buggy file This quickstart walks you through building an agent that can find and fix bugs in code. First, you need a file with some intentional bugs for the agent to fix. Create `utils.py` in the `my-agent` directory and paste the following code: ```python def calculate_average(numbers): total = 0 for num in numbers: total += num return total / len(numbers) def get_user_name(user): return user["name"].upper() ``` This code has two bugs: 1. `calculate_average([])` crashes with division by zero 2. `get_user_name(None)` crashes with a TypeError ## Build an agent that finds and fixes bugs Create `agent.py` if you're using the Python SDK, or `agent.ts` for TypeScript: ```python Python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ResultMessage async def main(): # Agentic loop: streams messages as Claude works async for message in query( prompt="Review utils.py for bugs that would cause crashes. Fix any issues you find.", options=ClaudeAgentOptions( allowed_tools=["Read", "Edit", "Glob"], # Tools Claude can use permission_mode="acceptEdits" # Auto-approve file edits ) ): # Print human-readable output if isinstance(message, AssistantMessage): for block in message.content: if hasattr(block, "text"): print(block.text) # Claude's reasoning elif hasattr(block, "name"): print(f"Tool: {block.name}") # Tool being called elif isinstance(message, ResultMessage): print(f"Done: {message.subtype}") # Final result asyncio.run(main()) ``` ```typescript TypeScript import { query } from "@anthropic-ai/claude-agent-sdk"; // Agentic loop: streams messages as Claude works for await (const message of query({ prompt: "Review utils.py for bugs that would cause crashes. Fix any issues you find.", options: { allowedTools: ["Read", "Edit", "Glob"], // Tools Claude can use permissionMode: "acceptEdits" // Auto-approve file edits } })) { // Print human-readable output if (message.type === "assistant" && message.message?.content) { for (const block of message.message.content) { if ("text" in block) { console.log(block.text); // Claude's reasoning } else if ("name" in block) { console.log(`Tool: ${block.name}`); // Tool being called } } } else if (message.type === "result") { console.log(`Done: ${message.subtype}`); // Final result } } ``` This code has three main parts: 1. **`query`**: the main entry point that creates the agentic loop. It returns an async iterator, so you use `async for` to stream messages as Claude works. See the full API in the [Python](/docs/en/agent-sdk/python#query) or [TypeScript](/docs/en/agent-sdk/typescript#query) SDK reference. 2. **`prompt`**: what you want Claude to do. Claude figures out which tools to use based on the task. 3. **`options`**: configuration for the agent. This example uses `allowedTools` to restrict Claude to `Read`, `Edit`, and `Glob`, and `permissionMode: "acceptEdits"` to auto-approve file changes. Other options include `systemPrompt`, `mcpServers`, and more. See all options for [Python](/docs/en/agent-sdk/python#claudeagentoptions) or [TypeScript](/docs/en/agent-sdk/typescript#claudeagentoptions). The `async for` loop keeps running as Claude thinks, calls tools, observes results, and decides what to do next. Each iteration yields a message: Claude's reasoning, a tool call, a tool result, or the final outcome. The SDK handles the orchestration (tool execution, context management, retries) so you just consume the stream. The loop ends when Claude finishes the task or hits an error. The message handling inside the loop filters for human-readable output. Without filtering, you'd see raw message objects including system initialization and internal state, which is useful for debugging but noisy otherwise. This example uses streaming to show progress in real-time. If you don't need live output (e.g., for background jobs or CI pipelines), you can collect all messages at once. See [Streaming vs. single-turn mode](/docs/en/agent-sdk/streaming-vs-single-mode) for details. ### Run your agent Your agent is ready. Run it with the following command: ```bash python3 agent.py ``` ```bash npx tsx agent.ts ``` After running, check `utils.py`. You'll see defensive code handling empty lists and null users. Your agent autonomously: 1. **Read** `utils.py` to understand the code 2. **Analyzed** the logic and identified edge cases that would crash 3. **Edited** the file to add proper error handling This is what makes the Agent SDK different: Claude executes tools directly instead of asking you to implement them. If you see "Claude Code not found", [install Claude Code](#install-claude-code) and restart your terminal. For "API key not found", [set your API key](#set-your-api-key). See the [full troubleshooting guide](https://code.claude.com/docs/en/troubleshooting) for more help. ### Try other prompts Now that your agent is set up, try some different prompts: - `"Add docstrings to all functions in utils.py"` - `"Add type hints to all functions in utils.py"` - `"Create a README.md documenting the functions in utils.py"` ### Customize your agent You can modify your agent's behavior by changing the options. Here are a few examples: **Add web search capability:** ```python Python options=ClaudeAgentOptions( allowed_tools=["Read", "Edit", "Glob", "WebSearch"], permission_mode="acceptEdits" ) ``` ```typescript TypeScript options: { allowedTools: ["Read", "Edit", "Glob", "WebSearch"], permissionMode: "acceptEdits" } ``` **Give Claude a custom system prompt:** ```python Python options=ClaudeAgentOptions( allowed_tools=["Read", "Edit", "Glob"], permission_mode="acceptEdits", system_prompt="You are a senior Python developer. Always follow PEP 8 style guidelines." ) ``` ```typescript TypeScript options: { allowedTools: ["Read", "Edit", "Glob"], permissionMode: "acceptEdits", systemPrompt: "You are a senior Python developer. Always follow PEP 8 style guidelines." } ``` **Run commands in the terminal:** ```python Python options=ClaudeAgentOptions( allowed_tools=["Read", "Edit", "Glob", "Bash"], permission_mode="acceptEdits" ) ``` ```typescript TypeScript options: { allowedTools: ["Read", "Edit", "Glob", "Bash"], permissionMode: "acceptEdits" } ``` With `Bash` enabled, try: `"Write unit tests for utils.py, run them, and fix any failures"` ## Key concepts **Tools** control what your agent can do: | Tools | What the agent can do | |-------|----------------------| | `Read`, `Glob`, `Grep` | Read-only analysis | | `Read`, `Edit`, `Glob` | Analyze and modify code | | `Read`, `Edit`, `Bash`, `Glob`, `Grep` | Full automation | **Permission modes** control how much human oversight you want: | Mode | Behavior | Use case | |------|----------|----------| | `acceptEdits` | Auto-approves file edits, asks for other actions | Trusted development workflows | | `bypassPermissions` | Runs without prompts | CI/CD pipelines, automation | | `default` | Requires a `canUseTool` callback to handle approval | Custom approval flows | The example above uses `acceptEdits` mode, which auto-approves file operations so the agent can run without interactive prompts. If you want to prompt users for approval, use `default` mode and provide a [`canUseTool` callback](/docs/en/agent-sdk/user-input) that collects user input. For more control, see [Permissions](/docs/en/agent-sdk/permissions). ## Next steps Now that you've created your first agent, learn how to extend its capabilities and tailor it to your use case: - **[Permissions](/docs/en/agent-sdk/permissions)**: control what your agent can do and when it needs approval - **[Hooks](/docs/en/agent-sdk/hooks)**: run custom code before or after tool calls - **[Sessions](/docs/en/agent-sdk/sessions)**: build multi-turn agents that maintain context - **[MCP servers](/docs/en/agent-sdk/mcp)**: connect to databases, browsers, APIs, and other external systems - **[Hosting](/docs/en/agent-sdk/hosting)**: deploy agents to Docker, cloud, and CI/CD - **[Example agents](https://github.com/anthropics/claude-agent-sdk-demos)**: see complete examples: email assistant, research agent, and more --- # Agent SDK reference - Python URL: https://platform.claude.com/docs/en/agent-sdk/python # Agent SDK reference - Python Complete API reference for the Python Agent SDK, including all functions, types, and classes. --- ## Installation ```bash pip install claude-agent-sdk ``` ## Choosing Between `query()` and `ClaudeSDKClient` The Python SDK provides two ways to interact with Claude Code: ### Quick Comparison | Feature | `query()` | `ClaudeSDKClient` | | :------------------ | :---------------------------- | :--------------------------------- | | **Session** | Creates new session each time | Reuses same session | | **Conversation** | Single exchange | Multiple exchanges in same context | | **Connection** | Managed automatically | Manual control | | **Streaming Input** | ✅ Supported | ✅ Supported | | **Interrupts** | ❌ Not supported | ✅ Supported | | **Hooks** | ❌ Not supported | ✅ Supported | | **Custom Tools** | ❌ Not supported | ✅ Supported | | **Continue Chat** | ❌ New session each time | ✅ Maintains conversation | | **Use Case** | One-off tasks | Continuous conversations | ### When to Use `query()` (New Session Each Time) **Best for:** - One-off questions where you don't need conversation history - Independent tasks that don't require context from previous exchanges - Simple automation scripts - When you want a fresh start each time ### When to Use `ClaudeSDKClient` (Continuous Conversation) **Best for:** - **Continuing conversations** - When you need Claude to remember context - **Follow-up questions** - Building on previous responses - **Interactive applications** - Chat interfaces, REPLs - **Response-driven logic** - When next action depends on Claude's response - **Session control** - Managing conversation lifecycle explicitly ## Functions ### `query()` Creates a new session for each interaction with Claude Code. Returns an async iterator that yields messages as they arrive. Each call to `query()` starts fresh with no memory of previous interactions. ```python async def query( *, prompt: str | AsyncIterable[dict[str, Any]], options: ClaudeAgentOptions | None = None ) -> AsyncIterator[Message] ``` #### Parameters | Parameter | Type | Description | | :-------- | :--------------------------- | :------------------------------------------------------------------------- | | `prompt` | `str \| AsyncIterable[dict]` | The input prompt as a string or async iterable for streaming mode | | `options` | `ClaudeAgentOptions \| None` | Optional configuration object (defaults to `ClaudeAgentOptions()` if None) | #### Returns Returns an `AsyncIterator[Message]` that yields messages from the conversation. #### Example - With options ```python import asyncio from claude_agent_sdk import query, ClaudeAgentOptions async def main(): options = ClaudeAgentOptions( system_prompt="You are an expert Python developer", permission_mode='acceptEdits', cwd="/home/user/project" ) async for message in query( prompt="Create a Python web server", options=options ): print(message) asyncio.run(main()) ``` ### `tool()` Decorator for defining MCP tools with type safety. ```python def tool( name: str, description: str, input_schema: type | dict[str, Any] ) -> Callable[[Callable[[Any], Awaitable[dict[str, Any]]]], SdkMcpTool[Any]] ``` #### Parameters | Parameter | Type | Description | | :------------- | :----------------------- | :------------------------------------------------------ | | `name` | `str` | Unique identifier for the tool | | `description` | `str` | Human-readable description of what the tool does | | `input_schema` | `type \| dict[str, Any]` | Schema defining the tool's input parameters (see below) | #### Input Schema Options 1. **Simple type mapping** (recommended): ```python {"text": str, "count": int, "enabled": bool} ``` 2. **JSON Schema format** (for complex validation): ```python { "type": "object", "properties": { "text": {"type": "string"}, "count": {"type": "integer", "minimum": 0} }, "required": ["text"] } ``` #### Returns A decorator function that wraps the tool implementation and returns an `SdkMcpTool` instance. #### Example ```python from claude_agent_sdk import tool from typing import Any @tool("greet", "Greet a user", {"name": str}) async def greet(args: dict[str, Any]) -> dict[str, Any]: return { "content": [{ "type": "text", "text": f"Hello, {args['name']}!" }] } ``` ### `create_sdk_mcp_server()` Create an in-process MCP server that runs within your Python application. ```python def create_sdk_mcp_server( name: str, version: str = "1.0.0", tools: list[SdkMcpTool[Any]] | None = None ) -> McpSdkServerConfig ``` #### Parameters | Parameter | Type | Default | Description | | :-------- | :------------------------------ | :-------- | :---------------------------------------------------- | | `name` | `str` | - | Unique identifier for the server | | `version` | `str` | `"1.0.0"` | Server version string | | `tools` | `list[SdkMcpTool[Any]] \| None` | `None` | List of tool functions created with `@tool` decorator | #### Returns Returns an `McpSdkServerConfig` object that can be passed to `ClaudeAgentOptions.mcp_servers`. #### Example ```python from claude_agent_sdk import tool, create_sdk_mcp_server @tool("add", "Add two numbers", {"a": float, "b": float}) async def add(args): return { "content": [{ "type": "text", "text": f"Sum: {args['a'] + args['b']}" }] } @tool("multiply", "Multiply two numbers", {"a": float, "b": float}) async def multiply(args): return { "content": [{ "type": "text", "text": f"Product: {args['a'] * args['b']}" }] } calculator = create_sdk_mcp_server( name="calculator", version="2.0.0", tools=[add, multiply] # Pass decorated functions ) # Use with Claude options = ClaudeAgentOptions( mcp_servers={"calc": calculator}, allowed_tools=["mcp__calc__add", "mcp__calc__multiply"] ) ``` ## Classes ### `ClaudeSDKClient` **Maintains a conversation session across multiple exchanges.** This is the Python equivalent of how the TypeScript SDK's `query()` function works internally - it creates a client object that can continue conversations. #### Key Features - **Session Continuity**: Maintains conversation context across multiple `query()` calls - **Same Conversation**: Claude remembers previous messages in the session - **Interrupt Support**: Can stop Claude mid-execution - **Explicit Lifecycle**: You control when the session starts and ends - **Response-driven Flow**: Can react to responses and send follow-ups - **Custom Tools & Hooks**: Supports custom tools (created with `@tool` decorator) and hooks ```python class ClaudeSDKClient: def __init__(self, options: ClaudeAgentOptions | None = None) async def connect(self, prompt: str | AsyncIterable[dict] | None = None) -> None async def query(self, prompt: str | AsyncIterable[dict], session_id: str = "default") -> None async def receive_messages(self) -> AsyncIterator[Message] async def receive_response(self) -> AsyncIterator[Message] async def interrupt(self) -> None async def rewind_files(self, user_message_uuid: str) -> None async def disconnect(self) -> None ``` #### Methods | Method | Description | | :-------------------------- | :------------------------------------------------------------------ | | `__init__(options)` | Initialize the client with optional configuration | | `connect(prompt)` | Connect to Claude with an optional initial prompt or message stream | | `query(prompt, session_id)` | Send a new request in streaming mode | | `receive_messages()` | Receive all messages from Claude as an async iterator | | `receive_response()` | Receive messages until and including a ResultMessage | | `interrupt()` | Send interrupt signal (only works in streaming mode) | | `rewind_files(user_message_uuid)` | Restore files to their state at the specified user message. Requires `enable_file_checkpointing=True`. See [File checkpointing](/docs/en/agent-sdk/file-checkpointing) | | `disconnect()` | Disconnect from Claude | #### Context Manager Support The client can be used as an async context manager for automatic connection management: ```python async with ClaudeSDKClient() as client: await client.query("Hello Claude") async for message in client.receive_response(): print(message) ``` > **Important:** When iterating over messages, avoid using `break` to exit early as this can cause asyncio cleanup issues. Instead, let the iteration complete naturally or use flags to track when you've found what you need. #### Example - Continuing a conversation ```python import asyncio from claude_agent_sdk import ClaudeSDKClient, AssistantMessage, TextBlock, ResultMessage async def main(): async with ClaudeSDKClient() as client: # First question await client.query("What's the capital of France?") # Process response async for message in client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(f"Claude: {block.text}") # Follow-up question - Claude remembers the previous context await client.query("What's the population of that city?") async for message in client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(f"Claude: {block.text}") # Another follow-up - still in the same conversation await client.query("What are some famous landmarks there?") async for message in client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(f"Claude: {block.text}") asyncio.run(main()) ``` #### Example - Streaming input with ClaudeSDKClient ```python import asyncio from claude_agent_sdk import ClaudeSDKClient async def message_stream(): """Generate messages dynamically.""" yield {"type": "text", "text": "Analyze the following data:"} await asyncio.sleep(0.5) yield {"type": "text", "text": "Temperature: 25°C"} await asyncio.sleep(0.5) yield {"type": "text", "text": "Humidity: 60%"} await asyncio.sleep(0.5) yield {"type": "text", "text": "What patterns do you see?"} async def main(): async with ClaudeSDKClient() as client: # Stream input to Claude await client.query(message_stream()) # Process response async for message in client.receive_response(): print(message) # Follow-up in same session await client.query("Should we be concerned about these readings?") async for message in client.receive_response(): print(message) asyncio.run(main()) ``` #### Example - Using interrupts ```python import asyncio from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions async def interruptible_task(): options = ClaudeAgentOptions( allowed_tools=["Bash"], permission_mode="acceptEdits" ) async with ClaudeSDKClient(options=options) as client: # Start a long-running task await client.query("Count from 1 to 100 slowly") # Let it run for a bit await asyncio.sleep(2) # Interrupt the task await client.interrupt() print("Task interrupted!") # Send a new command await client.query("Just say hello instead") async for message in client.receive_response(): # Process the new response pass asyncio.run(interruptible_task()) ``` #### Example - Advanced permission control ```python from claude_agent_sdk import ( ClaudeSDKClient, ClaudeAgentOptions ) from claude_agent_sdk.types import PermissionResultAllow, PermissionResultDeny async def custom_permission_handler( tool_name: str, input_data: dict, context: dict ) -> PermissionResultAllow | PermissionResultDeny: """Custom logic for tool permissions.""" # Block writes to system directories if tool_name == "Write" and input_data.get("file_path", "").startswith("/system/"): return PermissionResultDeny( message="System directory write not allowed", interrupt=True ) # Redirect sensitive file operations if tool_name in ["Write", "Edit"] and "config" in input_data.get("file_path", ""): safe_path = f"./sandbox/{input_data['file_path']}" return PermissionResultAllow( updated_input={**input_data, "file_path": safe_path} ) # Allow everything else return PermissionResultAllow(updated_input=input_data) async def main(): options = ClaudeAgentOptions( can_use_tool=custom_permission_handler, allowed_tools=["Read", "Write", "Edit"] ) async with ClaudeSDKClient(options=options) as client: await client.query("Update the system config file") async for message in client.receive_response(): # Will use sandbox path instead print(message) asyncio.run(main()) ``` ## Types ### `SdkMcpTool` Definition for an SDK MCP tool created with the `@tool` decorator. ```python @dataclass class SdkMcpTool(Generic[T]): name: str description: str input_schema: type[T] | dict[str, Any] handler: Callable[[T], Awaitable[dict[str, Any]]] ``` | Property | Type | Description | | :------------- | :----------------------------------------- | :----------------------------------------- | | `name` | `str` | Unique identifier for the tool | | `description` | `str` | Human-readable description | | `input_schema` | `type[T] \| dict[str, Any]` | Schema for input validation | | `handler` | `Callable[[T], Awaitable[dict[str, Any]]]` | Async function that handles tool execution | ### `ClaudeAgentOptions` Configuration dataclass for Claude Code queries. ```python @dataclass class ClaudeAgentOptions: tools: list[str] | ToolsPreset | None = None allowed_tools: list[str] = field(default_factory=list) system_prompt: str | SystemPromptPreset | None = None mcp_servers: dict[str, McpServerConfig] | str | Path = field(default_factory=dict) permission_mode: PermissionMode | None = None continue_conversation: bool = False resume: str | None = None max_turns: int | None = None max_budget_usd: float | None = None disallowed_tools: list[str] = field(default_factory=list) model: str | None = None fallback_model: str | None = None betas: list[SdkBeta] = field(default_factory=list) output_format: OutputFormat | None = None permission_prompt_tool_name: str | None = None cwd: str | Path | None = None cli_path: str | Path | None = None settings: str | None = None add_dirs: list[str | Path] = field(default_factory=list) env: dict[str, str] = field(default_factory=dict) extra_args: dict[str, str | None] = field(default_factory=dict) max_buffer_size: int | None = None debug_stderr: Any = sys.stderr # Deprecated stderr: Callable[[str], None] | None = None can_use_tool: CanUseTool | None = None hooks: dict[HookEvent, list[HookMatcher]] | None = None user: str | None = None include_partial_messages: bool = False fork_session: bool = False agents: dict[str, AgentDefinition] | None = None setting_sources: list[SettingSource] | None = None max_thinking_tokens: int | None = None ``` | Property | Type | Default | Description | | :---------------------------- | :------------------------------------------- | :------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `tools` | `list[str] \| ToolsPreset \| None` | `None` | Tools configuration. Use `{"type": "preset", "preset": "claude_code"}` for Claude Code's default tools | | `allowed_tools` | `list[str]` | `[]` | List of allowed tool names | | `system_prompt` | `str \| SystemPromptPreset \| None` | `None` | System prompt configuration. Pass a string for custom prompt, or use `{"type": "preset", "preset": "claude_code"}` for Claude Code's system prompt. Add `"append"` to extend the preset | | `mcp_servers` | `dict[str, McpServerConfig] \| str \| Path` | `{}` | MCP server configurations or path to config file | | `permission_mode` | `PermissionMode \| None` | `None` | Permission mode for tool usage | | `continue_conversation` | `bool` | `False` | Continue the most recent conversation | | `resume` | `str \| None` | `None` | Session ID to resume | | `max_turns` | `int \| None` | `None` | Maximum conversation turns | | `max_budget_usd` | `float \| None` | `None` | Maximum budget in USD for the session | | `disallowed_tools` | `list[str]` | `[]` | List of disallowed tool names | | `enable_file_checkpointing` | `bool` | `False` | Enable file change tracking for rewinding. See [File checkpointing](/docs/en/agent-sdk/file-checkpointing) | | `model` | `str \| None` | `None` | Claude model to use | | `fallback_model` | `str \| None` | `None` | Fallback model to use if the primary model fails | | `betas` | `list[SdkBeta]` | `[]` | Beta features to enable. See [`SdkBeta`](#sdkbeta) for available options | | `output_format` | [`OutputFormat`](#outputformat) ` \| None` | `None` | Define output format for agent results. See [Structured outputs](/docs/en/agent-sdk/structured-outputs) for details | | `permission_prompt_tool_name` | `str \| None` | `None` | MCP tool name for permission prompts | | `cwd` | `str \| Path \| None` | `None` | Current working directory | | `cli_path` | `str \| Path \| None` | `None` | Custom path to the Claude Code CLI executable | | `settings` | `str \| None` | `None` | Path to settings file | | `add_dirs` | `list[str \| Path]` | `[]` | Additional directories Claude can access | | `env` | `dict[str, str]` | `{}` | Environment variables | | `extra_args` | `dict[str, str \| None]` | `{}` | Additional CLI arguments to pass directly to the CLI | | `max_buffer_size` | `int \| None` | `None` | Maximum bytes when buffering CLI stdout | | `debug_stderr` | `Any` | `sys.stderr` | _Deprecated_ - File-like object for debug output. Use `stderr` callback instead | | `stderr` | `Callable[[str], None] \| None` | `None` | Callback function for stderr output from CLI | | `can_use_tool` | [`CanUseTool`](#canusertool) ` \| None` | `None` | Tool permission callback function. See [Permission types](#canusertool) for details | | `hooks` | `dict[HookEvent, list[HookMatcher]] \| None` | `None` | Hook configurations for intercepting events | | `user` | `str \| None` | `None` | User identifier | | `include_partial_messages` | `bool` | `False` | Include partial message streaming events. When enabled, [`StreamEvent`](#streamevent) messages are yielded | | `fork_session` | `bool` | `False` | When resuming with `resume`, fork to a new session ID instead of continuing the original session | | `agents` | `dict[str, AgentDefinition] \| None` | `None` | Programmatically defined subagents | | `plugins` | `list[SdkPluginConfig]` | `[]` | Load custom plugins from local paths. See [Plugins](/docs/en/agent-sdk/plugins) for details | | `sandbox` | [`SandboxSettings`](#sandboxsettings) ` \| None` | `None` | Configure sandbox behavior programmatically. See [Sandbox settings](#sandboxsettings) for details | | `setting_sources` | `list[SettingSource] \| None` | `None` (no settings) | Control which filesystem settings to load. When omitted, no settings are loaded. **Note:** Must include `"project"` to load CLAUDE.md files | | `max_thinking_tokens` | `int \| None` | `None` | Maximum tokens for thinking blocks | ### `OutputFormat` Configuration for structured output validation. ```python class OutputFormat(TypedDict): type: Literal["json_schema"] schema: dict[str, Any] ``` | Field | Required | Description | | :------- | :------- | :--------------------------------------------- | | `type` | Yes | Must be `"json_schema"` for JSON Schema validation | | `schema` | Yes | JSON Schema definition for output validation | ### `SystemPromptPreset` Configuration for using Claude Code's preset system prompt with optional additions. ```python class SystemPromptPreset(TypedDict): type: Literal["preset"] preset: Literal["claude_code"] append: NotRequired[str] ``` | Field | Required | Description | | :------- | :------- | :------------------------------------------------------------ | | `type` | Yes | Must be `"preset"` to use a preset system prompt | | `preset` | Yes | Must be `"claude_code"` to use Claude Code's system prompt | | `append` | No | Additional instructions to append to the preset system prompt | ### `SettingSource` Controls which filesystem-based configuration sources the SDK loads settings from. ```python SettingSource = Literal["user", "project", "local"] ``` | Value | Description | Location | | :---------- | :------------------------------------------- | :---------------------------- | | `"user"` | Global user settings | `~/.claude/settings.json` | | `"project"` | Shared project settings (version controlled) | `.claude/settings.json` | | `"local"` | Local project settings (gitignored) | `.claude/settings.local.json` | #### Default behavior When `setting_sources` is **omitted** or **`None`**, the SDK does **not** load any filesystem settings. This provides isolation for SDK applications. #### Why use setting_sources? **Load all filesystem settings (legacy behavior):** ```python # Load all settings like SDK v0.0.x did from claude_agent_sdk import query, ClaudeAgentOptions async for message in query( prompt="Analyze this code", options=ClaudeAgentOptions( setting_sources=["user", "project", "local"] # Load all settings ) ): print(message) ``` **Load only specific setting sources:** ```python # Load only project settings, ignore user and local async for message in query( prompt="Run CI checks", options=ClaudeAgentOptions( setting_sources=["project"] # Only .claude/settings.json ) ): print(message) ``` **Testing and CI environments:** ```python # Ensure consistent behavior in CI by excluding local settings async for message in query( prompt="Run tests", options=ClaudeAgentOptions( setting_sources=["project"], # Only team-shared settings permission_mode="bypassPermissions" ) ): print(message) ``` **SDK-only applications:** ```python # Define everything programmatically (default behavior) # No filesystem dependencies - setting_sources defaults to None async for message in query( prompt="Review this PR", options=ClaudeAgentOptions( # setting_sources=None is the default, no need to specify agents={ /* ... */ }, mcp_servers={ /* ... */ }, allowed_tools=["Read", "Grep", "Glob"] ) ): print(message) ``` **Loading CLAUDE.md project instructions:** ```python # Load project settings to include CLAUDE.md files async for message in query( prompt="Add a new feature following project conventions", options=ClaudeAgentOptions( system_prompt={ "type": "preset", "preset": "claude_code" # Use Claude Code's system prompt }, setting_sources=["project"], # Required to load CLAUDE.md from project allowed_tools=["Read", "Write", "Edit"] ) ): print(message) ``` #### Settings precedence When multiple sources are loaded, settings are merged with this precedence (highest to lowest): 1. Local settings (`.claude/settings.local.json`) 2. Project settings (`.claude/settings.json`) 3. User settings (`~/.claude/settings.json`) Programmatic options (like `agents`, `allowed_tools`) always override filesystem settings. ### `AgentDefinition` Configuration for a subagent defined programmatically. ```python @dataclass class AgentDefinition: description: str prompt: str tools: list[str] | None = None model: Literal["sonnet", "opus", "haiku", "inherit"] | None = None ``` | Field | Required | Description | | :------------ | :------- | :------------------------------------------------------------- | | `description` | Yes | Natural language description of when to use this agent | | `tools` | No | Array of allowed tool names. If omitted, inherits all tools | | `prompt` | Yes | The agent's system prompt | | `model` | No | Model override for this agent. If omitted, uses the main model | ### `PermissionMode` Permission modes for controlling tool execution. ```python PermissionMode = Literal[ "default", # Standard permission behavior "acceptEdits", # Auto-accept file edits "plan", # Planning mode - no execution "bypassPermissions" # Bypass all permission checks (use with caution) ] ``` ### `CanUseTool` Type alias for tool permission callback functions. ```python CanUseTool = Callable[ [str, dict[str, Any], ToolPermissionContext], Awaitable[PermissionResult] ] ``` The callback receives: - `tool_name`: Name of the tool being called - `input_data`: The tool's input parameters - `context`: A `ToolPermissionContext` with additional information Returns a `PermissionResult` (either `PermissionResultAllow` or `PermissionResultDeny`). ### `ToolPermissionContext` Context information passed to tool permission callbacks. ```python @dataclass class ToolPermissionContext: signal: Any | None = None # Future: abort signal support suggestions: list[PermissionUpdate] = field(default_factory=list) ``` | Field | Type | Description | |:------|:-----|:------------| | `signal` | `Any \| None` | Reserved for future abort signal support | | `suggestions` | `list[PermissionUpdate]` | Permission update suggestions from the CLI | ### `PermissionResult` Union type for permission callback results. ```python PermissionResult = PermissionResultAllow | PermissionResultDeny ``` ### `PermissionResultAllow` Result indicating the tool call should be allowed. ```python @dataclass class PermissionResultAllow: behavior: Literal["allow"] = "allow" updated_input: dict[str, Any] | None = None updated_permissions: list[PermissionUpdate] | None = None ``` | Field | Type | Default | Description | |:------|:-----|:--------|:------------| | `behavior` | `Literal["allow"]` | `"allow"` | Must be "allow" | | `updated_input` | `dict[str, Any] \| None` | `None` | Modified input to use instead of original | | `updated_permissions` | `list[PermissionUpdate] \| None` | `None` | Permission updates to apply | ### `PermissionResultDeny` Result indicating the tool call should be denied. ```python @dataclass class PermissionResultDeny: behavior: Literal["deny"] = "deny" message: str = "" interrupt: bool = False ``` | Field | Type | Default | Description | |:------|:-----|:--------|:------------| | `behavior` | `Literal["deny"]` | `"deny"` | Must be "deny" | | `message` | `str` | `""` | Message explaining why the tool was denied | | `interrupt` | `bool` | `False` | Whether to interrupt the current execution | ### `PermissionUpdate` Configuration for updating permissions programmatically. ```python @dataclass class PermissionUpdate: type: Literal[ "addRules", "replaceRules", "removeRules", "setMode", "addDirectories", "removeDirectories", ] rules: list[PermissionRuleValue] | None = None behavior: Literal["allow", "deny", "ask"] | None = None mode: PermissionMode | None = None directories: list[str] | None = None destination: Literal["userSettings", "projectSettings", "localSettings", "session"] | None = None ``` | Field | Type | Description | |:------|:-----|:------------| | `type` | `Literal[...]` | The type of permission update operation | | `rules` | `list[PermissionRuleValue] \| None` | Rules for add/replace/remove operations | | `behavior` | `Literal["allow", "deny", "ask"] \| None` | Behavior for rule-based operations | | `mode` | `PermissionMode \| None` | Mode for setMode operation | | `directories` | `list[str] \| None` | Directories for add/remove directory operations | | `destination` | `Literal[...] \| None` | Where to apply the permission update | ### `SdkBeta` Literal type for SDK beta features. ```python SdkBeta = Literal["context-1m-2025-08-07"] ``` Use with the `betas` field in `ClaudeAgentOptions` to enable beta features. ### `McpSdkServerConfig` Configuration for SDK MCP servers created with `create_sdk_mcp_server()`. ```python class McpSdkServerConfig(TypedDict): type: Literal["sdk"] name: str instance: Any # MCP Server instance ``` ### `McpServerConfig` Union type for MCP server configurations. ```python McpServerConfig = McpStdioServerConfig | McpSSEServerConfig | McpHttpServerConfig | McpSdkServerConfig ``` #### `McpStdioServerConfig` ```python class McpStdioServerConfig(TypedDict): type: NotRequired[Literal["stdio"]] # Optional for backwards compatibility command: str args: NotRequired[list[str]] env: NotRequired[dict[str, str]] ``` #### `McpSSEServerConfig` ```python class McpSSEServerConfig(TypedDict): type: Literal["sse"] url: str headers: NotRequired[dict[str, str]] ``` #### `McpHttpServerConfig` ```python class McpHttpServerConfig(TypedDict): type: Literal["http"] url: str headers: NotRequired[dict[str, str]] ``` ### `SdkPluginConfig` Configuration for loading plugins in the SDK. ```python class SdkPluginConfig(TypedDict): type: Literal["local"] path: str ``` | Field | Type | Description | |:------|:-----|:------------| | `type` | `Literal["local"]` | Must be `"local"` (only local plugins currently supported) | | `path` | `str` | Absolute or relative path to the plugin directory | **Example:** ```python plugins=[ {"type": "local", "path": "./my-plugin"}, {"type": "local", "path": "/absolute/path/to/plugin"} ] ``` For complete information on creating and using plugins, see [Plugins](/docs/en/agent-sdk/plugins). ## Message Types ### `Message` Union type of all possible messages. ```python Message = UserMessage | AssistantMessage | SystemMessage | ResultMessage | StreamEvent ``` ### `UserMessage` User input message. ```python @dataclass class UserMessage: content: str | list[ContentBlock] ``` ### `AssistantMessage` Assistant response message with content blocks. ```python @dataclass class AssistantMessage: content: list[ContentBlock] model: str ``` ### `SystemMessage` System message with metadata. ```python @dataclass class SystemMessage: subtype: str data: dict[str, Any] ``` ### `ResultMessage` Final result message with cost and usage information. ```python @dataclass class ResultMessage: subtype: str duration_ms: int duration_api_ms: int is_error: bool num_turns: int session_id: str total_cost_usd: float | None = None usage: dict[str, Any] | None = None result: str | None = None structured_output: Any = None ``` ### `StreamEvent` Stream event for partial message updates during streaming. Only received when `include_partial_messages=True` in `ClaudeAgentOptions`. ```python @dataclass class StreamEvent: uuid: str session_id: str event: dict[str, Any] # The raw Anthropic API stream event parent_tool_use_id: str | None = None ``` | Field | Type | Description | |:------|:-----|:------------| | `uuid` | `str` | Unique identifier for this event | | `session_id` | `str` | Session identifier | | `event` | `dict[str, Any]` | The raw Anthropic API stream event data | | `parent_tool_use_id` | `str \| None` | Parent tool use ID if this event is from a subagent | ## Content Block Types ### `ContentBlock` Union type of all content blocks. ```python ContentBlock = TextBlock | ThinkingBlock | ToolUseBlock | ToolResultBlock ``` ### `TextBlock` Text content block. ```python @dataclass class TextBlock: text: str ``` ### `ThinkingBlock` Thinking content block (for models with thinking capability). ```python @dataclass class ThinkingBlock: thinking: str signature: str ``` ### `ToolUseBlock` Tool use request block. ```python @dataclass class ToolUseBlock: id: str name: str input: dict[str, Any] ``` ### `ToolResultBlock` Tool execution result block. ```python @dataclass class ToolResultBlock: tool_use_id: str content: str | list[dict[str, Any]] | None = None is_error: bool | None = None ``` ## Error Types ### `ClaudeSDKError` Base exception class for all SDK errors. ```python class ClaudeSDKError(Exception): """Base error for Claude SDK.""" ``` ### `CLINotFoundError` Raised when Claude Code CLI is not installed or not found. ```python class CLINotFoundError(CLIConnectionError): def __init__(self, message: str = "Claude Code not found", cli_path: str | None = None): """ Args: message: Error message (default: "Claude Code not found") cli_path: Optional path to the CLI that was not found """ ``` ### `CLIConnectionError` Raised when connection to Claude Code fails. ```python class CLIConnectionError(ClaudeSDKError): """Failed to connect to Claude Code.""" ``` ### `ProcessError` Raised when the Claude Code process fails. ```python class ProcessError(ClaudeSDKError): def __init__(self, message: str, exit_code: int | None = None, stderr: str | None = None): self.exit_code = exit_code self.stderr = stderr ``` ### `CLIJSONDecodeError` Raised when JSON parsing fails. ```python class CLIJSONDecodeError(ClaudeSDKError): def __init__(self, line: str, original_error: Exception): """ Args: line: The line that failed to parse original_error: The original JSON decode exception """ self.line = line self.original_error = original_error ``` ## Hook Types For a comprehensive guide on using hooks with examples and common patterns, see the [Hooks guide](/docs/en/agent-sdk/hooks). ### `HookEvent` Supported hook event types. Note that due to setup limitations, the Python SDK does not support SessionStart, SessionEnd, and Notification hooks. ```python HookEvent = Literal[ "PreToolUse", # Called before tool execution "PostToolUse", # Called after tool execution "UserPromptSubmit", # Called when user submits a prompt "Stop", # Called when stopping execution "SubagentStop", # Called when a subagent stops "PreCompact" # Called before message compaction ] ``` ### `HookCallback` Type definition for hook callback functions. ```python HookCallback = Callable[ [dict[str, Any], str | None, HookContext], Awaitable[dict[str, Any]] ] ``` Parameters: - `input_data`: Hook-specific input data (see [Hooks guide](/docs/en/agent-sdk/hooks#input-data)) - `tool_use_id`: Optional tool use identifier (for tool-related hooks) - `context`: Hook context with additional information Returns a dictionary that may contain: - `decision`: `"block"` to block the action - `systemMessage`: System message to add to the transcript - `hookSpecificOutput`: Hook-specific output data ### `HookContext` Context information passed to hook callbacks. ```python @dataclass class HookContext: signal: Any | None = None # Future: abort signal support ``` ### `HookMatcher` Configuration for matching hooks to specific events or tools. ```python @dataclass class HookMatcher: matcher: str | None = None # Tool name or pattern to match (e.g., "Bash", "Write|Edit") hooks: list[HookCallback] = field(default_factory=list) # List of callbacks to execute timeout: float | None = None # Timeout in seconds for all hooks in this matcher (default: 60) ``` ### `HookInput` Union type of all hook input types. The actual type depends on the `hook_event_name` field. ```python HookInput = ( PreToolUseHookInput | PostToolUseHookInput | UserPromptSubmitHookInput | StopHookInput | SubagentStopHookInput | PreCompactHookInput ) ``` ### `BaseHookInput` Base fields present in all hook input types. ```python class BaseHookInput(TypedDict): session_id: str transcript_path: str cwd: str permission_mode: NotRequired[str] ``` | Field | Type | Description | |:------|:-----|:------------| | `session_id` | `str` | Current session identifier | | `transcript_path` | `str` | Path to the session transcript file | | `cwd` | `str` | Current working directory | | `permission_mode` | `str` (optional) | Current permission mode | ### `PreToolUseHookInput` Input data for `PreToolUse` hook events. ```python class PreToolUseHookInput(BaseHookInput): hook_event_name: Literal["PreToolUse"] tool_name: str tool_input: dict[str, Any] ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["PreToolUse"]` | Always "PreToolUse" | | `tool_name` | `str` | Name of the tool about to be executed | | `tool_input` | `dict[str, Any]` | Input parameters for the tool | ### `PostToolUseHookInput` Input data for `PostToolUse` hook events. ```python class PostToolUseHookInput(BaseHookInput): hook_event_name: Literal["PostToolUse"] tool_name: str tool_input: dict[str, Any] tool_response: Any ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["PostToolUse"]` | Always "PostToolUse" | | `tool_name` | `str` | Name of the tool that was executed | | `tool_input` | `dict[str, Any]` | Input parameters that were used | | `tool_response` | `Any` | Response from the tool execution | ### `UserPromptSubmitHookInput` Input data for `UserPromptSubmit` hook events. ```python class UserPromptSubmitHookInput(BaseHookInput): hook_event_name: Literal["UserPromptSubmit"] prompt: str ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["UserPromptSubmit"]` | Always "UserPromptSubmit" | | `prompt` | `str` | The user's submitted prompt | ### `StopHookInput` Input data for `Stop` hook events. ```python class StopHookInput(BaseHookInput): hook_event_name: Literal["Stop"] stop_hook_active: bool ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["Stop"]` | Always "Stop" | | `stop_hook_active` | `bool` | Whether the stop hook is active | ### `SubagentStopHookInput` Input data for `SubagentStop` hook events. ```python class SubagentStopHookInput(BaseHookInput): hook_event_name: Literal["SubagentStop"] stop_hook_active: bool ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["SubagentStop"]` | Always "SubagentStop" | | `stop_hook_active` | `bool` | Whether the stop hook is active | ### `PreCompactHookInput` Input data for `PreCompact` hook events. ```python class PreCompactHookInput(BaseHookInput): hook_event_name: Literal["PreCompact"] trigger: Literal["manual", "auto"] custom_instructions: str | None ``` | Field | Type | Description | |:------|:-----|:------------| | `hook_event_name` | `Literal["PreCompact"]` | Always "PreCompact" | | `trigger` | `Literal["manual", "auto"]` | What triggered the compaction | | `custom_instructions` | `str \| None` | Custom instructions for compaction | ### `HookJSONOutput` Union type for hook callback return values. ```python HookJSONOutput = AsyncHookJSONOutput | SyncHookJSONOutput ``` #### `SyncHookJSONOutput` Synchronous hook output with control and decision fields. ```python class SyncHookJSONOutput(TypedDict): # Control fields continue_: NotRequired[bool] # Whether to proceed (default: True) suppressOutput: NotRequired[bool] # Hide stdout from transcript stopReason: NotRequired[str] # Message when continue is False # Decision fields decision: NotRequired[Literal["block"]] systemMessage: NotRequired[str] # Warning message for user reason: NotRequired[str] # Feedback for Claude # Hook-specific output hookSpecificOutput: NotRequired[dict[str, Any]] ``` Use `continue_` (with underscore) in Python code. It is automatically converted to `continue` when sent to the CLI. #### `AsyncHookJSONOutput` Async hook output that defers hook execution. ```python class AsyncHookJSONOutput(TypedDict): async_: Literal[True] # Set to True to defer execution asyncTimeout: NotRequired[int] # Timeout in milliseconds ``` Use `async_` (with underscore) in Python code. It is automatically converted to `async` when sent to the CLI. ### Hook Usage Example This example registers two hooks: one that blocks dangerous bash commands like `rm -rf /`, and another that logs all tool usage for auditing. The security hook only runs on Bash commands (via the `matcher`), while the logging hook runs on all tools. ```python from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher, HookContext from typing import Any async def validate_bash_command( input_data: dict[str, Any], tool_use_id: str | None, context: HookContext ) -> dict[str, Any]: """Validate and potentially block dangerous bash commands.""" if input_data['tool_name'] == 'Bash': command = input_data['tool_input'].get('command', '') if 'rm -rf /' in command: return { 'hookSpecificOutput': { 'hookEventName': 'PreToolUse', 'permissionDecision': 'deny', 'permissionDecisionReason': 'Dangerous command blocked' } } return {} async def log_tool_use( input_data: dict[str, Any], tool_use_id: str | None, context: HookContext ) -> dict[str, Any]: """Log all tool usage for auditing.""" print(f"Tool used: {input_data.get('tool_name')}") return {} options = ClaudeAgentOptions( hooks={ 'PreToolUse': [ HookMatcher(matcher='Bash', hooks=[validate_bash_command], timeout=120), # 2 min for validation HookMatcher(hooks=[log_tool_use]) # Applies to all tools (default 60s timeout) ], 'PostToolUse': [ HookMatcher(hooks=[log_tool_use]) ] } ) async for message in query( prompt="Analyze this codebase", options=options ): print(message) ``` ## Tool Input/Output Types Documentation of input/output schemas for all built-in Claude Code tools. While the Python SDK doesn't export these as types, they represent the structure of tool inputs and outputs in messages. ### Task **Tool name:** `Task` **Input:** ```python { "description": str, # A short (3-5 word) description of the task "prompt": str, # The task for the agent to perform "subagent_type": str # The type of specialized agent to use } ``` **Output:** ```python { "result": str, # Final result from the subagent "usage": dict | None, # Token usage statistics "total_cost_usd": float | None, # Total cost in USD "duration_ms": int | None # Execution duration in milliseconds } ``` ### AskUserQuestion **Tool name:** `AskUserQuestion` Asks the user clarifying questions during execution. See [Handle approvals and user input](/docs/en/agent-sdk/user-input#handle-clarifying-questions) for usage details. **Input:** ```python { "questions": [ # Questions to ask the user (1-4 questions) { "question": str, # The complete question to ask the user "header": str, # Very short label displayed as a chip/tag (max 12 chars) "options": [ # The available choices (2-4 options) { "label": str, # Display text for this option (1-5 words) "description": str # Explanation of what this option means } ], "multiSelect": bool # Set to true to allow multiple selections } ], "answers": dict | None # User answers populated by the permission system } ``` **Output:** ```python { "questions": [ # The questions that were asked { "question": str, "header": str, "options": [{"label": str, "description": str}], "multiSelect": bool } ], "answers": dict[str, str] # Maps question text to answer string # Multi-select answers are comma-separated } ``` ### Bash **Tool name:** `Bash` **Input:** ```python { "command": str, # The command to execute "timeout": int | None, # Optional timeout in milliseconds (max 600000) "description": str | None, # Clear, concise description (5-10 words) "run_in_background": bool | None # Set to true to run in background } ``` **Output:** ```python { "output": str, # Combined stdout and stderr output "exitCode": int, # Exit code of the command "killed": bool | None, # Whether command was killed due to timeout "shellId": str | None # Shell ID for background processes } ``` ### Edit **Tool name:** `Edit` **Input:** ```python { "file_path": str, # The absolute path to the file to modify "old_string": str, # The text to replace "new_string": str, # The text to replace it with "replace_all": bool | None # Replace all occurrences (default False) } ``` **Output:** ```python { "message": str, # Confirmation message "replacements": int, # Number of replacements made "file_path": str # File path that was edited } ``` ### Read **Tool name:** `Read` **Input:** ```python { "file_path": str, # The absolute path to the file to read "offset": int | None, # The line number to start reading from "limit": int | None # The number of lines to read } ``` **Output (Text files):** ```python { "content": str, # File contents with line numbers "total_lines": int, # Total number of lines in file "lines_returned": int # Lines actually returned } ``` **Output (Images):** ```python { "image": str, # Base64 encoded image data "mime_type": str, # Image MIME type "file_size": int # File size in bytes } ``` ### Write **Tool name:** `Write` **Input:** ```python { "file_path": str, # The absolute path to the file to write "content": str # The content to write to the file } ``` **Output:** ```python { "message": str, # Success message "bytes_written": int, # Number of bytes written "file_path": str # File path that was written } ``` ### Glob **Tool name:** `Glob` **Input:** ```python { "pattern": str, # The glob pattern to match files against "path": str | None # The directory to search in (defaults to cwd) } ``` **Output:** ```python { "matches": list[str], # Array of matching file paths "count": int, # Number of matches found "search_path": str # Search directory used } ``` ### Grep **Tool name:** `Grep` **Input:** ```python { "pattern": str, # The regular expression pattern "path": str | None, # File or directory to search in "glob": str | None, # Glob pattern to filter files "type": str | None, # File type to search "output_mode": str | None, # "content", "files_with_matches", or "count" "-i": bool | None, # Case insensitive search "-n": bool | None, # Show line numbers "-B": int | None, # Lines to show before each match "-A": int | None, # Lines to show after each match "-C": int | None, # Lines to show before and after "head_limit": int | None, # Limit output to first N lines/entries "multiline": bool | None # Enable multiline mode } ``` **Output (content mode):** ```python { "matches": [ { "file": str, "line_number": int | None, "line": str, "before_context": list[str] | None, "after_context": list[str] | None } ], "total_matches": int } ``` **Output (files_with_matches mode):** ```python { "files": list[str], # Files containing matches "count": int # Number of files with matches } ``` ### NotebookEdit **Tool name:** `NotebookEdit` **Input:** ```python { "notebook_path": str, # Absolute path to the Jupyter notebook "cell_id": str | None, # The ID of the cell to edit "new_source": str, # The new source for the cell "cell_type": "code" | "markdown" | None, # The type of the cell "edit_mode": "replace" | "insert" | "delete" | None # Edit operation type } ``` **Output:** ```python { "message": str, # Success message "edit_type": "replaced" | "inserted" | "deleted", # Type of edit performed "cell_id": str | None, # Cell ID that was affected "total_cells": int # Total cells in notebook after edit } ``` ### WebFetch **Tool name:** `WebFetch` **Input:** ```python { "url": str, # The URL to fetch content from "prompt": str # The prompt to run on the fetched content } ``` **Output:** ```python { "response": str, # AI model's response to the prompt "url": str, # URL that was fetched "final_url": str | None, # Final URL after redirects "status_code": int | None # HTTP status code } ``` ### WebSearch **Tool name:** `WebSearch` **Input:** ```python { "query": str, # The search query to use "allowed_domains": list[str] | None, # Only include results from these domains "blocked_domains": list[str] | None # Never include results from these domains } ``` **Output:** ```python { "results": [ { "title": str, "url": str, "snippet": str, "metadata": dict | None } ], "total_results": int, "query": str } ``` ### TodoWrite **Tool name:** `TodoWrite` **Input:** ```python { "todos": [ { "content": str, # The task description "status": "pending" | "in_progress" | "completed", # Task status "activeForm": str # Active form of the description } ] } ``` **Output:** ```python { "message": str, # Success message "stats": { "total": int, "pending": int, "in_progress": int, "completed": int } } ``` ### BashOutput **Tool name:** `BashOutput` **Input:** ```python { "bash_id": str, # The ID of the background shell "filter": str | None # Optional regex to filter output lines } ``` **Output:** ```python { "output": str, # New output since last check "status": "running" | "completed" | "failed", # Current shell status "exitCode": int | None # Exit code when completed } ``` ### KillBash **Tool name:** `KillBash` **Input:** ```python { "shell_id": str # The ID of the background shell to kill } ``` **Output:** ```python { "message": str, # Success message "shell_id": str # ID of the killed shell } ``` ### ExitPlanMode **Tool name:** `ExitPlanMode` **Input:** ```python { "plan": str # The plan to run by the user for approval } ``` **Output:** ```python { "message": str, # Confirmation message "approved": bool | None # Whether user approved the plan } ``` ### ListMcpResources **Tool name:** `ListMcpResources` **Input:** ```python { "server": str | None # Optional server name to filter resources by } ``` **Output:** ```python { "resources": [ { "uri": str, "name": str, "description": str | None, "mimeType": str | None, "server": str } ], "total": int } ``` ### ReadMcpResource **Tool name:** `ReadMcpResource` **Input:** ```python { "server": str, # The MCP server name "uri": str # The resource URI to read } ``` **Output:** ```python { "contents": [ { "uri": str, "mimeType": str | None, "text": str | None, "blob": str | None } ], "server": str } ``` ## Advanced Features with ClaudeSDKClient ### Building a Continuous Conversation Interface ```python from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions, AssistantMessage, TextBlock import asyncio class ConversationSession: """Maintains a single conversation session with Claude.""" def __init__(self, options: ClaudeAgentOptions = None): self.client = ClaudeSDKClient(options) self.turn_count = 0 async def start(self): await self.client.connect() print("Starting conversation session. Claude will remember context.") print("Commands: 'exit' to quit, 'interrupt' to stop current task, 'new' for new session") while True: user_input = input(f"\n[Turn {self.turn_count + 1}] You: ") if user_input.lower() == 'exit': break elif user_input.lower() == 'interrupt': await self.client.interrupt() print("Task interrupted!") continue elif user_input.lower() == 'new': # Disconnect and reconnect for a fresh session await self.client.disconnect() await self.client.connect() self.turn_count = 0 print("Started new conversation session (previous context cleared)") continue # Send message - Claude remembers all previous messages in this session await self.client.query(user_input) self.turn_count += 1 # Process response print(f"[Turn {self.turn_count}] Claude: ", end="") async for message in self.client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(block.text, end="") print() # New line after response await self.client.disconnect() print(f"Conversation ended after {self.turn_count} turns.") async def main(): options = ClaudeAgentOptions( allowed_tools=["Read", "Write", "Bash"], permission_mode="acceptEdits" ) session = ConversationSession(options) await session.start() # Example conversation: # Turn 1 - You: "Create a file called hello.py" # Turn 1 - Claude: "I'll create a hello.py file for you..." # Turn 2 - You: "What's in that file?" # Turn 2 - Claude: "The hello.py file I just created contains..." (remembers!) # Turn 3 - You: "Add a main function to it" # Turn 3 - Claude: "I'll add a main function to hello.py..." (knows which file!) asyncio.run(main()) ``` ### Using Hooks for Behavior Modification ```python from claude_agent_sdk import ( ClaudeSDKClient, ClaudeAgentOptions, HookMatcher, HookContext ) import asyncio from typing import Any async def pre_tool_logger( input_data: dict[str, Any], tool_use_id: str | None, context: HookContext ) -> dict[str, Any]: """Log all tool usage before execution.""" tool_name = input_data.get('tool_name', 'unknown') print(f"[PRE-TOOL] About to use: {tool_name}") # You can modify or block the tool execution here if tool_name == "Bash" and "rm -rf" in str(input_data.get('tool_input', {})): return { 'hookSpecificOutput': { 'hookEventName': 'PreToolUse', 'permissionDecision': 'deny', 'permissionDecisionReason': 'Dangerous command blocked' } } return {} async def post_tool_logger( input_data: dict[str, Any], tool_use_id: str | None, context: HookContext ) -> dict[str, Any]: """Log results after tool execution.""" tool_name = input_data.get('tool_name', 'unknown') print(f"[POST-TOOL] Completed: {tool_name}") return {} async def user_prompt_modifier( input_data: dict[str, Any], tool_use_id: str | None, context: HookContext ) -> dict[str, Any]: """Add context to user prompts.""" original_prompt = input_data.get('prompt', '') # Add timestamp to all prompts from datetime import datetime timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") return { 'hookSpecificOutput': { 'hookEventName': 'UserPromptSubmit', 'updatedPrompt': f"[{timestamp}] {original_prompt}" } } async def main(): options = ClaudeAgentOptions( hooks={ 'PreToolUse': [ HookMatcher(hooks=[pre_tool_logger]), HookMatcher(matcher='Bash', hooks=[pre_tool_logger]) ], 'PostToolUse': [ HookMatcher(hooks=[post_tool_logger]) ], 'UserPromptSubmit': [ HookMatcher(hooks=[user_prompt_modifier]) ] }, allowed_tools=["Read", "Write", "Bash"] ) async with ClaudeSDKClient(options=options) as client: await client.query("List files in current directory") async for message in client.receive_response(): # Hooks will automatically log tool usage pass asyncio.run(main()) ``` ### Real-time Progress Monitoring ```python from claude_agent_sdk import ( ClaudeSDKClient, ClaudeAgentOptions, AssistantMessage, ToolUseBlock, ToolResultBlock, TextBlock ) import asyncio async def monitor_progress(): options = ClaudeAgentOptions( allowed_tools=["Write", "Bash"], permission_mode="acceptEdits" ) async with ClaudeSDKClient(options=options) as client: await client.query( "Create 5 Python files with different sorting algorithms" ) # Monitor progress in real-time files_created = [] async for message in client.receive_messages(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, ToolUseBlock): if block.name == "Write": file_path = block.input.get("file_path", "") print(f"🔨 Creating: {file_path}") elif isinstance(block, ToolResultBlock): print(f"✅ Completed tool execution") elif isinstance(block, TextBlock): print(f"💭 Claude says: {block.text[:100]}...") # Check if we've received the final result if hasattr(message, 'subtype') and message.subtype in ['success', 'error']: print(f"\n🎯 Task completed!") break asyncio.run(monitor_progress()) ``` ## Example Usage ### Basic file operations (using query) ```python from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, ToolUseBlock import asyncio async def create_project(): options = ClaudeAgentOptions( allowed_tools=["Read", "Write", "Bash"], permission_mode='acceptEdits', cwd="/home/user/project" ) async for message in query( prompt="Create a Python project structure with setup.py", options=options ): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, ToolUseBlock): print(f"Using tool: {block.name}") asyncio.run(create_project()) ``` ### Error handling ```python from claude_agent_sdk import ( query, CLINotFoundError, ProcessError, CLIJSONDecodeError ) try: async for message in query(prompt="Hello"): print(message) except CLINotFoundError: print("Please install Claude Code: npm install -g @anthropic-ai/claude-code") except ProcessError as e: print(f"Process failed with exit code: {e.exit_code}") except CLIJSONDecodeError as e: print(f"Failed to parse response: {e}") ``` ### Streaming mode with client ```python from claude_agent_sdk import ClaudeSDKClient import asyncio async def interactive_session(): async with ClaudeSDKClient() as client: # Send initial message await client.query("What's the weather like?") # Process responses async for msg in client.receive_response(): print(msg) # Send follow-up await client.query("Tell me more about that") # Process follow-up response async for msg in client.receive_response(): print(msg) asyncio.run(interactive_session()) ``` ### Using custom tools with ClaudeSDKClient ```python from claude_agent_sdk import ( ClaudeSDKClient, ClaudeAgentOptions, tool, create_sdk_mcp_server, AssistantMessage, TextBlock ) import asyncio from typing import Any # Define custom tools with @tool decorator @tool("calculate", "Perform mathematical calculations", {"expression": str}) async def calculate(args: dict[str, Any]) -> dict[str, Any]: try: result = eval(args["expression"], {"__builtins__": {}}) return { "content": [{ "type": "text", "text": f"Result: {result}" }] } except Exception as e: return { "content": [{ "type": "text", "text": f"Error: {str(e)}" }], "is_error": True } @tool("get_time", "Get current time", {}) async def get_time(args: dict[str, Any]) -> dict[str, Any]: from datetime import datetime current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S") return { "content": [{ "type": "text", "text": f"Current time: {current_time}" }] } async def main(): # Create SDK MCP server with custom tools my_server = create_sdk_mcp_server( name="utilities", version="1.0.0", tools=[calculate, get_time] ) # Configure options with the server options = ClaudeAgentOptions( mcp_servers={"utils": my_server}, allowed_tools=[ "mcp__utils__calculate", "mcp__utils__get_time" ] ) # Use ClaudeSDKClient for interactive tool usage async with ClaudeSDKClient(options=options) as client: await client.query("What's 123 * 456?") # Process calculation response async for message in client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(f"Calculation: {block.text}") # Follow up with time query await client.query("What time is it now?") async for message in client.receive_response(): if isinstance(message, AssistantMessage): for block in message.content: if isinstance(block, TextBlock): print(f"Time: {block.text}") asyncio.run(main()) ``` ## Sandbox Configuration ### `SandboxSettings` Configuration for sandbox behavior. Use this to enable command sandboxing and configure network restrictions programmatically. ```python class SandboxSettings(TypedDict, total=False): enabled: bool autoAllowBashIfSandboxed: bool excludedCommands: list[str] allowUnsandboxedCommands: bool network: SandboxNetworkConfig ignoreViolations: SandboxIgnoreViolations enableWeakerNestedSandbox: bool ``` | Property | Type | Default | Description | | :------- | :--- | :------ | :---------- | | `enabled` | `bool` | `False` | Enable sandbox mode for command execution | | `autoAllowBashIfSandboxed` | `bool` | `False` | Auto-approve bash commands when sandbox is enabled | | `excludedCommands` | `list[str]` | `[]` | Commands that always bypass sandbox restrictions (e.g., `["docker"]`). These run unsandboxed automatically without model involvement | | `allowUnsandboxedCommands` | `bool` | `False` | Allow the model to request running commands outside the sandbox. When `True`, the model can set `dangerouslyDisableSandbox` in tool input, which falls back to the [permissions system](#permissions-fallback-for-unsandboxed-commands) | | `network` | [`SandboxNetworkConfig`](#sandboxnetworkconfig) | `None` | Network-specific sandbox configuration | | `ignoreViolations` | [`SandboxIgnoreViolations`](#sandboxignoreviolations) | `None` | Configure which sandbox violations to ignore | | `enableWeakerNestedSandbox` | `bool` | `False` | Enable a weaker nested sandbox for compatibility | **Filesystem and network access restrictions** are NOT configured via sandbox settings. Instead, they are derived from [permission rules](https://code.claude.com/docs/en/settings#permission-settings): - **Filesystem read restrictions**: Read deny rules - **Filesystem write restrictions**: Edit allow/deny rules - **Network restrictions**: WebFetch allow/deny rules Use sandbox settings for command execution sandboxing, and permission rules for filesystem and network access control. #### Example usage ```python from claude_agent_sdk import query, ClaudeAgentOptions, SandboxSettings sandbox_settings: SandboxSettings = { "enabled": True, "autoAllowBashIfSandboxed": True, "network": { "allowLocalBinding": True } } async for message in query( prompt="Build and test my project", options=ClaudeAgentOptions(sandbox=sandbox_settings) ): print(message) ``` **Unix socket security**: The `allowUnixSockets` option can grant access to powerful system services. For example, allowing `/var/run/docker.sock` effectively grants full host system access through the Docker API, bypassing sandbox isolation. Only allow Unix sockets that are strictly necessary and understand the security implications of each. ### `SandboxNetworkConfig` Network-specific configuration for sandbox mode. ```python class SandboxNetworkConfig(TypedDict, total=False): allowLocalBinding: bool allowUnixSockets: list[str] allowAllUnixSockets: bool httpProxyPort: int socksProxyPort: int ``` | Property | Type | Default | Description | | :------- | :--- | :------ | :---------- | | `allowLocalBinding` | `bool` | `False` | Allow processes to bind to local ports (e.g., for dev servers) | | `allowUnixSockets` | `list[str]` | `[]` | Unix socket paths that processes can access (e.g., Docker socket) | | `allowAllUnixSockets` | `bool` | `False` | Allow access to all Unix sockets | | `httpProxyPort` | `int` | `None` | HTTP proxy port for network requests | | `socksProxyPort` | `int` | `None` | SOCKS proxy port for network requests | ### `SandboxIgnoreViolations` Configuration for ignoring specific sandbox violations. ```python class SandboxIgnoreViolations(TypedDict, total=False): file: list[str] network: list[str] ``` | Property | Type | Default | Description | | :------- | :--- | :------ | :---------- | | `file` | `list[str]` | `[]` | File path patterns to ignore violations for | | `network` | `list[str]` | `[]` | Network patterns to ignore violations for | ### Permissions Fallback for Unsandboxed Commands When `allowUnsandboxedCommands` is enabled, the model can request to run commands outside the sandbox by setting `dangerouslyDisableSandbox: True` in the tool input. These requests fall back to the existing permissions system, meaning your `can_use_tool` handler will be invoked, allowing you to implement custom authorization logic. **`excludedCommands` vs `allowUnsandboxedCommands`:** - `excludedCommands`: A static list of commands that always bypass the sandbox automatically (e.g., `["docker"]`). The model has no control over this. - `allowUnsandboxedCommands`: Lets the model decide at runtime whether to request unsandboxed execution by setting `dangerouslyDisableSandbox: True` in the tool input. ```python from claude_agent_sdk import query, ClaudeAgentOptions async def can_use_tool(tool: str, input: dict) -> bool: # Check if the model is requesting to bypass the sandbox if tool == "Bash" and input.get("dangerouslyDisableSandbox"): # The model wants to run this command outside the sandbox print(f"Unsandboxed command requested: {input.get('command')}") # Return True to allow, False to deny return is_command_authorized(input.get("command")) return True async def main(): async for message in query( prompt="Deploy my application", options=ClaudeAgentOptions( sandbox={ "enabled": True, "allowUnsandboxedCommands": True # Model can request unsandboxed execution }, permission_mode="default", can_use_tool=can_use_tool ) ): print(message) ``` This pattern enables you to: - **Audit model requests**: Log when the model requests unsandboxed execution - **Implement allowlists**: Only permit specific commands to run unsandboxed - **Add approval workflows**: Require explicit authorization for privileged operations Commands running with `dangerouslyDisableSandbox: True` have full system access. Ensure your `can_use_tool` handler validates these requests carefully. If `permission_mode` is set to `bypassPermissions` and `allow_unsandboxed_commands` is enabled, the model can autonomously execute commands outside the sandbox without any approval prompts. This combination effectively allows the model to escape sandbox isolation silently. ## See also - [Python SDK guide](/docs/en/agent-sdk/python) - Tutorial and examples - [SDK overview](/docs/en/agent-sdk/overview) - General SDK concepts - [TypeScript SDK reference](/docs/en/agent-sdk/typescript) - TypeScript SDK documentation - [CLI reference](https://code.claude.com/docs/en/cli-reference) - Command-line interface - [Common workflows](https://code.claude.com/docs/en/common-workflows) - Step-by-step guides --- # Agent SDK reference - TypeScript URL: https://platform.claude.com/docs/en/agent-sdk/typescript # Agent SDK reference - TypeScript Complete API reference for the TypeScript Agent SDK, including all functions, types, and interfaces. ---