Loading...
    • Developer Guide
    • API Reference
    • MCP
    • Resources
    • Release Notes
    Search...
    ⌘K

    First steps

    Intro to ClaudeQuickstart

    Models & pricing

    Models overviewChoosing a modelWhat's new in Claude 4.5Migrating to Claude 4.5Model deprecationsPricing

    Build with Claude

    Features overviewUsing the Messages APIContext windowsPrompting best practices

    Capabilities

    Prompt cachingContext editingExtended thinkingStreaming MessagesBatch processingCitationsMultilingual supportToken countingEmbeddingsVisionPDF supportFiles APISearch resultsGoogle Sheets add-on

    Tools

    OverviewHow to implement tool useToken-efficient tool useFine-grained tool streamingBash toolCode execution toolComputer use toolText editor toolWeb fetch toolWeb search toolMemory tool

    Agent Skills

    OverviewQuickstartBest practicesUsing Skills with the API

    Agent SDK

    OverviewTypeScript SDKPython SDK

    Guides

    Streaming InputHandling PermissionsSession ManagementHosting the Agent SDKModifying system promptsMCP in the SDKCustom ToolsSubagents in the SDKSlash Commands in the SDKAgent Skills in the SDKTracking Costs and UsageTodo ListsPlugins in the SDK

    MCP in the API

    MCP connectorRemote MCP servers

    Claude on 3rd-party platforms

    Amazon BedrockVertex AI

    Prompt engineering

    OverviewPrompt generatorUse prompt templatesPrompt improverBe clear and directUse examples (multishot prompting)Let Claude think (CoT)Use XML tagsGive Claude a role (system prompts)Prefill Claude's responseChain complex promptsLong context tipsExtended thinking tips

    Test & evaluate

    Define success criteriaDevelop test casesUsing the Evaluation ToolReducing latency

    Strengthen guardrails

    Reduce hallucinationsIncrease output consistencyMitigate jailbreaksStreaming refusalsReduce prompt leakKeep Claude in character

    Administration and monitoring

    Admin API overviewUsage and Cost APIClaude Code Analytics API
    Console
    Capabilities

    Context editing

    Automatically manage conversation context as it grows with context editing.

    Context editing is currently in beta with support for tool result clearing and thinking block clearing. To enable it, use the beta header context-management-2025-06-27 in your API requests.

    Please reach out through our feedback form to share your feedback on this feature.

    Overview

    Context editing allows you to automatically manage conversation context as it grows, helping you optimize costs and stay within context window limits. The API provides different strategies for managing context:

    • Tool result clearing (clear_tool_uses_20250919): Automatically clears tool use/result pairs when conversation context exceeds your configured threshold
    • Thinking block clearing (clear_thinking_20251015): Manages thinking blocks by clearing older thinking blocks from previous turns

    Each strategy can be configured independently and applied together to optimize your specific use case.

    Context editing strategies

    Tool result clearing

    The clear_tool_uses_20250919 strategy clears tool results when conversation context grows beyond your configured threshold. When activated, the API automatically clears the oldest tool results in chronological order, replacing them with placeholder text to let Claude know the tool result was removed. By default, only tool results are cleared. You can optionally clear both tool results and tool calls (the tool use parameters) by setting clear_tool_inputs to true.

    Thinking block clearing

    The clear_thinking_20251015 strategy manages thinking blocks in conversations when extended thinking is enabled. This strategy automatically clears older thinking blocks from previous turns.

    Default behavior: When extended thinking is enabled without configuring the clear_thinking_20251015 strategy, the API automatically keeps only the thinking blocks from the last assistant turn (equivalent to keep: {type: "thinking_turns", value: 1}).

    To maximize cache hits, preserve all thinking blocks by setting keep: "all".

    An assistant conversation turn may include multiple content blocks (e.g. when using tools) and multiple thinking blocks (e.g. with interleaved thinking).

    Context editing happens server-side

    Context editing is applied server-side before the prompt reaches Claude. Your client application maintains the full, unmodified conversation history—you do not need to sync your client state with the edited version. Continue managing your full conversation history locally as you normally would.

    Context editing and prompt caching

    Context editing's interaction with prompt caching varies by strategy:

    • Tool result clearing: Invalidates cached prompt prefixes when content is cleared. To account for this, we recommend clearing enough tokens to make the cache invalidation worthwhile. Use the clear_at_least parameter to ensure a minimum number of tokens is cleared each time. You'll incur cache write costs each time content is cleared, but subsequent requests can reuse the newly cached prefix.

    • Thinking block clearing: When thinking blocks are kept in context (not cleared), the prompt cache is preserved, enabling cache hits and reducing input token costs. When thinking blocks are cleared, the cache is invalidated at the point where clearing occurs. Configure the keep parameter based on whether you want to prioritize cache performance or context window availability.

    Supported models

    Context editing is available on:

    • Claude Opus 4.1 (claude-opus-4-1-20250805)
    • Claude Opus 4 (claude-opus-4-20250514)
    • Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
    • Claude Sonnet 4 (claude-sonnet-4-20250514)
    • Claude Haiku 4.5 (claude-haiku-4-5-20251001)

    Tool result clearing usage

    The simplest way to enable tool result clearing is to specify only the strategy type, as all other configuration options will use their default values:

    cURL
    curl https://api.anthropic.com/v1/messages \
        --header "x-api-key: $ANTHROPIC_API_KEY" \
        --header "anthropic-version: 2023-06-01" \
        --header "content-type: application/json" \
        --header "anthropic-beta: context-management-2025-06-27" \
        --data '{
            "model": "claude-sonnet-4-5",
            "max_tokens": 4096,
            "messages": [
                {
                    "role": "user",
                    "content": "Search for recent developments in AI"
                }
            ],
            "tools": [
                {
                    "type": "web_search_20250305",
                    "name": "web_search"
                }
            ],
            "context_management": {
                "edits": [
                    {"type": "clear_tool_uses_20250919"}
                ]
            }
        }'
    Python
    response = client.beta.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        messages=[
            {
                "role": "user",
                "content": "Search for recent developments in AI"
            }
        ],
        tools=[
            {
                "type": "web_search_20250305",
                "name": "web_search"
            }
        ],
        betas=["context-management-2025-06-27"],
        context_management={
            "edits": [
                {"type": "clear_tool_uses_20250919"}
            ]
        }
    )
    TypeScript
    import Anthropic from '@anthropic-ai/sdk';
    
    const anthropic = new Anthropic({
      apiKey: process.env.ANTHROPIC_API_KEY,
    });
    
    const response = await anthropic.beta.messages.create({
      model: "claude-sonnet-4-5",
      max_tokens: 4096,
      messages: [
        {
          role: "user",
          content: "Search for recent developments in AI"
        }
      ],
      tools: [
        {
          type: "web_search_20250305",
          name: "web_search"
        }
      ],
      context_management: {
        edits: [
          { type: "clear_tool_uses_20250919" }
        ]
      },
      betas: ["context-management-2025-06-27"]
    });

    Advanced configuration

    You can customize the tool result clearing behavior with additional parameters:

    curl https://api.anthropic.com/v1/messages \
        --header "x-api-key: $ANTHROPIC_API_KEY" \
        --header "anthropic-version: 2023-06-01" \
        --header "content-type: application/json" \
        --header "anthropic-beta: context-management-2025-06-27" \
        --data '{
            "model": "claude-sonnet-4-5",
            "max_tokens": 4096,
            "messages": [
                {
                    "role": "user",
                    "content": "Create a simple command line calculator app using Python"
                }
            ],
            "tools": [
                {
                    "type": "text_editor_20250728",
                    "name": "str_replace_based_edit_tool",
                    "max_characters": 10000
                },
                {
                    "type": "web_search_20250305",
                    "name": "web_search",
                    "max_uses": 3
                }
            ],
            "context_management": {
                "edits": [
                    {
                        "type": "clear_tool_uses_20250919",
                        "trigger": {
                            "type": "input_tokens",
                            "value": 30000
                        },
                        "keep": {
                            "type": "tool_uses",
                            "value": 3
                        },
                        "clear_at_least": {
                            "type": "input_tokens",
                            "value": 5000
                        },
                        "exclude_tools": ["web_search"]
                    }
                ]
            }
        }'

    Thinking block clearing usage

    Enable thinking block clearing to manage context and prompt caching effectively when extended thinking is enabled:

    curl https://api.anthropic.com/v1/messages \
        --header "x-api-key: $ANTHROPIC_API_KEY" \
        --header "anthropic-version: 2023-06-01" \
        --header "content-type: application/json" \
        --header "anthropic-beta: context-management-2025-06-27" \
        --data '{
            "model": "claude-sonnet-4-5-20250929",
            "max_tokens": 1024,
            "messages": [...],
            "thinking": {
                "type": "enabled",
                "budget_tokens": 10000
            },
            "context_management": {
                "edits": [
                    {
                        "type": "clear_thinking_20251015",
                        "keep": {
                            "type": "thinking_turns",
                            "value": 2
                        }
                    }
                ]
            }
        }'

    Configuration options for thinking block clearing

    The clear_thinking_20251015 strategy supports the following configuration:

    Configuration optionDefaultDescription
    keep{type: "thinking_turns", value: 1}Defines how many recent assistant turns with thinking blocks to preserve. Use {type: "thinking_turns", value: N} where N must be > 0 to keep the last N turns, or "all" to keep all thinking blocks.

    Example configurations:

    // Keep thinking blocks from the last 3 assistant turns
    {
      "type": "clear_thinking_20251015",
      "keep": {
        "type": "thinking_turns",
        "value": 3
      }
    }
    
    // Keep all thinking blocks (maximizes cache hits)
    {
      "type": "clear_thinking_20251015",
      "keep": "all"
    }

    Combining strategies

    You can use both thinking block clearing and tool result clearing together:

    When using multiple strategies, the clear_thinking_20251015 strategy must be listed first in the edits array.

    response = client.beta.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=1024,
        messages=[...],
        thinking={
            "type": "enabled",
            "budget_tokens": 10000
        },
        tools=[...],
        betas=["context-management-2025-06-27"],
        context_management={
            "edits": [
                {
                    "type": "clear_thinking_20251015",
                    "keep": {
                        "type": "thinking_turns",
                        "value": 2
                    }
                },
                {
                    "type": "clear_tool_uses_20250919",
                    "trigger": {
                        "type": "input_tokens",
                        "value": 50000
                    },
                    "keep": {
                        "type": "tool_uses",
                        "value": 5
                    }
                }
            ]
        }
    )

    Configuration options for tool result clearing

    Configuration optionDefaultDescription
    trigger100,000 input tokensDefines when the context editing strategy activates. Once the prompt exceeds this threshold, clearing will begin. You can specify this value in either input_tokens or tool_uses.
    keep3 tool usesDefines how many recent tool use/result pairs to keep after clearing occurs. The API removes the oldest tool interactions first, preserving the most recent ones.
    clear_at_leastNoneEnsures a minimum number of tokens is cleared each time the strategy activates. If the API can't clear at least the specified amount, the strategy will not be applied. This helps determine if context clearing is worth breaking your prompt cache.
    exclude_toolsNoneList of tool names whose tool uses and results should never be cleared. Useful for preserving important context.
    clear_tool_inputsfalseControls whether the tool call parameters are cleared along with the tool results. By default, only the tool results are cleared while keeping Claude's original tool calls visible.

    Context editing response

    You can see which context edits were applied to your request using the context_management response field, along with helpful statistics about the content and input tokens cleared.

    Response
    {
        "id": "msg_013Zva2CMHLNnXjNJJKqJ2EF",
        "type": "message",
        "role": "assistant",
        "content": [...],
        "usage": {...},
        "context_management": {
            "applied_edits": [
                // When using `clear_thinking_20251015`
                {
                    "type": "clear_thinking_20251015",
                    "cleared_thinking_turns": 3,
                    "cleared_input_tokens": 15000
                },
                // When using `clear_tool_uses_20250919`
                {
                    "type": "clear_tool_uses_20250919",
                    "cleared_tool_uses": 8,
                    "cleared_input_tokens": 50000
                }
            ]
        }
    }

    For streaming responses, the context edits will be included in the final message_delta event:

    Streaming Response
    {
        "type": "message_delta",
        "delta": {
            "stop_reason": "end_turn",
            "stop_sequence": null
        },
        "usage": {
            "output_tokens": 1024
        },
        "context_management": {
            "applied_edits": [...]
        }
    }

    Token counting

    The token counting endpoint supports context management, allowing you to preview how many tokens your prompt will use after context editing is applied.

    curl https://api.anthropic.com/v1/messages/count_tokens \
        --header "x-api-key: $ANTHROPIC_API_KEY" \
        --header "anthropic-version: 2023-06-01" \
        --header "content-type: application/json" \
        --header "anthropic-beta: context-management-2025-06-27" \
        --data '{
            "model": "claude-sonnet-4-5",
            "messages": [
                {
                    "role": "user",
                    "content": "Continue our conversation..."
                }
            ],
            "tools": [...],
            "context_management": {
                "edits": [
                    {
                        "type": "clear_tool_uses_20250919",
                        "trigger": {
                            "type": "input_tokens",
                            "value": 30000
                        },
                        "keep": {
                            "type": "tool_uses",
                            "value": 5
                        }
                    }
                ]
            }
        }'
    Response
    {
        "input_tokens": 25000,
        "context_management": {
            "original_input_tokens": 70000
        }
    }

    The response shows both the final token count after context management is applied (input_tokens) and the original token count before any clearing occurred (original_input_tokens).

    Using with the Memory Tool

    Context editing can be combined with the memory tool. When your conversation context approaches the configured clearing threshold, Claude receives an automatic warning to preserve important information. This enables Claude to save tool results or context to its memory files before they're cleared from the conversation history.

    This combination allows you to:

    • Preserve important context: Claude can write essential information from tool results to memory files before those results are cleared
    • Maintain long-running workflows: Enable agentic workflows that would otherwise exceed context limits by offloading information to persistent storage
    • Access information on demand: Claude can look up previously cleared information from memory files when needed, rather than keeping everything in the active context window

    For example, in a file editing workflow where Claude performs many operations, Claude can summarize completed changes to memory files as the context grows. When tool results are cleared, Claude retains access to that information through its memory system and can continue working effectively.

    To use both features together, enable them in your API request:

    response = client.beta.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=4096,
        messages=[...],
        tools=[
            {
                "type": "memory_20250818",
                "name": "memory"
            },
            # Your other tools
        ],
        betas=["context-management-2025-06-27"],
        context_management={
            "edits": [
                {"type": "clear_tool_uses_20250919"}
            ]
        }
    )
    • Overview
    • Context editing strategies
    • Tool result clearing
    • Thinking block clearing
    • Supported models
    • Tool result clearing usage
    • Advanced configuration
    • Thinking block clearing usage
    • Configuration options for thinking block clearing
    • Combining strategies
    • Configuration options for tool result clearing
    • Context editing response
    • Token counting
    • Using with the Memory Tool
    © 2025 ANTHROPIC PBC

    Products

    • Claude
    • Claude Code
    • Max plan
    • Team plan
    • Enterprise plan
    • Download app
    • Pricing
    • Log in

    Features

    • Claude and Slack
    • Claude in Excel

    Models

    • Opus
    • Sonnet
    • Haiku

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Claude Developer Platform

    • Overview
    • Developer docs
    • Pricing
    • Amazon Bedrock
    • Google Cloud’s Vertex AI
    • Console login

    Learn

    • Blog
    • Catalog
    • Courses
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Help and security

    • Availability
    • Status
    • Support center

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy

    Products

    • Claude
    • Claude Code
    • Max plan
    • Team plan
    • Enterprise plan
    • Download app
    • Pricing
    • Log in

    Features

    • Claude and Slack
    • Claude in Excel

    Models

    • Opus
    • Sonnet
    • Haiku

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Claude Developer Platform

    • Overview
    • Developer docs
    • Pricing
    • Amazon Bedrock
    • Google Cloud’s Vertex AI
    • Console login

    Learn

    • Blog
    • Catalog
    • Courses
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Help and security

    • Availability
    • Status
    • Support center

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    © 2025 ANTHROPIC PBC