Loading...
    • Developer Guide
    • API Reference
    • MCP
    • Resources
    • Release Notes
    Search...
    ⌘K
    First steps
    Intro to ClaudeQuickstart
    Models & pricing
    Models overviewChoosing a modelWhat's new in Claude 4.6Migration guideModel deprecationsPricing
    Build with Claude
    Features overviewUsing the Messages APIHandling stop reasonsPrompting best practices
    Model capabilities
    Extended thinkingAdaptive thinkingEffortFast mode (beta: research preview)Structured outputsCitationsStreaming MessagesBatch processingPDF supportSearch resultsMultilingual supportEmbeddingsVision
    Tools
    OverviewHow tool use worksTutorial: Build a tool-using agentDefine toolsHandle tool callsParallel tool useTool Runner (SDK)Strict tool useTool use with prompt cachingServer toolsTroubleshootingTool referenceWeb search toolWeb fetch toolCode execution toolMemory toolBash toolComputer use toolText editor tool
    Tool infrastructure
    Manage tool contextTool combinationsTool searchProgrammatic tool callingFine-grained tool streaming
    Context management
    Context windowsCompactionContext editingPrompt cachingToken counting
    Files & assets
    Files API
    Agent Skills
    OverviewQuickstartBest practicesSkills for enterpriseClaude API skillUsing Skills with the API
    Agent SDK
    OverviewQuickstartHow the agent loop works
    MCP in the API
    MCP connectorRemote MCP servers
    Claude on 3rd-party platforms
    Amazon BedrockMicrosoft FoundryVertex AI
    Prompt engineering
    OverviewConsole prompting tools
    Test & evaluate
    Define success and build evaluationsUsing the Evaluation ToolReducing latency
    Strengthen guardrails
    Reduce hallucinationsIncrease output consistencyMitigate jailbreaksStreaming refusalsReduce prompt leak
    Administration and monitoring
    Admin API overviewData residencyWorkspacesUsage and Cost APIClaude Code Analytics APIAPI and data retention
    Console
    Log in
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Partners

    • Amazon Bedrock
    • Google Cloud's Vertex AI

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Help and security

    • Availability
    • Status
    • Support
    • Discord

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    Tools

    Tool use with prompt caching

    Cache tool definitions across turns and understand what invalidates your cache.

    This page covers prompt caching for tool definitions: where to place cache_control breakpoints, how defer_loading preserves your cache, and what invalidates it. For general prompt caching, see Prompt caching.

    cache_control on tool definitions

    Place cache_control: {"type": "ephemeral"} on the last tool in your tools array. This caches the entire tool-definitions prefix, from the first tool through the marked breakpoint:

    {
      "tools": [
        {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "input_schema": {
            "type": "object",
            "properties": {
              "location": { "type": "string" }
            },
            "required": ["location"]
          }
        },
        {
          "name": "get_time",
          "description": "Get the current time in a given time zone",
          "input_schema": {
            "type": "object",
            "properties": {
              "timezone": { "type": "string" }
            },
            "required": ["timezone"]
          },
          "cache_control": { "type": "ephemeral" }
        }
      ]
    }

    For mcp_toolset, the cache_control breakpoint lands on the last tool in the set. You don't control tool order within an MCP toolset, so place the breakpoint on the mcp_toolset entry itself and the API applies it to the final expanded tool.

    defer_loading and cache preservation

    Deferred tools are not included in the system-prompt prefix. When the model discovers a deferred tool through tool search, the definition is appended inline as a tool_reference block in the conversation history. The prefix is untouched, so prompt caching is preserved.

    This means adding tools dynamically through tool search does not break your cache. You can start a conversation with a small set of always-loaded tools (cached), let the model discover additional tools as needed, and keep the same cache hit across every turn.

    defer_loading also acts independently of grammar construction for strict mode. The grammar builds from the full toolset regardless of which tools are deferred, so prompt caching and grammar caching are both preserved when tools load dynamically.

    What invalidates your cache

    The cache follows a prefix hierarchy (tools → system → messages), so a change at one level invalidates that level and everything after it:

    ChangeInvalidates
    Modifying tool definitionsEntire cache (tools, system, messages)
    Toggling web search or citationsSystem and messages caches
    Changing tool_choiceMessages cache
    Changing disable_parallel_tool_useMessages cache
    Toggling images present/absentMessages cache
    Changing thinking parametersMessages cache

    If you need to vary tool_choice mid-conversation, consider placing cache breakpoints before the variation point.

    Per-tool interaction table

    ToolCaching considerations
    Web searchEnabling or disabling invalidates the system and messages caches
    Web fetchEnabling or disabling invalidates the system and messages caches
    Code executionContainer state is independent of prompt cache
    Tool searchDiscovered tools load as tool_reference blocks, preserving prefix cache
    Computer useScreenshot presence affects messages cache
    Text editorStandard client tool, no special caching interaction
    BashStandard client tool, no special caching interaction
    MemoryStandard client tool, no special caching interaction

    Next steps

    Prompt caching

    Learn the full prompt caching model, including TTLs and pricing.

    Tool search

    Load tools on demand without breaking your cache.

    Tool reference

    Browse all available tools and their parameters.

    Was this page helpful?

    • cache_control on tool definitions
    • defer_loading and cache preservation
    • What invalidates your cache
    • Per-tool interaction table
    • Next steps