Loading...
    • Developer Guide
    • API Reference
    • MCP
    • Resources
    • Release Notes
    Search...
    ⌘K
    First steps
    Intro to ClaudeQuickstart
    Models & pricing
    Models overviewChoosing a modelWhat's new in Claude 4.6Migration guideModel deprecationsPricing
    Build with Claude
    Features overviewUsing the Messages APIHandling stop reasonsPrompting best practices
    Context management
    Context windowsCompactionContext editing
    Capabilities
    Prompt cachingExtended thinkingAdaptive thinkingEffortFast mode (research preview)Streaming MessagesBatch processingCitationsMultilingual supportToken countingEmbeddingsVisionPDF supportFiles APISearch resultsStructured outputs
    Tools
    OverviewHow to implement tool useFine-grained tool streamingBash toolCode execution toolProgrammatic tool callingComputer use toolText editor toolWeb fetch toolWeb search toolMemory toolTool search tool
    Agent Skills
    OverviewQuickstartBest practicesSkills for enterpriseUsing Skills with the API
    Agent SDK
    OverviewQuickstartTypeScript SDKTypeScript V2 (preview)Python SDKMigration Guide
    MCP in the API
    MCP connectorRemote MCP servers
    Claude on 3rd-party platforms
    Amazon BedrockMicrosoft FoundryVertex AI
    Prompt engineering
    OverviewPrompt generatorUse prompt templatesPrompt improverBe clear and directUse examples (multishot prompting)Let Claude think (CoT)Use XML tagsGive Claude a role (system prompts)Chain complex promptsLong context tipsExtended thinking tips
    Test & evaluate
    Define success criteriaDevelop test casesUsing the Evaluation ToolReducing latency
    Strengthen guardrails
    Reduce hallucinationsIncrease output consistencyMitigate jailbreaksStreaming refusalsReduce prompt leakKeep Claude in character
    Administration and monitoring
    Admin API overviewData residencyWorkspacesUsage and Cost APIClaude Code Analytics APIZero Data Retention
    Console
    Log in
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Partners

    • Amazon Bedrock
    • Google Cloud's Vertex AI

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Help and security

    • Availability
    • Status
    • Support
    • Discord

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    Tools

    Fine-grained tool streaming

    Fine-grained tool streaming is generally available on all models and all platforms, with no beta header required. It enables streaming of tool use parameter values without buffering or JSON validation, reducing the latency to begin receiving large parameters.

    When using fine-grained tool streaming, you may potentially receive invalid or partial JSON inputs. Make sure to account for these edge cases in your code.

    How to use fine-grained tool streaming

    Fine-grained tool streaming is available on all models and all platforms (Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry). To use it, set eager_input_streaming to true on any tool where you want fine-grained streaming enabled, and enable streaming on your request.

    Here's an example of how to use fine-grained tool streaming with the API:

    curl https://api.anthropic.com/v1/messages \
      -H "content-type: application/json" \
      -H "x-api-key: $ANTHROPIC_API_KEY" \
      -H "anthropic-version: 2023-06-01" \
      -d '{
        "model": "claude-opus-4-6",
        "max_tokens": 65536,
        "tools": [
          {
            "name": "make_file",
            "description": "Write text to a file",
            "eager_input_streaming": true,
            "input_schema": {
              "type": "object",
              "properties": {
                "filename": {
                  "type": "string",
                  "description": "The filename to write text to"
                },
                "lines_of_text": {
                  "type": "array",
                  "description": "An array of lines of text to write to the file"
                }
              },
              "required": ["filename", "lines_of_text"]
            }
          }
        ],
        "messages": [
          {
            "role": "user",
            "content": "Can you write a long poem and make a file called poem.txt?"
          }
        ],
        "stream": true
      }' | jq '.usage'

    In this example, fine-grained tool streaming enables Claude to stream the lines of a long poem into the tool call make_file without buffering to validate if the lines_of_text parameter is valid JSON. This means you can see the parameter stream as it arrives, without having to wait for the entire parameter to buffer and validate.

    With fine-grained tool streaming, tool use chunks start streaming faster, and are often longer and contain fewer word breaks. This is due to differences in chunking behavior.

    Example:

    Without fine-grained streaming (15s delay):

    Chunk 1: '{"'
    Chunk 2: 'query": "Ty'
    Chunk 3: 'peScri'
    Chunk 4: 'pt 5.0 5.1 '
    Chunk 5: '5.2 5'
    Chunk 6: '.3'
    Chunk 8: ' new f'
    Chunk 9: 'eatur'
    ...

    With fine-grained streaming (3s delay):

    Chunk 1: '{"query": "TypeScript 5.0 5.1 5.2 5.3'
    Chunk 2: ' new features comparison'

    Because fine-grained streaming sends parameters without buffering or JSON validation, there is no guarantee that the resulting stream will complete in a valid JSON string. Particularly, if the stop reason max_tokens is reached, the stream may end midway through a parameter and may be incomplete. You will generally have to write specific support to handle when max_tokens is reached.

    Handling invalid JSON in tool responses

    When using fine-grained tool streaming, you may receive invalid or incomplete JSON from the model. If you need to pass this invalid JSON back to the model in an error response block, you may wrap it in a JSON object to ensure proper handling (with a reasonable key). For example:

    {
      "INVALID_JSON": "<your invalid json string>"
    }

    This approach helps the model understand that the content is invalid JSON while preserving the original malformed data for debugging purposes.

    When wrapping invalid JSON, make sure to properly escape any quotes or special characters in the invalid JSON string to maintain valid JSON structure in the wrapper object.

    Was this page helpful?

    • How to use fine-grained tool streaming
    • Handling invalid JSON in tool responses