This feature is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
Fine-grained tool streaming delivers a tool's input to your client as Claude generates it, without server-side buffering or JSON validation. Skipping the buffering step reduces the time to the first fragment of a large parameter, such as a document or a block of code, and the fragments arrive through the same Streaming messages events as standard tool use.
Because the API does not buffer or validate a tool's input before streaming it, you might receive partial or invalid JSON. A response that ends with the stop reason max_tokens can also cut a parameter off midway. Accumulate the fragments, guard the parse, and see Handling invalid JSON in tool responses for how to return unparseable input to Claude.
All models support fine-grained tool streaming on the Claude API, Claude Platform on AWS, Amazon Bedrock, Google Cloud, and Microsoft Foundry. To use it, set eager_input_streaming to true on any user-defined tool where you want fine-grained streaming enabled, and enable streaming on your request.
The eager_input_streaming field is optional. Setting it to true turns on fine-grained streaming for that tool, and omitting it gives you standard buffered streaming, in which the API buffers and validates each parameter value before streaming it back. The exception is a request that still sends the legacy fine-grained-tool-streaming-2025-05-14 beta header, which turns fine-grained streaming on for tools that leave the field unset. The per-tool field replaces that header, and an explicit false keeps buffered streaming for a tool even when a request still sends it. See Tool reference for the field definition.
The following example turns on fine-grained streaming for a make_file tool and asks Claude for a long poem, so the tool input is large enough to watch it stream in:
client = anthropic.Anthropic()
with client.messages.stream(
max_tokens=65536,
model="claude-opus-4-8",
tools=[
{
"name": "make_file",
"description": "Write text to a file",
"eager_input_streaming": True,
"input_schema": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "The filename to write text to",
},
"lines_of_text": {
"type": "array",
"description": "An array of lines of text to write to the file",
},
},
"required": ["filename", "lines_of_text"],
},
}
],
messages=[
{
"role": "user",
"content": "Can you write a long poem and make a file called poem.txt?",
}
],
) as stream:
for event in stream:
if event.type == "input_json":
print(event.partial_json, end="", flush=True)
final_message = stream.get_final_message()
print()
for block in final_message.content:
if block.type == "tool_use":
print(f"Complete tool input: {block.input}")Every tab turns on fine-grained streaming for the make_file tool. The SDK tabs print each input fragment the moment it arrives, then print the complete accumulated input once the stream ends. The cURL tab shows the raw event stream, and the CLI tab uses jq to print just the fragments. Because the printed fragments join into the full tool input, the poem fills your terminal as Claude writes it:
{"filename": "poem.txt", "lines_of_text": ["The Wanderer's Journey", "", "I.", "", "Beneath the vast and star-strewn sky,", "Where silver moonbeams softly lie,", ...
Complete tool input: {"filename": "poem.txt", "lines_of_text": ["The Wanderer's Journey", ...]}Without eager_input_streaming, the API buffers and validates each parameter value before streaming it back, so nothing prints for a large parameter until Claude has finished generating it. With it, fragments start arriving as soon as Claude begins the parameter, and they are typically longer, with fewer mid-word breaks.
The accumulation contract is the same as for standard tool-use streaming, so this section applies with and without eager_input_streaming. See Input JSON delta in Streaming messages for the event format. Fine-grained tool streaming changes what you can assume about the result: the server streams fragments without validating them, so the accumulated string might not be valid JSON.
When a tool_use content block streams, the initial content_block_start event contains input: {} (an empty object). This is a placeholder. The actual input arrives as a series of input_json_delta events, each carrying a partial_json string fragment. To assemble the full input, concatenate these fragments and parse the result when the block closes.
Where your SDK provides an accumulator helper (as the Python, TypeScript, Go, Java, and Ruby tabs in the previous example do), it handles this for you. The manual pattern is for SDKs without a helper, or when you want full control over how the input is assembled.
The accumulation contract:
content_block_start with type: "tool_use", initialize an empty string: input_json = ""content_block_delta with type: "input_json_delta", append: input_json += event.delta.partial_jsoncontent_block_stop, parse the accumulated stringGuard the parse, as the following SDK examples do. A response can also stop at max_tokens midway through a parameter. Check the stop reason and decide whether to retry the request with a higher max_tokens or repair the partial input.
The type mismatch between the initial input: {} (object) and partial_json (string) is by design. The empty object marks the slot in the content array. The delta strings build the real value.
client = anthropic.Anthropic()
tool_inputs: dict[int, str] = {} # index -> accumulated JSON string
with client.messages.stream(
model="claude-opus-4-8",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get current weather for a city",
"eager_input_streaming": True,
"input_schema": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
}
],
messages=[{"role": "user", "content": "Weather in Paris?"}],
) as stream:
for event in stream:
match event.type:
case "content_block_start" if event.content_block.type == "tool_use":
tool_inputs[event.index] = ""
case "content_block_delta" if event.delta.type == "input_json_delta":
tool_inputs[event.index] += event.delta.partial_json
case "content_block_stop" if event.index in tool_inputs:
raw_input = tool_inputs[event.index]
try:
parsed = json.loads(raw_input)
except json.JSONDecodeError:
# The accumulated string is not guaranteed to be valid JSON.
# See "Handling invalid JSON in tool responses" on this page.
print(f"Invalid tool input: {raw_input}")
else:
print(f"Tool input: {parsed}")Reacting to fragments and assembling them are separate concerns. The first example reacts to each fragment as it arrives and still hands assembly to the SDK in the tabs that use an accumulator helper. Use the manual pattern when you are not using an accumulator helper or when you want full control over assembly.
With fine-grained tool streaming, the accumulated input for a tool call might be invalid or incomplete JSON. When it is, you cannot run the tool, so report the failure back to Claude instead. The content of a tool result does not have to be JSON, but wrapping the raw string in a JSON object under a single key makes it unambiguous to Claude that you received invalid JSON, and preserves the original input for debugging:
{
"INVALID_JSON": "<the unparseable input you received>"
}Return the wrapper, serialized to a string, as the content of a tool result content block with is_error set to true:
{
"type": "tool_result",
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",
"is_error": true,
"content": "{\"INVALID_JSON\": \"<the unparseable input you received>\"}"
}Build the wrapper with your JSON library rather than by concatenating strings, so quotes and other special characters in the invalid input are escaped correctly.
Understand how the context window works, how extended thinking and tool use count toward it, and how to manage context as conversations grow.
Stream Messages API responses incrementally with server-sent events, including text, tool use, and extended thinking deltas.
Parse tool_use blocks, format tool_result responses, and handle errors with is_error.
Directory of Anthropic-provided tools and reference for optional tool definition properties.
Was this page helpful?