Claude Platform Docs
  • 消息
  • 托管智能体
  • 管理

Search...
⌘K
第一步
Claude 简介快速入门
使用 Claude 构建
功能概览使用 Messages API停止原因与回退拒绝与回退回退额度
模型能力
扩展思考自适应思考努力程度任务预算(测试版)快速模式(研究预览)结构化输出引用流式传输消息批量处理搜索结果流式传输拒绝多语言支持嵌入
工具
概览工具使用的工作原理教程:构建使用工具的智能体定义工具处理工具调用并行工具使用工具运行器(SDK)严格工具使用工具使用与提示缓存服务器工具故障排除网页搜索工具网页抓取工具代码执行工具顾问工具记忆工具Bash 工具计算机使用工具文本编辑器工具
工具基础设施
工具参考管理工具上下文工具组合工具搜索编程式工具调用细粒度工具流式传输
上下文管理
上下文窗口压缩上下文编辑提示缓存对话中系统消息构建编排模式缓存诊断(测试版)令牌计数
处理文件
Files APIPDF 支持图像与视觉
技能
概览快速入门最佳实践企业技能API 中的技能
MCP
远程 MCP 服务器MCP 连接器
云平台上的 Claude
Amazon BedrockAmazon Bedrock(旧版)AWS 上的 Claude PlatformMicrosoft FoundryVertex AI

Log in
流式传输消息
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Claude Platform Docs

Solutions

  • AI agents
  • Code modernization
  • Coding
  • Customer support
  • Education
  • Financial services
  • Government
  • Life sciences

Partners

  • Claude on AWS
  • Claude on Google Cloud

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Company

  • Anthropic
  • Careers
  • Economic Futures
  • Research
  • News
  • Responsible Scaling Policy
  • Security and compliance
  • Transparency

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Help and security

  • Availability
  • Status
  • Support
  • Discord

Terms and policies

  • Privacy policy
  • Responsible disclosure policy
  • Terms of service: Commercial
  • Terms of service: Consumer
  • Usage policy
消息/模型能力

流式传输消息

通过服务器发送事件(server-sent events)增量流式传输 Messages API 响应,包括文本、工具使用和扩展思考增量。

创建 Message 时,您可以设置 "stream": true,以使用 server-sent events(服务器发送事件,即 SSE)增量流式传输响应。

使用 SDK 进行流式传输

Python 和 TypeScript SDK 提供了多种流式传输方式。PHP SDK 通过 createStream() 提供流式传输。Python SDK 同时支持同步和异步流。有关详细信息,请参阅各 SDK 的文档。

client = anthropic.Anthropic()

with client.messages.stream(
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    model="claude-opus-4-8",
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

无需处理事件即可获取最终消息

如果您不需要在文本到达时进行处理,SDK 提供了一种在底层使用流式传输的同时返回完整 Message 对象的方法,该对象与 .create() 返回的对象相同。这对于 max_tokens 值较大的请求特别有用,因为 SDK 需要流式传输来避免 HTTP 超时。

client = anthropic.Anthropic()

with client.messages.stream(
    max_tokens=128000,
    messages=[{"role": "user", "content": "Write a detailed analysis..."}],
    model="claude-opus-4-8",
) as stream:
    message = stream.get_final_message()

print(message.content[0].text)

.stream() 调用通过服务器发送事件保持 HTTP 连接处于活动状态,然后 .get_final_message()(Python)或 .finalMessage()(TypeScript)会累积所有事件并返回完整的 Message 对象。在 Go 中,您在流循环内调用 message.Accumulate(event) 来构建相同的完整 Message。在 Java 中,使用 MessageAccumulator.create() 并对每个事件调用 accumulator.accumulate(event)。在 C# 中,对流的 .Aggregate() 扩展方法使用 await 以获取完整的 Message,或将 MessageContentAggregator 传递给 .CollectAsync() 以在处理事件的同时进行聚合。在 Ruby 中,对流调用 .accumulated_message。在 PHP SDK 中,您需要手动遍历流事件来累积响应。

事件类型

每个服务器发送事件都包含一个命名的事件类型和关联的 JSON 数据。每个事件使用一个 SSE 事件名称(例如 event: message_stop),并在其数据中包含匹配的事件 type。

每个流使用以下事件流程:

  1. message_start:包含一个 content 为空的 Message 对象。
  2. 一系列内容块,每个内容块都有一个 content_block_start、一个或多个 content_block_delta 事件,以及一个 content_block_stop 事件。每个内容块都有一个 index,对应于其在最终 Message content 数组中的索引。有一个例外:在服务器端回退响应期间,fallback 内容块会在每个模型边界处以 content_block_start 和 content_block_stop 对的形式到达,中间没有增量。
  3. 一个或多个 message_delta 事件,指示对最终 Message 对象的顶层更改。
  4. 最后一个 message_stop 事件。


message_delta 事件的 usage 字段中显示的令牌计数是累积的。

Ping 事件

事件流还可能包含任意数量的 ping 事件。

错误事件

API 可能偶尔会在事件流中发送错误。例如,在高使用量期间,您可能会收到 overloaded_error,这在非流式传输上下文中通常对应于 HTTP 529:

Example error
event: error
data: {"type": "error", "error": {"type": "overloaded_error", "message": "Overloaded"}}

其他事件

根据版本控制策略,可能会添加新的事件类型,您的代码应优雅地处理未知的事件类型。

内容块增量类型

每个 content_block_delta 事件都包含一个特定类型的 delta,用于更新给定 index 处的 content 块。

文本增量

text 内容块增量如下所示:

Text delta
event: content_block_delta
data: {"type": "content_block_delta","index": 0,"delta": {"type": "text_delta", "text": "ello frien"}}

输入 JSON 增量

tool_use 内容块的增量对应于该块 input 字段的更新。为了支持最大粒度,这些增量是部分 JSON 字符串,而最终的 tool_use.input 始终是一个对象。

您可以累积字符串增量,并在收到 content_block_stop 事件后解析 JSON,方法是使用像 Pydantic 这样的库进行部分 JSON 解析,或使用 SDK,它们提供了访问已解析增量值的辅助工具。

tool_use 内容块增量如下所示:

Input JSON delta
event: content_block_delta
data: {"type": "content_block_delta","index": 1,"delta": {"type": "input_json_delta","partial_json": "{\"location\": \"San Fra"}}}

注意:当前模型一次只支持从 input 发出一个完整的键和值属性。因此,在使用工具时,模型工作期间流式传输事件之间可能会有延迟。一旦累积了一个 input 键和值,它们会作为多个带有分块部分 JSON 的 content_block_delta 事件发出,以便该格式能够自动支持未来模型中更细的粒度。

思考增量

当启用流式传输并使用扩展思考时,您将通过 thinking_delta 事件接收思考内容。这些增量对应于 thinking 内容块的 thinking 字段。

对于思考内容,会在 content_block_stop 事件之前发送一个特殊的 signature_delta 事件。此签名用于验证思考块的完整性。

当在思考配置中设置 display: "omitted" 时,不会发送任何 thinking_delta 事件。思考块会打开,接收单个 signature_delta,然后关闭。请参阅控制思考显示。

典型的思考增量如下所示:

Thinking delta
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "I need to find the GCD of 1071 and 462 using the Euclidean algorithm.\n\n1071 = 2 × 462 + 147"}}

签名增量如下所示:

Signature delta
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}}

完整的 HTTP 流响应

使用流式传输模式时,请使用客户端 SDK。但是,如果您正在构建直接的 API 集成,则需要自行处理这些事件。

流响应包括:

  1. 一个 message_start 事件
  2. 可能有多个内容块,每个内容块包含:
    • 一个 content_block_start 事件
    • 可能有多个 content_block_delta 事件
    • 一个 content_block_stop 事件
  3. 一个或多个 message_delta 事件
  4. 一个 message_stop 事件

响应中还可能穿插有 ping 事件。有关格式的更多详细信息,请参阅事件类型。

基本流式传输请求

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=256,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Response
event: message_start
data: {"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-opus-4-8", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 25, "output_tokens": 1}}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "!"}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 0}

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence":null}, "usage": {"output_tokens": 15}}

event: message_stop
data: {"type": "message_stop"}

带工具使用的流式传输请求



工具使用支持参数值的细粒度流式传输。通过 eager_input_streaming 为每个工具启用它。

此请求要求 Claude 使用工具来报告天气。

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                }
            },
            "required": ["location"],
        },
    }
]

with client.messages.stream(
    model="claude-opus-4-8",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "any"},
    messages=[
        {"role": "user", "content": "What is the weather like in San Francisco?"}
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Response
event: message_start
data: {"type":"message_start","message":{"id":"msg_014p7gG3wDgGV9EUtLvnow3U","type":"message","role":"assistant","model":"claude-opus-4-8","stop_sequence":null,"usage":{"input_tokens":472,"output_tokens":2},"content":[],"stop_reason":null}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Okay"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" let"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"'s"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" check"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" weather"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" for"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" San"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" Francisco"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" CA"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":":"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"toolu_01T1x1fJ34qAmk2tNTrN7Up6","name":"get_weather","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"San"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" Francisc"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"o,"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" CA\"}"}}

event: content_block_stop
data: {"type":"content_block_stop","index":1}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":89}}

event: message_stop
data: {"type":"message_stop"}

带扩展思考的流式传输请求

此请求启用带流式传输的扩展思考。display: "summarized" 设置会流式传输 Claude 推理过程的精简摘要,而不是完整的思维链。

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-8",
    max_tokens=20000,
    thinking={"type": "adaptive", "display": "summarized"},
    messages=[
        {
            "role": "user",
            "content": "What is the greatest common divisor of 1071 and 462?",
        }
    ],
) as stream:
    for event in stream:
        if event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)
Response
event: message_start
data: {"type": "message_start", "message": {"id": "msg_01...", "type": "message", "role": "assistant", "content": [], "model": "claude-opus-4-8", "stop_reason": null, "stop_sequence": null}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "thinking", "thinking": "", "signature": ""}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "I need to find the GCD of 1071 and 462 using the Euclidean algorithm.\n\n1071 = 2 × 462 + 147"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n462 = 3 × 147 + 21"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n147 = 7 × 21 + 0"}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\nThe remainder is 0, so GCD(1071, 462) = 21."}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 0}

event: content_block_start
data: {"type": "content_block_start", "index": 1, "content_block": {"type": "text", "text": ""}}

event: content_block_delta
data: {"type": "content_block_delta", "index": 1, "delta": {"type": "text_delta", "text": "The greatest common divisor of 1071 and 462 is **21**."}}

event: content_block_stop
data: {"type": "content_block_stop", "index": 1}

event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}}

event: message_stop
data: {"type": "message_stop"}

带网络搜索工具使用的流式传输请求

此请求要求 Claude 在网络上搜索当前天气信息。

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-8",
    max_tokens=1024,
    tools=[{"type": "web_search_20250305", "name": "web_search", "max_uses": 5}],
    messages=[
        {"role": "user", "content": "What is the weather like in New York City today?"}
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Response
event: message_start
data: {"type":"message_start","message":{"id":"msg_01G...","type":"message","role":"assistant","model":"claude-opus-4-8","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":2679,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":3}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"I'll check"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the current weather in New York City for you"}}

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"."}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"server_tool_use","id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","name":"web_search","input":{}}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"query"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"\":"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"weather"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" NY"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"C to"}}

event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"day\"}"}}

event: content_block_stop
data: {"type":"content_block_stop","index":1 }

event: content_block_start
data: {"type":"content_block_start","index":2,"content_block":{"type":"web_search_tool_result","tool_use_id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","content":[{"type":"web_search_result","title":"Weather in New York City in May 2025 (New York) - detailed Weather Forecast for a month","url":"https://world-weather.info/forecast/usa/new_york/may-2025/","encrypted_content":"Ev0DCioIAxgCIiQ3NmU4ZmI4OC1k...","page_age":null},...]}}

event: content_block_stop
data: {"type":"content_block_stop","index":2}

event: content_block_start
data: {"type":"content_block_start","index":3,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"Here's the current weather information for New York"}}

event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" City:\n\n# Weather"}}

event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" in New York City"}}

event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"\n\n"}}

...

event: content_block_stop
data: {"type":"content_block_stop","index":17}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"input_tokens":10682,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":510,"server_tool_use":{"web_search_requests":1}}}

event: message_stop
data: {"type":"message_stop"}

错误恢复

Claude 4.5 及更早版本

对于 Claude 4.5 及更早版本的模型,您可以通过从流中断处恢复来恢复因网络问题、超时或其他错误而中断的流式传输请求。这种方法可以避免重新处理整个响应。

基本恢复策略包括:

  1. 捕获部分响应: 保存错误发生前成功接收的所有内容
  2. 构造续传请求: 创建一个新的 API 请求,将部分助手响应作为新助手消息的开头
  3. 恢复流式传输: 从中断处继续接收响应的其余部分

Claude 4.6 及更高版本

对于 Claude 4.6 及更高版本的模型,同样适用捕获并恢复的策略,但第 2 步有所变化:不是将部分响应放在助手消息中,而是添加一条用户消息,指示模型从中断处继续。

  1. 捕获部分响应: 保存错误发生前成功接收的所有内容
  2. 构造续传请求: 创建一个新的 API 请求,其中包含一条用户消息,该消息包含部分响应和继续的指令,例如:
    Sample prompt
    Your previous response was interrupted and ended with [previous_response]. Continue from where you left off.
  3. 恢复流式传输: 从中断处继续接收响应的其余部分

错误恢复最佳实践

  1. 使用 SDK 功能: 利用 SDK 内置的消息累积和错误处理功能
  2. 处理内容类型: 请注意,消息可以包含多个内容块(text、tool_use、thinking)。工具使用和扩展思考块无法部分恢复。您可以从最近的文本块恢复流式传输。

后续步骤

停止原因和回退

在流完成后处理每个 stop_reason 值。


细粒度工具流式传输

无需服务器端缓冲即可流式传输工具输入 JSON,以降低延迟。

扩展思考

通过 thinking_delta 和 signature_delta 事件流式传输扩展思考输出。


客户端 SDK

使用官方 SDK,它们会为您处理流式传输、累积和重新连接。

批处理

当您不需要实时响应时,异步处理大量请求。

Was this page helpful?

  • 使用 SDK 进行流式传输
  • 无需处理事件即可获取最终消息
  • 事件类型
  • Ping 事件
  • 错误事件
  • 其他事件
  • 内容块增量类型
  • 文本增量
  • 输入 JSON 增量
  • 思考增量
  • 完整的 HTTP 流响应
  • 基本流式传输请求
  • 带工具使用的流式传输请求
  • 带扩展思考的流式传输请求
  • 带网络搜索工具使用的流式传输请求
  • 错误恢复
  • Claude 4.5 及更早版本
  • Claude 4.6 及更高版本
  • 错误恢复最佳实践
  • 后续步骤