Was this page helpful?
Al crear un Message, puedes establecer "stream": true para transmitir incrementalmente la respuesta usando server-sent events (eventos enviados por el servidor), o SSE.
Los SDKs de Python y TypeScript ofrecen múltiples formas de hacer streaming. El SDK de PHP proporciona streaming a través de createStream(). El SDK de Python permite streams tanto síncronos como asíncronos. Consulta la documentación de cada SDK para obtener más detalles.
client = anthropic.Anthropic()
with client.messages.stream(
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}],
model="claude-opus-4-8",
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)Si no necesitas procesar el texto a medida que llega, los SDKs proporcionan una forma de usar streaming internamente mientras devuelven el objeto Message completo, idéntico a lo que devuelve .create(). Esto es especialmente útil para solicitudes con valores grandes de max_tokens, donde los SDKs requieren streaming para evitar tiempos de espera de HTTP.
La llamada a .stream() mantiene viva la conexión HTTP con eventos enviados por el servidor, luego .get_final_message() (Python) o .finalMessage() (TypeScript) acumula todos los eventos y devuelve el objeto Message completo. En Go, llamas a message.Accumulate(event) dentro del bucle del stream para construir el mismo Message completo. En Java, usa MessageAccumulator.create() y llama a accumulator.accumulate(event) en cada evento. En Ruby, llama a .accumulated_message en el stream. En el SDK de PHP, iteras manualmente sobre los eventos del stream para acumular la respuesta.
Cada evento enviado por el servidor incluye un tipo de evento con nombre y datos JSON asociados. Cada evento usa un nombre de evento SSE (por ejemplo, event: message_stop), e incluye el type de evento correspondiente en sus datos.
Cada stream usa el siguiente flujo de eventos:
message_start: contiene un objeto Message con content vacío.content_block_start, uno o más eventos content_block_delta y un evento content_block_stop. Cada bloque de contenido tiene un index que corresponde a su índice en el arreglo content del Message final. Una excepción: durante las respuestas de fallback del lado del servidor, un bloque de contenido fallback llega en cada límite de modelo como un par de content_block_start y content_block_stop sin deltas entre ellos.message_delta, que indican cambios de nivel superior en el objeto Message final.Los recuentos de tokens que se muestran en el campo usage del evento message_delta son acumulativos.
Los streams de eventos también pueden incluir cualquier número de eventos ping.
La API puede enviar ocasionalmente errores en el stream de eventos. Por ejemplo, durante períodos de alto uso, puedes recibir un overloaded_error, que normalmente correspondería a un HTTP 529 en un contexto sin streaming:
event: error
data: {"type": "error", "error": {"type": "overloaded_error", "message": "Overloaded"}}De acuerdo con la política de versionado, se pueden agregar nuevos tipos de eventos, y tu código debe manejar los tipos de eventos desconocidos de manera adecuada.
Cada evento content_block_delta contiene un delta de un tipo que actualiza el bloque content en un index dado.
Un delta de bloque de contenido text se ve así:
event: content_block_delta
data: {"type": "content_block_delta","index": 0,"delta": {"type": "text_delta", "text": "ello frien"}}Los deltas para bloques de contenido tool_use corresponden a actualizaciones del campo input del bloque. Para admitir la máxima granularidad, los deltas son cadenas JSON parciales, mientras que el tool_use.input final es siempre un objeto.
Puedes acumular los deltas de cadena y analizar el JSON una vez que recibas un evento content_block_stop, usando una biblioteca como Pydantic para hacer análisis parcial de JSON, o usando los SDKs, que proporcionan utilidades para acceder a valores incrementales analizados.
Un delta de bloque de contenido tool_use se ve así:
event: content_block_delta
data: {"type": "content_block_delta","index": 1,"delta": {"type": "input_json_delta","partial_json": "{\"location\": \"San Fra"}}}Nota: Los modelos actuales solo admiten emitir una clave y propiedad de valor completa de input a la vez. Por lo tanto, al usar herramientas, puede haber retrasos entre eventos de streaming mientras el modelo está trabajando. Una vez que se acumulan una clave y un valor de input, se emiten como múltiples eventos content_block_delta con JSON parcial fragmentado para que el formato pueda admitir automáticamente una granularidad más fina en modelos futuros.
Al usar pensamiento extendido con streaming habilitado, recibirás contenido de pensamiento a través de eventos thinking_delta. Estos deltas corresponden al campo thinking de los bloques de contenido thinking.
Para el contenido de pensamiento, se envía un evento especial signature_delta justo antes del evento content_block_stop. Esta firma se usa para verificar la integridad del bloque de pensamiento.
Cuando se establece display: "omitted" en la configuración de pensamiento, no se envían eventos thinking_delta. El bloque de pensamiento se abre, recibe un único signature_delta y se cierra. Consulta Controlar la visualización del pensamiento.
Un delta de pensamiento típico se ve así:
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "I need to find the GCD of 1071 and 462 using the Euclidean algorithm.\n\n1071 = 2 × 462 + 147"}}El delta de firma se ve así:
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}}Usa los SDKs de cliente cuando uses el modo de streaming. Sin embargo, si estás construyendo una integración directa con la API, necesitas manejar estos eventos tú mismo.
Una respuesta de stream consiste en:
message_startcontent_block_startcontent_block_deltacontent_block_stopmessage_deltamessage_stopTambién puede haber eventos ping dispersos a lo largo de la respuesta. Consulta Tipos de eventos para obtener más detalles sobre el formato.
event: message_start
data: {"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-opus-4-8", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 25, "output_tokens": 1}}}
event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}
event: ping
data: {"type": "ping"}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "!"}}
event: content_block_stop
data: {"type": "content_block_stop", "index": 0}
event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence":null}, "usage": {"output_tokens": 15}}
event: message_stop
data: {"type": "message_stop"}
El uso de herramientas admite streaming de grano fino para valores de parámetros. Habilítalo por herramienta con eager_input_streaming.
Esta solicitud le pide a Claude que use una herramienta para informar el clima.
event: message_start
data: {"type":"message_start","message":{"id":"msg_014p7gG3wDgGV9EUtLvnow3U","type":"message","role":"assistant","model":"claude-opus-4-8","stop_sequence":null,"usage":{"input_tokens":472,"output_tokens":2},"content":[],"stop_reason":null}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: ping
data: {"type": "ping"}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Okay"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" let"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"'s"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" check"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" weather"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" for"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" San"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" Francisco"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" CA"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":":"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"toolu_01T1x1fJ34qAmk2tNTrN7Up6","name":"get_weather","input":{}}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"location\":"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"San"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" Francisc"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"o,"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" CA\"}"}}
event: content_block_stop
data: {"type":"content_block_stop","index":1}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":89}}
event: message_stop
data: {"type":"message_stop"}Esta solicitud habilita el pensamiento extendido con streaming. La configuración display: "summarized" transmite un resumen condensado del razonamiento de Claude en lugar de la cadena completa de pensamiento.
event: message_start
data: {"type": "message_start", "message": {"id": "msg_01...", "type": "message", "role": "assistant", "content": [], "model": "claude-opus-4-8", "stop_reason": null, "stop_sequence": null}}
event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "thinking", "thinking": "", "signature": ""}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "I need to find the GCD of 1071 and 462 using the Euclidean algorithm.\n\n1071 = 2 × 462 + 147"}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n462 = 3 × 147 + 21"}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\n147 = 7 × 21 + 0"}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "thinking_delta", "thinking": "\nThe remainder is 0, so GCD(1071, 462) = 21."}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "signature_delta", "signature": "EqQBCgIYAhIM1gbcDa9GJwZA2b3hGgxBdjrkzLoky3dl1pkiMOYds..."}}
event: content_block_stop
data: {"type": "content_block_stop", "index": 0}
event: content_block_start
data: {"type": "content_block_start", "index": 1, "content_block": {"type": "text", "text": ""}}
event: content_block_delta
data: {"type": "content_block_delta", "index": 1, "delta": {"type": "text_delta", "text": "The greatest common divisor of 1071 and 462 is **21**."}}
event: content_block_stop
data: {"type": "content_block_stop", "index": 1}
event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}}
event: message_stop
data: {"type": "message_stop"}Esta solicitud le pide a Claude que busque en la web información actual sobre el clima.
event: message_start
data: {"type":"message_start","message":{"id":"msg_01G...","type":"message","role":"assistant","model":"claude-opus-4-8","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":2679,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":3}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"I'll check"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" the current weather in New York City for you"}}
event: ping
data: {"type": "ping"}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"."}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: content_block_start
data: {"type":"content_block_start","index":1,"content_block":{"type":"server_tool_use","id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","name":"web_search","input":{}}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"{\"query"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"\":"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" \"weather"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":" NY"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"C to"}}
event: content_block_delta
data: {"type":"content_block_delta","index":1,"delta":{"type":"input_json_delta","partial_json":"day\"}"}}
event: content_block_stop
data: {"type":"content_block_stop","index":1 }
event: content_block_start
data: {"type":"content_block_start","index":2,"content_block":{"type":"web_search_tool_result","tool_use_id":"srvtoolu_014hJH82Qum7Td6UV8gDXThB","content":[{"type":"web_search_result","title":"Weather in New York City in May 2025 (New York) - detailed Weather Forecast for a month","url":"https://world-weather.info/forecast/usa/new_york/may-2025/","encrypted_content":"Ev0DCioIAxgCIiQ3NmU4ZmI4OC1k...","page_age":null},...]}}
event: content_block_stop
data: {"type":"content_block_stop","index":2}
event: content_block_start
data: {"type":"content_block_start","index":3,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"Here's the current weather information for New York"}}
event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" City:\n\n# Weather"}}
event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":" in New York City"}}
event: content_block_delta
data: {"type":"content_block_delta","index":3,"delta":{"type":"text_delta","text":"\n\n"}}
...
event: content_block_stop
data: {"type":"content_block_stop","index":17}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"input_tokens":10682,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":510,"server_tool_use":{"web_search_requests":1}}}
event: message_stop
data: {"type":"message_stop"}Para los modelos Claude 4.5 y anteriores, puedes recuperar una solicitud de streaming que se interrumpió debido a problemas de red, tiempos de espera u otros errores reanudando desde donde se interrumpió el stream. Este enfoque te evita volver a procesar toda la respuesta.
La estrategia básica de recuperación implica:
Para los modelos Claude 4.6 y posteriores, se aplica la misma estrategia de capturar y reanudar, pero el paso 2 cambia: en lugar de colocar la respuesta parcial en un mensaje del asistente, agrega un mensaje de usuario que le indique al modelo que continúe desde donde se quedó.
Your previous response was interrupted and ended with [previous_response]. Continue from where you left off.text, tool_use, thinking). Los bloques de uso de herramientas y de pensamiento extendido no se pueden recuperar parcialmente. Puedes reanudar el streaming desde el bloque de texto más reciente.client = anthropic.Anthropic()
with client.messages.stream(
max_tokens=128000,
messages=[{"role": "user", "content": "Write a detailed analysis..."}],
model="claude-opus-4-8",
) as stream:
message = stream.get_final_message()
print(message.content[0].text)message_stopclient = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-8",
messages=[{"role": "user", "content": "Hello"}],
max_tokens=256,
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)client = anthropic.Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
}
},
"required": ["location"],
},
}
]
with client.messages.stream(
model="claude-opus-4-8",
max_tokens=1024,
tools=tools,
tool_choice={"type": "any"},
messages=[
{"role": "user", "content": "What is the weather like in San Francisco?"}
],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)client = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-8",
max_tokens=20000,
thinking={"type": "adaptive", "display": "summarized"},
messages=[
{
"role": "user",
"content": "What is the greatest common divisor of 1071 and 462?",
}
],
) as stream:
for event in stream:
if event.type == "content_block_delta":
if event.delta.type == "thinking_delta":
print(event.delta.thinking, end="", flush=True)
elif event.delta.type == "text_delta":
print(event.delta.text, end="", flush=True)client = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-8",
max_tokens=1024,
tools=[{"type": "web_search_20250305", "name": "web_search", "max_uses": 5}],
messages=[
{"role": "user", "content": "What is the weather like in New York City today?"}
],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)