當您向 Messages API 發出請求時,Claude 的回應會包含一個 stop_reason 欄位,指出模型停止生成回應的原因。理解這些值對於建構能夠適當處理不同回應類型的穩健應用程式至關重要。
如需 API 回應中 stop_reason 的詳細資訊,請參閱 Messages API 參考文件。
stop_reason 欄位是每個成功的 Messages API 回應的一部分。與表示請求處理失敗的錯誤不同,stop_reason 告訴您 Claude 完成回應生成的原因。
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}最常見的停止原因。表示 Claude 自然地完成了其回應。
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
if response.stop_reason == "end_turn":
# 處理完整的回應
print(response.content[0].text)有時 Claude 會回傳一個空回應(恰好 2-3 個 token 且沒有內容),並帶有 stop_reason: "end_turn"。這通常發生在 Claude 認為助手回合已完成時,特別是在工具結果之後。
常見原因:
如何防止空回應:
# 錯誤做法:在 tool_result 之後立即加入文字
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678},
}
],
},
{
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{
"type": "text",
"text": "Here's the result", # Don't add text after tool_result
},
],
},
]
# 正確做法:直接傳送工具結果,不附加額外文字
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678},
}
],
},
{
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
],
}, # Just the tool_result, no additional text
]
# 如果修正訊息結構後仍收到空回應:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-8", max_tokens=1024, messages=messages
)
# 檢查回應是否為空
if response.stop_reason == "end_turn" and not response.content:
# 錯誤做法:不要直接用空回應重試
# 這樣無效,因為 Claude 已判定任務完成
# 正確做法:在新的使用者訊息中加入接續提示
messages.append({"role": "user", "content": "Please continue"})
response = client.messages.create(
model="claude-opus-4-8", max_tokens=1024, messages=messages
)
return response最佳實務:
Claude 因達到您在請求中指定的 max_tokens 限制而停止。
# 使用有限 token 數的請求
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=10,
messages=[{"role": "user", "content": "Explain quantum physics"}],
)
if response.stop_reason == "max_tokens":
# 回應已被截斷
print("Response was cut off at token limit")
# 請考慮發出另一個請求以繼續如果 Claude 的回應因達到 max_tokens 限制而被截斷,且被截斷的回應包含不完整的工具使用區塊,您需要以更高的 max_tokens 值重試請求,以取得完整的工具使用內容。
# 檢查回應是否在工具使用期間被截斷
if response.stop_reason == "max_tokens":
# 檢查最後一個內容區塊是否為不完整的 tool_use
last_block = response.content[-1]
if last_block.type == "tool_use":
# 以更高的 max_tokens 重新發送請求
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=4096, # Increased limit
messages=messages,
tools=tools,
)Claude 遇到了您的其中一個自訂停止序列。
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
stop_sequences=["END", "STOP"],
messages=[{"role": "user", "content": "Generate text until you say END"}],
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")Claude 正在呼叫工具,並期望您執行它。
對於大多數工具使用的實作,請使用 tool runner,它會自動處理工具執行、結果格式化和對話管理。
from anthropic import Anthropic
client = Anthropic()
weather_tool = {
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City and state"},
},
"required": ["location"],
},
}
def execute_tool(name, tool_input):
"""Execute a tool and return the result."""
return f"Weather in {tool_input.get('location', 'unknown')}: 72°F"
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[weather_tool],
messages=[{"role": "user", "content": "What's the weather?"}],
)
if response.stop_reason == "tool_use":
# 提取並執行工具
for content in response.content:
if content.type == "tool_use":
result = execute_tool(content.name, content.input)
# 將結果回傳給 Claude 以取得最終回應當伺服器端取樣迴圈在執行伺服器工具(如網頁搜尋或網頁擷取)時達到其迭代限制時回傳。預設限制為每個請求 10 次迭代。
發生這種情況時,回應可能包含一個沒有對應 server_tool_result 的 server_tool_use 區塊。若要讓 Claude 完成處理,請將回應原樣傳回以繼續對話。
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
tools=[{"type": "web_search_20250305", "name": "web_search"}],
messages=[{"role": "user", "content": "Search for latest AI news"}],
)
if response.stop_reason == "pause_turn":
# 透過將回應傳回以繼續對話
messages = [
{"role": "user", "content": original_query},
{"role": "assistant", "content": response.content},
]
continuation = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=messages,
tools=[{"type": "web_search_20250305", "name": "web_search"}],
)您的應用程式應在任何使用伺服器工具的代理迴圈中處理 pause_turn。只需將助手的回應新增到您的訊息陣列中,並發出另一個 API 請求,讓 Claude 繼續執行。
Claude 拒絕生成回應。在 Claude Fable 5 上,安全分類器會以正常的 HTTP 200 回應回傳此停止原因,而非錯誤。
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=[{"role": "user", "content": "[Unsafe request]"}],
)
if response.stop_reason == "refusal":
# Claude 拒絕回應
print("Claude was unable to process this request")
# 請考慮重新措辭或修改請求如果您在使用 Claude Sonnet 4.5 或 Opus 4.1(已棄用)時經常遇到 refusal 停止原因,您可以嘗試將 API 呼叫更新為使用 Haiku 4.5(claude-haiku-4-5-20251001),它具有不同的使用限制。深入了解理解 Sonnet 4.5 的 API 安全過濾器。
發生拒絕時,stop_details 物件會識別觸發拒絕的政策類別。這些類別和完整的拒絕回應結構在拒絕與備援中有詳細說明。對於 refusal 以外的所有停止原因,stop_details 為 null。
在 Claude Fable 5 上被拒絕的請求通常可以透過在另一個 Claude 模型上重試來處理,拒絕與備援說明了如何在伺服器端或您的用戶端設定該重試。備援額度說明了當您自行建構重試時,如何避免重複支付提示快取成本。
Claude 因達到模型的「context window」(上下文視窗)限制而停止。這讓您可以在不知道確切輸入大小的情況下請求最大可能的 token 數量。
# 以最大 token 數發出請求,盡可能取得更多內容
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=20000, # Python SDK requires streaming for max_tokens above ~21k (Opus 4.8 supports 128k with streaming)
messages=[
{"role": "user", "content": "Large input that uses most of context window..."}
],
)
if response.stop_reason == "model_context_window_exceeded":
# 回應在達到 max_tokens 之前已觸及上下文視窗限制
print("Response reached model's context window limit")
# 回應仍然有效,但受到上下文視窗的限制此停止原因在 Sonnet 4.5 及更新的模型中預設可用。對於較早的模型,請使用 beta 標頭 model-context-window-exceeded-2025-08-26 來啟用此行為。
養成在回應處理邏輯中檢查 stop_reason 的習慣:
def handle_response(response):
if response.stop_reason == "tool_use":
return handle_tool_use(response)
elif response.stop_reason == "max_tokens":
return handle_truncation(response)
elif response.stop_reason == "model_context_window_exceeded":
return handle_context_limit(response)
elif response.stop_reason == "pause_turn":
return handle_pause(response)
elif response.stop_reason == "refusal":
return handle_refusal(response)
else:
# 處理 end_turn 及其他情況
return response.content[0].text當回應因 token 限制或上下文視窗而被截斷時:
def handle_truncated_response(response):
if response.stop_reason in ["max_tokens", "model_context_window_exceeded"]:
# 選項 1:警告使用者已達到特定限制
if response.stop_reason == "max_tokens":
message = "[Response truncated due to max_tokens limit]"
else:
message = "[Response truncated due to context window limit]"
return f"{response.content[0].text}\n\n{message}"
# 選項 2:繼續生成
messages = [
{"role": "user", "content": original_prompt},
{"role": "assistant", "content": response.content[0].text},
]
continuation = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
messages=messages + [{"role": "user", "content": "Please continue"}],
)
return response.content[0].text + continuation.content[0].text使用伺服器工具時,如果伺服器端取樣迴圈達到其迭代限制(預設為 10),API 可能會回傳 pause_turn。透過繼續對話來處理此情況:
def handle_server_tool_conversation(client, user_query, tools, max_continuations=5):
"""
Handle server tool conversations that may require multiple continuations.
The server runs a sampling loop when executing server tools. If the loop
reaches its iteration limit, the API returns pause_turn. Continue the
conversation by sending the response back to let Claude finish.
"""
messages = [{"role": "user", "content": user_query}]
for _ in range(max_continuations):
response = client.messages.create(
model="claude-opus-4-8", max_tokens=1024, messages=messages, tools=tools
)
if response.stop_reason != "pause_turn":
# Claude 已完成處理 - 回傳最終回應
return response
# pause_turn:替換整個訊息列表以維持角色交替
messages = [
{"role": "user", "content": user_query},
{"role": "assistant", "content": response.content},
]
# 已達到最大續接次數 - 回傳最後一個回應
return response區分 stop_reason 值與實際錯誤非常重要:
import anthropic
from anthropic import Anthropic
client = Anthropic()
try:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
# 處理帶有 stop_reason 的成功回應
if response.stop_reason == "max_tokens":
print("Response was truncated")
except anthropic.APIStatusError as e:
# 處理實際錯誤
if e.status_code == 429:
print("Rate limit exceeded")
elif e.status_code == 500:
print("Server error")使用「streaming」(串流)時,stop_reason 為:
message_start 事件中為 nullmessage_delta 事件中提供from anthropic import Anthropic
client = Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
) as stream:
for event in stream:
if event.type == "message_delta":
stop_reason = event.delta.stop_reason
if stop_reason:
print(f"Stream ended with: {stop_reason}")使用 tool runner 更簡單: 以下範例展示手動工具處理。對於大多數使用案例,tool runner 會以更少的程式碼自動處理工具執行。
def complete_tool_workflow(client, user_query, tools):
messages = [{"role": "user", "content": user_query}]
while True:
response = client.messages.create(
model="claude-opus-4-8", max_tokens=1024, messages=messages, tools=tools
)
if response.stop_reason == "tool_use":
# 執行工具並繼續
tool_results = execute_tools(response.content)
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
# 最終回應
return responsedef get_complete_response(client, prompt, max_attempts=3):
messages = [{"role": "user", "content": prompt}]
full_response = ""
for _ in range(max_attempts):
response = client.messages.create(
model="claude-opus-4-8", messages=messages, max_tokens=4096
)
full_response += response.content[0].text
if response.stop_reason != "max_tokens":
break
# 從中斷處繼續
messages = [
{"role": "user", "content": prompt},
{"role": "assistant", "content": full_response},
{"role": "user", "content": "Please continue from where you left off."},
]
return full_response透過 model_context_window_exceeded 停止原因,您可以在不計算輸入大小的情況下請求最大可能的 token 數:
def get_max_possible_tokens(client, prompt):
"""
Get as many tokens as possible within the model's context window
without needing to calculate input token count
"""
response = client.messages.create(
model="claude-opus-4-8",
messages=[{"role": "user", "content": prompt}],
max_tokens=20000, # Python SDK requires streaming for max_tokens above ~21k
)
if response.stop_reason == "model_context_window_exceeded":
# 在給定輸入大小下取得最大可能的 token 數
print(
f"Generated {response.usage.output_tokens} tokens (context limit reached)"
)
elif response.stop_reason == "max_tokens":
# 取得的 token 數與請求完全相符
print(f"Generated {response.usage.output_tokens} tokens (max_tokens reached)")
else:
# 自然完成
print(f"Generated {response.usage.output_tokens} tokens (natural completion)")
return response.content[0].text透過正確處理 stop_reason 值,您可以建構更穩健的應用程式,優雅地處理不同的回應情境,並提供更好的使用者體驗。
Was this page helpful?