Loading...
    • 開發者指南
    • API 參考
    • MCP
    • 資源
    • 發行說明
    Search...
    ⌘K
    入門
    Claude 簡介快速開始
    模型與定價
    模型概覽選擇模型Claude 4.6 新功能遷移指南模型棄用定價
    使用 Claude 構建
    功能概覽使用 Messages API處理停止原因提示詞最佳實踐
    上下文管理
    上下文視窗壓縮上下文編輯
    功能
    提示詞快取延伸思考自適應思考思考力度串流訊息批次處理引用多語言支援Token 計數嵌入視覺PDF 支援Files API搜尋結果結構化輸出
    工具
    概覽如何實作工具使用細粒度工具串流Bash 工具程式碼執行工具程式化工具呼叫電腦使用工具文字編輯器工具網頁擷取工具網頁搜尋工具記憶工具工具搜尋工具
    Agent Skills
    概覽快速開始最佳實踐企業級 Skills透過 API 使用 Skills
    Agent SDK
    概覽快速開始TypeScript SDKTypeScript V2(預覽版)Python SDK遷移指南
    API 中的 MCP
    MCP 連接器遠端 MCP 伺服器
    第三方平台上的 Claude
    Amazon BedrockMicrosoft FoundryVertex AI
    提示詞工程
    概覽提示詞產生器使用提示詞範本提示詞改進器清晰直接使用範例(多範例提示)讓 Claude 思考(CoT)使用 XML 標籤賦予 Claude 角色(系統提示詞)串聯複雜提示詞長上下文技巧延伸思考技巧
    測試與評估
    定義成功標準開發測試案例使用評估工具降低延遲
    強化防護機制
    減少幻覺提高輸出一致性防範越獄攻擊串流拒絕減少提示詞洩漏讓 Claude 保持角色
    管理與監控
    Admin API 概覽資料駐留工作區用量與成本 APIClaude Code Analytics API零資料保留
    Console
    Log in
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Partners

    • Amazon Bedrock
    • Google Cloud's Vertex AI

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Help and security

    • Availability
    • Status
    • Support
    • Discord

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    使用 Claude 構建

    處理停止原因

    當您向 Messages API 發送請求時,Claude 的回應包含一個 stop_reason 欄位,指示模型為何停止生成回應。理解這些值對於建構能適當處理不同回應類型的穩健應用程式至關重要。

    有關 API 回應中 stop_reason 的詳細資訊,請參閱 Messages API 參考文件。

    什麼是 stop_reason?

    stop_reason 欄位是每個成功的 Messages API 回應的一部分。與表示處理請求失敗的錯誤不同,stop_reason 告訴您 Claude 為何成功完成了回應生成。

    Example response
    {
      "id": "msg_01234",
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "Here's the answer to your question..."
        }
      ],
      "stop_reason": "end_turn",
      "stop_sequence": null,
      "usage": {
        "input_tokens": 100,
        "output_tokens": 50
      }
    }

    停止原因值

    end_turn

    最常見的停止原因。表示 Claude 自然地完成了回應。

    if response.stop_reason == "end_turn":
        # Process the complete response
        print(response.content[0].text)

    end_turn 的空回應

    有時 Claude 會返回一個空回應(恰好 2-3 個 token,沒有內容),且 stop_reason: "end_turn"。這通常發生在 Claude 判斷助手回合已完成時,特別是在工具結果之後。

    常見原因:

    • 在工具結果之後立即添加文字區塊(Claude 學會期望使用者總是在工具結果後插入文字,因此它結束回合以遵循該模式)
    • 將 Claude 已完成的回應發送回去而不添加任何內容(Claude 已經決定完成了,所以它會保持完成狀態)

    如何防止空回應:

    # INCORRECT: Adding text immediately after tool_result
    messages = [
        {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
        {"role": "assistant", "content": [
            {
                "type": "tool_use",
                "id": "toolu_123",
                "name": "calculator",
                "input": {"operation": "add", "a": 1234, "b": 5678}
            }
        ]},
        {"role": "user", "content": [
            {
                "type": "tool_result",
                "tool_use_id": "toolu_123",
                "content": "6912"
            },
            {
                "type": "text",
                "text": "Here's the result"  # Don't add text after tool_result
            }
        ]}
    ]
    
    # CORRECT: Send tool results directly without additional text
    messages = [
        {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
        {"role": "assistant", "content": [
            {
                "type": "tool_use",
                "id": "toolu_123",
                "name": "calculator",
                "input": {"operation": "add", "a": 1234, "b": 5678}
            }
        ]},
        {"role": "user", "content": [
            {
                "type": "tool_result",
                "tool_use_id": "toolu_123",
                "content": "6912"
            }
        ]}  # Just the tool_result, no additional text
    ]
    
    # If you still get empty responses after fixing the above:
    def handle_empty_response(client, messages):
        response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=1024,
            messages=messages
        )
    
        # Check if response is empty
        if (response.stop_reason == "end_turn" and
            not response.content:
    
            # INCORRECT: Don't just retry with the empty response
            # This won't work because Claude already decided it's done
    
            # CORRECT: Add a continuation prompt in a NEW user message
            messages.append({"role": "user", "content": "Please continue"})
    
            response = client.messages.create(
                model="claude-opus-4-6",
                max_tokens=1024,
                messages=messages
            )
    
        return response

    最佳實踐:

    1. 永遠不要在工具結果之後立即添加文字區塊 - 這會教導 Claude 在每次工具使用後期望使用者輸入
    2. 不要在不修改的情況下重試空回應 - 簡單地將空回應發送回去不會有幫助
    3. 將繼續提示作為最後手段 - 僅在上述修復無法解決問題時使用

    max_tokens

    Claude 停止是因為達到了您在請求中指定的 max_tokens 限制。

    # Request with limited tokens
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=10,
        messages=[{"role": "user", "content": "Explain quantum physics"}]
    )
    
    if response.stop_reason == "max_tokens":
        # Response was truncated
        print("Response was cut off at token limit")
        # Consider making another request to continue

    stop_sequence

    Claude 遇到了您的自訂停止序列之一。

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        stop_sequences=["END", "STOP"],
        messages=[{"role": "user", "content": "Generate text until you say END"}]
    )
    
    if response.stop_reason == "stop_sequence":
        print(f"Stopped at sequence: {response.stop_sequence}")

    tool_use

    Claude 正在呼叫工具並期望您執行它。

    對於大多數工具使用實作,我們建議使用 tool runner,它會自動處理工具執行、結果格式化和對話管理。

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        tools=[weather_tool],
        messages=[{"role": "user", "content": "What's the weather?"}]
    )
    
    if response.stop_reason == "tool_use":
        # Extract and execute the tool
        for content in response.content:
            if content.type == "tool_use":
                result = execute_tool(content.name, content.input)
                # Return result to Claude for final response

    pause_turn

    當 Claude 需要暫停長時間運行的操作時,與網路搜尋等伺服器工具一起使用。

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        tools=[{"type": "web_search_20250305", "name": "web_search"}],
        messages=[{"role": "user", "content": "Search for latest AI news"}]
    )
    
    if response.stop_reason == "pause_turn":
        # Continue the conversation
        messages = [
            {"role": "user", "content": original_query},
            {"role": "assistant", "content": response.content}
        ]
        continuation = client.messages.create(
            model="claude-opus-4-6",
            messages=messages,
            tools=[{"type": "web_search_20250305", "name": "web_search"}]
        )

    refusal

    Claude 因安全考量而拒絕生成回應。

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": "[Unsafe request]"}]
    )
    
    if response.stop_reason == "refusal":
        # Claude declined to respond
        print("Claude was unable to process this request")
        # Consider rephrasing or modifying the request

    如果您在使用 Claude Sonnet 4.5 或 Opus 4.1 時頻繁遇到 refusal 停止原因,您可以嘗試將 API 呼叫更新為使用 Sonnet 4(claude-sonnet-4-20250514),它有不同的使用限制。了解更多關於理解 Sonnet 4.5 的 API 安全過濾器。

    要了解更多關於 Claude Sonnet 4.5 的 API 安全過濾器觸發的拒絕,請參閱理解 Sonnet 4.5 的 API 安全過濾器。

    model_context_window_exceeded

    Claude 停止是因為達到了模型的上下文視窗限制。這允許您在不知道確切輸入大小的情況下請求最大可能的 token 數。

    # Request with maximum tokens to get as much as possible
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=64000,  # Model's maximum output tokens
        messages=[{"role": "user", "content": "Large input that uses most of context window..."}]
    )
    
    if response.stop_reason == "model_context_window_exceeded":
        # Response hit context window limit before max_tokens
        print("Response reached model's context window limit")
        # The response is still valid but was limited by context window

    此停止原因在 Sonnet 4.5 及更新模型中預設可用。對於較早的模型,請使用 beta 標頭 model-context-window-exceeded-2025-08-26 來啟用此行為。

    處理停止原因的最佳實踐

    1. 始終檢查 stop_reason

    養成在回應處理邏輯中檢查 stop_reason 的習慣:

    def handle_response(response):
        if response.stop_reason == "tool_use":
            return handle_tool_use(response)
        elif response.stop_reason == "max_tokens":
            return handle_truncation(response)
        elif response.stop_reason == "model_context_window_exceeded":
            return handle_context_limit(response)
        elif response.stop_reason == "pause_turn":
            return handle_pause(response)
        elif response.stop_reason == "refusal":
            return handle_refusal(response)
        else:
            # Handle end_turn and other cases
            return response.content[0].text

    2. 優雅地處理截斷的回應

    當回應因 token 限制或上下文視窗而被截斷時:

    def handle_truncated_response(response):
        if response.stop_reason in ["max_tokens", "model_context_window_exceeded"]:
            # Option 1: Warn the user about the specific limit
            if response.stop_reason == "max_tokens":
                message = "[Response truncated due to max_tokens limit]"
            else:
                message = "[Response truncated due to context window limit]"
            return f"{response.content[0].text}\n\n{message}"
    
            # Option 2: Continue generation
            messages = [
                {"role": "user", "content": original_prompt},
                {"role": "assistant", "content": response.content[0].text}
            ]
            continuation = client.messages.create(
                model="claude-opus-4-6",
                max_tokens=1024,
                messages=messages + [{"role": "user", "content": "Please continue"}]
            )
            return response.content[0].text + continuation.content[0].text

    3. 為 pause_turn 實作重試邏輯

    對於可能暫停的伺服器工具:

    def handle_paused_conversation(initial_response, max_retries=3):
        response = initial_response
        messages = [{"role": "user", "content": original_query}]
        
        for attempt in range(max_retries):
            if response.stop_reason != "pause_turn":
                break
                
            messages.append({"role": "assistant", "content": response.content})
            response = client.messages.create(
                model="claude-opus-4-6",
                messages=messages,
                tools=original_tools
            )
        
        return response

    停止原因與錯誤的區別

    區分 stop_reason 值和實際錯誤非常重要:

    停止原因(成功的回應)

    • 回應主體的一部分
    • 指示生成為何正常停止
    • 回應包含有效內容

    錯誤(失敗的請求)

    • HTTP 狀態碼 4xx 或 5xx
    • 指示請求處理失敗
    • 回應包含錯誤詳情
    try:
        response = client.messages.create(...)
        
        # Handle successful response with stop_reason
        if response.stop_reason == "max_tokens":
            print("Response was truncated")
        
    except anthropic.APIError as e:
        # Handle actual errors
        if e.status_code == 429:
            print("Rate limit exceeded")
        elif e.status_code == 500:
            print("Server error")

    串流注意事項

    使用串流時,stop_reason 為:

    • 在初始 message_start 事件中為 null
    • 在 message_delta 事件中提供
    • 在任何其他事件中不提供
    with client.messages.stream(...) as stream:
        for event in stream:
            if event.type == "message_delta":
                stop_reason = event.delta.stop_reason
                if stop_reason:
                    print(f"Stream ended with: {stop_reason}")

    常見模式

    處理工具使用工作流程

    使用 tool runner 更簡單:以下範例展示了手動工具處理。對於大多數使用案例,tool runner 會以更少的程式碼自動處理工具執行。

    def complete_tool_workflow(client, user_query, tools):
        messages = [{"role": "user", "content": user_query}]
    
        while True:
            response = client.messages.create(
                model="claude-opus-4-6",
                messages=messages,
                tools=tools
            )
    
            if response.stop_reason == "tool_use":
                # Execute tools and continue
                tool_results = execute_tools(response.content)
                messages.append({"role": "assistant", "content": response.content})
                messages.append({"role": "user", "content": tool_results})
            else:
                # Final response
                return response

    確保完整回應

    def get_complete_response(client, prompt, max_attempts=3):
        messages = [{"role": "user", "content": prompt}]
        full_response = ""
    
        for _ in range(max_attempts):
            response = client.messages.create(
                model="claude-opus-4-6",
                messages=messages,
                max_tokens=4096
            )
    
            full_response += response.content[0].text
    
            if response.stop_reason != "max_tokens":
                break
    
            # Continue from where it left off
            messages = [
                {"role": "user", "content": prompt},
                {"role": "assistant", "content": full_response},
                {"role": "user", "content": "Please continue from where you left off."}
            ]
    
        return full_response

    在不知道輸入大小的情況下獲取最大 token 數

    使用 model_context_window_exceeded 停止原因,您可以在不計算輸入大小的情況下請求最大可能的 token 數:

    def get_max_possible_tokens(client, prompt):
        """
        Get as many tokens as possible within the model's context window
        without needing to calculate input token count
        """
        response = client.messages.create(
            model="claude-opus-4-6",
            messages=[{"role": "user", "content": prompt}],
            max_tokens=64000  # Set to model's maximum output tokens
        )
    
        if response.stop_reason == "model_context_window_exceeded":
            # Got the maximum possible tokens given input size
            print(f"Generated {response.usage.output_tokens} tokens (context limit reached)")
        elif response.stop_reason == "max_tokens":
            # Got exactly the requested tokens
            print(f"Generated {response.usage.output_tokens} tokens (max_tokens reached)")
        else:
            # Natural completion
            print(f"Generated {response.usage.output_tokens} tokens (natural completion)")
    
        return response.content[0].text

    透過正確處理 stop_reason 值,您可以建構更穩健的應用程式,優雅地處理不同的回應場景並提供更好的使用者體驗。

    Was this page helpful?

    • 什麼是 stop_reason?
    • end_turn
    • max_tokens
    • stop_sequence
    • tool_use
    • pause_turn
    • refusal
    • model_context_window_exceeded
    • 1. 始終檢查 stop_reason
    • 2. 優雅地處理截斷的回應
    • 3. 為 pause_turn 實作重試邏輯
    • 在不知道輸入大小的情況下獲取最大 token 數