Loading...
    • 開發者指南
    • API 參考
    • MCP
    • 資源
    • 發行說明
    Search...
    ⌘K
    入門
    Claude 簡介快速開始
    模型與定價
    模型概覽選擇模型Claude 4.6 新功能遷移指南模型棄用定價
    使用 Claude 構建
    功能概覽使用 Messages API處理停止原因提示詞最佳實踐
    上下文管理
    上下文視窗壓縮上下文編輯
    功能
    提示詞快取延伸思考自適應思考思考力度串流訊息批次處理引用多語言支援Token 計數嵌入視覺PDF 支援Files API搜尋結果結構化輸出
    工具
    概覽如何實作工具使用細粒度工具串流Bash 工具程式碼執行工具程式化工具呼叫電腦使用工具文字編輯器工具網頁擷取工具網頁搜尋工具記憶工具工具搜尋工具
    Agent Skills
    概覽快速開始最佳實踐企業級 Skills透過 API 使用 Skills
    Agent SDK
    概覽快速開始TypeScript SDKTypeScript V2(預覽版)Python SDK遷移指南
    API 中的 MCP
    MCP 連接器遠端 MCP 伺服器
    第三方平台上的 Claude
    Amazon BedrockMicrosoft FoundryVertex AI
    提示詞工程
    概覽提示詞產生器使用提示詞範本提示詞改進器清晰直接使用範例(多範例提示)讓 Claude 思考(CoT)使用 XML 標籤賦予 Claude 角色(系統提示詞)串聯複雜提示詞長上下文技巧延伸思考技巧
    測試與評估
    定義成功標準開發測試案例使用評估工具降低延遲
    強化防護機制
    減少幻覺提高輸出一致性防範越獄攻擊串流拒絕減少提示詞洩漏讓 Claude 保持角色
    管理與監控
    Admin API 概覽資料駐留工作區用量與成本 APIClaude Code Analytics API零資料保留
    Console
    Log in
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...
    Loading...

    Solutions

    • AI agents
    • Code modernization
    • Coding
    • Customer support
    • Education
    • Financial services
    • Government
    • Life sciences

    Partners

    • Amazon Bedrock
    • Google Cloud's Vertex AI

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Company

    • Anthropic
    • Careers
    • Economic Futures
    • Research
    • News
    • Responsible Scaling Policy
    • Security and compliance
    • Transparency

    Learn

    • Blog
    • Catalog
    • Courses
    • Use cases
    • Connectors
    • Customer stories
    • Engineering at Anthropic
    • Events
    • Powered by Claude
    • Service partners
    • Startups program

    Help and security

    • Availability
    • Status
    • Support
    • Discord

    Terms and policies

    • Privacy policy
    • Responsible disclosure policy
    • Terms of service: Commercial
    • Terms of service: Consumer
    • Usage policy
    功能

    自適應思考

    讓 Claude 透過自適應思考模式動態決定何時以及思考多少。

    Was this page helpful?

    • 自適應思考搭配 effort 參數
    • 自適應 vs 手動 vs 停用思考

    自適應思考是在 Claude Opus 4.6 上使用延伸思考的推薦方式。自適應思考不需要手動設定思考 token 預算,而是讓 Claude 根據每個請求的複雜度動態決定何時以及思考多少。

    自適應思考可靠地比使用固定 budget_tokens 的延伸思考帶來更好的效能,我們建議改用自適應思考以從 Opus 4.6 獲得最智慧的回應。不需要 beta 標頭。

    支援的模型

    自適應思考支援以下模型:

    • Claude Opus 4.6 (claude-opus-4-6)

    thinking.type: "enabled" 和 budget_tokens 在 Opus 4.6 上已棄用,將在未來的模型版本中移除。請改用 thinking.type: "adaptive" 搭配 effort 參數。

    較舊的模型(Sonnet 4.5、Opus 4.5 等)不支援自適應思考,需要使用 thinking.type: "enabled" 搭配 budget_tokens。

    自適應思考的運作方式

    在自適應模式中,思考對模型來說是可選的。Claude 會評估每個請求的複雜度,並決定是否以及思考多少。在預設的 effort 等級(high)下,Claude 幾乎總是會思考。在較低的 effort 等級下,Claude 可能會跳過較簡單問題的思考。

    自適應思考也會自動啟用交錯思考。這意味著 Claude 可以在工具呼叫之間進行思考,使其在代理工作流程中特別有效。

    如何使用自適應思考

    在您的 API 請求中將 thinking.type 設定為 "adaptive":

    自適應思考搭配 effort 參數

    您可以將自適應思考與 effort 參數結合使用,以引導 Claude 進行多少思考。effort 等級作為 Claude 思考分配的軟性指引:

    Effort 等級思考行為
    maxClaude 總是思考,且對思考深度沒有限制。僅限 Opus 4.6 — 在其他模型上使用 max 的請求將返回錯誤。
    high(預設)Claude 總是思考。在複雜任務上提供深度推理。
    mediumClaude 使用適度的思考。對於非常簡單的查詢可能會跳過思考。
    lowClaude 最小化思考。在速度最重要的簡單任務中跳過思考。

    自適應思考的串流

    自適應思考與串流無縫配合。思考區塊透過 thinking_delta 事件進行串流,就像手動思考模式一樣:

    自適應 vs 手動 vs 停用思考

    模式設定可用性何時使用
    自適應thinking: {type: "adaptive"}Opus 4.6Claude 決定何時以及思考多少。使用 effort 來引導。
    手動thinking: {type: "enabled", budget_tokens: N}所有模型。在 Opus 4.6 上已棄用 — 請改用自適應模式。當您需要精確控制思考 token 花費時。
    停用省略 thinking 參數所有模型當您不需要延伸思考且想要最低延遲時。

    自適應思考目前可在 Opus 4.6 上使用。較舊的模型僅支援 type: "enabled" 搭配 budget_tokens。在 Opus 4.6 上,type: "enabled" 搭配 budget_tokens 仍然被接受但已棄用 — 我們建議改用自適應思考搭配 effort 參數。

    重要注意事項

    驗證變更

    使用自適應思考時,先前的助理回合不需要以思考區塊開頭。這比手動模式更靈活,手動模式中 API 會強制要求啟用思考的回合以思考區塊開頭。

    提示快取

    使用 adaptive 思考的連續請求會保留提示快取斷點。然而,在 adaptive 和 enabled/disabled 思考模式之間切換會破壞訊息的快取斷點。無論模式如何變更,系統提示和工具定義都會保持快取。

    調整思考行為

    自適應思考的觸發行為是可透過提示引導的。如果 Claude 思考的頻率比您期望的多或少,您可以在系統提示中添加指引:

    Extended thinking adds latency and should only be used when it
    will meaningfully improve answer quality — typically for problems
    that require multi-step reasoning. When in doubt, respond directly.

    引導 Claude 減少思考頻率可能會降低受益於推理的任務品質。在將基於提示的調整部署到生產環境之前,請衡量對您特定工作負載的影響。考慮先使用較低的 effort 等級進行測試。

    成本控制

    使用 max_tokens 作為總輸出(思考 + 回應文字)的硬性限制。effort 參數提供額外的軟性指引,控制 Claude 分配多少思考。兩者結合使用,可以有效控制成本。

    在 high 和 max effort 等級下,Claude 可能會進行更廣泛的思考,並且更可能耗盡 max_tokens 預算。如果您在回應中觀察到 stop_reason: "max_tokens",請考慮增加 max_tokens 以給模型更多空間,或降低 effort 等級。

    處理思考區塊

    以下概念適用於所有支援延伸思考的模型,無論您使用自適應還是手動模式。

    摘要思考

    With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse.

    Here are some important considerations for summarized thinking:

    • You're charged for the full thinking tokens generated by the original request, not the summary tokens.
    • The billed output token count will not match the count of tokens you see in the response.
    • The first few lines of thinking output are more verbose, providing detailed reasoning that's particularly helpful for prompt engineering purposes.
    • As Anthropic seeks to improve the extended thinking feature, summarization behavior is subject to change.
    • Summarization preserves the key ideas of Claude's thinking process with minimal added latency, enabling a streamable user experience and easy migration from Claude Sonnet 3.7 to Claude 4 and later models.
    • Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output.

    Claude Sonnet 3.7 continues to return full thinking output.

    In rare cases where you need access to full thinking output for Claude 4 models, contact our sales team.

    思考加密

    Full thinking content is encrypted and returned in the signature field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API.

    It is only strictly necessary to send back thinking blocks when using tools with extended thinking. Otherwise you can omit thinking blocks from previous turns, or let the API strip them for you if you pass them back.

    If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues.

    Here are some important considerations on thinking encryption:

    • When streaming responses, the signature is added via a signature_delta inside a content_block_delta event just before the content_block_stop event.
    • signature values are significantly longer in Claude 4 models than in previous models.
    • The signature field is an opaque field and should not be interpreted or parsed - it exists solely for verification purposes.
    • signature values are compatible across platforms (Claude APIs, Amazon Bedrock, and Vertex AI). Values generated on one platform will be compatible with another.

    思考編輯

    Occasionally Claude's internal reasoning will be flagged by our safety systems. When this occurs, we encrypt some or all of the thinking block and return it to you as a redacted_thinking block. redacted_thinking blocks are decrypted when passed back to the API, allowing Claude to continue its response without losing context.

    When building customer-facing applications that use extended thinking:

    • Be aware that redacted thinking blocks contain encrypted content that isn't human-readable
    • Consider providing a simple explanation like: "Some of Claude's internal reasoning has been automatically encrypted for safety reasons. This doesn't affect the quality of responses."
    • If showing thinking blocks to users, you can filter out redacted blocks while preserving normal thinking blocks
    • Be transparent that using extended thinking features may occasionally result in some reasoning being encrypted
    • Implement appropriate error handling to gracefully manage redacted thinking without breaking your UI

    Here's an example showing both normal and redacted thinking blocks:

    {
      "content": [
        {
          "type": "thinking",
          "thinking": "Let me analyze this step by step...",
          "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
        },
        {
          "type": "redacted_thinking",
          "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..."
        },
        {
          "type": "text",
          "text": "Based on my analysis..."
        }
      ]
    }

    Seeing redacted thinking blocks in your output is expected behavior. The model can still use this redacted reasoning to inform its responses while maintaining safety guardrails.

    If you need to test redacted thinking handling in your application, you can use this special test string as your prompt: ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB

    When passing thinking and redacted_thinking blocks back to the API in a multi-turn conversation, you must include the complete unmodified block back to the API for the last assistant turn. This is critical for maintaining the model's reasoning flow. We suggest always passing back all thinking blocks to the API. For more details, see the Preserving thinking blocks section.

    定價

    For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the pricing page.

    The thinking process incurs charges for:

    • Tokens used during thinking (output tokens)
    • Thinking blocks from the last assistant turn included in subsequent requests (input tokens)
    • Standard text output tokens

    When extended thinking is enabled, a specialized system prompt is automatically included to support this feature.

    When using summarized thinking:

    • Input tokens: Tokens in your original request (excludes thinking tokens from previous turns)
    • Output tokens (billed): The original thinking tokens that Claude generated internally
    • Output tokens (visible): The summarized thinking tokens you see in the response
    • No charge: Tokens used to generate the summary

    The billed output token count will not match the visible token count in the response. You are billed for the full thinking process, not the summary you see.

    其他主題

    延伸思考頁面以更詳細的方式涵蓋了幾個主題,並附有特定模式的程式碼範例:

    • 搭配思考的工具使用:自適應思考適用相同的規則 — 在工具呼叫之間保留思考區塊,並注意思考啟用時 tool_choice 的限制。
    • 提示快取:使用自適應思考時,使用相同思考模式的連續請求會保留快取斷點。在 adaptive 和 enabled/disabled 模式之間切換會破壞訊息的快取斷點(系統提示和工具定義會保持快取)。
    • 上下文視窗:思考 token 如何與 max_tokens 和上下文視窗限制互動。

    後續步驟

    curl https://api.anthropic.com/v1/messages \
         --header "x-api-key: $ANTHROPIC_API_KEY" \
         --header "anthropic-version: 2023-06-01" \
         --header "content-type: application/json" \
         --data \
    '{
        "model": "claude-opus-4-6",
        "max_tokens": 16000,
        "thinking": {
            "type": "adaptive"
        },
        "messages": [
            {
                "role": "user",
                "content": "Explain why the sum of two even numbers is always even."
            }
        ]
    }'
    import anthropic
    
    client = anthropic.Anthropic()
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=16000,
        thinking={
            "type": "adaptive"
        },
        output_config={
            "effort": "medium"
        },
        messages=[{
            "role": "user",
            "content": "What is the capital of France?"
        }]
    )
    
    print(response.content[0].text)
    import anthropic
    
    client = anthropic.Anthropic()
    
    with client.messages.stream(
        model="claude-opus-4-6",
        max_tokens=16000,
        thinking={"type": "adaptive"},
        messages=[{"role": "user", "content": "What is the greatest common divisor of 1071 and 462?"}],
    ) as stream:
        for event in stream:
            if event.type == "content_block_start":
                print(f"\nStarting {event.content_block.type} block...")
            elif event.type == "content_block_delta":
                if event.delta.type == "thinking_delta":
                    print(event.delta.thinking, end="", flush=True)
                elif event.delta.type == "text_delta":
                    print(event.delta.text, end="", flush=True)
    延伸思考

    深入了解延伸思考,包括手動模式、工具使用和提示快取。

    Effort 參數

    使用 effort 參數控制 Claude 回應的徹底程度。