Loading...
  • 建構
  • 管理
  • 模型與定價
  • 客戶端 SDK
  • API 參考
Search...
⌘K
Log in
自適應思考
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Solutions

  • AI agents
  • Code modernization
  • Coding
  • Customer support
  • Education
  • Financial services
  • Government
  • Life sciences

Partners

  • Amazon Bedrock
  • Google Cloud's Vertex AI

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Company

  • Anthropic
  • Careers
  • Economic Futures
  • Research
  • News
  • Responsible Scaling Policy
  • Security and compliance
  • Transparency

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Help and security

  • Availability
  • Status
  • Support
  • Discord

Terms and policies

  • Privacy policy
  • Responsible disclosure policy
  • Terms of service: Commercial
  • Terms of service: Consumer
  • Usage policy
建構/模型功能

自適應思考

讓 Claude 透過自適應思考模式動態決定何時以及如何使用擴展思考。

This feature is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.

自適應思考是在 Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用擴展思考的推薦方式,也是 Claude Mythos Preview 上的預設模式(當 thinking 未設定時自動應用)。自適應思考不需要手動設定思考令牌預算,而是讓 Claude 根據每個請求的複雜性動態決定何時以及如何使用擴展思考。在 Claude Opus 4.7 上,自適應思考是唯一支援的思考模式;不再接受手動 thinking: {type: "enabled", budget_tokens: N}。

對於許多工作負載,特別是雙峰任務和長期代理工作流程,自適應思考可以提供比具有固定 budget_tokens 的擴展思考更好的效能。不需要測試版標頭。

如果您的工作負載需要可預測的延遲或對思考成本的精確控制,擴展思考與 budget_tokens 在 Claude Opus 4.6 和 Claude Sonnet 4.6 上仍然可用,但已被棄用且不再推薦。請參閱下面的警告。

支援的模型

自適應思考在以下模型上受支援:

  • Claude Mythos Preview (claude-mythos-preview),自適應思考是預設值;不支援 thinking: {type: "disabled"}
  • Claude Opus 4.7 (claude-opus-4-7),自適應思考是唯一支援的思考模式。除非您在請求中明確設定 thinking: {type: "adaptive"},否則思考處於關閉狀態;手動 thinking: {type: "enabled"} 會被拒絕並返回 400 錯誤。
  • Claude Opus 4.6 (claude-opus-4-6)
  • Claude Sonnet 4.6 (claude-sonnet-4-6)

thinking.type: "enabled" 和 budget_tokens 在 Opus 4.6 和 Sonnet 4.6 上已棄用,將在未來的模型版本中移除。請改用 thinking.type: "adaptive" 搭配 effort 參數。現有的 budget_tokens 配置仍然可用,但不再推薦;請計劃進行遷移。

較舊的模型(Sonnet 4.5、Opus 4.5 等)不支援自適應思考,需要 thinking.type: "enabled" 搭配 budget_tokens。

自適應思考如何運作

在自適應模式中,思考對於模型是可選的。Claude 評估每個請求的複雜性,並決定是否以及如何使用擴展思考。在預設努力級別(high)下,Claude 幾乎總是會思考。在較低的努力級別下,Claude 可能會跳過簡單問題的思考。

自適應思考也會自動啟用交錯思考。這意味著 Claude 可以在工具呼叫之間進行思考,使其對代理工作流程特別有效。

如何使用自適應思考

在您的 API 請求中將 thinking.type 設定為 "adaptive":

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even.",
        }
    ],
)

for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

自適應思考搭配努力參數

您可以將自適應思考與努力參數結合,以指導 Claude 進行多少思考。努力級別作為 Claude 思考分配的軟指導:

努力級別思考行為
maxClaude 總是思考,對思考深度沒有限制。在 Claude Mythos Preview、Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上可用。
xhighClaude 總是深入思考,進行擴展探索。在 Claude Opus 4.7 上可用。
high(預設)Claude 總是思考。對複雜任務提供深入推理。
mediumClaude 使用適度思考。可能會跳過非常簡單查詢的思考。
lowClaude 最小化思考。跳過簡單任務的思考,其中速度最重要。
client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    output_config={"effort": "medium"},
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

print(response.content[0].text)

使用自適應思考進行串流

自適應思考與串流無縫協作。思考區塊透過 thinking_delta 事件進行串流,就像手動思考模式一樣:

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "What is the greatest common divisor of 1071 and 462?",
        }
    ],
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            print(f"\nStarting {event.content_block.type} block...")
        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

自適應 vs 手動 vs 禁用思考

模式配置可用性何時使用
自適應thinking: {type: "adaptive"}Claude Mythos Preview(預設)、Opus 4.7(唯一模式)、Opus 4.6、Sonnet 4.6Claude 決定何時以及如何使用擴展思考。使用 effort 進行指導。
手動thinking: {type: "enabled", budget_tokens: N}除 Claude Opus 4.7 外的所有模型(被拒絕)。在 Opus 4.6 和 Sonnet 4.6 上已棄用(考慮改用自適應模式)。當您需要對思考令牌支出進行精確控制時。
禁用省略 thinking 參數或傳遞 {type: "disabled"}除 Claude Mythos Preview 外的所有模型當您不需要擴展思考並希望獲得最低延遲時。

自適應思考在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上可用。在 Mythos Preview 上,自適應思考是預設值,當 thinking 未設定時自動應用。在 Claude Opus 4.7 上,自適應思考是唯一支援的模式,type: "enabled" 搭配 budget_tokens 會被拒絕。較舊的模型只支援 type: "enabled" 搭配 budget_tokens。在 Opus 4.6 和 Sonnet 4.6 上,type: "enabled" 搭配 budget_tokens 仍然可用但已棄用。

按模式的交錯思考可用性:

  • 自適應模式: 交錯思考在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上自動啟用。在 Mythos Preview 和 Opus 4.7 上,工具間推理始終位於思考區塊內。
  • Sonnet 4.6 上的手動模式: 交錯思考透過 interleaved-thinking-2025-05-14 測試版標頭運作。
  • Opus 4.6 上的手動模式: 交錯思考不可用。如果您的代理工作流程需要在 Opus 4.6 上進行工具呼叫之間的思考,請使用自適應模式。

重要考慮事項

驗證變更

使用自適應思考時,先前的助手回合不需要以思考區塊開始。這比手動模式更靈活,手動模式中 API 強制執行啟用思考的回合以思考區塊開始。

提示快取

使用 adaptive 思考的連續請求保留提示快取斷點。但是,在 adaptive 和 enabled/disabled 思考模式之間切換會破壞訊息的快取斷點。系統提示和工具定義無論模式如何變更都保持快取。

調整思考行為

自適應思考的觸發行為是可提示的。如果 Claude 思考的頻率比您希望的多或少,您可以在系統提示中添加指導:

Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

引導 Claude 減少思考可能會降低受益於推理的任務的品質。在將基於提示的調整部署到生產環境之前,請測量對您特定工作負載的影響。考慮先測試較低的努力級別。

成本控制

使用 max_tokens 作為總輸出(思考 + 回應文本)的硬限制。effort 參數提供對 Claude 分配多少思考的額外軟指導。這兩者結合起來可以有效控制成本。

在 high 和 max 努力級別下,Claude 可能會進行更廣泛的思考,更可能耗盡 max_tokens 預算。如果您在回應中觀察到 stop_reason: "max_tokens",請考慮增加 max_tokens 以給模型更多空間,或降低努力級別。

使用思考區塊

以下概念適用於所有支援擴展思考的模型,無論您使用自適應還是手動模式。

總結思考

With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse. This is the default behavior on Claude 4 models when the display field on the thinking configuration is unset or set to "summarized". On Claude Opus 4.7 and Claude Mythos Preview, display defaults to "omitted" instead, so you must set display: "summarized" explicitly to receive summarized thinking.

Here are some important considerations for summarized thinking:

  • You're charged for the full thinking tokens generated by the original request, not the summary tokens.
  • The billed output token count will not match the count of tokens you see in the response.
  • On Claude 4 models, the first few lines of thinking output are more verbose, providing detailed reasoning that's particularly helpful for prompt engineering purposes. Claude Mythos Preview summarizes from the first token, so its thinking blocks do not show this verbose preamble.
  • As Anthropic seeks to improve the extended thinking feature, summarization behavior is subject to change.
  • Summarization preserves the key ideas of Claude's thinking process with minimal added latency, enabling a streamable user experience and easy migration from Claude Sonnet 3.7 to Claude 4 and later models.
  • Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output.

Claude Sonnet 3.7 continues to return full thinking output.

In rare cases where you need access to full thinking output for Claude 4 models, contact our sales team.

控制思考顯示

The display field on the thinking configuration controls how thinking content is returned in API responses. It accepts two values:

  • "summarized": Thinking blocks contain summarized thinking text. See Summarized thinking for details. This is the default on Claude Opus 4.6, Claude Sonnet 4.6, and earlier Claude 4 models.
  • "omitted": Thinking blocks are returned with an empty thinking field. The signature field still carries the encrypted full thinking for multi-turn continuity (see Thinking encryption). This is the default on Claude Opus 4.7 and Claude Mythos Preview.

Setting display: "omitted" is useful when your application doesn't surface thinking content to users. The primary benefit is faster time-to-first-text-token when streaming: The server skips streaming thinking tokens entirely and delivers only the signature, so the final text response begins streaming sooner.

Here are some important considerations for omitted thinking:

  • You're still charged for the full thinking tokens. Omitting reduces latency, not cost.
  • If you pass thinking blocks back in multi-turn conversations, pass them unchanged. The server decrypts the signature to reconstruct the original thinking for prompt construction (see Preserving thinking blocks). Any text you place in the thinking field of a round-tripped omitted block is ignored.
  • display is invalid with thinking.type: "disabled" (there is nothing to display).
  • When using thinking.type: "adaptive" and the model skips thinking for a simple request, no thinking block is produced regardless of display.

The signature field is identical whether display is "summarized" or "omitted". Switching display values between turns in a conversation is supported.

在 Claude Opus 4.7 上,thinking.display 預設為 "omitted"。思考區塊仍然出現在回應串流中,但除非您明確選擇加入,否則其 thinking 欄位為空。這是對 Claude Opus 4.6 的無聲變更,其中預設值為 "summarized"。要在 Claude Opus 4.7 上恢復總結思考文本,請明確將 thinking.display 設定為 "summarized":

thinking = {
    "type": "adaptive",
    "display": "summarized",
}

有關 display: "omitted" 的程式碼範例和串流行為,請參閱擴展思考頁面上的控制思考顯示。那裡的範例使用 type: "enabled";使用自適應思考時,請使用:

thinking = {"type": "adaptive", "display": "omitted"}

思考加密

Full thinking content is encrypted and returned in the signature field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API.

It is only strictly necessary to send back thinking blocks when using tools with extended thinking. Otherwise you can omit thinking blocks from previous turns. If you pass them back, whether the API keeps or strips them depends on the model: Opus 4.5+ and Sonnet 4.6+ keep them in context by default; earlier Opus/Sonnet models and all Haiku models strip them. See context editing to configure this.

If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues.

Here are some important considerations on thinking encryption:

  • When streaming responses, the signature is added via a signature_delta inside a content_block_delta event just before the content_block_stop event.
  • signature values are significantly longer in Claude 4 models than in previous models.
  • The signature field is an opaque field and should not be interpreted or parsed.
  • signature values are compatible across platforms (Claude APIs, Amazon Bedrock, and Vertex AI). Values generated on one platform will be compatible with another.

定價

For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the pricing page.

The thinking process incurs charges for:

  • Tokens used during thinking (output tokens)
  • Thinking blocks from prior assistant turns kept in context: only the last turn on earlier Opus/Sonnet models and all Haiku models; all turns by default on Opus 4.5+ and Sonnet 4.6+ (input tokens)
  • Standard text output tokens

When extended thinking is enabled, a specialized system prompt is automatically included to support this feature.

When using summarized thinking:

  • Input tokens: Tokens in your original request (excludes thinking tokens from previous turns)
  • Output tokens (billed): The original thinking tokens that Claude generated internally
  • Output tokens (visible): The summarized thinking tokens you see in the response
  • No charge: Tokens used to generate the summary

When using display: "omitted":

  • Input tokens: Tokens in your original request (same as summarized)
  • Output tokens (billed): The original thinking tokens that Claude generated internally (same as summarized)
  • Output tokens (visible): Zero thinking tokens (the thinking field is empty)

The billed output token count will not match the visible token count in the response. You are billed for the full thinking process, not the thinking content visible in the response.

其他主題

擴展思考頁面涵蓋了幾個主題的更多詳細資訊,包括特定於模式的程式碼範例:

  • 工具使用與思考:自適應思考適用相同的規則:保留工具呼叫之間的思考區塊,並注意當思考處於活動狀態時 tool_choice 的限制。
  • 提示快取:使用自適應思考時,使用相同思考模式的連續請求保留快取斷點。在 adaptive 和 enabled/disabled 模式之間切換會破壞訊息的快取斷點(系統提示和工具定義保持快取)。
  • 上下文視窗:思考令牌如何與 max_tokens 和上下文視窗限制互動。

後續步驟

擴展思考

深入了解擴展思考,包括手動模式、工具使用和提示快取。

努力參數

使用努力參數控制 Claude 的回應有多徹底。

Was this page helpful?

  • 自適應 vs 手動 vs 禁用思考