This feature is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
自適應思考是在 Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用擴展思考的推薦方式,也是 Claude Mythos Preview 上的預設模式(當 thinking 未設定時自動應用)。自適應思考不需要手動設定思考令牌預算,而是讓 Claude 根據每個請求的複雜性動態決定何時以及如何使用擴展思考。在 Claude Opus 4.7 上,自適應思考是唯一支援的思考模式;不再接受手動 thinking: {type: "enabled", budget_tokens: N}。
對於許多工作負載,特別是雙峰任務和長期代理工作流程,自適應思考可以提供比具有固定 budget_tokens 的擴展思考更好的效能。不需要測試版標頭。
如果您的工作負載需要可預測的延遲或對思考成本的精確控制,擴展思考與 budget_tokens 在 Claude Opus 4.6 和 Claude Sonnet 4.6 上仍然可用,但已被棄用且不再推薦。請參閱下面的警告。
自適應思考在以下模型上受支援:
claude-mythos-preview),自適應思考是預設值;不支援 thinking: {type: "disabled"}claude-opus-4-7),自適應思考是唯一支援的思考模式。除非您在請求中明確設定 thinking: {type: "adaptive"},否則思考處於關閉狀態;手動 thinking: {type: "enabled"} 會被拒絕並返回 400 錯誤。claude-opus-4-6)claude-sonnet-4-6)thinking.type: "enabled" 和 budget_tokens 在 Opus 4.6 和 Sonnet 4.6 上已棄用,將在未來的模型版本中移除。請改用 thinking.type: "adaptive" 搭配 effort 參數。現有的 budget_tokens 配置仍然可用,但不再推薦;請計劃進行遷移。
較舊的模型(Sonnet 4.5、Opus 4.5 等)不支援自適應思考,需要 thinking.type: "enabled" 搭配 budget_tokens。
在自適應模式中,思考對於模型是可選的。Claude 評估每個請求的複雜性,並決定是否以及如何使用擴展思考。在預設努力級別(high)下,Claude 幾乎總是會思考。在較低的努力級別下,Claude 可能會跳過簡單問題的思考。
自適應思考也會自動啟用交錯思考。這意味著 Claude 可以在工具呼叫之間進行思考,使其對代理工作流程特別有效。
在您的 API 請求中將 thinking.type 設定為 "adaptive":
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even.",
}
],
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")您可以將自適應思考與努力參數結合,以指導 Claude 進行多少思考。努力級別作為 Claude 思考分配的軟指導:
| 努力級別 | 思考行為 |
|---|---|
max | Claude 總是思考,對思考深度沒有限制。在 Claude Mythos Preview、Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上可用。 |
xhigh | Claude 總是深入思考,進行擴展探索。在 Claude Opus 4.7 上可用。 |
high(預設) | Claude 總是思考。對複雜任務提供深入推理。 |
medium | Claude 使用適度思考。可能會跳過非常簡單查詢的思考。 |
low | Claude 最小化思考。跳過簡單任務的思考,其中速度最重要。 |
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
output_config={"effort": "medium"},
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(response.content[0].text)自適應思考與串流無縫協作。思考區塊透過 thinking_delta 事件進行串流,就像手動思考模式一樣:
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "What is the greatest common divisor of 1071 and 462?",
}
],
) as stream:
for event in stream:
if event.type == "content_block_start":
print(f"\nStarting {event.content_block.type} block...")
elif event.type == "content_block_delta":
if event.delta.type == "thinking_delta":
print(event.delta.thinking, end="", flush=True)
elif event.delta.type == "text_delta":
print(event.delta.text, end="", flush=True)| 模式 | 配置 | 可用性 | 何時使用 |
|---|---|---|---|
| 自適應 | thinking: {type: "adaptive"} | Claude Mythos Preview(預設)、Opus 4.7(唯一模式)、Opus 4.6、Sonnet 4.6 | Claude 決定何時以及如何使用擴展思考。使用 effort 進行指導。 |
| 手動 | thinking: {type: "enabled", budget_tokens: N} | 除 Claude Opus 4.7 外的所有模型(被拒絕)。在 Opus 4.6 和 Sonnet 4.6 上已棄用(考慮改用自適應模式)。 | 當您需要對思考令牌支出進行精確控制時。 |
| 禁用 | 省略 thinking 參數或傳遞 {type: "disabled"} | 除 Claude Mythos Preview 外的所有模型 | 當您不需要擴展思考並希望獲得最低延遲時。 |
自適應思考在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上可用。在 Mythos Preview 上,自適應思考是預設值,當 thinking 未設定時自動應用。在 Claude Opus 4.7 上,自適應思考是唯一支援的模式,type: "enabled" 搭配 budget_tokens 會被拒絕。較舊的模型只支援 type: "enabled" 搭配 budget_tokens。在 Opus 4.6 和 Sonnet 4.6 上,type: "enabled" 搭配 budget_tokens 仍然可用但已棄用。
按模式的交錯思考可用性:
interleaved-thinking-2025-05-14 測試版標頭運作。使用自適應思考時,先前的助手回合不需要以思考區塊開始。這比手動模式更靈活,手動模式中 API 強制執行啟用思考的回合以思考區塊開始。
使用 adaptive 思考的連續請求保留提示快取斷點。但是,在 adaptive 和 enabled/disabled 思考模式之間切換會破壞訊息的快取斷點。系統提示和工具定義無論模式如何變更都保持快取。
自適應思考的觸發行為是可提示的。如果 Claude 思考的頻率比您希望的多或少,您可以在系統提示中添加指導:
Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.引導 Claude 減少思考可能會降低受益於推理的任務的品質。在將基於提示的調整部署到生產環境之前,請測量對您特定工作負載的影響。考慮先測試較低的努力級別。
使用 max_tokens 作為總輸出(思考 + 回應文本)的硬限制。effort 參數提供對 Claude 分配多少思考的額外軟指導。這兩者結合起來可以有效控制成本。
在 high 和 max 努力級別下,Claude 可能會進行更廣泛的思考,更可能耗盡 max_tokens 預算。如果您在回應中觀察到 stop_reason: "max_tokens",請考慮增加 max_tokens 以給模型更多空間,或降低努力級別。
以下概念適用於所有支援擴展思考的模型,無論您使用自適應還是手動模式。
With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse. This is the default behavior on Claude 4 models when the display field on the thinking configuration is unset or set to "summarized". On Claude Opus 4.7 and Claude Mythos Preview, display defaults to "omitted" instead, so you must set display: "summarized" explicitly to receive summarized thinking.
Here are some important considerations for summarized thinking:
Claude Sonnet 3.7 continues to return full thinking output.
In rare cases where you need access to full thinking output for Claude 4 models, contact our sales team.
The display field on the thinking configuration controls how thinking content is returned in API responses. It accepts two values:
"summarized": Thinking blocks contain summarized thinking text. See Summarized thinking for details. This is the default on Claude Opus 4.6, Claude Sonnet 4.6, and earlier Claude 4 models."omitted": Thinking blocks are returned with an empty thinking field. The signature field still carries the encrypted full thinking for multi-turn continuity (see Thinking encryption). This is the default on Claude Opus 4.7 and Claude Mythos Preview.Setting display: "omitted" is useful when your application doesn't surface thinking content to users. The primary benefit is faster time-to-first-text-token when streaming: The server skips streaming thinking tokens entirely and delivers only the signature, so the final text response begins streaming sooner.
Here are some important considerations for omitted thinking:
signature to reconstruct the original thinking for prompt construction (see Preserving thinking blocks). Any text you place in the thinking field of a round-tripped omitted block is ignored.display is invalid with thinking.type: "disabled" (there is nothing to display).thinking.type: "adaptive" and the model skips thinking for a simple request, no thinking block is produced regardless of display.The signature field is identical whether display is "summarized" or "omitted". Switching display values between turns in a conversation is supported.
在 Claude Opus 4.7 上,thinking.display 預設為 "omitted"。思考區塊仍然出現在回應串流中,但除非您明確選擇加入,否則其 thinking 欄位為空。這是對 Claude Opus 4.6 的無聲變更,其中預設值為 "summarized"。要在 Claude Opus 4.7 上恢復總結思考文本,請明確將 thinking.display 設定為 "summarized":
thinking = {
"type": "adaptive",
"display": "summarized",
}有關 display: "omitted" 的程式碼範例和串流行為,請參閱擴展思考頁面上的控制思考顯示。那裡的範例使用 type: "enabled";使用自適應思考時,請使用:
thinking = {"type": "adaptive", "display": "omitted"}Full thinking content is encrypted and returned in the signature field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API.
It is only strictly necessary to send back thinking blocks when using tools with extended thinking. Otherwise you can omit thinking blocks from previous turns. If you pass them back, whether the API keeps or strips them depends on the model: Opus 4.5+ and Sonnet 4.6+ keep them in context by default; earlier Opus/Sonnet models and all Haiku models strip them. See context editing to configure this.
If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues.
Here are some important considerations on thinking encryption:
signature_delta inside a content_block_delta event just before the content_block_stop event.signature values are significantly longer in Claude 4 models than in previous models.signature field is an opaque field and should not be interpreted or parsed.signature values are compatible across platforms (Claude APIs, Amazon Bedrock, and Vertex AI). Values generated on one platform will be compatible with another.For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the pricing page.
The thinking process incurs charges for:
When extended thinking is enabled, a specialized system prompt is automatically included to support this feature.
When using summarized thinking:
When using display: "omitted":
thinking field is empty)The billed output token count will not match the visible token count in the response. You are billed for the full thinking process, not the thinking content visible in the response.
擴展思考頁面涵蓋了幾個主題的更多詳細資訊,包括特定於模式的程式碼範例:
tool_choice 的限制。adaptive 和 enabled/disabled 模式之間切換會破壞訊息的快取斷點(系統提示和工具定義保持快取)。max_tokens 和上下文視窗限制互動。Was this page helpful?