機能

適応型思考

適応型思考モードで、Claudeがいつ、どの程度思考するかを動的に決定できるようにします。

適応型思考は、Claude Opus 4.6で拡張思考を使用する際の推奨方法です。思考トークンの予算を手動で設定する代わりに、適応型思考ではClaudeが各リクエストの複雑さに基づいて、いつ、どの程度思考するかを動的に決定します。

適応型思考は、固定のbudget_tokensを使用した拡張思考よりも確実に優れたパフォーマンスを発揮します。Opus 4.6から最もインテリジェントな応答を得るために、適応型思考への移行をお勧めします。ベータヘッダーは不要です。

サポートされているモデル

適応型思考は以下のモデルでサポートされています：

Claude Opus 4.6 (claude-opus-4-6)

thinking.type: "enabled"とbudget_tokensはOpus 4.6では非推奨であり、将来のモデルリリースで削除される予定です。代わりにthinking.type: "adaptive"をeffortパラメータと組み合わせて使用してください。

古いモデル（Sonnet 4.5、Opus 4.5など）は適応型思考をサポートしておらず、thinking.type: "enabled"とbudget_tokensが必要です。

適応型思考の仕組み

適応モードでは、思考はモデルにとってオプションです。Claudeは各リクエストの複雑さを評価し、思考するかどうか、またどの程度思考するかを決定します。デフォルトのeffortレベル（high）では、Claudeはほぼ常に思考します。より低いeffortレベルでは、Claudeは単純な問題に対して思考をスキップする場合があります。

適応型思考はインターリーブ思考も自動的に有効にします。これにより、Claudeはツール呼び出しの間に思考できるため、エージェントワークフローで特に効果的です。

適応型思考の使い方

APIリクエストでthinking.typeを"adaptive"に設定します：

curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "content-type: application/json" \
     --data \
'{
    "model": "claude-opus-4-6",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "messages": [
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
}'

effortパラメータを使用した適応型思考

適応型思考をeffortパラメータと組み合わせて、Claudeがどの程度思考するかをガイドできます。effortレベルはClaudeの思考配分に対するソフトなガイダンスとして機能します：

effortレベル	思考の動作
`max`	Claudeは思考の深さに制約なく常に思考します。Opus 4.6のみ — 他のモデルで`max`を使用するリクエストはエラーを返します。
`high`（デフォルト）	Claudeは常に思考します。複雑なタスクに対して深い推論を提供します。
`medium`	Claudeは適度に思考します。非常に単純なクエリでは思考をスキップする場合があります。
`low`	Claudeは思考を最小限にします。速度が最も重要な単純なタスクでは思考をスキップします。

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive"
    },
    output_config={
        "effort": "medium"
    },
    messages=[{
        "role": "user",
        "content": "What is the capital of France?"
    }]
)

print(response.content[0].text)

適応型思考でのストリーミング

適応型思考はストリーミングとシームレスに連携します。思考ブロックは手動思考モードと同様にthinking_deltaイベントを通じてストリーミングされます：

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "What is the greatest common divisor of 1071 and 462?"}],
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            print(f"\nStarting {event.content_block.type} block...")
        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

適応型 vs 手動 vs 無効の思考

モード	設定	利用可能性	使用するタイミング
適応型	`thinking: {type: "adaptive"}`	Opus 4.6	Claudeがいつ、どの程度思考するかを決定します。`effort`でガイドします。
手動	`thinking: {type: "enabled", budget_tokens: N}`	すべてのモデル。Opus 4.6では非推奨 — 代わりに適応モードを使用してください。	思考トークンの使用量を正確に制御する必要がある場合。
無効	`thinking`パラメータを省略	すべてのモデル	拡張思考が不要で、最低レイテンシーが必要な場合。

適応型思考は現在Opus 4.6で利用可能です。古いモデルはtype: "enabled"とbudget_tokensのみをサポートしています。Opus 4.6では、type: "enabled"とbudget_tokensは引き続き受け入れられますが非推奨です — 代わりに適応型思考をeffortパラメータと組み合わせて使用することをお勧めします。

重要な考慮事項

バリデーションの変更

適応型思考を使用する場合、以前のアシスタントターンは思考ブロックで始まる必要がありません。これは手動モードよりも柔軟で、手動モードではAPIが思考有効化ターンが思考ブロックで始まることを強制します。

プロンプトキャッシング

adaptive思考を使用する連続リクエストは、プロンプトキャッシュのブレークポイントを保持します。ただし、adaptiveとenabled/disabledの思考モード間を切り替えると、メッセージのキャッシュブレークポイントが壊れます。システムプロンプトとツール定義は、モード変更に関係なくキャッシュされたままです。

思考動作のチューニング

適応型思考のトリガー動作はプロンプトで制御可能です。Claudeが希望よりも頻繁に、または少なく思考する場合は、システムプロンプトにガイダンスを追加できます：

Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

Claudeの思考頻度を減らすよう誘導すると、推論が有益なタスクの品質が低下する可能性があります。プロンプトベースのチューニングを本番環境にデプロイする前に、特定のワークロードへの影響を測定してください。まず低いeffortレベルでテストすることを検討してください。

コスト管理

max_tokensを合計出力（思考 + 応答テキスト）のハードリミットとして使用します。effortパラメータは、Claudeが割り当てる思考量に対する追加のソフトガイダンスを提供します。これらを組み合わせることで、コストを効果的に制御できます。

highおよびmaxのeffortレベルでは、Claudeはより広範に思考する可能性があり、max_tokensの予算を使い切る可能性が高くなります。応答でstop_reason: "max_tokens"が観察される場合は、モデルにより多くの余裕を与えるためにmax_tokensを増やすか、effortレベルを下げることを検討してください。

思考ブロックの操作

以下の概念は、適応モードまたは手動モードのいずれを使用するかに関係なく、拡張思考をサポートするすべてのモデルに適用されます。

要約された思考

With extended thinking enabled, the Messages API for Claude 4 models returns a summary of Claude's full thinking process. Summarized thinking provides the full intelligence benefits of extended thinking, while preventing misuse.

Here are some important considerations for summarized thinking:

You're charged for the full thinking tokens generated by the original request, not the summary tokens.
The billed output token count will not match the count of tokens you see in the response.
The first few lines of thinking output are more verbose, providing detailed reasoning that's particularly helpful for prompt engineering purposes.
As Anthropic seeks to improve the extended thinking feature, summarization behavior is subject to change.
Summarization preserves the key ideas of Claude's thinking process with minimal added latency, enabling a streamable user experience and easy migration from Claude Sonnet 3.7 to Claude 4 and later models.
Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output.

Claude Sonnet 3.7 continues to return full thinking output.

In rare cases where you need access to full thinking output for Claude 4 models, contact our sales team.

思考の暗号化

Full thinking content is encrypted and returned in the signature field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API.

It is only strictly necessary to send back thinking blocks when using tools with extended thinking. Otherwise you can omit thinking blocks from previous turns, or let the API strip them for you if you pass them back.

If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues.

Here are some important considerations on thinking encryption:

When streaming responses, the signature is added via a signature_delta inside a content_block_delta event just before the content_block_stop event.
signature values are significantly longer in Claude 4 models than in previous models.
The signature field is an opaque field and should not be interpreted or parsed - it exists solely for verification purposes.
signature values are compatible across platforms (Claude APIs, Amazon Bedrock, and Vertex AI). Values generated on one platform will be compatible with another.

思考の編集

Occasionally Claude's internal reasoning will be flagged by our safety systems. When this occurs, we encrypt some or all of the thinking block and return it to you as a redacted_thinking block. redacted_thinking blocks are decrypted when passed back to the API, allowing Claude to continue its response without losing context.

When building customer-facing applications that use extended thinking:

Be aware that redacted thinking blocks contain encrypted content that isn't human-readable
Consider providing a simple explanation like: "Some of Claude's internal reasoning has been automatically encrypted for safety reasons. This doesn't affect the quality of responses."
If showing thinking blocks to users, you can filter out redacted blocks while preserving normal thinking blocks
Be transparent that using extended thinking features may occasionally result in some reasoning being encrypted
Implement appropriate error handling to gracefully manage redacted thinking without breaking your UI

Here's an example showing both normal and redacted thinking blocks:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "redacted_thinking",
      "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

Seeing redacted thinking blocks in your output is expected behavior. The model can still use this redacted reasoning to inform its responses while maintaining safety guardrails.

If you need to test redacted thinking handling in your application, you can use this special test string as your prompt: ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB

When passing thinking and redacted_thinking blocks back to the API in a multi-turn conversation, you must include the complete unmodified block back to the API for the last assistant turn. This is critical for maintaining the model's reasoning flow. We suggest always passing back all thinking blocks to the API. For more details, see the Preserving thinking blocks section.

料金

For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the pricing page.

The thinking process incurs charges for:

Tokens used during thinking (output tokens)
Thinking blocks from the last assistant turn included in subsequent requests (input tokens)
Standard text output tokens

When extended thinking is enabled, a specialized system prompt is automatically included to support this feature.

When using summarized thinking:

Input tokens: Tokens in your original request (excludes thinking tokens from previous turns)
Output tokens (billed): The original thinking tokens that Claude generated internally
Output tokens (visible): The summarized thinking tokens you see in the response
No charge: Tokens used to generate the summary

The billed output token count will not match the visible token count in the response. You are billed for the full thinking process, not the summary you see.

その他のトピック

拡張思考ページでは、モード固有のコード例を含むいくつかのトピックをより詳細にカバーしています：

思考を使用したツール使用：適応型思考にも同じルールが適用されます — ツール呼び出し間の思考ブロックを保持し、思考がアクティブな場合のtool_choiceの制限に注意してください。
プロンプトキャッシング：適応型思考では、同じ思考モードを使用する連続リクエストはキャッシュブレークポイントを保持します。adaptiveとenabled/disabledモード間の切り替えは、メッセージのキャッシュブレークポイントを壊します（システムプロンプトとツール定義はキャッシュされたままです）。
コンテキストウィンドウ：思考トークンがmax_tokensおよびコンテキストウィンドウの制限とどのように相互作用するか。

次のステップ

拡張思考

手動モード、ツール使用、プロンプトキャッシングを含む拡張思考の詳細を学びます。

effortパラメータ

effortパラメータでClaudeの応答の徹底度を制御します。

Was this page helpful?

機能

適応型思考

適応型思考モードで、Claudeがいつ、どの程度思考するかを動的に決定できるようにします。

サポートされているモデル

適応型思考は以下のモデルでサポートされています：

Claude Opus 4.6 (claude-opus-4-6)

古いモデル（Sonnet 4.5、Opus 4.5など）は適応型思考をサポートしておらず、thinking.type: "enabled"とbudget_tokensが必要です。

適応型思考の仕組み

適応型思考の使い方

APIリクエストでthinking.typeを"adaptive"に設定します：

curl https://api.anthropic.com/v1/messages \
     --header "x-api-key: $ANTHROPIC_API_KEY" \
     --header "anthropic-version: 2023-06-01" \
     --header "content-type: application/json" \
     --data \
'{
    "model": "claude-opus-4-6",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "messages": [
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
}'

effortパラメータを使用した適応型思考

effortレベル	思考の動作
`max`	Claudeは思考の深さに制約なく常に思考します。Opus 4.6のみ — 他のモデルで`max`を使用するリクエストはエラーを返します。
`high`（デフォルト）	Claudeは常に思考します。複雑なタスクに対して深い推論を提供します。
`medium`	Claudeは適度に思考します。非常に単純なクエリでは思考をスキップする場合があります。
`low`	Claudeは思考を最小限にします。速度が最も重要な単純なタスクでは思考をスキップします。

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive"
    },
    output_config={
        "effort": "medium"
    },
    messages=[{
        "role": "user",
        "content": "What is the capital of France?"
    }]
)

print(response.content[0].text)

適応型思考でのストリーミング

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "What is the greatest common divisor of 1071 and 462?"}],
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            print(f"\nStarting {event.content_block.type} block...")
        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)

適応型 vs 手動 vs 無効の思考

モード	設定	利用可能性	使用するタイミング
適応型	`thinking: {type: "adaptive"}`	Opus 4.6	Claudeがいつ、どの程度思考するかを決定します。`effort`でガイドします。
手動	`thinking: {type: "enabled", budget_tokens: N}`	すべてのモデル。Opus 4.6では非推奨 — 代わりに適応モードを使用してください。	思考トークンの使用量を正確に制御する必要がある場合。
無効	`thinking`パラメータを省略	すべてのモデル	拡張思考が不要で、最低レイテンシーが必要な場合。

重要な考慮事項

バリデーションの変更

プロンプトキャッシング

思考動作のチューニング

Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

コスト管理

思考ブロックの操作

以下の概念は、適応モードまたは手動モードのいずれを使用するかに関係なく、拡張思考をサポートするすべてのモデルに適用されます。

要約された思考

Here are some important considerations for summarized thinking:

You're charged for the full thinking tokens generated by the original request, not the summary tokens.
The billed output token count will not match the count of tokens you see in the response.
The first few lines of thinking output are more verbose, providing detailed reasoning that's particularly helpful for prompt engineering purposes.
As Anthropic seeks to improve the extended thinking feature, summarization behavior is subject to change.
Summarization preserves the key ideas of Claude's thinking process with minimal added latency, enabling a streamable user experience and easy migration from Claude Sonnet 3.7 to Claude 4 and later models.
Summarization is processed by a different model than the one you target in your requests. The thinking model does not see the summarized output.

Claude Sonnet 3.7 continues to return full thinking output.

In rare cases where you need access to full thinking output for Claude 4 models, contact our sales team.

思考の暗号化

Full thinking content is encrypted and returned in the signature field. This field is used to verify that thinking blocks were generated by Claude when passed back to the API.

If sending back thinking blocks, we recommend passing everything back as you received it for consistency and to avoid potential issues.

Here are some important considerations on thinking encryption:

When streaming responses, the signature is added via a signature_delta inside a content_block_delta event just before the content_block_stop event.
signature values are significantly longer in Claude 4 models than in previous models.
The signature field is an opaque field and should not be interpreted or parsed - it exists solely for verification purposes.
signature values are compatible across platforms (Claude APIs, Amazon Bedrock, and Vertex AI). Values generated on one platform will be compatible with another.

思考の編集

When building customer-facing applications that use extended thinking:

Be aware that redacted thinking blocks contain encrypted content that isn't human-readable
Consider providing a simple explanation like: "Some of Claude's internal reasoning has been automatically encrypted for safety reasons. This doesn't affect the quality of responses."
If showing thinking blocks to users, you can filter out redacted blocks while preserving normal thinking blocks
Be transparent that using extended thinking features may occasionally result in some reasoning being encrypted
Implement appropriate error handling to gracefully manage redacted thinking without breaking your UI

Here's an example showing both normal and redacted thinking blocks:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "redacted_thinking",
      "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpPkNRj2YfWXGmKDxH4mPnZ5sQ7vB9URj2pLmN3kF8/dW5hR7xJ0aP1oLs9yTcMnKVf2wRpEGjH9XZaBt4UvDcPrQ..."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

Seeing redacted thinking blocks in your output is expected behavior. The model can still use this redacted reasoning to inform its responses while maintaining safety guardrails.

料金

For complete pricing information including base rates, cache writes, cache hits, and output tokens, see the pricing page.

The thinking process incurs charges for:

Tokens used during thinking (output tokens)
Thinking blocks from the last assistant turn included in subsequent requests (input tokens)
Standard text output tokens

When extended thinking is enabled, a specialized system prompt is automatically included to support this feature.

When using summarized thinking:

Input tokens: Tokens in your original request (excludes thinking tokens from previous turns)
Output tokens (billed): The original thinking tokens that Claude generated internally
Output tokens (visible): The summarized thinking tokens you see in the response
No charge: Tokens used to generate the summary

The billed output token count will not match the visible token count in the response. You are billed for the full thinking process, not the summary you see.

その他のトピック

拡張思考ページでは、モード固有のコード例を含むいくつかのトピックをより詳細にカバーしています：

思考を使用したツール使用：適応型思考にも同じルールが適用されます — ツール呼び出し間の思考ブロックを保持し、思考がアクティブな場合のtool_choiceの制限に注意してください。
プロンプトキャッシング：適応型思考では、同じ思考モードを使用する連続リクエストはキャッシュブレークポイントを保持します。adaptiveとenabled/disabledモード間の切り替えは、メッセージのキャッシュブレークポイントを壊します（システムプロンプトとツール定義はキャッシュされたままです）。
コンテキストウィンドウ：思考トークンがmax_tokensおよびコンテキストウィンドウの制限とどのように相互作用するか。

次のステップ

拡張思考

手動モード、ツール使用、プロンプトキャッシングを含む拡張思考の詳細を学びます。

effortパラメータ

effortパラメータでClaudeの応答の徹底度を制御します。

Was this page helpful?