Models & pricing

What's new in Claude 4.6

Overview of new features and capabilities in Claude Opus 4.6.

Claude 4.6 represents the next generation of Claude models, bringing significant new capabilities and API improvements. This page summarizes all new features available at launch.

New models

Model	API model ID	Description
Claude Opus 4.6	`claude-opus-4-6`	Our most intelligent model for building agents and coding

Claude Opus 4.6 supports a 200K context window (with 1M token context window available in beta), 128K max output tokens, extended thinking, and all existing Claude API features.

For complete pricing and specs, see the models overview.

New features

Adaptive thinking mode

Adaptive thinking (thinking: {type: "adaptive"}) is the recommended thinking mode for Opus 4.6. Claude dynamically decides when and how much to think. At the default effort level (high), Claude will almost always think. At lower effort levels, it may skip thinking for simpler problems.

thinking: {type: "enabled"} and budget_tokens are deprecated on Opus 4.6. They remain functional but will be removed in a future model release. Use adaptive thinking and the effort parameter to control thinking depth instead. Adaptive thinking also automatically enables interleaved thinking.

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "Solve this complex problem..."}]
)

Effort parameter GA

The effort parameter is now generally available (no beta header required). A new max effort level provides the absolute highest capability on Opus 4.6. Combine effort with adaptive thinking for optimal cost-quality tradeoffs.

Compaction API (beta)

Compaction provides automatic, server-side context summarization, enabling effectively infinite conversations. When context approaches the window limit, the API automatically summarizes earlier parts of the conversation.

Fine-grained tool streaming (GA)

Fine-grained tool streaming is now generally available on all models and platforms. No beta header is required.

128K output tokens

Opus 4.6 supports up to 128K output tokens, doubling the previous 64K limit. This enables longer thinking budgets and more comprehensive responses. The SDKs require streaming for requests with large max_tokens values to avoid HTTP timeouts. If you don't need to process events incrementally, use .stream() with .get_final_message() to get the complete response — see Streaming Messages for details.

Data residency controls

Data residency controls allow you to specify where model inference runs using the inference_geo parameter. You can choose "global" (default) or "us" routing per request. US-only inference is priced at 1.1x on Claude Opus 4.6 and newer models.

Deprecations

`type: "enabled"` and `budget_tokens`

thinking: {type: "enabled", budget_tokens: N} is deprecated on Opus 4.6. It remains functional but will be removed in a future model release. Migrate to thinking: {type: "adaptive"} with the effort parameter.

`interleaved-thinking-2025-05-14` beta header

The interleaved-thinking-2025-05-14 beta header is deprecated on Opus 4.6. It is safely ignored if included, but is no longer required. Adaptive thinking automatically enables interleaved thinking. Remove betas=["interleaved-thinking-2025-05-14"] from your requests when using Opus 4.6.

`output_format`

The output_format parameter for structured outputs has been moved to output_config.format. The old parameter remains functional but is deprecated and will be removed in a future model release.

# Before
response = client.messages.create(
    output_format={"type": "json_schema", "schema": {...}},
    ...
)

# After
response = client.messages.create(
    output_config={"format": {"type": "json_schema", "schema": {...}}},
    ...
)

Breaking changes

Prefill removal

Prefilling assistant messages (last-assistant-turn prefills) is not supported on Opus 4.6. Requests with prefilled assistant messages return a 400 error.

Alternatives:

Structured outputs for controlling response format
System prompt instructions for guiding response style
output_config.format for JSON output

Tool parameter quoting

Opus 4.6 may produce slightly different JSON string escaping in tool call arguments (e.g., different handling of Unicode escapes or forward slash escaping). Standard JSON parsers handle these differences automatically. If you parse tool call input as a raw string rather than using json.loads() or JSON.parse(), verify your parsing logic still works.

Migration guide

For step-by-step migration instructions, see Migrating to Claude 4.6.

Next steps

Adaptive thinking

Learn how to use adaptive thinking mode.

Models overview

Compare all Claude models.

Compaction

Explore server-side context compaction.

Migration guide

Step-by-step migration instructions.

Was this page helpful?

Models & pricing

What's new in Claude 4.6

Overview of new features and capabilities in Claude Opus 4.6.

Claude 4.6 represents the next generation of Claude models, bringing significant new capabilities and API improvements. This page summarizes all new features available at launch.

New models

Model	API model ID	Description
Claude Opus 4.6	`claude-opus-4-6`	Our most intelligent model for building agents and coding

Claude Opus 4.6 supports a 200K context window (with 1M token context window available in beta), 128K max output tokens, extended thinking, and all existing Claude API features.

For complete pricing and specs, see the models overview.

New features

Adaptive thinking mode

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[{"role": "user", "content": "Solve this complex problem..."}]
)

Effort parameter GA

Compaction API (beta)

Fine-grained tool streaming (GA)

Fine-grained tool streaming is now generally available on all models and platforms. No beta header is required.

128K output tokens

Data residency controls

Deprecations

`type: "enabled"` and `budget_tokens`

`interleaved-thinking-2025-05-14` beta header

`output_format`

The output_format parameter for structured outputs has been moved to output_config.format. The old parameter remains functional but is deprecated and will be removed in a future model release.

# Before
response = client.messages.create(
    output_format={"type": "json_schema", "schema": {...}},
    ...
)

# After
response = client.messages.create(
    output_config={"format": {"type": "json_schema", "schema": {...}}},
    ...
)

Breaking changes

Prefill removal

Prefilling assistant messages (last-assistant-turn prefills) is not supported on Opus 4.6. Requests with prefilled assistant messages return a 400 error.

Alternatives:

Structured outputs for controlling response format
System prompt instructions for guiding response style
output_config.format for JSON output

Tool parameter quoting

Migration guide

For step-by-step migration instructions, see Migrating to Claude 4.6.

Next steps

Adaptive thinking

Learn how to use adaptive thinking mode.

Models overview

Compare all Claude models.

Compaction

Explore server-side context compaction.

Migration guide

Step-by-step migration instructions.

Was this page helpful?

New models

New features

Adaptive thinking mode

Effort parameter GA

Compaction API (beta)

Fine-grained tool streaming (GA)

128K output tokens

Data residency controls

Deprecations

type: "enabled" and budget_tokens

interleaved-thinking-2025-05-14 beta header

output_format

Breaking changes

Prefill removal

Tool parameter quoting

Migration guide

Next steps

New models

New features

Adaptive thinking mode

Effort parameter GA

Compaction API (beta)

Fine-grained tool streaming (GA)

128K output tokens

Data residency controls

Deprecations

type: "enabled" and budget_tokens

interleaved-thinking-2025-05-14 beta header

output_format

Breaking changes

Prefill removal

Tool parameter quoting

Migration guide

Next steps

`type: "enabled"` and `budget_tokens`

`interleaved-thinking-2025-05-14` beta header

`output_format`

`type: "enabled"` and `budget_tokens`

`interleaved-thinking-2025-05-14` beta header

`output_format`