Claude Opus 4.6 is a near drop-in replacement for Claude 4.5, with a few breaking changes to be aware of. For a full list of new features, see What's new in Claude 4.6.
# Opus migration
model = "claude-opus-4-5" # Before
model = "claude-opus-4-6" # AfterPrefill removal: Prefilling assistant messages returns a 400 error on Claude 4.6 models. Use structured outputs, system prompt instructions, or output_config.format instead.
Tool parameter quoting: Claude 4.6 models may produce slightly different JSON string escaping in tool call arguments (e.g., different handling of Unicode escapes or forward slash escaping). If you parse tool call input as a raw string rather than using a JSON parser, verify your parsing logic. Standard JSON parsers (like json.loads() or JSON.parse()) handle these differences automatically.
These are not required but will improve your experience:
Migrate to adaptive thinking: thinking: {type: "enabled", budget_tokens: N} is deprecated on Claude 4.6 models and will be removed in a future model release. Switch to thinking: {type: "adaptive"} and use the effort parameter to control thinking depth. See Adaptive thinking.
response = client.beta.messages.create(
model="claude-opus-4-5",
max_tokens=16000,
thinking={"type": "enabled", "budget_tokens": 32000},
betas=["interleaved-thinking-2025-05-14"],
messages=[...],
)Note that the migration also moves from client.beta.messages.create to client.messages.create. Adaptive thinking and effort are GA features and do not require the beta SDK namespace or any beta headers.
Remove effort beta header: The effort parameter is now GA. Remove betas=["effort-2025-11-24"] from your requests.
Remove fine-grained tool streaming beta header: Fine-grained tool streaming is now GA. Remove betas=["fine-grained-tool-streaming-2025-05-14"] from your requests.
Remove interleaved thinking beta header (Opus 4.6 only): Adaptive thinking automatically enables interleaved thinking on Opus 4.6. Remove betas=["interleaved-thinking-2025-05-14"] from your Opus 4.6 requests. Note: Sonnet 4.6 continues to support this beta header with manual extended thinking.
Migrate to output_config.format: If using structured outputs, update output_format={...} to output_config={"format": {...}}. The old parameter remains functional but is deprecated and will be removed in a future model release.
If you're migrating from Opus 4.1, Sonnet 4, or earlier models directly to Claude 4.6, apply the Claude 4.6 breaking changes above plus the additional changes in this section.
# From Opus 4.1
model = "claude-opus-4-1-20250805" # Before
model = "claude-opus-4-6" # After
# From Sonnet 4
model = "claude-sonnet-4-20250514" # Before
model = "claude-opus-4-6" # After
# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219" # Before
model = "claude-opus-4-6" # AfterUpdate sampling parameters
This is a breaking change when migrating from Claude 3.x models.
Use only temperature OR top_p, not both:
# Before - This will error in Claude 4+ models
response = client.messages.create(
model="claude-3-7-sonnet-20250219",
temperature=0.7,
top_p=0.9, # Cannot use both
# ...
)
# After
response = client.messages.create(
model="claude-opus-4-6",
temperature=0.7, # Use temperature OR top_p, not both
# ...
)Update tool versions
This is a breaking change when migrating from Claude 3.x models.
Update to the latest tool versions. Remove any code using the undo_edit command.
# Before
tools = [{"type": "text_editor_20250124", "name": "str_replace_editor"}]
# After
tools = [{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}]text_editor_20250728 and str_replace_based_edit_tool. See Text editor tool documentation for details.code_execution_20250825. See Code execution tool documentation for migration instructions.Handle the refusal stop reason
Update your application to handle refusal stop reasons:
response = client.messages.create(...)
if response.stop_reason == "refusal":
# Handle refusal appropriately
passHandle the model_context_window_exceeded stop reason
Claude 4.5+ models return a model_context_window_exceeded stop reason when generation stops due to hitting the context window limit, rather than the requested max_tokens limit. Update your application to handle this new stop reason:
response = client.messages.create(...)
if response.stop_reason == "model_context_window_exceeded":
# Handle context window limit appropriately
passVerify tool parameter handling (trailing newlines)
Claude 4.5+ models preserve trailing newlines in tool call string parameters that were previously stripped. If your tools rely on exact string matching against tool call parameters, verify your logic handles trailing newlines correctly.
Update your prompts for behavioral changes
Claude 4+ models have a more concise, direct communication style and require explicit direction. Review prompting best practices for optimization guidance.
token-efficient-tools-2025-02-19 and output-128k-2025-02-19. All Claude 4+ models have built-in token-efficient tool use and these headers have no effect.claude-opus-4-6output_config.format insteadthinking: {type: "enabled", budget_tokens: N} to thinking: {type: "adaptive"} with the effort parameter (budget_tokens is deprecated and will be removed in a future release)effort-2025-11-24 beta header (effort is now GA)fine-grained-tool-streaming-2025-05-14 beta headerinterleaved-thinking-2025-05-14 beta header (Opus 4.6 only; Sonnet 4.6 still supports it)output_format to output_config.format (if applicable)temperature OR top_ptext_editor_20250728, code_execution_20250825)refusal stop reasonmodel_context_window_exceeded stop reasontoken-efficient-tools-2025-02-19, output-128k-2025-02-19)Claude Sonnet 4.6 combines strong intelligence with fast performance, featuring improved agentic search capabilities and free code execution when used with web search or web fetch. It is ideal for everyday coding, analysis, and content tasks.
For a complete overview of capabilities, see the models overview.
Sonnet 4.6 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.
Update your model name:
# From Sonnet 4.5
model = "claude-sonnet-4-5" # Before
model = "claude-sonnet-4-6" # After
# From Sonnet 4
model = "claude-sonnet-4-20250514" # Before
model = "claude-sonnet-4-6" # AfterPrefilling assistant messages is no longer supported
This is a breaking change when migrating from Sonnet 4.5 or earlier.
Prefilling assistant messages returns a 400 error on Sonnet 4.6. Use structured outputs, system prompt instructions, or output_config.format instead.
Common prefill use cases and migrations:
Controlling output formatting (forcing JSON/YAML output): Use structured outputs or tools with enum fields for classification tasks.
Eliminating preambles (removing "Here is..." phrases): Add direct instructions in the system prompt: "Respond directly without preamble. Do not start with phrases like 'Here is...', 'Based on...', etc."
Avoiding bad refusals: Claude is much better at appropriate refusals now. Clear prompting in the user message without prefill should be sufficient.
Continuations (resuming interrupted responses): Move the continuation to the user message: "Your previous response was interrupted and ended with [previous_response]. Continue from where you left off."
Context hydration / role consistency (refreshing context in long conversations): Inject what were previously prefilled-assistant reminders into the user turn instead.
Tool parameter JSON escaping may differ
This is a breaking change when migrating from Sonnet 4.5 or earlier.
JSON string escaping in tool parameters may differ from previous models. Standard JSON parsers handle this automatically, but custom string-based parsing may need updates.
Update sampling parameters
This is a breaking change when migrating from Claude 3.x models.
Use only temperature OR top_p, not both.
Update tool versions
This is a breaking change when migrating from Claude 3.x models.
Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason
Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes
Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.
fine-grained-tool-streaming-2025-05-14 beta header — Fine-grained tool streaming is now GA on Sonnet 4.6 and no longer requires a beta header.output_format to output_config.format — The output_format parameter is deprecated. Use output_config.format instead.We strongly encourage migrating from Sonnet 4.5 to Sonnet 4.6, which delivers more intelligence at the same price point.
Sonnet 4.6 defaults to an effort level of high, in contrast to Sonnet 4.5 which had no effort parameter. We recommend adjusting the effort parameter as you migrate from Sonnet 4.5 to Sonnet 4.6. If not explicitly set, you may experience higher latency with the default effort level.
If you're not using extended thinking on Sonnet 4.5, you can continue without it on Sonnet 4.6. You should explicitly set effort to the level appropriate for your use case. At low effort with thinking disabled, you can expect similar or better performance relative to Sonnet 4.5 with no extended thinking.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=8192,
output_config={"effort": "low"},
messages=[{"role": "user", "content": "Your prompt here"}],
)If you're using extended thinking on Sonnet 4.5, it continues to be supported on Sonnet 4.6 with no changes needed to your thinking configuration. We recommend keeping a thinking budget around 16k tokens. In practice, most tasks don't use that much, but it provides headroom for harder problems without risk of runaway token usage.
For agentic coding, frontend design, tool-heavy workflows, and complex enterprise workflows, we recommend starting with medium effort. If you find latency is too high, consider reducing effort to low. If you need higher intelligence, consider increasing effort to high or migrating to Opus 4.6.
response = client.beta.messages.create(
model="claude-sonnet-4-6",
max_tokens=16384,
thinking={"type": "enabled", "budget_tokens": 16384},
output_config={"effort": "medium"},
betas=["interleaved-thinking-2025-05-14"],
messages=[{"role": "user", "content": "Your prompt here"}],
)For chat, content generation, search, classification, and other non-coding tasks, we recommend starting with low effort with extended thinking. If you need more depth, increase effort to medium.
response = client.beta.messages.create(
model="claude-sonnet-4-6",
max_tokens=8192,
thinking={"type": "enabled", "budget_tokens": 16384},
output_config={"effort": "low"},
betas=["interleaved-thinking-2025-05-14"],
messages=[{"role": "user", "content": "Your prompt here"}],
)The migration paths above use extended thinking with budget_tokens for predictable token usage. If your workload fits one of the following patterns, consider trying adaptive thinking instead:
high effort. If latency or token usage is a concern, scale down to medium.When using adaptive thinking, evaluate medium and high effort on your tasks. The right level depends on your workload's tradeoff between quality, latency, and token usage.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=64000,
thinking={"type": "adaptive"},
output_config={"effort": "medium"},
messages=[{"role": "user", "content": "Your prompt here"}],
)If you see inconsistent behavior or quality regressions with adaptive thinking, switch to extended thinking with budget_tokens. This provides more predictable results with a cap on thinking costs.
claude-sonnet-4-6output_config.format insteadtext_editor_20250728, code_execution_20250825); legacy versions are not supported (if migrating from 3.x)undo_edit command (if applicable)temperature OR top_p, not both (if migrating from 3.x)refusal stop reason in your applicationfine-grained-tool-streaming-2025-05-14 beta header (now GA)output_format to output_config.formatClaude Sonnet 4.5 combines strong intelligence with fast performance, making it ideal for everyday coding, analysis, and content tasks.
For a complete overview of capabilities, see the models overview.
Sonnet 4.5 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.
Update your model name:
# From Sonnet 4
model = "claude-sonnet-4-20250514" # Before
model = "claude-sonnet-4-5-20250929" # After
# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219" # Before
model = "claude-sonnet-4-5-20250929" # AfterThese breaking changes apply when migrating from Claude 3.x Sonnet models.
Update sampling parameters
This is a breaking change when migrating from Claude 3.x models.
Use only temperature OR top_p, not both.
Update tool versions
This is a breaking change when migrating from Claude 3.x models.
Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason
Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes
Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.
claude-sonnet-4-5-20250929text_editor_20250728, code_execution_20250825); legacy versions are not supported (if migrating from 3.x)undo_edit command (if applicable)temperature OR top_p, not both (if migrating from 3.x)refusal stop reason in your applicationClaude Haiku 4.5 is the fastest and most intelligent Haiku model with near-frontier performance, delivering premium model quality for interactive applications and high-volume processing.
For a complete overview of capabilities, see the models overview.
Haiku 4.5 pricing is $1 per million input tokens, $5 per million output tokens. See Claude pricing for details.
Update your model name:
# From Haiku 3.5
model = "claude-3-5-haiku-20241022" # Before
model = "claude-haiku-4-5-20251001" # AfterReview new rate limits: Haiku 4.5 has separate rate limits from Haiku 3.5. See Rate limits documentation for details.
For significant performance improvements on coding and reasoning tasks, consider enabling extended thinking with thinking: {type: "enabled", budget_tokens: N}.
Extended thinking impacts prompt caching efficiency.
Extended thinking is deprecated in Claude 4.6 or newer models. If using newer models, use adaptive thinking instead.
Explore new capabilities: See the models overview for details on context awareness, increased output capacity (64K tokens), higher intelligence, and improved speed.
These breaking changes apply when migrating from Claude 3.x Haiku models.
Update sampling parameters
This is a breaking change when migrating from Claude 3.x models.
Use only temperature OR top_p, not both.
Update tool versions
This is a breaking change when migrating from Claude 3.x models.
Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason
Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes
Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.
claude-haiku-4-5-20251001text_editor_20250728, code_execution_20250825); legacy versions are not supportedundo_edit command (if applicable)temperature OR top_p, not bothrefusal stop reason in your applicationWas this page helpful?