Models & pricingModels

Migration guide

Guide for migrating to Claude Opus 4.7 and Claude 4.6 models from previous Claude versions

This guide covers migrating Messages API code. If you use Claude Managed Agents, no changes beyond updating model name are required.

Migrating to Claude Opus 4.7

Claude Opus 4.7 is our most capable generally available model to date. It is highly autonomous and performs exceptionally well on long-horizon agentic work, knowledge work, vision tasks, and memory tasks.

Claude Opus 4.7 should have strong out-of-the-box performance on existing Claude Opus 4.6 prompts and evals at the same $5 / $25 per MTok pricing, but there are a handful of behavioral and API changes worth knowing about as you migrate. It supports the same set of features as Claude Opus 4.6, including:

1M token context window at standard API pricing with no long-context premium
128k max output tokens
Adaptive thinking
Prompt caching
Batch processing
Files API
PDF support
Vision
The full set of server-side and client-side tools (bash, code execution, computer use, text editor, web search, web fetch, MCP connector, memory)

Automate this migration with the Claude API skill. In Claude Code, run /claude-api migrate to invoke the bundled Claude API skill:

/claude-api migrate this project to claude-opus-4-7

The skill applies the model ID swap, breaking parameter changes, prefill replacement, and effort calibration described below across your codebase, then produces a checklist of items to verify manually. It asks you to confirm the migration scope (entire working directory, a subdirectory, or a specific file list) before editing any files.

Update your model name

# Opus migration
model = "claude-opus-4-6"  # Before
model = "claude-opus-4-7"  # After

Breaking changes

Extended thinking removed: thinking: {type: "enabled", budget_tokens: N} is no longer supported on Claude Opus 4.7 or later models and returns a 400 error. Switch to adaptive thinking (thinking: {type: "adaptive"}) and use the effort parameter to control thinking depth. Adaptive thinking is off by default on Claude Opus 4.7: requests with no thinking field run without thinking, matching Opus 4.6 behavior. Set thinking: {type: "adaptive"} explicitly to enable it.

Before (Claude Opus 4.6):
```
client.messages.create(
    model="claude-opus-4-6",
    max_tokens=64000,
    thinking={"type": "enabled", "budget_tokens": 32000},
    messages=[{"role": "user", "content": "..."}],
)
```
After (Claude Opus 4.7):
```
client.messages.create(
    model="claude-opus-4-7",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},  # or "max", "xhigh", "medium", "low"
    messages=[{"role": "user", "content": "..."}],
)
```
Adaptive thinking is steerable through prompting. For guidance on tuning when the model over- or under-thinks, see Calibrating effort and thinking depth.
Sampling parameters removed: Setting temperature, top_p, or top_k to any non-default value on Claude Opus 4.7 returns a 400 error. The safest migration path is to omit these parameters entirely from request payloads. Prompting is the recommended way to guide model behavior on Claude Opus 4.7. If you were using temperature = 0 for determinism, note that it never guaranteed identical outputs on prior models.
Thinking content omitted by default: Thinking blocks still appear in the response stream on Claude Opus 4.7, but their thinking field is empty unless you explicitly opt in. This is a silent change from Claude Opus 4.6, where the default was to return summarized thinking text. To restore summarized thinking content on Claude Opus 4.7, set thinking.display to "summarized":
```
thinking = {
    "type": "adaptive",
    "display": "summarized",
}
```
The default is "omitted" on Claude Opus 4.7. If your product streams reasoning to users, the new default appears as a long pause before output begins; set display: "summarized" to restore visible progress during thinking. See Extended thinking for details.
Updated token counting: Claude Opus 4.7 uses a new tokenizer, contributing to its improved performance on a wide range of tasks. The new tokenizer may use roughly 1x to 1.35x as many tokens when processing text compared to previous models (up to ~35% more, varying by content).

/v1/messages/count_tokens will return a different number of tokens for Claude Opus 4.7 than it did for Claude Opus 4.6. Token efficiency can vary by workload shape.

Prompting interventions, task_budget, and effort can help control costs and ensure appropriate token usage. These controls may trade off model intelligence. We suggest updating your max_tokens parameters to give additional headroom, including compaction triggers. Claude Opus 4.7 provides a 1M context window at standard API pricing with no long-context premium.
Prefill removal (carried over from Opus 4.6): Prefilling assistant messages returns a 400 error on Claude Opus 4.7. Use structured outputs, system prompt instructions, or output_config.format instead.

Choosing an effort level

The effort parameter allows you to tune Claude's intelligence vs. token spend, trading off capability for faster speed and lower costs. Start with the new xhigh effort level for coding and agentic use cases, and use a minimum of high effort for most intelligence-sensitive use cases. Experiment with other effort levels to further tune token usage and intelligence:

max: Max effort can deliver performance gains in some use cases, but may show diminishing returns from increased token usage. This setting can also sometimes be prone to overthinking. We recommend testing max effort for intelligence-demanding tasks.
xhigh (new): Extra high effort is the best setting for most coding and agentic use cases.
high: This setting balances token usage and intelligence. For most intelligence-sensitive use cases, we recommend a minimum of high effort.
medium: Good for cost-sensitive use cases that need to reduce token usage while trading off intelligence.
low: Reserve for short, scoped tasks and latency-sensitive workloads that are not intelligence-sensitive.

We expect effort to be more important for this model than for any prior Opus, and recommend experimenting with it actively when you upgrade.

Behavior changes

Claude Opus 4.7 has several behavioral differences from Claude Opus 4.6 that are not API breaking changes but may require prompt updates or scaffolding removal.

Response length varies by use case: Claude Opus 4.7 calibrates response length to how complex it judges the task to be, rather than defaulting to a fixed verbosity. This usually means shorter answers on simple lookups and much longer ones on open-ended analysis.

If your product depends on a certain style or verbosity of output, you may need to tune your prompts. For example, to decrease verbosity, add: "Provide concise, focused responses. Skip non-essential context, and keep examples minimal." If you see specific kinds of over-explaining, add targeted instructions in your prompt to prevent them.

Positive examples showing how Claude can communicate with the appropriate level of concision tend to be more effective than negative examples or instructions that tell the model what not to do.
More literal instruction following: Claude Opus 4.7 interprets prompts more literally and explicitly than Claude Opus 4.6, particularly at lower effort levels. It will not silently generalize an instruction from one item to another, and it will not infer requests you didn't make. The upside of this literalism is precision and less thrash. It generally performs better for API use cases with carefully tuned prompts, structured extraction, and pipelines where you want predictable behavior. A prompt and harness review may be especially helpful for migration to Claude Opus 4.7.
More direct tone: As with any new model, prose style on long-form writing may shift. Claude Opus 4.7 is more direct and opinionated, with less validation-forward phrasing and fewer emoji than Claude Opus 4.6's warmer style. If your product relies on a specific voice, re-evaluate style prompts against the new baseline.
Built-in progress updates in agentic traces: Claude Opus 4.7 provides more regular, higher-quality updates to the user throughout long agentic traces. If you've added scaffolding to force interim status messages ("After every 3 tool calls, summarize progress"), try removing it. If you find that the length or contents of Claude Opus 4.7's user-facing updates are not well-calibrated to your use case, explicitly describe what these updates should look like in the prompt and provide examples.
Fewer subagents spawned by default: Claude Opus 4.7 tends to spawn fewer subagents by default. However, this behavior is steerable through prompting; give Claude Opus 4.7 explicit guidance around when subagents are desirable.
Stricter effort calibration: Meaningfully changing from Claude Opus 4.6, Claude Opus 4.7 respects effort levels strictly, especially at the low end. At low and medium, the model scopes its work to what was asked rather than going above and beyond.

This is good for latency and cost, but on moderately complex tasks running at low effort there is some risk of under-thinking. If you observe shallow reasoning on complex problems, raise effort to high or xhigh rather than prompting around it.

If you need to keep effort at low for latency, add targeted guidance: "This task involves multi-step reasoning. Think carefully through the problem before responding." See Recommended effort levels for Claude Opus 4.7.
Fewer tool calls by default: Claude Opus 4.7 has a tendency to use tools less often than Claude Opus 4.6 and to use reasoning more. This produces better results in most cases.

To increase tool usage, raise the effort setting. high or xhigh effort settings show substantially more tool usage in agentic search and coding. You can also adjust your prompt to explicitly instruct the model about when and how to properly use its tools.
Real-time cybersecurity safeguards: Newly added in Claude Opus 4.7, requests that involve prohibited or high-risk topics may lead to refusals. For legitimate security work such as penetration testing, vulnerability research, or red-teaming, apply to the Cyber Verification Program to request reduced restrictions. See Safeguards, warnings, and appeals for background.
High-resolution image support: Claude Opus 4.7 is the first Claude model with high-resolution image support. Maximum image resolution is 2576 pixels on the long edge, up from 1568 pixels on prior models. This unlocks gains on vision-heavy workloads and is particularly valuable for computer use, screenshot understanding, and document analysis.

High-resolution support is automatic and requires no beta header or client-side opt-in. Two things to plan for:
- Full-resolution images can use up to approximately 3x more image tokens than on prior models (up to 4,784 tokens per image, compared to the previous cap of roughly 1,600 tokens per image). Re-budget max_tokens and cost expectations for image-heavy workloads, or downsample before sending if you do not need the additional fidelity.
- Pointing and bounding-box coordinates returned by the model are 1:1 with actual image pixels on Claude Opus 4.7, so no scale-factor conversion is required.
See High-resolution image support on Claude Opus 4.7 for details.

Recommended changes

These are not required but will improve your experience:

Re-evaluate max_tokens: Because the same text produces a higher token count on Claude Opus 4.7, we suggest updating your max_tokens parameters to give additional headroom, including compaction triggers. Prompting interventions, task_budget, and effort can help control costs and ensure appropriate token usage.
Audit token-count expectations: Any code path that estimates tokens client-side or assumes a fixed token-to-character ratio should be re-tested against Claude Opus 4.7. Use the Token counting endpoint to verify.
Adopt task budgets (beta): Claude Opus 4.7 introduces task budgets. These budgets let you inform Claude how many tokens it has for a full agentic loop, including thinking, tool calls, tool results, and final output. The model sees a running countdown and uses it to prioritize work and finish the task gracefully as the budget is consumed. To use, set the beta header task-budgets-2026-03-13 and add the following to your output config:
```
output_config = {
    "effort": "high",
    "task_budget": {"type": "tokens", "total": 128000},
}
```
You may need to experiment with different task budgets for your use case. If the model is given a task budget that is too restrictive, it may complete the task less thoroughly, referencing its budget as the constraint.

For open-ended agentic tasks where quality matters more than speed, do not set a task budget. Reserve task budgets for workloads where you need the model to scope its work to a token allowance. The minimum value for a task budget is 20k tokens.

A task budget is not a hard cap; it's a suggestion that the model is aware of. It differs from max_tokens:
- task_budget: an advisory cap across the full agentic loop. The model sees it and uses it to pace itself.
- max_tokens: a hard per-request ceiling on generated tokens. It is not passed to the model, so the model is not aware of it.
Use task_budget when you want the model to self-moderate, and max_tokens as a hard ceiling to cap usage.
Set a large max_tokens at max or xhigh effort: If you are running Claude Opus 4.7 at max or xhigh effort, set a large max output token budget so the model has room to think and act across its subagents and tool calls. We recommend starting at 64k tokens and tuning from there.
Downsample images if high resolution is unnecessary: Claude Opus 4.7 supports images up to 2576px / 3.75MP. High-res images use more tokens. If the additional image fidelity is unnecessary, downsample images before sending to Claude to avoid token-usage increases. See Images and vision.

Migration checklist

Migrating to Claude Opus 4.7 from Opus 4.5 or earlier

If you are migrating from Claude Opus 4.5, Opus 4.1, or an earlier model directly to Claude Opus 4.7, apply all of the Opus 4.7 changes above plus the cumulative changes in this section that took effect between Opus 4.5 and Opus 4.7. If you are migrating from Opus 4.6, you only need the Opus 4.7 section above.

Update your model name

# Opus migration
model = "claude-opus-4-5"  # Before
model = "claude-opus-4-7"  # After

Breaking changes

Prefill removal is covered in the Opus 4.7 breaking changes above.
Tool parameter quoting: Claude Opus 4.6 and later models may produce slightly different JSON string escaping in tool call arguments (e.g., different handling of Unicode escapes or forward slash escaping). If you parse tool call input as a raw string rather than using a JSON parser, verify your parsing logic. Standard JSON parsers (like json.loads() or JSON.parse()) handle these differences automatically.

Recommended changes

These changes improve your experience on Opus 4.7. Items marked (required on Opus 4.7) were optional recommendations when Opus 4.6 launched but are now mandatory; the rest remain recommended.

Migrate to adaptive thinking (required on Opus 4.7): thinking: {type: "enabled", budget_tokens: N} returns a 400 error on Claude Opus 4.7. Switch to thinking: {type: "adaptive"} and use the effort parameter to control thinking depth. See Adaptive thinking.
```
response = client.beta.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 32000},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[...],
)
```
Note that the migration also moves from client.beta.messages.create to client.messages.create. Adaptive thinking and effort are GA features and do not require the beta SDK namespace or any beta headers.
Remove effort beta header: The effort parameter is now GA. Remove betas=["effort-2025-11-24"] from your requests.
Remove fine-grained tool streaming beta header: Fine-grained tool streaming is now GA. Remove betas=["fine-grained-tool-streaming-2025-05-14"] from your requests.
Remove interleaved thinking beta header: Adaptive thinking automatically enables interleaved thinking on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. Remove betas=["interleaved-thinking-2025-05-14"] from your requests. The header is still functional on Sonnet 4.6 with manual extended thinking, but manual mode is deprecated.
Migrate to output_config.format: If using structured outputs, update output_format={...} to output_config={"format": {...}}. The old parameter remains functional but is deprecated and will be removed in a future model release.

Migrating from Claude 4.1 or earlier

If you're migrating from Opus 4.1, Sonnet 4 (deprecated), or earlier models directly to Claude Opus 4.7, apply the Claude Opus 4.7 changes at the top of this guide and the cumulative changes above plus the additional changes in this section.

# From Opus 4.1
model = "claude-opus-4-1-20250805"  # Before
model = "claude-opus-4-7"  # After

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-opus-4-7"  # After

# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219"  # Before
model = "claude-opus-4-7"  # After

Additional breaking changes

Remove sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Starting with Claude Opus 4.7, setting temperature, top_p, or top_k to any non-default value will return a 400 error. The safest migration path is to omit these parameters entirely from requests, and to use prompting to guide the model's behavior. If you were using temperature = 0 for determinism, note that it never guaranteed identical outputs.
Python
```
# Before - This will error in Claude 4+ models
response = client.messages.create(
    model="claude-3-7-sonnet-20250219",
    temperature=0.7,
    top_p=0.9,  # Non-default sampling params return 400 on Opus 4.7
    # ...
)

# After
response = client.messages.create(
    model="claude-opus-4-7",
    # ...
)
```
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions. Remove any code using the undo_edit command.
```
# Before
tools = [{"type": "text_editor_20250124", "name": "str_replace_editor"}]

# After
tools = [{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}]
```
- Text editor: Use text_editor_20250728 and str_replace_based_edit_tool. See Text editor tool documentation for details.
- Code execution: Upgrade to code_execution_20250825. See Code execution tool documentation for migration instructions.

Handle the refusal stop reason

Update your application to handle refusal stop reasons:

Python

response = client.messages.create(...)

if response.stop_reason == "refusal":
    # Handle refusal appropriately
    pass

Handle the model_context_window_exceeded stop reason

Claude 4.5+ models return a model_context_window_exceeded stop reason when generation stops due to hitting the context window limit, rather than the requested max_tokens limit. Update your application to handle this new stop reason:
Python
```
response = client.messages.create(...)

if response.stop_reason == "model_context_window_exceeded":
    # Handle context window limit appropriately
    pass
```
Verify tool parameter handling (trailing newlines)

Claude 4.5+ models preserve trailing newlines in tool call string parameters that were previously stripped. If your tools rely on exact string matching against tool call parameters, verify your logic handles trailing newlines correctly.
Update your prompts for behavioral changes

Claude 4+ models have a more concise, direct communication style and require explicit direction. Review prompting best practices for optimization guidance.

Additional recommended changes

Remove legacy beta headers: Remove token-efficient-tools-2025-02-19 and output-128k-2025-02-19. All Claude 4+ models have built-in token-efficient tool use and these headers have no effect.

Migration checklist (from Opus 4.5 or earlier)

Migrating to Claude Sonnet 4.6

Claude Sonnet 4.6 combines strong intelligence with fast performance, featuring improved agentic search capabilities and free code execution when used with web search or web fetch. It is ideal for everyday coding, analysis, and content tasks.

For a complete overview of capabilities, see the models overview.

Sonnet 4.6 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.

Update your model name:

# From Sonnet 4.5
model = "claude-sonnet-4-5"  # Before
model = "claude-sonnet-4-6"  # After

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-sonnet-4-6"  # After

Breaking changes

When migrating from Sonnet 4.5

Prefilling assistant messages is no longer supported

This is a breaking change when migrating from Sonnet 4.5 or earlier.

Prefilling assistant messages returns a 400 error on Sonnet 4.6. Use structured outputs, system prompt instructions, or output_config.format instead.

Common prefill use cases and migrations:
- Controlling output formatting (forcing JSON/YAML output): Use structured outputs or tools with enum fields for classification tasks.
- Eliminating preambles (removing "Here is..." phrases): Add direct instructions in the system prompt: "Respond directly without preamble. Do not start with phrases like 'Here is...', 'Based on...', etc."
- Avoiding bad refusals: Claude is much better at appropriate refusals now. Clear prompting in the user message without prefill should be sufficient.
- Continuations (resuming interrupted responses): Move the continuation to the user message: "Your previous response was interrupted and ended with [previous_response]. Continue from where you left off."
- Context hydration / role consistency (refreshing context in long conversations): Inject what were previously prefilled-assistant reminders into the user turn instead.
Tool parameter JSON escaping may differ

This is a breaking change when migrating from Sonnet 4.5 or earlier.

JSON string escaping in tool parameters may differ from previous models. Standard JSON parsers handle this automatically, but custom string-based parsing may need updates.

When migrating from Claude 3.x

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Recommended changes

Remove fine-grained-tool-streaming-2025-05-14 beta header: Fine-grained tool streaming is now GA on Sonnet 4.6 and no longer requires a beta header.
Migrate output_format to output_config.format: The output_format parameter is deprecated. Use output_config.format instead.

Migrating from Sonnet 4.5

Consider migrating from Sonnet 4.5 to Sonnet 4.6, which delivers more intelligence at the same price point.

Sonnet 4.6 defaults to an effort level of high, in contrast to Sonnet 4.5 which had no effort parameter. Consider adjusting the effort parameter as you migrate from Sonnet 4.5 to Sonnet 4.6. If not explicitly set, you may experience higher latency with the default effort level.

If you're not using extended thinking

If you're not using extended thinking on Sonnet 4.5, you can continue without it on Sonnet 4.6. You should explicitly set effort to the level appropriate for your use case. At low effort with thinking disabled, you can expect similar or better performance relative to Sonnet 4.5 with no extended thinking.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    output_config={"effort": "low"},
    messages=[{"role": "user", "content": "Your prompt here"}],
)

If you're using extended thinking

If you're using extended thinking with budget_tokens on Sonnet 4.5, it is still functional on Sonnet 4.6 but is deprecated. Migrate to adaptive thinking with the effort parameter.

Migrating to adaptive thinking

Adaptive thinking is the recommended replacement for budget_tokens on Sonnet 4.6. It is particularly well suited to the following workload patterns:

Autonomous multi-step agents: coding agents that turn requirements into working software, data analysis pipelines, and bug finding where the model runs independently across many steps. Adaptive thinking lets the model calibrate its reasoning per step, staying on path over longer trajectories. For these workloads, start at high effort. If latency or token usage is a concern, scale down to medium.
Computer use agents: Sonnet 4.6 achieved best-in-class accuracy on computer use evaluations using adaptive mode.
Bimodal workloads: a mix of easy and hard tasks where adaptive skips thinking on simple queries and reasons deeply on complex ones.

When using adaptive thinking, evaluate medium and high effort on your tasks. The right level depends on your workload's tradeoff between quality, latency, and token usage.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={"effort": "medium"},
    messages=[{"role": "user", "content": "Your prompt here"}],
)

If you see inconsistent behavior or quality regressions with adaptive thinking, try lowering the effort setting or using max_tokens as a hard limit first. Extended thinking with budget_tokens is still functional on Sonnet 4.6 but is deprecated and no longer recommended.

Keeping budget_tokens during migration

If you need to keep budget_tokens temporarily while migrating, a budget around 16k tokens provides headroom for harder problems without risk of runaway token usage. This configuration is deprecated and will be removed in a future model release.

Coding and agentic use cases

For agentic coding, frontend design, tool-heavy workflows, and complex enterprise workflows, start with medium effort. If you find latency is too high, consider reducing effort to low. If you need higher intelligence, consider increasing effort to high or migrating to Opus 4.7.

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16384,
    thinking={"type": "enabled", "budget_tokens": 16384},
    output_config={"effort": "medium"},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[{"role": "user", "content": "Your prompt here"}],
)

Chat and non-coding use cases

For chat, content generation, search, classification, and other non-coding tasks, start with low effort with extended thinking. If you need more depth, increase effort to medium.

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    thinking={"type": "enabled", "budget_tokens": 16384},
    output_config={"effort": "low"},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[{"role": "user", "content": "Your prompt here"}],
)

Sonnet 4.6 migration checklist

Migrating to Claude Sonnet 4.5

Claude Sonnet 4.5 combines strong intelligence with fast performance, making it ideal for everyday coding, analysis, and content tasks.

For a complete overview of capabilities, see the models overview.

Sonnet 4.5 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.

Update your model name:

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-sonnet-4-5-20250929"  # After

# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219"  # Before
model = "claude-sonnet-4-5-20250929"  # After

Breaking changes

These breaking changes apply when migrating from Claude 3.x Sonnet models.

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Sonnet 4.5 migration checklist

Update model ID to claude-sonnet-4-5-20250929
BREAKING: Update tool versions to latest (text_editor_20250728, code_execution_20250825); legacy versions are not supported (if migrating from 3.x)
BREAKING: Remove any code using the undo_edit command (if applicable)
BREAKING: Update sampling parameters to use only temperature OR top_p, not both (if migrating from 3.x)
Handle new refusal stop reason in your application
Review and update prompts following prompting best practices
Consider enabling extended thinking for complex reasoning tasks
Test in development environment before production deployment

Migrating to Claude Haiku 4.5

Claude Haiku 4.5 is the fastest and most intelligent Haiku model with near-frontier performance, delivering premium model quality for interactive applications and high-volume processing.

For a complete overview of capabilities, see the models overview.

Haiku 4.5 pricing is $1 per million input tokens, $5 per million output tokens. See Claude pricing for details.

Update your model name:

# From Haiku 3.5
model = "claude-3-5-haiku-20241022"  # Before
model = "claude-haiku-4-5-20251001"  # After

Review new rate limits: Haiku 4.5 has separate rate limits from Haiku 3.5. See Rate limits documentation for details.

For significant performance improvements on coding and reasoning tasks, consider enabling extended thinking with thinking: {type: "enabled", budget_tokens: N}.

Extended thinking impacts prompt caching efficiency.

Extended thinking is deprecated in Claude 4.6 models and removed in Claude Opus 4.7. If using newer models, use adaptive thinking instead.

Explore new capabilities: See the models overview for details on context awareness, increased output capacity (64k tokens), higher intelligence, and improved speed.

Breaking changes

These breaking changes apply when migrating from Claude 3.x Haiku models.

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Haiku 4.5 migration checklist

Update model ID to claude-haiku-4-5-20251001
BREAKING: Update tool versions to latest (text_editor_20250728, code_execution_20250825); legacy versions are not supported
BREAKING: Remove any code using the undo_edit command (if applicable)
BREAKING: Update sampling parameters to use only temperature OR top_p, not both
Handle new refusal stop reason in your application
Review and adjust for new rate limits (separate from Haiku 3.5)
Review and update prompts following prompting best practices
Consider enabling extended thinking for complex reasoning tasks
Test in development environment before production deployment

Get help

Check the API documentation for detailed specifications
Review model capabilities for performance comparisons
Review API release notes for API updates
Contact support if you encounter any issues during migration

Was this page helpful?

Models & pricingModels

Migration guide

Guide for migrating to Claude Opus 4.7 and Claude 4.6 models from previous Claude versions

This guide covers migrating Messages API code. If you use Claude Managed Agents, no changes beyond updating model name are required.

Migrating to Claude Opus 4.7

1M token context window at standard API pricing with no long-context premium
128k max output tokens
Adaptive thinking
Prompt caching
Batch processing
Files API
PDF support
Vision
The full set of server-side and client-side tools (bash, code execution, computer use, text editor, web search, web fetch, MCP connector, memory)

Automate this migration with the Claude API skill. In Claude Code, run /claude-api migrate to invoke the bundled Claude API skill:

/claude-api migrate this project to claude-opus-4-7

Update your model name

# Opus migration
model = "claude-opus-4-6"  # Before
model = "claude-opus-4-7"  # After

Breaking changes

Extended thinking removed: thinking: {type: "enabled", budget_tokens: N} is no longer supported on Claude Opus 4.7 or later models and returns a 400 error. Switch to adaptive thinking (thinking: {type: "adaptive"}) and use the effort parameter to control thinking depth. Adaptive thinking is off by default on Claude Opus 4.7: requests with no thinking field run without thinking, matching Opus 4.6 behavior. Set thinking: {type: "adaptive"} explicitly to enable it.

Before (Claude Opus 4.6):
```
client.messages.create(
    model="claude-opus-4-6",
    max_tokens=64000,
    thinking={"type": "enabled", "budget_tokens": 32000},
    messages=[{"role": "user", "content": "..."}],
)
```
After (Claude Opus 4.7):
```
client.messages.create(
    model="claude-opus-4-7",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={"effort": "high"},  # or "max", "xhigh", "medium", "low"
    messages=[{"role": "user", "content": "..."}],
)
```
Adaptive thinking is steerable through prompting. For guidance on tuning when the model over- or under-thinks, see Calibrating effort and thinking depth.
Sampling parameters removed: Setting temperature, top_p, or top_k to any non-default value on Claude Opus 4.7 returns a 400 error. The safest migration path is to omit these parameters entirely from request payloads. Prompting is the recommended way to guide model behavior on Claude Opus 4.7. If you were using temperature = 0 for determinism, note that it never guaranteed identical outputs on prior models.
Thinking content omitted by default: Thinking blocks still appear in the response stream on Claude Opus 4.7, but their thinking field is empty unless you explicitly opt in. This is a silent change from Claude Opus 4.6, where the default was to return summarized thinking text. To restore summarized thinking content on Claude Opus 4.7, set thinking.display to "summarized":
```
thinking = {
    "type": "adaptive",
    "display": "summarized",
}
```
The default is "omitted" on Claude Opus 4.7. If your product streams reasoning to users, the new default appears as a long pause before output begins; set display: "summarized" to restore visible progress during thinking. See Extended thinking for details.
Updated token counting: Claude Opus 4.7 uses a new tokenizer, contributing to its improved performance on a wide range of tasks. The new tokenizer may use roughly 1x to 1.35x as many tokens when processing text compared to previous models (up to ~35% more, varying by content).

/v1/messages/count_tokens will return a different number of tokens for Claude Opus 4.7 than it did for Claude Opus 4.6. Token efficiency can vary by workload shape.

Prompting interventions, task_budget, and effort can help control costs and ensure appropriate token usage. These controls may trade off model intelligence. We suggest updating your max_tokens parameters to give additional headroom, including compaction triggers. Claude Opus 4.7 provides a 1M context window at standard API pricing with no long-context premium.
Prefill removal (carried over from Opus 4.6): Prefilling assistant messages returns a 400 error on Claude Opus 4.7. Use structured outputs, system prompt instructions, or output_config.format instead.

Choosing an effort level

max: Max effort can deliver performance gains in some use cases, but may show diminishing returns from increased token usage. This setting can also sometimes be prone to overthinking. We recommend testing max effort for intelligence-demanding tasks.
xhigh (new): Extra high effort is the best setting for most coding and agentic use cases.
high: This setting balances token usage and intelligence. For most intelligence-sensitive use cases, we recommend a minimum of high effort.
medium: Good for cost-sensitive use cases that need to reduce token usage while trading off intelligence.
low: Reserve for short, scoped tasks and latency-sensitive workloads that are not intelligence-sensitive.

We expect effort to be more important for this model than for any prior Opus, and recommend experimenting with it actively when you upgrade.

Behavior changes

Claude Opus 4.7 has several behavioral differences from Claude Opus 4.6 that are not API breaking changes but may require prompt updates or scaffolding removal.

Response length varies by use case: Claude Opus 4.7 calibrates response length to how complex it judges the task to be, rather than defaulting to a fixed verbosity. This usually means shorter answers on simple lookups and much longer ones on open-ended analysis.

If your product depends on a certain style or verbosity of output, you may need to tune your prompts. For example, to decrease verbosity, add: "Provide concise, focused responses. Skip non-essential context, and keep examples minimal." If you see specific kinds of over-explaining, add targeted instructions in your prompt to prevent them.

Positive examples showing how Claude can communicate with the appropriate level of concision tend to be more effective than negative examples or instructions that tell the model what not to do.
More literal instruction following: Claude Opus 4.7 interprets prompts more literally and explicitly than Claude Opus 4.6, particularly at lower effort levels. It will not silently generalize an instruction from one item to another, and it will not infer requests you didn't make. The upside of this literalism is precision and less thrash. It generally performs better for API use cases with carefully tuned prompts, structured extraction, and pipelines where you want predictable behavior. A prompt and harness review may be especially helpful for migration to Claude Opus 4.7.
More direct tone: As with any new model, prose style on long-form writing may shift. Claude Opus 4.7 is more direct and opinionated, with less validation-forward phrasing and fewer emoji than Claude Opus 4.6's warmer style. If your product relies on a specific voice, re-evaluate style prompts against the new baseline.
Built-in progress updates in agentic traces: Claude Opus 4.7 provides more regular, higher-quality updates to the user throughout long agentic traces. If you've added scaffolding to force interim status messages ("After every 3 tool calls, summarize progress"), try removing it. If you find that the length or contents of Claude Opus 4.7's user-facing updates are not well-calibrated to your use case, explicitly describe what these updates should look like in the prompt and provide examples.
Fewer subagents spawned by default: Claude Opus 4.7 tends to spawn fewer subagents by default. However, this behavior is steerable through prompting; give Claude Opus 4.7 explicit guidance around when subagents are desirable.
Stricter effort calibration: Meaningfully changing from Claude Opus 4.6, Claude Opus 4.7 respects effort levels strictly, especially at the low end. At low and medium, the model scopes its work to what was asked rather than going above and beyond.

This is good for latency and cost, but on moderately complex tasks running at low effort there is some risk of under-thinking. If you observe shallow reasoning on complex problems, raise effort to high or xhigh rather than prompting around it.

If you need to keep effort at low for latency, add targeted guidance: "This task involves multi-step reasoning. Think carefully through the problem before responding." See Recommended effort levels for Claude Opus 4.7.
Fewer tool calls by default: Claude Opus 4.7 has a tendency to use tools less often than Claude Opus 4.6 and to use reasoning more. This produces better results in most cases.

To increase tool usage, raise the effort setting. high or xhigh effort settings show substantially more tool usage in agentic search and coding. You can also adjust your prompt to explicitly instruct the model about when and how to properly use its tools.
Real-time cybersecurity safeguards: Newly added in Claude Opus 4.7, requests that involve prohibited or high-risk topics may lead to refusals. For legitimate security work such as penetration testing, vulnerability research, or red-teaming, apply to the Cyber Verification Program to request reduced restrictions. See Safeguards, warnings, and appeals for background.
High-resolution image support: Claude Opus 4.7 is the first Claude model with high-resolution image support. Maximum image resolution is 2576 pixels on the long edge, up from 1568 pixels on prior models. This unlocks gains on vision-heavy workloads and is particularly valuable for computer use, screenshot understanding, and document analysis.

High-resolution support is automatic and requires no beta header or client-side opt-in. Two things to plan for:
- Full-resolution images can use up to approximately 3x more image tokens than on prior models (up to 4,784 tokens per image, compared to the previous cap of roughly 1,600 tokens per image). Re-budget max_tokens and cost expectations for image-heavy workloads, or downsample before sending if you do not need the additional fidelity.
- Pointing and bounding-box coordinates returned by the model are 1:1 with actual image pixels on Claude Opus 4.7, so no scale-factor conversion is required.
See High-resolution image support on Claude Opus 4.7 for details.

Recommended changes

These are not required but will improve your experience:

Re-evaluate max_tokens: Because the same text produces a higher token count on Claude Opus 4.7, we suggest updating your max_tokens parameters to give additional headroom, including compaction triggers. Prompting interventions, task_budget, and effort can help control costs and ensure appropriate token usage.
Audit token-count expectations: Any code path that estimates tokens client-side or assumes a fixed token-to-character ratio should be re-tested against Claude Opus 4.7. Use the Token counting endpoint to verify.
Adopt task budgets (beta): Claude Opus 4.7 introduces task budgets. These budgets let you inform Claude how many tokens it has for a full agentic loop, including thinking, tool calls, tool results, and final output. The model sees a running countdown and uses it to prioritize work and finish the task gracefully as the budget is consumed. To use, set the beta header task-budgets-2026-03-13 and add the following to your output config:
```
output_config = {
    "effort": "high",
    "task_budget": {"type": "tokens", "total": 128000},
}
```
You may need to experiment with different task budgets for your use case. If the model is given a task budget that is too restrictive, it may complete the task less thoroughly, referencing its budget as the constraint.

For open-ended agentic tasks where quality matters more than speed, do not set a task budget. Reserve task budgets for workloads where you need the model to scope its work to a token allowance. The minimum value for a task budget is 20k tokens.

A task budget is not a hard cap; it's a suggestion that the model is aware of. It differs from max_tokens:
- task_budget: an advisory cap across the full agentic loop. The model sees it and uses it to pace itself.
- max_tokens: a hard per-request ceiling on generated tokens. It is not passed to the model, so the model is not aware of it.
Use task_budget when you want the model to self-moderate, and max_tokens as a hard ceiling to cap usage.
Set a large max_tokens at max or xhigh effort: If you are running Claude Opus 4.7 at max or xhigh effort, set a large max output token budget so the model has room to think and act across its subagents and tool calls. We recommend starting at 64k tokens and tuning from there.
Downsample images if high resolution is unnecessary: Claude Opus 4.7 supports images up to 2576px / 3.75MP. High-res images use more tokens. If the additional image fidelity is unnecessary, downsample images before sending to Claude to avoid token-usage increases. See Images and vision.

Migration checklist

Migrating to Claude Opus 4.7 from Opus 4.5 or earlier

Update your model name

# Opus migration
model = "claude-opus-4-5"  # Before
model = "claude-opus-4-7"  # After

Breaking changes

Prefill removal is covered in the Opus 4.7 breaking changes above.
Tool parameter quoting: Claude Opus 4.6 and later models may produce slightly different JSON string escaping in tool call arguments (e.g., different handling of Unicode escapes or forward slash escaping). If you parse tool call input as a raw string rather than using a JSON parser, verify your parsing logic. Standard JSON parsers (like json.loads() or JSON.parse()) handle these differences automatically.

Recommended changes

These changes improve your experience on Opus 4.7. Items marked (required on Opus 4.7) were optional recommendations when Opus 4.6 launched but are now mandatory; the rest remain recommended.

Migrate to adaptive thinking (required on Opus 4.7): thinking: {type: "enabled", budget_tokens: N} returns a 400 error on Claude Opus 4.7. Switch to thinking: {type: "adaptive"} and use the effort parameter to control thinking depth. See Adaptive thinking.
```
response = client.beta.messages.create(
    model="claude-opus-4-5",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 32000},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[...],
)
```
Note that the migration also moves from client.beta.messages.create to client.messages.create. Adaptive thinking and effort are GA features and do not require the beta SDK namespace or any beta headers.
Remove effort beta header: The effort parameter is now GA. Remove betas=["effort-2025-11-24"] from your requests.
Remove fine-grained tool streaming beta header: Fine-grained tool streaming is now GA. Remove betas=["fine-grained-tool-streaming-2025-05-14"] from your requests.
Remove interleaved thinking beta header: Adaptive thinking automatically enables interleaved thinking on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. Remove betas=["interleaved-thinking-2025-05-14"] from your requests. The header is still functional on Sonnet 4.6 with manual extended thinking, but manual mode is deprecated.
Migrate to output_config.format: If using structured outputs, update output_format={...} to output_config={"format": {...}}. The old parameter remains functional but is deprecated and will be removed in a future model release.

Migrating from Claude 4.1 or earlier

# From Opus 4.1
model = "claude-opus-4-1-20250805"  # Before
model = "claude-opus-4-7"  # After

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-opus-4-7"  # After

# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219"  # Before
model = "claude-opus-4-7"  # After

Additional breaking changes

Remove sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Starting with Claude Opus 4.7, setting temperature, top_p, or top_k to any non-default value will return a 400 error. The safest migration path is to omit these parameters entirely from requests, and to use prompting to guide the model's behavior. If you were using temperature = 0 for determinism, note that it never guaranteed identical outputs.
Python
```
# Before - This will error in Claude 4+ models
response = client.messages.create(
    model="claude-3-7-sonnet-20250219",
    temperature=0.7,
    top_p=0.9,  # Non-default sampling params return 400 on Opus 4.7
    # ...
)

# After
response = client.messages.create(
    model="claude-opus-4-7",
    # ...
)
```
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions. Remove any code using the undo_edit command.
```
# Before
tools = [{"type": "text_editor_20250124", "name": "str_replace_editor"}]

# After
tools = [{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}]
```
- Text editor: Use text_editor_20250728 and str_replace_based_edit_tool. See Text editor tool documentation for details.
- Code execution: Upgrade to code_execution_20250825. See Code execution tool documentation for migration instructions.

Handle the refusal stop reason

Update your application to handle refusal stop reasons:

Python

response = client.messages.create(...)

if response.stop_reason == "refusal":
    # Handle refusal appropriately
    pass

Handle the model_context_window_exceeded stop reason

Claude 4.5+ models return a model_context_window_exceeded stop reason when generation stops due to hitting the context window limit, rather than the requested max_tokens limit. Update your application to handle this new stop reason:
Python
```
response = client.messages.create(...)

if response.stop_reason == "model_context_window_exceeded":
    # Handle context window limit appropriately
    pass
```
Verify tool parameter handling (trailing newlines)

Claude 4.5+ models preserve trailing newlines in tool call string parameters that were previously stripped. If your tools rely on exact string matching against tool call parameters, verify your logic handles trailing newlines correctly.
Update your prompts for behavioral changes

Claude 4+ models have a more concise, direct communication style and require explicit direction. Review prompting best practices for optimization guidance.

Additional recommended changes

Remove legacy beta headers: Remove token-efficient-tools-2025-02-19 and output-128k-2025-02-19. All Claude 4+ models have built-in token-efficient tool use and these headers have no effect.

Migration checklist (from Opus 4.5 or earlier)

Migrating to Claude Sonnet 4.6

For a complete overview of capabilities, see the models overview.

Sonnet 4.6 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.

Update your model name:

# From Sonnet 4.5
model = "claude-sonnet-4-5"  # Before
model = "claude-sonnet-4-6"  # After

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-sonnet-4-6"  # After

Breaking changes

When migrating from Sonnet 4.5

Prefilling assistant messages is no longer supported

This is a breaking change when migrating from Sonnet 4.5 or earlier.

Prefilling assistant messages returns a 400 error on Sonnet 4.6. Use structured outputs, system prompt instructions, or output_config.format instead.

Common prefill use cases and migrations:
- Controlling output formatting (forcing JSON/YAML output): Use structured outputs or tools with enum fields for classification tasks.
- Eliminating preambles (removing "Here is..." phrases): Add direct instructions in the system prompt: "Respond directly without preamble. Do not start with phrases like 'Here is...', 'Based on...', etc."
- Avoiding bad refusals: Claude is much better at appropriate refusals now. Clear prompting in the user message without prefill should be sufficient.
- Continuations (resuming interrupted responses): Move the continuation to the user message: "Your previous response was interrupted and ended with [previous_response]. Continue from where you left off."
- Context hydration / role consistency (refreshing context in long conversations): Inject what were previously prefilled-assistant reminders into the user turn instead.
Tool parameter JSON escaping may differ

This is a breaking change when migrating from Sonnet 4.5 or earlier.

JSON string escaping in tool parameters may differ from previous models. Standard JSON parsers handle this automatically, but custom string-based parsing may need updates.

When migrating from Claude 3.x

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Recommended changes

Remove fine-grained-tool-streaming-2025-05-14 beta header: Fine-grained tool streaming is now GA on Sonnet 4.6 and no longer requires a beta header.
Migrate output_format to output_config.format: The output_format parameter is deprecated. Use output_config.format instead.

Migrating from Sonnet 4.5

Consider migrating from Sonnet 4.5 to Sonnet 4.6, which delivers more intelligence at the same price point.

If you're not using extended thinking

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    output_config={"effort": "low"},
    messages=[{"role": "user", "content": "Your prompt here"}],
)

If you're using extended thinking

If you're using extended thinking with budget_tokens on Sonnet 4.5, it is still functional on Sonnet 4.6 but is deprecated. Migrate to adaptive thinking with the effort parameter.

Migrating to adaptive thinking

Adaptive thinking is the recommended replacement for budget_tokens on Sonnet 4.6. It is particularly well suited to the following workload patterns:

Autonomous multi-step agents: coding agents that turn requirements into working software, data analysis pipelines, and bug finding where the model runs independently across many steps. Adaptive thinking lets the model calibrate its reasoning per step, staying on path over longer trajectories. For these workloads, start at high effort. If latency or token usage is a concern, scale down to medium.
Computer use agents: Sonnet 4.6 achieved best-in-class accuracy on computer use evaluations using adaptive mode.
Bimodal workloads: a mix of easy and hard tasks where adaptive skips thinking on simple queries and reasons deeply on complex ones.

When using adaptive thinking, evaluate medium and high effort on your tasks. The right level depends on your workload's tradeoff between quality, latency, and token usage.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={"effort": "medium"},
    messages=[{"role": "user", "content": "Your prompt here"}],
)

Keeping budget_tokens during migration

Coding and agentic use cases

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16384,
    thinking={"type": "enabled", "budget_tokens": 16384},
    output_config={"effort": "medium"},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[{"role": "user", "content": "Your prompt here"}],
)

Chat and non-coding use cases

For chat, content generation, search, classification, and other non-coding tasks, start with low effort with extended thinking. If you need more depth, increase effort to medium.

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    thinking={"type": "enabled", "budget_tokens": 16384},
    output_config={"effort": "low"},
    betas=["interleaved-thinking-2025-05-14"],
    messages=[{"role": "user", "content": "Your prompt here"}],
)

Sonnet 4.6 migration checklist

Migrating to Claude Sonnet 4.5

Claude Sonnet 4.5 combines strong intelligence with fast performance, making it ideal for everyday coding, analysis, and content tasks.

For a complete overview of capabilities, see the models overview.

Sonnet 4.5 pricing is $3 per million input tokens, $15 per million output tokens. See Claude pricing for details.

Update your model name:

# From Sonnet 4
model = "claude-sonnet-4-20250514"  # Before
model = "claude-sonnet-4-5-20250929"  # After

# From Sonnet 3.7
model = "claude-3-7-sonnet-20250219"  # Before
model = "claude-sonnet-4-5-20250929"  # After

Breaking changes

These breaking changes apply when migrating from Claude 3.x Sonnet models.

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Sonnet 4.5 migration checklist

Update model ID to claude-sonnet-4-5-20250929
BREAKING: Update tool versions to latest (text_editor_20250728, code_execution_20250825); legacy versions are not supported (if migrating from 3.x)
BREAKING: Remove any code using the undo_edit command (if applicable)
BREAKING: Update sampling parameters to use only temperature OR top_p, not both (if migrating from 3.x)
Handle new refusal stop reason in your application
Review and update prompts following prompting best practices
Consider enabling extended thinking for complex reasoning tasks
Test in development environment before production deployment

Migrating to Claude Haiku 4.5

Claude Haiku 4.5 is the fastest and most intelligent Haiku model with near-frontier performance, delivering premium model quality for interactive applications and high-volume processing.

For a complete overview of capabilities, see the models overview.

Haiku 4.5 pricing is $1 per million input tokens, $5 per million output tokens. See Claude pricing for details.

Update your model name:

# From Haiku 3.5
model = "claude-3-5-haiku-20241022"  # Before
model = "claude-haiku-4-5-20251001"  # After

Review new rate limits: Haiku 4.5 has separate rate limits from Haiku 3.5. See Rate limits documentation for details.

For significant performance improvements on coding and reasoning tasks, consider enabling extended thinking with thinking: {type: "enabled", budget_tokens: N}.

Extended thinking impacts prompt caching efficiency.

Extended thinking is deprecated in Claude 4.6 models and removed in Claude Opus 4.7. If using newer models, use adaptive thinking instead.

Explore new capabilities: See the models overview for details on context awareness, increased output capacity (64k tokens), higher intelligence, and improved speed.

Breaking changes

These breaking changes apply when migrating from Claude 3.x Haiku models.

Update sampling parameters

This is a breaking change when migrating from Claude 3.x models.

Use only temperature OR top_p, not both.
Update tool versions

This is a breaking change when migrating from Claude 3.x models.

Update to the latest tool versions (text_editor_20250728, code_execution_20250825). Remove any code using the undo_edit command.
Handle the refusal stop reason

Update your application to handle refusal stop reasons.
Update your prompts for behavioral changes

Claude 4 models have a more concise, direct communication style. Review prompting best practices for optimization guidance.

Haiku 4.5 migration checklist

Update model ID to claude-haiku-4-5-20251001
BREAKING: Update tool versions to latest (text_editor_20250728, code_execution_20250825); legacy versions are not supported
BREAKING: Remove any code using the undo_edit command (if applicable)
BREAKING: Update sampling parameters to use only temperature OR top_p, not both
Handle new refusal stop reason in your application
Review and adjust for new rate limits (separate from Haiku 3.5)
Review and update prompts following prompting best practices
Consider enabling extended thinking for complex reasoning tasks
Test in development environment before production deployment

Get help

Check the API documentation for detailed specifications
Review model capabilities for performance comparisons
Review API release notes for API updates
Contact support if you encounter any issues during migration

Was this page helpful?