Claude Sonnet 5 is the next generation of Anthropic's Sonnet model family. It is a drop-in upgrade for Claude Sonnet 4.6 with three behavior changes: adaptive thinking is on by default, manual extended thinking now returns a 400 error (it was deprecated on Claude Sonnet 4.6), and setting sampling parameters (temperature, top_p, top_k) to non-default values returns a 400 error. This page summarizes everything new at launch, including a new tokenizer.
| Model | API model ID | Description |
|---|---|---|
| Claude Sonnet 5 | claude-sonnet-5 | The best combination of speed and intelligence |
Claude Sonnet 5 supports the 1M token context window by default (1M tokens is both the default and the maximum; there is no smaller context variant), 128k max output tokens, adaptive thinking, and the same set of tools and platform features as Claude Sonnet 4.6, except Priority Tier, which is not available on Claude Sonnet 5.
For complete pricing and specs, see the models overview.
On Claude Sonnet 4.6, requests without a thinking field run without thinking. On Claude Sonnet 5, the same requests run with adaptive thinking. To turn thinking off, pass thinking: {type: "disabled"}. Because max_tokens is a hard limit on total output (thinking plus response text), revisit it for workloads that ran without thinking on Claude Sonnet 4.6.
Setting temperature, top_p, or top_k to a non-default value returns a 400 error. Remove these parameters when migrating; the default value (or omitting the parameter) is accepted. Use system-prompt instructions to guide model behavior. This is new for Sonnet-class models; the same constraint was previously introduced on Claude Opus 4.7.
Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) was deprecated on Claude Sonnet 4.6; on Claude Sonnet 5 it is removed and returns a 400 error, the same as on Claude Opus 4.8 and Claude Opus 4.7. Use adaptive thinking with the effort parameter instead.
# Not supported on Claude Sonnet 5 (returns 400)
thinking = {"type": "enabled", "budget_tokens": 32000}
# Use this instead
thinking = {"type": "adaptive"}Claude Sonnet 5 uses a new tokenizer. The same input text produces approximately 30% more tokens than on Claude Sonnet 4.6. This is not an API change: requests, responses, and streaming events keep the same shape, and no code changes are required.
The change affects anything you measure or budget in tokens:
usage fields and token counting results for the same text are higher than on Claude Sonnet 4.6. Don't reuse counts measured against earlier models; recount against Claude Sonnet 5.max_tokens budgets: an output limit tuned for Claude Sonnet 4.6 may truncate equivalent output on Claude Sonnet 5. Revisit limits sized close to your expected output length.This constraint is unchanged from Claude Sonnet 4.6. Aside from the three behavior changes (see Migration guide), code that already runs on Claude Sonnet 4.6 needs no other changes.
Prefilling the assistant message returns a 400 error, unchanged from Claude Sonnet 4.6. Use structured outputs, system prompt instructions, or output_config.format instead.
Claude Sonnet 5 is a capability upgrade over Claude Sonnet 4.6 at the same price. It is also an option for workloads that need more capability than Claude Sonnet 4.6 provides without moving to an Opus-class model.
The largest gains over Claude Sonnet 4.6 are in coding and agentic tasks. For benchmark results, see Anthropic's Transparency Hub.
Claude Sonnet 5 is the first Sonnet-tier model with real-time cybersecurity safeguards. Requests that involve prohibited or high-risk cybersecurity topics may be refused. Refusals return as a successful HTTP 200 response with stop_reason: "refusal", not an error. See Safeguards, warnings, and appeals for background.
Claude Sonnet 5 is priced at $3 per million input tokens and $15 per million output tokens, unchanged from Claude Sonnet 4.6. Because the new tokenizer produces approximately 30% more tokens for the same text, the cost of an equivalent request can differ from Claude Sonnet 4.6 even though per-token pricing is unchanged.
Introductory pricing of $2/$10 per million input/output tokens is in effect through August 31, 2026, after which the standard pricing of $3/$15 per million input/output tokens will take effect.
See Pricing for complete pricing, including batch processing and prompt caching rates.
At launch, Claude Sonnet 5 is available on:
InvokeModel and Converse APIs).Claude Sonnet 5 supports zero data retention for organizations with ZDR agreements.
Claude Sonnet 5 is a drop-in replacement for Claude Sonnet 4.6. Update your model ID:
model = "claude-sonnet-4-6" # Before
model = "claude-sonnet-5" # AfterThen review the following:
max_tokens limits sized close to your expected output length.budget_tokens, migrate to adaptive thinking. Manual extended thinking (thinking: {type: "enabled"}) is not supported and returns a 400 error.temperature, top_p, top_k) to a non-default value return a 400 error; remove them when migrating. Tool definitions and response shapes are unchanged, and assistant message prefilling was already unsupported on Claude Sonnet 4.6.See the Claude Sonnet 5 section of the migration guide for details.
Complete specs and pricing for all current Claude models.
Measure your prompts under the new tokenizer before you migrate.
The recommended thinking-on mode on Claude Sonnet 5.
How the 1M token context window works.
Complete pricing, including batch processing and prompt caching rates.
Was this page helpful?