The effort parameter allows you to control how eager Claude is about spending tokens when responding to requests. This gives you the ability to trade off between response thoroughness and token efficiency, all with a single model.
The effort parameter is currently in beta and only supported by Claude Opus 4.5.
You must include the beta header effort-2025-11-24 when using this feature.
By default, Claude uses maximum effort—spending as many tokens as needed for the best possible outcome. By lowering the effort level, you can instruct Claude to be more conservative with token usage, optimizing for speed and cost while accepting some reduction in capability.
Setting effort to "high" produces exactly the same behavior as omitting the effort parameter entirely.
The effort parameter affects all tokens in the response, including:
This approach has two major advantages:
| Level | Description | Typical use case |
|---|---|---|
high | Maximum capability. Claude uses as many tokens as needed for the best possible outcome. Equivalent to not setting the parameter. | Complex reasoning, difficult coding problems, agentic tasks |
medium | Balanced approach with moderate token savings. | Agentic tasks that require a balance of speed, cost, and performance |
low | Most efficient. Significant token savings with some capability reduction. | Simpler tasks that need the best speed and lowest costs, such as subagents |
import anthropic
client = anthropic.Anthropic()
response = client.beta.messages.create(
model="claude-opus-4-5-20251101",
betas=["effort-2025-11-24"],
max_tokens=4096,
messages=[{
"role": "user",
"content": "Analyze the trade-offs between microservices and monolithic architectures"
}],
output_config={
"effort": "medium"
}
)
print(response.content[0].text)When using tools, the effort parameter affects both the explanations around tool calls and the tool calls themselves. Lower effort levels tend to:
Higher effort levels may:
The effort parameter works alongside the thinking token budget when extended thinking is enabled. These two controls serve different purposes:
The effort parameter can be used with or without extended thinking enabled. When both are configured:
For best performance on complex reasoning tasks, use high effort (the default) with a high thinking token budget. This allows Claude to think thoroughly and provide comprehensive responses.