Literature: Anthropic Thinking Capabilities (Batch 2, Sub-batch B)
Synthesis of three Anthropic documentation pages covering extended thinking, adaptive thinking, and the effort parameter. Crawled 2026-05-02 to 2026-05-03 as part of Batch 2 ingestion.
The Thinking Mode Landscape
Anthropic's thinking capabilities let Claude reason step-by-step before producing a final response. There are three modes:
| Mode | Config | Model support |
|---|---|---|
| Adaptive | thinking: {type: "adaptive"} |
Mythos Preview (default/only), Opus 4.7 (only), Opus 4.6, Sonnet 4.6 |
| Manual | thinking: {type: "enabled", budget_tokens: N} |
All models *except* Opus 4.7 (rejected with 400). Deprecated on Opus 4.6, Sonnet 4.6. |
| Disabled | Omit thinking or {type: "disabled"} |
All models except Mythos Preview |
Invariant (critical): Manual thinking (type: "enabled", budget_tokens) on Claude Opus 4.7 is not deprecated — it is removed. Sending it returns a 400 error. This is not the same as a deprecation with a grace period.
Extended Thinking (Manual Mode)
When extended thinking is on, Claude produces thinking content blocks containing its internal reasoning, followed by text content blocks with the final answer. Applications see both.
budget_tokens sets the maximum thinking token budget. On Claude 4 models, this applies to the full thinking tokens (not the summarized output). Claude may use less than the budget. Diminishing returns above 32k tokens; use batch processing for thinking budgets that push above 32k to avoid timeout exposure.
budget_tokens must be less than max_tokens. This makes extended thinking incompatible with max_tokens: 0 (cache pre-warming).
Adaptive Thinking
Adaptive thinking removes manual budget management. Claude evaluates request complexity and determines when and how much to think. Set thinking.type: "adaptive":
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[{"role": "user", "content": "..."}],
)
No beta header is required for adaptive thinking on any model.
Adaptive mode automatically enables interleaved thinking — Claude can reason between tool calls, not just before the first one. On Opus 4.7 and Mythos Preview, inter-tool reasoning always moves into thinking blocks. On Opus 4.6 and Sonnet 4.6 with adaptive mode, interleaved thinking is also automatic.
The Effort Parameter
effort is the replacement for budget_tokens as a thinking-depth control. It passes as output_config: {effort: "<level>"} alongside adaptive thinking. The effort level is soft guidance — it affects all token spend (text, tool calls, and thinking), not just thinking depth.
| Level | Behavior | Availability |
|---|---|---|
max |
Always thinks, no depth constraint | Mythos Preview, Opus 4.7, Opus 4.6, Sonnet 4.6 |
xhigh |
Extended, long-horizon work | Opus 4.7 only |
high |
Always thinks deeply (API default) | All supported models |
medium |
Moderate thinking; may skip simple queries | All supported models |
low |
Minimizes thinking; fastest | All supported models |
At low effort, Claude also makes fewer tool calls and proceeds more directly to action.
Guidance by model:
- Opus 4.7: Start at
xhighfor coding/agentic;highminimum for intelligence-sensitive tasks;maxfor frontier problems. - Sonnet 4.6:
mediumfor most;highwhen maximum intelligence required;lowfor high-volume/latency-sensitive.
Summarized vs. Omitted Thinking Display
The display field controls what appears in API responses — it does not affect cost.
"summarized"(default on Opus 4.6, Sonnet 4.6): Thinking blocks contain a summary of Claude's reasoning. Billed for full thinking tokens; only summary visible."omitted"(default on Opus 4.7, Mythos Preview): Thinking blocks have an emptythinkingfield. Reduces time-to-first-text-token during streaming because thinking tokens are not streamed. Same cost as summarized.
The signature field is always present and identical regardless of display. It carries encrypted full thinking content for multi-turn continuity.
Warning: On Opus 4.7, display silently changed default from "summarized" (Opus 4.6 behavior) to "omitted". Code that processed thinking blocks from Opus 4.6 may need display: "summarized" set explicitly when migrating to Opus 4.7.
Thinking Encryption and Round-Tripping
Full thinking content is encrypted and stored in the opaque signature field. Rules:
- Must pass thinking blocks back unchanged when using tools with thinking (tool-use continuity).
- On Opus 4.5+ and Sonnet 4.6+, thinking blocks from previous turns are preserved by default. On earlier Opus/Sonnet and all Haiku, only the last turn's thinking is preserved by default.
redacted_thinkingblocks are a distinct type returned when portions of thinking are safety-redacted. They must also be passed back unchanged in multi-turn tool-use conversations. Filter code usingblock.type == "thinking"alone will silently dropredacted_thinkingblocks — include"redacted_thinking"in the filter.signaturevalues are cross-platform (Claude API, Bedrock, Vertex AI).
Interleaved Thinking Model Support
| Model | Interleaved thinking |
|---|---|
| Mythos Preview | Automatic (no header needed) |
| Opus 4.7 (adaptive only) | Automatic (no header needed) |
| Opus 4.6 (adaptive) | Automatic (no header needed); interleaved-thinking-2025-05-14 deprecated, safely ignored |
| Sonnet 4.6 (adaptive) | Automatic |
| Sonnet 4.6 (manual) | Via interleaved-thinking-2025-05-14 beta header (deprecated) |
| Opus 4.5, Sonnet 4.5, Opus 4.1 | Via interleaved-thinking-2025-05-14 beta header |
Thinking Block Preservation by Model
| Model class | Default thinking block preservation |
|---|---|
| Opus 4.5+, Sonnet 4.6+, Mythos Preview | All prior thinking blocks kept |
| Earlier Opus/Sonnet | Last turn only |
| All Haiku through Haiku 4.5 | Last turn only |
Override with the clear_thinking_20251015 context-editing strategy.
Prompt Caching Interactions
- Consecutive requests using the same thinking mode (
adaptiveorenabled) preserve cache breakpoints. - Switching between
adaptive,enabled, anddisabledbreaks message cache breakpoints. - System prompts and tool definitions remain cached regardless of thinking mode changes.
- Extended thinking tasks may run > 5 minutes; use the 1-hour cache TTL (
extended-cache-ttl-2025-04-11). - Thinking block preservation on Opus 4.5+ and Sonnet 4.6+ enables cache hits when thinking blocks are passed back with tool results.
Cost Model
You are charged for the full thinking tokens generated, not the summarized tokens visible in the response. The billed output token count will not match the visible token count when display: "omitted" or "summarized". Use max_tokens as a hard ceiling on total output (thinking + text). At high or max effort, monitor for stop_reason: "max_tokens".
Context Window Behavior
With extended thinking, max_tokens is enforced as a strict limit. Claude 3.7+ and Claude 4 models return a validation error if prompt_tokens + max_tokens exceeds the context window. (Earlier models silently adjusted max_tokens.)
Stable Patterns vs. Operational Details
Stable (architecture):
- Three-mode landscape (adaptive, manual, disabled) with hard model constraints
- Adaptive thinking is the trajectory; manual is deprecated on current flagship models
signaturefield as opaque round-trip token for thinking continuity- Interleaved thinking as the key capability advantage for agentic workflows
displayfield as latency/visibility control, not cost control
Operational (likely to drift):
- Per-model default for
display(watch for future model changes) - Specific effort level availability by model
redacted_thinkingoccurrence patternsinterleaved-thinking-2025-05-14beta header deprecation timeline
Notes for Synthesis
This sub-batch is the primary source for anthropic-adaptive-thinking. The existing permanent note from the advanced-capabilities batch (2026-05-02) is high quality and covers this material well. The batch-2 crawl provides more granular provenance (page IDs) but no significant new content over what was already synthesized.