# 自适应思考让 Claude 通过自适应思考模式动态决定何时使用以及使用多少扩展思考。 --- 此功能符合 [零数据保留 (ZDR)](/docs/en/build-with-claude/api-and-data-retention) 条件。当您的组织拥有 ZDR 协议时，通过此功能发送的数据在 API 响应返回后不会被存储。自适应思考是在 Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用[扩展思考](/docs/en/build-with-claude/extended-thinking)的推荐方式，也是 [Claude Mythos Preview](https://anthropic.com/glasswing) 上的默认模式（当未设置 `thinking` 时自动应用）。自适应思考无需手动设置思考 token 预算，而是让 Claude 根据每个请求的复杂度动态决定何时使用以及使用多少扩展思考。在 Claude Opus 4.7 上，自适应思考是**唯一**支持的思考模式；手动设置 `thinking: {type: "enabled", budget_tokens: N}` 不再被接受。对于许多工作负载，自适应思考可以比使用固定 `budget_tokens` 的扩展思考带来更好的性能，特别是双峰任务和长时间代理工作流。无需 beta 请求头。如果您的工作负载需要可预测的延迟或精确控制思考成本，使用 `budget_tokens` 的扩展思考在 Claude Opus 4.6 和 Claude Sonnet 4.6 上仍然可用，但已被弃用且不再推荐。请参阅下方的警告。 ## 支持的模型以下模型支持自适应思考： - Claude Mythos Preview (`claude-mythos-preview`)，自适应思考是默认模式；不支持 `thinking: {type: "disabled"}` - Claude Opus 4.7 (`claude-opus-4-7`)，自适应思考是唯一支持的思考模式。除非您在请求中明确设置 `thinking: {type: "adaptive"}`，否则思考功能处于关闭状态；手动设置 `thinking: {type: "enabled"}` 将返回 400 错误。 - Claude Opus 4.6 (`claude-opus-4-6`) - Claude Sonnet 4.6 (`claude-sonnet-4-6`) `thinking.type: "enabled"` 和 `budget_tokens` 在 Opus 4.6 和 Sonnet 4.6 上已被[**弃用**](/docs/en/build-with-claude/overview#feature-availability)，将在未来的模型版本中移除。请改用 `thinking.type: "adaptive"` 配合 `effort` 参数。现有的 `budget_tokens` 配置仍然可用但不再推荐；请计划迁移。较旧的模型（Sonnet 4.5、Opus 4.5 等）不支持自适应思考，需要使用 `thinking.type: "enabled"` 配合 `budget_tokens`。 ## 自适应思考的工作原理在自适应模式下，思考对模型来说是可选的。Claude 会评估每个请求的复杂度，并决定是否使用以及使用多少扩展思考。在默认的 effort 级别（`high`）下，Claude 几乎总是会进行思考。在较低的 effort 级别下，Claude 可能会跳过对较简单问题的思考。自适应思考还会自动启用[交错思考](/docs/en/build-with-claude/extended-thinking#interleaved-thinking)。这意味着 Claude 可以在工具调用之间进行思考，使其对代理工作流特别有效。 ## 如何使用自适应思考在 API 请求中将 `thinking.type` 设置为 `"adaptive"`： ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-opus-4-7", "max_tokens": 16000, "thinking": { "type": "adaptive" }, "messages": [ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] }' ``` ```bash CLI nocheck ant messages create \ --model claude-opus-4-7 \ --max-tokens 16000 \ --thinking '{type: adaptive}' \ --message '{role: user, content: Explain why the sum of two even numbers is always even.}' \ --transform content --format jsonl | jq -r ' if .type == "thinking" then "\nThinking: \(.thinking)" elif .type == "text" then "\nResponse: \(.text)" else empty end' ``` ```python Python hidelines={1..2} import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even.", } ], ) for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}") ``` ```typescript TypeScript hidelines={1..2} import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, messages: [ { role: "user", content: "Explain why the sum of two even numbers is always even." } ] }); for (const block of response.content) { if (block.type === "thinking") { console.log(`\nThinking: ${block.thinking}`); } else if (block.type === "text") { console.log(`\nResponse: ${block.text}`); } } ``` ```csharp C# using System; using System.Threading.Tasks; using Anthropic; using Anthropic.Models.Messages; class Program { static async Task Main(string[] args) { AnthropicClient client = new(); var parameters = new MessageCreateParams { Model = Model.ClaudeOpus4_7, MaxTokens = 16000, Thinking = new ThinkingConfigAdaptive(), Messages = [ new() { Role = Role.User, Content = "Explain why the sum of two even numbers is always even." } ] }; var message = await client.Messages.Create(parameters); foreach (var block in message.Content) { if (block.TryPickThinking(out ThinkingBlock? thinking)) { Console.WriteLine($"\nThinking: {thinking.Thinking}"); } else if (block.TryPickText(out TextBlock? text)) { Console.WriteLine($"\nResponse: {text.Text}"); } } } } ``` ```go Go hidelines={1..11,-1} package main import ( "context" "fmt" "log" "github.com/anthropics/anthropic-sdk-go" ) func main() { client := anthropic.NewClient() response, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{ Model: anthropic.ModelClaudeOpus4_7, MaxTokens: 16000, Thinking: anthropic.ThinkingConfigParamUnion{ OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{}, }, Messages: []anthropic.MessageParam{ anthropic.NewUserMessage(anthropic.NewTextBlock("Explain why the sum of two even numbers is always even.")), }, }) if err != nil { log.Fatal(err) } for _, block := range response.Content { switch v := block.AsAny().(type) { case anthropic.ThinkingBlock: fmt.Printf("\nThinking: %s", v.Thinking) case anthropic.TextBlock: fmt.Printf("\nResponse: %s", v.Text) } } } ``` ```java Java hidelines={1..5,7..9,-2..} import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ThinkingConfigAdaptive; public class ExtendedThinkingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_7) .maxTokens(16000L) .thinking(ThinkingConfigAdaptive.builder().build()) .addUserMessage("Explain why the sum of two even numbers is always even.") .build(); Message response = client.messages().create(params); response.content().forEach(block -> { block.thinking().ifPresent(thinkingBlock -> System.out.println("\nThinking: " + thinkingBlock.thinking()) ); block.text().ifPresent(textBlock -> System.out.println("\nResponse: " + textBlock.text()) ); }); } } ``` ```php PHP hidelines={1..4} messages->create( maxTokens: 16000, messages: [ [ 'role' => 'user', 'content' => 'Explain why the sum of two even numbers is always even.' ] ], model: 'claude-opus-4-7', thinking: ['type' => 'adaptive'], ); foreach ($message->content as $block) { if ($block->type === 'thinking') { echo "\nThinking: " . $block->thinking; } elseif ($block->type === 'text') { echo "\nResponse: " . $block->text; } } ``` ```ruby Ruby hidelines={1..2} require "anthropic" client = Anthropic::Client.new message = client.messages.create( model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, messages: [ { role: "user", content: "Explain why the sum of two even numbers is always even." } ] ) message.content.each do |block| case block.type when :thinking puts "\nThinking: #{block.thinking}" when :text puts "\nResponse: #{block.text}" end end ``` ## 使用 effort 参数的自适应思考您可以将自适应思考与 [effort 参数](/docs/en/build-with-claude/effort)结合使用，以引导 Claude 的思考量。effort 级别作为 Claude 思考分配的软指导： | Effort 级别 | 思考行为 | |:-------------|:------------------| | `max` | Claude 始终进行思考，对思考深度没有限制。可在 Claude Mythos Preview、Claude Opus 4.7、Claude Opus 4.6 和 Claude Sonnet 4.6 上使用。 | | `xhigh` | Claude 始终进行深度思考并进行扩展探索。可在 Claude Opus 4.7 上使用。 | | `high`（默认） | Claude 始终进行思考。对复杂任务提供深度推理。 | | `medium` | Claude 使用适度思考。对于非常简单的查询可能会跳过思考。 | | `low` | Claude 最小化思考。对于速度优先的简单任务跳过思考。 | ```bash cURL curl https://api.anthropic.com/v1/messages \ --header "x-api-key: $ANTHROPIC_API_KEY" \ --header "anthropic-version: 2023-06-01" \ --header "content-type: application/json" \ --data \ '{ "model": "claude-opus-4-7", "max_tokens": 16000, "thinking": { "type": "adaptive" }, "output_config": { "effort": "medium" }, "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' ``` ```bash CLI ant messages create \ --transform 'content.0.text' --raw-output <<'YAML' model: claude-opus-4-7 max_tokens: 16000 thinking: type: adaptive output_config: effort: medium messages: - role: user content: What is the capital of France? YAML ``` ```python Python hidelines={1..2} import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, output_config={"effort": "medium"}, messages=[{"role": "user", "content": "What is the capital of France?"}], ) print(response.content[0].text) ``` ```typescript TypeScript hidelines={1..2} import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); const response = await client.messages.create({ model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, output_config: { effort: "medium" }, messages: [ { role: "user", content: "What is the capital of France?" } ] }); for (const block of response.content) { if (block.type === "text") { console.log(block.text); } } ``` ```csharp C# using System; using System.Threading.Tasks; using Anthropic; using Anthropic.Models.Messages; class Program { static async Task Main(string[] args) { AnthropicClient client = new(); var parameters = new MessageCreateParams { Model = Model.ClaudeOpus4_7, MaxTokens = 16000, Thinking = new ThinkingConfigAdaptive(), OutputConfig = new OutputConfig { Effort = Effort.Medium }, Messages = [new() { Role = Role.User, Content = "What is the capital of France?" }] }; var message = await client.Messages.Create(parameters); Console.WriteLine(message); } } ``` ```go Go hidelines={1..11,-1} package main import ( "context" "fmt" "log" "github.com/anthropics/anthropic-sdk-go" ) func main() { client := anthropic.NewClient() response, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{ Model: anthropic.ModelClaudeOpus4_7, MaxTokens: 16000, Thinking: anthropic.ThinkingConfigParamUnion{ OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{}, }, OutputConfig: anthropic.OutputConfigParam{ Effort: anthropic.OutputConfigEffortMedium, }, Messages: []anthropic.MessageParam{ anthropic.NewUserMessage(anthropic.NewTextBlock("What is the capital of France?")), }, }) if err != nil { log.Fatal(err) } fmt.Println(response.Content[0].Text) } ``` ```java Java hidelines={1..5,8..10,-2..} import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Message; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.OutputConfig; import com.anthropic.models.messages.ThinkingConfigAdaptive; public class Main { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_7) .maxTokens(16000L) .thinking(ThinkingConfigAdaptive.builder().build()) .outputConfig(OutputConfig.builder() .effort(OutputConfig.Effort.MEDIUM) .build()) .addUserMessage("What is the capital of France?") .build(); Message response = client.messages().create(params); response.content().stream() .flatMap(block -> block.text().stream()) .forEach(textBlock -> System.out.println(textBlock.text())); } } ``` ```php PHP hidelines={1..4} messages->create( maxTokens: 16000, messages: [ ['role' => 'user', 'content' => 'What is the capital of France?'] ], model: 'claude-opus-4-7', thinking: ['type' => 'adaptive'], outputConfig: ['effort' => 'medium'], ); echo $message->content[0]->text; ``` ```ruby Ruby hidelines={1..2} require "anthropic" client = Anthropic::Client.new message = client.messages.create( model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, output_config: { effort: "medium" }, messages: [ { role: "user", content: "What is the capital of France?" } ] ) puts message.content.first.text ``` ## 使用自适应思考进行流式传输自适应思考可以与[流式传输](/docs/en/build-with-claude/streaming)无缝配合。思考块通过 `thinking_delta` 事件进行流式传输，与手动思考模式相同： ```bash CLI ant messages create --stream --format jsonl \ --model claude-opus-4-7 \ --max-tokens 16000 \ --thinking '{type: adaptive}' \ --message '{role: user, content: What is the greatest common divisor of 1071 and 462?}' ``` ```python Python hidelines={1..2} import anthropic client = anthropic.Anthropic() with client.messages.stream( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "What is the greatest common divisor of 1071 and 462?", } ], ) as stream: for event in stream: if event.type == "content_block_start": print(f"\nStarting {event.content_block.type} block...") elif event.type == "content_block_delta": if event.delta.type == "thinking_delta": print(event.delta.thinking, end="", flush=True) elif event.delta.type == "text_delta": print(event.delta.text, end="", flush=True) ``` ```typescript TypeScript hidelines={1..2} import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic(); const stream = await client.messages.stream({ model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, messages: [{ role: "user", content: "What is the greatest common divisor of 1071 and 462?" }] }); for await (const event of stream) { if (event.type === "content_block_start") { console.log(`\nStarting ${event.content_block.type} block...`); } else if (event.type === "content_block_delta") { if (event.delta.type === "thinking_delta") { process.stdout.write(event.delta.thinking); } else if (event.delta.type === "text_delta") { process.stdout.write(event.delta.text); } } } ``` ```csharp C# using System; using System.Threading.Tasks; using Anthropic; using Anthropic.Models.Messages; class Program { static async Task Main(string[] args) { AnthropicClient client = new(); var parameters = new MessageCreateParams { Model = Model.ClaudeOpus4_7, MaxTokens = 16000, Thinking = new ThinkingConfigAdaptive(), Messages = [new() { Role = Role.User, Content = "What is the greatest common divisor of 1071 and 462?" }] }; await foreach (var msg in client.Messages.CreateStreaming(parameters)) { Console.Write(msg); } } } ``` ```go Go hidelines={1..11,-1} package main import ( "context" "fmt" "log" "github.com/anthropics/anthropic-sdk-go" ) func main() { client := anthropic.NewClient() stream := client.Messages.NewStreaming(context.TODO(), anthropic.MessageNewParams{ Model: anthropic.ModelClaudeOpus4_7, MaxTokens: 16000, Thinking: anthropic.ThinkingConfigParamUnion{ OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{}, }, Messages: []anthropic.MessageParam{ anthropic.NewUserMessage(anthropic.NewTextBlock("What is the greatest common divisor of 1071 and 462?")), }, }) for stream.Next() { event := stream.Current() switch eventVariant := event.AsAny().(type) { case anthropic.ContentBlockStartEvent: fmt.Printf("\nStarting %s block...\n", eventVariant.ContentBlock.Type) case anthropic.ContentBlockDeltaEvent: switch deltaVariant := eventVariant.Delta.AsAny().(type) { case anthropic.ThinkingDelta: fmt.Print(deltaVariant.Thinking) case anthropic.TextDelta: fmt.Print(deltaVariant.Text) } } } if err := stream.Err(); err != nil { log.Fatal(err) } } ``` ```java Java hidelines={1..4,6..8,-2..} import com.anthropic.client.AnthropicClient; import com.anthropic.client.okhttp.AnthropicOkHttpClient; import com.anthropic.models.messages.MessageCreateParams; import com.anthropic.models.messages.Model; import com.anthropic.models.messages.ThinkingConfigAdaptive; public class StreamingThinkingExample { public static void main(String[] args) { AnthropicClient client = AnthropicOkHttpClient.fromEnv(); MessageCreateParams params = MessageCreateParams.builder() .model(Model.CLAUDE_OPUS_4_7) .maxTokens(16000L) .thinking(ThinkingConfigAdaptive.builder().build()) .addUserMessage("What is the greatest common divisor of 1071 and 462?") .build(); try (var streamResponse = client.messages().createStreaming(params)) { streamResponse.stream().forEach(event -> { if (event.contentBlockStart().isPresent()) { var startEvent = event.contentBlockStart().get(); var block = startEvent.contentBlock(); if (block.isThinking()) { System.out.println("\nStarting thinking block..."); } else if (block.isText()) { System.out.println("\nStarting text block..."); } } else if (event.contentBlockDelta().isPresent()) { var deltaEvent = event.contentBlockDelta().get(); deltaEvent.delta().thinking().ifPresent(td -> System.out.print(td.thinking()) ); deltaEvent.delta().text().ifPresent(td -> System.out.print(td.text()) ); } }); } } } ``` ```php PHP hidelines={1..4} messages->createStream( maxTokens: 16000, messages: [ ['role' => 'user', 'content' => 'What is the greatest common divisor of 1071 and 462?'] ], model: 'claude-opus-4-7', thinking: ['type' => 'adaptive'], ); foreach ($stream as $event) { if ($event->type === 'content_block_start') { echo "\nStarting {$event->contentBlock->type} block...\n"; } elseif ($event->type === 'content_block_delta') { if ($event->delta->type === 'thinking_delta') { echo $event->delta->thinking; } elseif ($event->delta->type === 'text_delta') { echo $event->delta->text; } } } ``` ```ruby Ruby hidelines={1..2} require "anthropic" client = Anthropic::Client.new stream = client.messages.stream( model: "claude-opus-4-7", max_tokens: 16000, thinking: { type: "adaptive" }, messages: [ { role: "user", content: "What is the greatest common divisor of 1071 and 462?" } ] ) stream.each do |event| case event when Anthropic::Streaming::ThinkingEvent print event.thinking when Anthropic::Streaming::TextEvent print event.text end end ``` ## 自适应、手动和禁用思考模式的比较 | 模式 | 配置 | 可用性 | 使用场景 | |:-----|:-------|:-------------|:------------| | **自适应** | `thinking: {type: "adaptive"}` | Claude Mythos Preview（默认）、Opus 4.7（唯一模式）、Opus 4.6、Sonnet 4.6 | Claude 决定何时使用以及使用多少扩展思考。使用 `effort` 进行引导。 | | **手动** | `thinking: {type: "enabled", budget_tokens: N}` | 除 Claude Opus 4.7 外的所有模型（会被拒绝）。在 Opus 4.6 和 Sonnet 4.6 上已弃用（建议改用自适应模式）。 | 需要精确控制思考 token 消耗时使用。 | | **禁用** | 省略 `thinking` 参数或传递 `{type: "disabled"}` | 除 Claude Mythos Preview 外的所有模型 | 不需要扩展思考且需要最低延迟时使用。 | 自适应思考在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上可用。在 Mythos Preview 上，自适应思考是默认模式，当未设置 `thinking` 时自动应用。在 Claude Opus 4.7 上，自适应思考是唯一支持的模式，使用 `budget_tokens` 的 `type: "enabled"` 会被拒绝。较旧的模型仅支持使用 `budget_tokens` 的 `type: "enabled"`。在 Opus 4.6 和 Sonnet 4.6 上，使用 `budget_tokens` 的 `type: "enabled"` 仍然可用但已被弃用。 **各模式的交错思考可用性：** - **自适应模式：** 在 Claude Mythos Preview、Claude Opus 4.7、Opus 4.6 和 Sonnet 4.6 上自动启用交错思考。在 Mythos Preview 和 Opus 4.7 上，工具间推理始终位于思考块内。 - **Sonnet 4.6 上的手动模式：** 通过 `interleaved-thinking-2025-05-14` beta 请求头支持交错思考。 - **Opus 4.6 上的手动模式：** 不支持交错思考。如果您的代理工作流需要在 Opus 4.6 上进行工具调用间的思考，请使用自适应模式。 ## 重要注意事项 ### 验证变更使用自适应思考时，之前的助手轮次不需要以思考块开头。这比手动模式更灵活，手动模式的 API 会强制要求启用思考的轮次以思考块开始。 ### Prompt caching 使用 `adaptive` 思考的连续请求会保留 [prompt cache](/docs/en/build-with-claude/prompt-caching) 断点。但是，在 `adaptive` 和 `enabled`/`disabled` 思考模式之间切换会破坏消息的缓存断点。无论模式如何更改，系统提示和工具定义都会保持缓存。 ### 调整思考行为自适应思考的触发行为是可提示的。如果 Claude 的思考频率不符合您的期望，您可以在系统提示中添加指导： ```text Extended thinking adds latency and should only be used when it will meaningfully improve answer quality — typically for problems that require multi-step reasoning. When in doubt, respond directly. ``` 引导 Claude 减少思考频率可能会降低对推理有益的任务的质量。在将基于提示的调优部署到生产环境之前，请衡量对您特定工作负载的影响。建议先使用较低的 [effort 级别](/docs/en/build-with-claude/effort)进行测试。 ### 成本控制使用 `max_tokens` 作为总输出（思考 + 响应文本）的硬性限制。`effort` 参数为 Claude 分配的思考量提供额外的软性指导。两者结合可以让您有效控制成本。在 `high` 和 `max` effort 级别下，Claude 可能会进行更广泛的思考，并且更有可能耗尽 `max_tokens` 预算。如果您在响应中观察到 `stop_reason: "max_tokens"`，请考虑增加 `max_tokens` 以给模型更多空间，或降低 effort 级别。 ## 使用思考块以下概念适用于所有支持扩展思考的模型，无论您使用的是自适应模式还是手动模式。 ### 摘要思考启用扩展思考后，Claude 4 模型的 Messages API 会返回 Claude 完整思考过程的摘要。摘要思考提供了扩展思考的完整智能优势，同时防止滥用。这是 Claude 4 模型在思考配置的 `display` 字段未设置或设置为 `"summarized"` 时的默认行为。在 Claude Opus 4.7 和 [Claude Mythos Preview](https://anthropic.com/glasswing) 上，`display` 默认为 `"omitted"`，因此您必须明确设置 `display: "summarized"` 才能接收摘要思考。以下是摘要思考的一些重要注意事项： - 您需要为原始请求生成的完整思考 token 付费，而不是摘要 token。 - 计费的输出 token 数量将**不匹配**您在响应中看到的 token 数量。 - 在 Claude 4 模型上，思考输出的前几行更详细，提供了对提示工程特别有用的详细推理。[Claude Mythos Preview](https://anthropic.com/glasswing) 从第一个 token 开始摘要，因此其思考块不会显示这种详细的前导内容。 - 随着 Anthropic 不断改进扩展思考功能，摘要行为可能会发生变化。 - 摘要保留了 Claude 思考过程的关键思想，同时添加最少的延迟，实现了可流式传输的用户体验。 - 摘要由与您请求目标不同的模型处理。思考模型看不到摘要输出。在极少数情况下，如果您需要访问 Claude 4 模型的完整思考输出，请[联系 Anthropic 销售团队](mailto:sales@anthropic.com)。 ### 控制思考显示思考配置上的 `display` 字段控制 API 响应中思考内容的返回方式。它接受两个值： - `"summarized"`：思考块包含摘要思考文本。详情请参阅[摘要思考](#summarized-thinking)。这是 Claude Opus 4.6、Claude Sonnet 4.6 及更早 Claude 4 模型的默认值。 - `"omitted"`：思考块返回时 `thinking` 字段为空。`signature` 字段仍然携带加密的完整思考，用于多轮连续性（参见[思考加密](#thinking-encryption)）。这是 Claude Opus 4.7 和 [Claude Mythos Preview](https://anthropic.com/glasswing) 的默认值。当您的应用不向用户展示思考内容时，设置 `display: "omitted"` 非常有用。主要好处是**流式传输时更快地获得第一个文本 token：** 服务器完全跳过流式传输思考 token，仅传输签名，因此最终文本响应会更快开始流式传输。以下是省略思考的一些重要注意事项： - 您仍然需要为完整的思考 token 付费。省略可以减少延迟，但不能减少成本。 - 如果您在多轮对话中传回思考块，请原样传递。服务器解密 `signature` 以重建原始思考用于提示构造（参见[保留思考块](/docs/en/build-with-claude/extended-thinking#preserving-thinking-blocks)）。您在往返的省略块的 `thinking` 字段中放置的任何文本都会被忽略。 - `display` 与 `thinking.type: "disabled"` 不兼容（没有可显示的内容）。 - 使用 `thinking.type: "adaptive"` 且模型跳过简单请求的思考时，无论 `display` 如何设置，都不会产生思考块。无论 `display` 是 `"summarized"` 还是 `"omitted"`，`signature` 字段都是相同的。支持在对话的轮次之间切换 `display` 值。在 Claude Opus 4.7 上，`thinking.display` 默认为 `"omitted"`。思考块仍然会出现在响应流中，但除非您明确选择加入，否则其 `thinking` 字段为空。这与 Claude Opus 4.6 的默认值 `"summarized"` 相比是一个静默变更。要在 Claude Opus 4.7 上恢复摘要思考文本，请明确将 `thinking.display` 设置为 `"summarized"`： ```python thinking = { "type": "adaptive", "display": "summarized", } ``` 有关使用 `display: "omitted"` 的代码示例和流式传输行为，请参阅扩展思考页面上的[控制思考显示](/docs/en/build-with-claude/extended-thinking#controlling-thinking-display)。那里的示例使用 `type: "enabled"`；使用自适应思考时，请使用： ```python thinking = {"type": "adaptive", "display": "omitted"} ``` ### 思考加密完整的思考内容会被加密并返回在 `signature` 字段中。此字段用于在将思考块传回 API 时验证它们是由 Claude 生成的。只有在使用[带有扩展思考的工具](/docs/en/build-with-claude/extended-thinking#extended-thinking-with-tool-use)时才需要传回思考块。否则，您可以省略之前轮次的思考块。如果您传回它们，API 是否保留或剥离它们取决于模型：Opus 4.5+ 和 Sonnet 4.6+ 默认将它们保留在上下文中；较早的 Opus/Sonnet 模型和所有 Haiku 模型会剥离它们。请参阅[上下文编辑](/docs/en/build-with-claude/context-editing)进行配置。如果传回思考块，请将所有内容原样传回，以确保一致性并避免潜在问题。以下是关于思考加密的一些重要注意事项： - 在[流式传输响应](/docs/en/build-with-claude/extended-thinking#streaming-thinking)时，签名通过 `content_block_delta` 事件中的 `signature_delta` 在 `content_block_stop` 事件之前添加。 - Claude 4 模型中的 `signature` 值比之前的模型长得多。 - `signature` 字段是一个不透明字段，不应被解释或解析。 - `signature` 值在各平台之间兼容（Claude API、[Amazon Bedrock](/docs/en/build-with-claude/claude-in-amazon-bedrock) 和 [Vertex AI](/docs/en/build-with-claude/claude-on-vertex-ai)）。在一个平台上生成的值将与另一个平台兼容。 ### 定价有关包括基础费率、缓存写入、缓存命中和输出 token 在内的完整定价信息，请参阅[定价页面](/docs/en/about-claude/pricing)。思考过程会产生以下费用： - 思考期间使用的 token（输出 token） - 保留在上下文中的之前助手轮次的思考块：在较早的 Opus/Sonnet 模型和所有 Haiku 模型上仅保留最后一轮；在 Opus 4.5+ 和 Sonnet 4.6+ 上默认保留所有轮次（输入 token） - 标准文本输出 token 启用扩展思考时，会自动包含一个专门的系统提示来支持此功能。使用摘要思考时： - **输入 token：** 原始请求中的 token（不包括之前轮次的思考 token） - **输出 token（计费）：** Claude 内部生成的原始思考 token - **输出 token（可见）：** 您在响应中看到的摘要思考 token - **不收费：** 用于生成摘要的 token 使用 `display: "omitted"` 时： - **输入 token：** 原始请求中的 token（与摘要相同） - **输出 token（计费）：** Claude 内部生成的原始思考 token（与摘要相同） - **输出 token（可见）：** 零思考 token（`thinking` 字段为空）计费的输出 token 数量将**不匹配**响应中可见的 token 数量。您需要为完整的思考过程付费，而不是响应中可见的思考内容。 ### 其他主题扩展思考页面更详细地介绍了几个主题，并提供了特定模式的代码示例： - **[带思考的工具使用](/docs/en/build-with-claude/extended-thinking#extended-thinking-with-tool-use)**：相同的规则适用于自适应思考：在工具调用之间保留思考块，并注意思考激活时 `tool_choice` 的限制。 - **[Prompt caching](/docs/en/build-with-claude/extended-thinking#extended-thinking-with-prompt-caching)**：使用自适应思考时，使用相同思考模式的连续请求会保留缓存断点。在 `adaptive` 和 `enabled`/`disabled` 模式之间切换会破坏消息的缓存断点（系统提示和工具定义保持缓存）。 - **[上下文窗口](/docs/en/build-with-claude/extended-thinking#max-tokens-and-context-window-size-with-extended-thinking)**：思考 token 如何与 `max_tokens` 和上下文窗口限制交互。 ## 后续步骤了解有关扩展思考的更多信息，包括手动模式、工具使用和 prompt caching。使用 effort 参数控制 Claude 响应的详细程度。