訊息工具

顧問工具

將較快速的執行者模型與提供生成過程中策略性指導的高智慧顧問模型配對。

顧問工具（advisor tool）讓較快速、成本較低的 executor model（執行者模型）能在生成過程中諮詢更高智慧的 advisor model（顧問模型）以取得策略性指導。顧問會讀取完整的對話內容，產生計畫或修正方向，然後執行者繼續執行任務。

這種模式適合長時程的代理式工作負載（程式編寫代理、電腦使用、多步驟研究管線），其中大多數回合是機械性的，但擁有出色的計畫至關重要。您可以獲得接近僅使用顧問模型的品質，同時大部分的 token 生成以執行者模型的費率進行。

顧問工具目前處於測試版。請在您的請求中包含測試版標頭 advisor-tool-2026-03-01。

此功能符合「Zero Data Retention」（零資料保留），即 ZDR 的資格。當您的組織具有 ZDR 安排時，透過此功能傳送的資料在 API 回應返回後不會被儲存。

何時使用

顧問適合以下配置：

您目前在複雜任務上使用 Sonnet： 加入 Opus 作為顧問，以相近或更低的總成本提升品質。
您目前使用 Haiku 並希望提升智慧程度： 加入 Opus 作為顧問。預期成本會高於單獨使用 Haiku，但低於將執行者切換為更大的模型。

結果取決於任務。請在您自己的工作負載上進行評估。

顧問較不適合單回合問答（沒有需要規劃的內容）、使用者已自行選擇成本與品質取捨的純粹模型選擇器，或每個回合都確實需要顧問模型完整能力的工作負載。

模型相容性

執行者模型（頂層的 model 欄位）和顧問模型（工具定義內的 model 欄位）必須構成有效的配對。顧問必須是 Claude Sonnet 4.6 或能力更強的模型，且其能力必須至少與執行者相當。能力相當的模型（例如 Claude Opus 4.7 和 Claude Opus 4.8）可以互相擔任顧問。

執行者模型	顧問模型
Claude Haiku 4.5 (claude-haiku-4-5-20251001)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6) Claude Sonnet 4.6 (claude-sonnet-4-6)
Claude Sonnet 4.6 (claude-sonnet-4-6)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6) Claude Sonnet 4.6 (claude-sonnet-4-6)
Claude Sonnet 5 (claude-sonnet-5)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Opus 4.6 (claude-opus-4-6)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6)
Claude Opus 4.7 (claude-opus-4-7)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Opus 4.8 (claude-opus-4-8)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Fable 5 (claude-fable-5)	Claude Fable 5 (claude-fable-5)
Claude Mythos 5 (claude-mythos-5)	Claude Mythos 5 (claude-mythos-5)

如果您請求了無效的配對，API 會回傳 400 invalid_request_error，並指出不支援的組合。

平台可用性

顧問工具在 Claude API 和 Claude Platform on AWS 上以測試版提供。目前在 Amazon Bedrock、Google Cloud 或 Microsoft Foundry 上尚未提供。

快速開始

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=[
        {
            "type": "advisor_20260301",
            "name": "advisor",
            "model": "claude-opus-4-8",
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Build a concurrent worker pool in Go with graceful shutdown.",
        }
    ],
)

print(response)

運作方式

當您將顧問工具加入 tools 陣列時，執行者模型會像使用任何其他工具一樣決定何時呼叫它。當執行者呼叫顧問時：

執行者發出一個 server_tool_use 區塊，其中 name: "advisor" 且 input 為空。執行者決定時機，伺服器提供上下文。
Anthropic 在伺服器端對顧問模型執行一次獨立的推論。顧問在其自己的、由 Anthropic 提供的系統提示下執行，並在其輸入中以引用上下文的形式接收執行者的完整對話記錄。該對話記錄包含您的系統提示、工具定義、先前的回合與工具結果，以及執行者在本回合中到目前為止產生的文字。
顧問的回應以 advisor_tool_result 區塊的形式回傳給執行者。
執行者在建議的指引下繼續生成。

所有這些都在單一 /v1/messages 請求內完成，您這邊不需要額外的往返。例外情況是回合在呼叫中途暫停，此時您需要透過後續請求來恢復（請參閱恢復暫停的回合）。

顧問本身在沒有工具且沒有上下文管理的情況下執行。其思考區塊會在結果回傳前被捨棄。只有建議文字會傳達給執行者。

工具參數

參數	類型	預設值	說明
`type`	string	必填	必須為 `"advisor_20260301"`。
`name`	string	必填	必須為 `"advisor"`。
`model`	string	必填	顧問模型 ID，例如 claude-opus-4-8。子推論以此模型的費率計費。
`max_uses`	integer	無限制	單一請求中允許的顧問呼叫次數上限。一旦執行者達到此上限，後續的顧問呼叫會回傳帶有 `error_code: "max_uses_exceeded"` 的 `advisor_tool_result_error`，執行者會在沒有進一步建議的情況下繼續。這是每個請求的上限，而非每個對話的上限。關於對話層級的限制，請參閱成本控制。
`max_tokens`	integer	顧問模型的輸出上限	限制顧問每次呼叫的總輸出（思考加文字）。最小值為 1024。請參閱限制顧問輸出。
`caching`	object \| null	`null`（關閉）	為顧問自身的對話記錄在同一對話的多次呼叫間啟用提示快取。請參閱顧問提示快取。

caching 物件的形式為 {"type": "ephemeral", "ttl": "5m" | "1h"}。與內容區塊上的 cache_control 不同，這不是一個斷點標記，而是一個開關。伺服器會決定快取邊界的位置。

顧問工具也接受任何工具定義上可用的通用屬性：cache_control、allowed_callers、defer_loading 和 strict（在結構化輸出中說明）。關於它們的語意，請參閱工具參考。

回應結構

成功的顧問呼叫

當顧問被呼叫時，助理的內容中會有一個 server_tool_use 區塊，後面接著一個 advisor_tool_result 區塊：

{
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Let me consult the advisor on this."
    },
    {
      "type": "server_tool_use",
      "id": "srvtoolu_abc123",
      "name": "advisor",
      "input": {}
    },
    {
      "type": "advisor_tool_result",
      "tool_use_id": "srvtoolu_abc123",
      "content": {
        "type": "advisor_result",
        "text": "Use a channel-based coordination pattern. The tricky part is draining in-flight work during shutdown: close the input channel first, then wait on a WaitGroup..."
      }
    },
    {
      "type": "text",
      "text": "Here's the implementation. I'm using a channel-based coordination pattern to avoid writer starvation..."
    }
  ]
}

server_tool_use.input 永遠是空的。伺服器會自動從完整的對話記錄建構顧問的視角。執行者放入 input 的任何內容都不會傳達給顧問。

結果變體

advisor_tool_result.content 欄位是一個可辨識聯集（discriminated union）。對於成功的呼叫，變體取決於顧問模型：

變體	欄位	回傳時機
`advisor_result`	`text`、`stop_reason`	顧問模型回傳純文字（例如 Claude Opus 4.8）。
`advisor_redacted_result`	`encrypted_content`、`stop_reason`	顧問模型回傳加密輸出。

Claude Fable 5 和 Claude Mythos 5 顧問會回傳 advisor_redacted_result。相容性表格中的其他顧問模型會回傳 advisor_result。

當您在工具定義上設定 max_tokens 時，兩種結果變體都會帶有 stop_reason 欄位；未設定時則會省略。它保存顧問子呼叫的停止原因，通常是 "end_turn"，或在達到上限時為 "max_tokens"。這些值與頂層 Messages API 的 stop_reason 相符。

使用 advisor_result 時，text 欄位包含人類可讀的建議。使用 advisor_redacted_result 時，encrypted_content 欄位包含您無法讀取的不透明資料。在下一回合，伺服器會將其解密並將純文字呈現到執行者的提示中。

在這兩種情況下，請在後續回合中原封不動地回傳內容。如果您在對話中途切換顧問模型，請根據 content.type 分支處理兩種形式。

錯誤結果

如果顧問呼叫失敗，結果會帶有錯誤：

{
  "type": "advisor_tool_result",
  "tool_use_id": "srvtoolu_abc123",
  "content": {
    "type": "advisor_tool_result_error",
    "error_code": "overloaded"
  }
}

執行者會看到錯誤並在沒有進一步建議的情況下繼續。請求本身不會失敗。

`error_code`	意義
`max_uses_exceeded`	請求達到了工具定義上設定的 `max_uses` 上限。同一請求中後續的顧問呼叫會回傳此錯誤。
`too_many_requests`	顧問子推論受到速率限制。
`overloaded`	顧問子推論達到容量限制。
`prompt_too_long`	對話記錄超過了顧問模型的上下文視窗。
`execution_time_exceeded`	顧問子推論逾時。
`unavailable`	任何其他顧問失敗。

顧問的速率限制與直接呼叫顧問模型使用相同的每模型配額。顧問的速率限制會以工具結果內的 too_many_requests 呈現。執行者的速率限制會使整個請求以 HTTP 429 失敗。

多回合對話

在後續回合中，將完整的助理內容（包括 advisor_tool_result 區塊）傳回 API：

client = anthropic.Anthropic()

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
    }
]

messages = [
    {
        "role": "user",
        "content": "Build a concurrent worker pool in Go with graceful shutdown.",
    }
]

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=tools,
    messages=messages,
)

# 附加完整的回應內容，包括任何 advisor_tool_result 區塊
messages.append({"role": "assistant", "content": response.content})

# 繼續對話
messages.append({"role": "user", "content": "Now add a max-in-flight limit of 10."})

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=tools,
    messages=messages,
)

如果您在後續回合中從 tools 省略顧問工具，而訊息歷史中仍包含 advisor_tool_result 區塊，API 會回傳 400 invalid_request_error。

顧問工具沒有內建的對話層級上限。若要限制整個對話中的顧問呼叫次數，請在用戶端計數。當達到您的上限時，從 tools 陣列中移除顧問工具，並且從訊息歷史中移除所有 advisor_tool_result 區塊，以避免 400 invalid_request_error。

恢復暫停的回合

回應可能以 stop_reason: "pause_turn" 結束，而顧問呼叫仍在等待中。發生這種情況時，回應會包含顧問的 server_tool_use 區塊，但沒有對應的 advisor_tool_result。若要恢復，請將該助理訊息附加到 messages，內容保持不變，保留 server_tool_use 區塊，然後使用相同的顧問工具和測試版標頭再次發送請求。您不需要加入使用者訊息或 tool_result 區塊。API 會執行待處理的顧問呼叫，並在新的回應中繼續執行者的回合。恢復的回合可能再次暫停。如果發生這種情況，請重複相同的步驟。從恢復請求中省略顧問工具會回傳 400 invalid_request_error。如果執行者在同一回合中也呼叫了您的某個工具，回應會以 stop_reason: "tool_use" 結束，而顧問呼叫仍在等待中。照常發送 tool_result 區塊，待處理的顧問呼叫會在下一個請求開始時執行。請參閱在同一回合中混合伺服器工具和用戶端工具。

針對呼叫不足的執行者進行對話中途提醒

如果 Haiku 執行者在其第一個助理回合中沒有呼叫顧問，請在第二個助理回合之前附加一個簡短的提醒作為額外的使用者訊息。在 Anthropic 的內部行為評估中，這使 Haiku 執行者的任務通過率提高了約 7 個百分點。在 Sonnet 執行者上，純文字提醒在 Anthropic 的測試中沒有可測量的效果。接下來的呼叫時機考量對 Sonnet 尤其重要。請勿對 Opus 執行者套用此提醒：在 Opus 上它會略微降低通過率。

使用預設的 NUDGE_TURN 值 2，提醒通常會在模型已掌握任務方向但尚未決定採用某種方法之前到達。

client = anthropic.Anthropic()

NUDGE_TURN = 2  # inject before this assistant turn if no advisor call yet
NUDGE_TEXT = (
    "You have not consulted the advisor yet. If the task has a non-obvious "
    "design decision or a failure mode you haven't ruled out, call advisor "
    "now before committing to an approach."
)
MAX_TURNS = 10  # agent loop cap


def run_your_tools(content):
    # 請替換為您的工具分派邏輯。每個 tool_use 區塊會回傳一個 tool_result 區塊。
    return [
        {
            "type": "tool_result",
            "tool_use_id": block.id,
            "content": "Replace with your tool output.",
        }
        for block in content
        if block.type == "tool_use"
    ]


tools = [
    {"type": "advisor_20260301", "name": "advisor", "model": "claude-opus-4-8"},
    # ... 您的其他工具
]
task = "Build a concurrent worker pool in Go with graceful shutdown."
messages = [{"role": "user", "content": task}]
advisor_called = False

for turn in range(1, MAX_TURNS + 1):
    response = client.beta.messages.create(
        model="claude-haiku-4-5",
        max_tokens=4096,
        betas=["advisor-tool-2026-03-01"],
        tools=tools,
        messages=messages,
    )
    messages.append({"role": "assistant", "content": response.content})
    advisor_called = advisor_called or any(
        b.type == "server_tool_use" and b.name == "advisor" for b in response.content
    )
    if response.stop_reason == "end_turn":
        break
    if response.stop_reason == "pause_turn":
        continue  # server tool pending; re-send to let the API complete it

    results = run_your_tools(response.content)  # list of tool_result blocks
    if results:
        messages.append({"role": "user", "content": results})
    # 如果您的系統提示已告知模型謹慎呼叫，可略過此步驟。
    if turn == NUDGE_TURN - 1 and not advisor_called:
        messages.append({"role": "user", "content": NUDGE_TEXT})

請將提醒作為獨立的使用者訊息附加在工具結果之後，而不是作為同一訊息中的同級區塊。連續的使用者訊息是有效的。在 Anthropic 對 Haiku 和 Sonnet 執行者的測試中，它們的行為與同級區塊相同。獨立訊息的形式也能讓提醒與工具輸出明確區分。

取捨： 提醒會提高呼叫率，這可能會使極其簡單的任務進行不必要的諮詢。如果您的工作負載混合了簡單和複雜的任務，請考慮將 NUDGE_TURN 提高到 3，讓兩回合的任務在提醒觸發前完成，或根據您已經計算的任務複雜度訊號來控制提醒。如果您的系統提示已經包含克制性的語言（「將顧問保留給真正不確定的情況」），請完全跳過提醒，因為這兩個指示會互相衝突。

純文字提醒在 Haiku 和 Sonnet 執行者上非常顯著：在 Anthropic 的測試中，74%（Sonnet）到 98%（Haiku）的受提醒嘗試在第 2 回合立即呼叫了顧問。如果這發生在您的執行者讀取問題或收集上下文之前，產生的顧問呼叫會缺乏上下文，並可能取代時機更好的後續呼叫。在加入提醒之前，請先測量您的執行者的基準首次呼叫回合。如果執行者已經可靠地呼叫顧問，且其首次呼叫通常發生在第 N 回合，請將 NUDGE_TURN 設定為大於 N。在 Anthropic 的測試中，在基準首次呼叫為第 7 回合或更晚的工作負載上，第 2 回合的提醒與 3 到 4 個百分點的任務效能下降相關。在基準呼叫率為 86% 的瀏覽工作負載上，相同的提醒提高了參與度，且沒有任務效能成本。

若要在特定請求上強制進行諮詢而非提醒，請將 tool_choice 設定為 {"type": "tool", "name": "advisor"}，但需遵守強制工具使用中的限制。強制工具使用不能與擴展思考結合：如果您同時啟用兩者，API 會回傳 400 invalid_request_error。

串流

顧問子推論不會串流。顧問執行時，執行者的串流會暫停，然後完整結果會在單一事件中到達。

帶有 name: "advisor" 的 server_tool_use 區塊表示顧問呼叫正在開始。暫停從該區塊關閉時（content_block_stop）開始。在暫停期間，除了大約每 30 秒發出一次的標準 SSE ping 保持連線訊號外，串流是靜默的。短的顧問呼叫可能不會顯示任何 ping。

當顧問完成時，advisor_tool_result 會在單一 content_block_start 事件中完整到達（沒有增量）。執行者的輸出隨後恢復串流。

接著會有一個 message_delta 事件，其中更新的 usage.iterations 陣列反映顧問的 token 計數。

用量與計費

顧問呼叫作為獨立的子推論執行，以顧問模型的費率計費。用量在 usage.iterations[] 陣列中回報：

{
  "usage": {
    "input_tokens": 412,
    "cache_read_input_tokens": 0,
    "cache_creation_input_tokens": 0,
    "output_tokens": 531,
    "iterations": [
      {
        "type": "message",
        "input_tokens": 412,
        "cache_read_input_tokens": 0,
        "cache_creation_input_tokens": 0,
        "output_tokens": 89
      },
      {
        "type": "advisor_message",
        "model": "claude-opus-4-8",
        "input_tokens": 823,
        "cache_read_input_tokens": 0,
        "cache_creation_input_tokens": 0,
        "output_tokens": 1612
      },
      {
        "type": "message",
        "input_tokens": 1348,
        "cache_read_input_tokens": 412,
        "cache_creation_input_tokens": 0,
        "output_tokens": 442
      }
    ]
  }
}

頂層的 usage 欄位僅反映執行者的 token。顧問的 token 不會計入頂層總計，因為它們以不同的費率計費。type: "advisor_message" 的迭代以顧問模型的費率計費，而 type: "message" 的迭代以執行者模型的費率計費。

彙總規則因欄位而異。頂層的 output_tokens 是所有執行者迭代的總和。頂層的 input_tokens 和 cache_read_input_tokens 僅反映第一次執行者迭代。後續執行者迭代的輸入不會重新加總，因為它們包含先前的輸出 token。在建構成本追蹤邏輯時，請使用 usage.iterations 取得完整的逐迭代明細。

顧問輸出通常為 400 到 700 個文字 token，或包含思考在內總計 1,400 到 1,800 個 token。成本節省來自於顧問不生成您的完整最終輸出。執行者以其較低的費率完成這項工作。

頂層的 max_tokens 僅適用於執行者的輸出。它不限制顧問子推論的 token。若要直接限制顧問輸出，請在工具定義上設定 max_tokens。顧問的 token 也不會從套用於執行者的任何任務預算中扣除。

Priority Tier 獨立適用於每個模型。執行者模型上的 Priority Tier 承諾不會延伸到顧問。只有當您的組織也持有顧問模型的承諾時，顧問呼叫才會以 Priority Tier 執行。

顧問提示快取

有兩個獨立的快取層。

執行者端快取

advisor_tool_result 區塊與任何其他內容區塊一樣可以快取。在後續回合中放置在其後的 cache_control 斷點會命中。無論您的用戶端收到的是 text 還是 encrypted_content，執行者的提示始終包含純文字建議，因此兩種結果變體的快取行為是相同的。

顧問端快取

在工具定義上設定 caching，為顧問自身的對話記錄在同一對話的多次呼叫間啟用提示快取：

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
        "caching": {"type": "ephemeral", "ttl": "5m"},
    }
]

顧問在第 N 次呼叫的提示是第 (N-1) 次呼叫的提示再附加一個片段，因此前綴在多次呼叫間是穩定的。啟用 caching 後，每次顧問呼叫都會寫入一個快取項目，下一次呼叫會讀取到該點，並只為差異部分付費。您會看到 cache_read_input_tokens 在第二次及之後的 advisor_message 迭代中變為非零。

何時啟用： 當顧問在每個對話中被呼叫兩次或更少時，快取寫入的成本會超過讀取節省的成本。快取大約在三次顧問呼叫時達到損益平衡，之後會持續改善。對於長的代理迴圈請啟用它，對於短任務請保持關閉。

保持一致： 設定 caching 一次後，在整個對話中保持不變。在對話中途切換開關會導致快取未命中。

clear_thinking 的 keep 值若非 "all"，會在每個回合移動顧問的引用對話記錄，導致顧問端快取未命中。這只是成本上的劣化，建議品質不受影響。當啟用擴展思考但沒有明確的 clear_thinking 設定時，API 預設為 keep: {type: "thinking_turns", value: 1}，這會觸發此行為（這是較早的 Opus/Sonnet 模型和所有 Haiku 模型的預設值，而在 Opus 4.5+ 和 Sonnet 4.6+ 上，預設值是保留所有回合）。請設定 keep: "all" 以保持顧問快取的穩定性。

與其他工具結合

顧問工具可以與其他伺服器端和用戶端工具組合。將它們全部加入同一個 tools 陣列：

tools = [
    {
        "type": "web_search_20250305",
        "name": "web_search",
        "max_uses": 5,
    },
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
    },
    {
        "name": "run_bash",
        "description": "Run a bash command",
        "input_schema": {
            "type": "object",
            "properties": {"command": {"type": "string"}},
        },
    },
]

執行者可以在同一回合中搜尋網路、呼叫顧問並使用您的自訂工具。顧問的計畫可以指引執行者接下來要使用哪些工具。

功能	互動
批次處理	支援。`usage.iterations` 會按項目回報。
Token 計數	僅回傳執行者第一次迭代的輸入 token。若要粗略估算顧問的用量，請使用設定為顧問模型的 `model` 和相同的訊息呼叫 `count_tokens`。
上下文編輯	`clear_tool_uses` 與顧問工具區塊不完全相容。關於 `clear_thinking`，請參閱前面的快取警告。
`pause_turn`	當同一回合中沒有用戶端 `tool_use` 區塊在等待您的結果時，懸置的顧問呼叫會使回應以 `stop_reason: "pause_turn"` 結束，並帶有一個沒有結果的 `server_tool_use` 區塊。顧問會在恢復時執行。如果執行者在該回合中也呼叫了您的某個工具，回應會改以 `stop_reason: "tool_use"` 結束，待處理的顧問呼叫會在您發送 `tool_result` 區塊後，於下一個請求開始時執行。請參閱恢復暫停的回合、在同一回合中混合伺服器工具和用戶端工具，以及伺服器工具。

最佳實踐

針對程式編寫和代理任務的提示

顧問工具內建的描述會引導執行者在複雜任務開始時以及遇到困難時呼叫它。對於研究任務，通常不需要額外的提示。

在程式編寫和代理任務上，當顧問減少總工具呼叫次數和對話長度時，能以相近的成本產生更高的智慧。有兩個時機驅動這種改善：

在對話記錄中有幾次探索性讀取之後，儘早進行第一次顧問呼叫。
對於困難的任務，在檔案寫入和測試輸出進入對話記錄後，進行最後一次顧問呼叫。

如果您的代理公開了其他類似規劃器的工具（例如待辦清單工具），請提示模型在那些工具之前呼叫顧問，讓顧問的計畫能匯入其中。建議的系統提示強化了儘早呼叫的模式。請加入您自己的匯入句子，指向您的代理公開的任何規劃器工具。

針對程式編寫任務的建議系統提示

在沒有系統提示引導的情況下，執行者在某些領域（特別是程式編寫任務）傾向於呼叫顧問的次數不足。對於您希望顧問時機一致且每個任務約有兩到三次呼叫的程式編寫任務，請將以下區塊加在您的執行者系統提示中任何其他提及顧問的句子之前。

時機指引：

You have access to an `advisor` tool backed by a stronger reviewer model. It takes NO parameters — when you call advisor(), your entire conversation history is automatically forwarded. They see the task, every tool call you've made, every result you've seen.

Call advisor BEFORE substantive work — before writing, before committing to an interpretation, before building on an assumption. If the task requires orientation first (finding files, fetching a source, seeing what's there), do that, then call advisor. Orientation is not substantive work. Writing, editing, and declaring an answer are.

Also call advisor:
- When you believe the task is complete. BEFORE this call, make your deliverable durable: write the file, save the result, commit the change. The advisor call takes time; if the session ends during it, a durable result persists and an unwritten one doesn't.
- When stuck — errors recurring, approach not converging, results that don't fit.
- When considering a change of approach.

On tasks longer than a few steps, call advisor at least once before committing to an approach and once before declaring done. On short reactive tasks where the next action is dictated by tool output you just read, you don't need to keep calling — the advisor adds most of its value on the first call, before the approach crystallizes.

執行者應如何對待建議（直接放在時機區塊之後）：

Give the advice serious weight. If you follow a step and it fails empirically, or you have primary-source evidence that contradicts a specific claim (the file says X, the paper states Y), adapt. A passing self-test is not evidence the advice is wrong — it's evidence your test doesn't check what the advice is checking.

If you've already retrieved data pointing one way and the advisor points another: don't silently switch. Surface the conflict in one more advisor call — "I found X, you suggest Y, which constraint breaks the tie?" The advisor saw your evidence but may have underweighted it; a reconcile call is cheaper than committing to the wrong branch.

針對程式編寫工作負載上的 Haiku 的替代系統提示

Claude Haiku 4.5 會保守地套用預設的顧問指引。這使其在研究和查詢工作負載上的呼叫率保持適當的低水準，但在程式編寫工作負載上犧牲了品質，而在這類工作負載中，儘早諮詢顧問可靠地物有所值。在一個內部程式編寫基準測試中，以下區塊的一個近似變體（Hard 規則中的唯讀例外是在測量後加入的）使 Haiku 的通過率比內建預設值提高了約 7.5 個百分點。

當您的 Haiku 執行者主要執行程式編寫或寫入任務工作負載時，請使用此區塊取代前面的時機和建議區塊：

Consult a stronger reviewer who sees your full conversation transcript.

No parameters. When you call advisor(), your entire history -- task, every tool call and result, your reasoning -- is automatically forwarded. The advisor sees exactly what you've done.

Call advisor BEFORE substantive work -- before writing, before committing to an interpretation, before building on an assumption. If the task requires orientation first (finding files, fetching a source, seeing what's there), do that, then call advisor. Orientation is not substantive work. Writing, editing, and declaring an answer are.

Also call advisor:
- When you believe the task is complete. BEFORE this call, make your deliverable durable: write the file, save the result, commit the change. The advisor call takes time; if the session ends during it, a durable result persists and an unwritten one doesn't.
- When stuck -- errors recurring, approach not converging, results that don't fit.
- When considering a change of approach.

On tasks longer than a few steps, call advisor at least once before committing to an approach and once before declaring done. On short reactive tasks where the next action is dictated by tool output you just read, you don't need to keep calling -- the advisor adds most of its value on the first call, before the approach crystallizes.

Give the advice serious weight. If you follow a step and it fails empirically, or you have primary-source evidence that contradicts a specific claim (the file says X, the paper states Y), adapt. A passing self-test is not evidence the advice is wrong -- it's evidence your test doesn't check what the advice is checking.

If you've already retrieved data pointing one way and the advisor points another: don't silently switch. Surface the conflict in one more advisor call -- "I found X, you suggest Y, which constraint breaks the tie?" The advisor saw your evidence but may have underweighted it; a reconcile call is cheaper than committing to the wrong branch.

Call advisor for design, architecture, and risk questions where you won't touch a file. If your response would be analysis or a recommendation with no other tool calls, call advisor first -- that judgment call is exactly where a second opinion is highest-value.

Hard rule: your first write_file, edit_file, or state-changing bash call on a task must be preceded by an advisor call in the same or an earlier turn. Read-only orientation commands (ls, cat, grep, find) are not state-changing. This is a checkpoint, not a difficulty judgment. It applies to one-line edits too.

注意事項： 在一個內部瀏覽理解基準測試（n = 1,266）中，此區塊的一個近似變體相對於內建預設值損失了約 4 個百分點的準確度。如果您的工作負載混合了程式編寫與大量的查詢或檢索，請繼續使用建議的區塊，或根據您已經計算的工作負載類型訊號來控制切換。

增加 Opus 執行者的顧問呼叫

Opus 執行者通常在沒有額外提示的情況下以適當的頻率呼叫顧問。如果您的 Opus 執行者在您的工作負載上呼叫不足，請將以下檢查點加入您的系統提示：

Call advisor for design, architecture, and risk questions where you won't touch a file. If your response would be analysis or a recommendation with no other tool calls, call advisor first. That judgment call is exactly where a second opinion is highest-value. (This does not apply to simple factual lookups or arithmetic; those you answer directly.)

Hard rule: your first write_file, edit_file, or state-changing bash call on a task must be preceded by an advisor call in the same or an earlier turn. Read-only orientation commands (ls, cat, grep, find) are not state-changing. This is a checkpoint, not a difficulty judgment. It applies to one-line edits too.

注意事項： 在 Anthropic 的測試中，此區塊的一個近似變體（Hard 規則中的唯讀例外是在測量後加入的）使呼叫不足的任務的通過率提高了約 7 到 10 個百分點，但導致 Opus 在第一個動作不需要規劃的任務上過度呼叫。在混合工作負載上的淨效果大致持平。只有在您觀察到 Opus 在諮詢會有幫助的任務上跳過顧問時才加入它。請勿將其作為預設值加入。

縮減顧問輸出長度

顧問輸出是顧問最大的成本驅動因素，而頂層的 max_tokens 不會限制它。顧問會將您的系統提示和使用者訊息視為關於執行者任務的引用上下文，因此直接對顧問說話的指示比第三人稱描述更可靠地被遵循。Anthropic 測試過最有效的放置位置是在使用者訊息中的一行：

(Advisor: please keep your guidance under 80 words — I need a focused starting point, not a comprehensive plan.)

這一行可以由您的代理框架在發送請求之前以程式化方式加在前面。此限制是軟性約束。顧問偶爾會超過它，因此請要求大約您真正上限的 80%。

在 Anthropic 的測試中，這一行也增加了執行者諮詢顧問的頻率，但淨效果仍然是更低的總成本（更多次諮詢，每次更短）。

將此方法與針對程式編寫任務的建議系統提示中的時機指引（或如果您已換用替代的 Haiku 區塊）搭配使用，以獲得最強的成本與品質取捨。若需要硬性上限而非軟性要求，請參閱限制顧問輸出。

限制顧問輸出

在工具定義上設定 max_tokens，以限制顧問每次呼叫的總輸出（思考加文字）：

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
        "max_tokens": 2048,
    }
]

最小值為 1024。將 max_tokens 設定為高於顧問模型自身的輸出上限會回傳 400 錯誤。此上限獨立適用於每次顧問呼叫，不會在同一請求的多次呼叫間共享。

這不僅僅是硬性截斷。伺服器也會將剩餘的 token 預算傳遞給顧問，因此顧問會調整其回應以符合限制。

建議的起始點： max_tokens: 2048。在 Anthropic 對一個困難推理基準測試（每個配置 n = 40）的測試中，與不設定上限相比，這使顧問的平均輸出減少了約 7 倍，截斷率接近零，且沒有可偵測的品質劣化。最小值 1024 使輸出減少約 10 倍，但截斷了約 10% 的呼叫。在此樣本量下，所有配置之間的準確度差異都在雜訊範圍內。請在您自己的工作負載上驗證。

`max_tokens`	顧問平均輸出 token 數	被截斷的呼叫
未設定	約 4,200 到 5,900	不適用
2048	約 630 到 840	約 0%
1024	約 370 到 480	約 10%

困難的推理任務引發的顧問輸出遠長於前面針對較輕工作負載引用的典型 1,400 到 1,800 個 token。請使用此表格來估算節省比例，而非作為顧問輸出的通用基準。

當顧問確實達到上限時，結果區塊會帶有 stop_reason: "max_tokens"。API 也會在建議文字後附加 [Advisor output truncated at max_tokens=2048.]（標明您的上限），讓執行者在其自身的上下文中看到截斷。使用 stop_reason 來偵測被截斷的建議，並決定是提高上限還是讓執行者以部分指引繼續。只有當您在工具定義上設定 max_tokens 時，這兩個訊號才會出現。

{
  "type": "advisor_tool_result",
  "tool_use_id": "srvtoolu_abc123",
  "content": {
    "type": "advisor_result",
    "text": "Use a channel-based coordination pattern. The tricky part is\n\n[Advisor output truncated at max_tokens=2048.]",
    "stop_reason": "max_tokens"
  }
}

檢查 usage.iterations 中對應的 advisor_message 項目上的 output_tokens，以查看每次呼叫與其上限的接近程度。

與基於提示的方法相比，max_tokens 是硬性上限而非軟性要求。當您需要成本或延遲的保證界限時，請使用 max_tokens。當您想要偏向簡潔而不冒中途截斷的風險時，請使用基於提示的方法（或兩者一起使用）。

與 effort 設定搭配

對於程式編寫任務，將中等 effort 的 Sonnet 執行者與 Opus 顧問配對，可以以更低的成本達到與預設 effort 的 Sonnet 相當的智慧。若要獲得最高智慧，請將執行者保持在預設 effort。

成本控制

對於對話層級的預算，請在用戶端計算顧問呼叫次數。當達到您的上限時，從 tools 中移除顧問工具，並且從訊息歷史中移除所有 advisor_tool_result 區塊，以避免 400 invalid_request_error（請參閱多回合對話中的注意事項）。
僅在您預期有三次或更多顧問呼叫的對話中啟用 caching。

後續步驟

記憶工具

使用用戶端記憶目錄在對話之間儲存和檢索資訊。

伺服器工具

使用由 Anthropic 執行的工具：server_tool_use 區塊、pause_turn 延續和網域篩選。

工具參考

Anthropic 提供的工具目錄以及選用工具定義屬性的參考。

Effort

使用 effort 參數控制 Claude 回應時使用的 token 數量，在回應的完整性和 token 效率之間取得平衡。

Was this page helpful?

訊息工具

顧問工具

將較快速的執行者模型與提供生成過程中策略性指導的高智慧顧問模型配對。

顧問工具目前處於測試版。請在您的請求中包含測試版標頭 advisor-tool-2026-03-01。

此功能符合「Zero Data Retention」（零資料保留），即 ZDR 的資格。當您的組織具有 ZDR 安排時，透過此功能傳送的資料在 API 回應返回後不會被儲存。

何時使用

顧問適合以下配置：

您目前在複雜任務上使用 Sonnet： 加入 Opus 作為顧問，以相近或更低的總成本提升品質。
您目前使用 Haiku 並希望提升智慧程度： 加入 Opus 作為顧問。預期成本會高於單獨使用 Haiku，但低於將執行者切換為更大的模型。

結果取決於任務。請在您自己的工作負載上進行評估。

模型相容性

執行者模型	顧問模型
Claude Haiku 4.5 (claude-haiku-4-5-20251001)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6) Claude Sonnet 4.6 (claude-sonnet-4-6)
Claude Sonnet 4.6 (claude-sonnet-4-6)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6) Claude Sonnet 4.6 (claude-sonnet-4-6)
Claude Sonnet 5 (claude-sonnet-5)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Opus 4.6 (claude-opus-4-6)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7) Claude Opus 4.6 (claude-opus-4-6)
Claude Opus 4.7 (claude-opus-4-7)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Opus 4.8 (claude-opus-4-8)	Claude Fable 5 (claude-fable-5) Claude Mythos 5 (claude-mythos-5) Claude Opus 4.8 (claude-opus-4-8) Claude Opus 4.7 (claude-opus-4-7)
Claude Fable 5 (claude-fable-5)	Claude Fable 5 (claude-fable-5)
Claude Mythos 5 (claude-mythos-5)	Claude Mythos 5 (claude-mythos-5)

如果您請求了無效的配對，API 會回傳 400 invalid_request_error，並指出不支援的組合。

平台可用性

顧問工具在 Claude API 和 Claude Platform on AWS 上以測試版提供。目前在 Amazon Bedrock、Google Cloud 或 Microsoft Foundry 上尚未提供。

快速開始

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=[
        {
            "type": "advisor_20260301",
            "name": "advisor",
            "model": "claude-opus-4-8",
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Build a concurrent worker pool in Go with graceful shutdown.",
        }
    ],
)

print(response)

運作方式

當您將顧問工具加入 tools 陣列時，執行者模型會像使用任何其他工具一樣決定何時呼叫它。當執行者呼叫顧問時：

執行者發出一個 server_tool_use 區塊，其中 name: "advisor" 且 input 為空。執行者決定時機，伺服器提供上下文。
Anthropic 在伺服器端對顧問模型執行一次獨立的推論。顧問在其自己的、由 Anthropic 提供的系統提示下執行，並在其輸入中以引用上下文的形式接收執行者的完整對話記錄。該對話記錄包含您的系統提示、工具定義、先前的回合與工具結果，以及執行者在本回合中到目前為止產生的文字。
顧問的回應以 advisor_tool_result 區塊的形式回傳給執行者。
執行者在建議的指引下繼續生成。

顧問本身在沒有工具且沒有上下文管理的情況下執行。其思考區塊會在結果回傳前被捨棄。只有建議文字會傳達給執行者。

工具參數

參數	類型	預設值	說明
`type`	string	必填	必須為 `"advisor_20260301"`。
`name`	string	必填	必須為 `"advisor"`。
`model`	string	必填	顧問模型 ID，例如 claude-opus-4-8。子推論以此模型的費率計費。
`max_uses`	integer	無限制	單一請求中允許的顧問呼叫次數上限。一旦執行者達到此上限，後續的顧問呼叫會回傳帶有 `error_code: "max_uses_exceeded"` 的 `advisor_tool_result_error`，執行者會在沒有進一步建議的情況下繼續。這是每個請求的上限，而非每個對話的上限。關於對話層級的限制，請參閱成本控制。
`max_tokens`	integer	顧問模型的輸出上限	限制顧問每次呼叫的總輸出（思考加文字）。最小值為 1024。請參閱限制顧問輸出。
`caching`	object \| null	`null`（關閉）	為顧問自身的對話記錄在同一對話的多次呼叫間啟用提示快取。請參閱顧問提示快取。

回應結構

成功的顧問呼叫

當顧問被呼叫時，助理的內容中會有一個 server_tool_use 區塊，後面接著一個 advisor_tool_result 區塊：

{
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Let me consult the advisor on this."
    },
    {
      "type": "server_tool_use",
      "id": "srvtoolu_abc123",
      "name": "advisor",
      "input": {}
    },
    {
      "type": "advisor_tool_result",
      "tool_use_id": "srvtoolu_abc123",
      "content": {
        "type": "advisor_result",
        "text": "Use a channel-based coordination pattern. The tricky part is draining in-flight work during shutdown: close the input channel first, then wait on a WaitGroup..."
      }
    },
    {
      "type": "text",
      "text": "Here's the implementation. I'm using a channel-based coordination pattern to avoid writer starvation..."
    }
  ]
}

server_tool_use.input 永遠是空的。伺服器會自動從完整的對話記錄建構顧問的視角。執行者放入 input 的任何內容都不會傳達給顧問。

結果變體

advisor_tool_result.content 欄位是一個可辨識聯集（discriminated union）。對於成功的呼叫，變體取決於顧問模型：

變體	欄位	回傳時機
`advisor_result`	`text`、`stop_reason`	顧問模型回傳純文字（例如 Claude Opus 4.8）。
`advisor_redacted_result`	`encrypted_content`、`stop_reason`	顧問模型回傳加密輸出。

Claude Fable 5 和 Claude Mythos 5 顧問會回傳 advisor_redacted_result。相容性表格中的其他顧問模型會回傳 advisor_result。

在這兩種情況下，請在後續回合中原封不動地回傳內容。如果您在對話中途切換顧問模型，請根據 content.type 分支處理兩種形式。

錯誤結果

如果顧問呼叫失敗，結果會帶有錯誤：

{
  "type": "advisor_tool_result",
  "tool_use_id": "srvtoolu_abc123",
  "content": {
    "type": "advisor_tool_result_error",
    "error_code": "overloaded"
  }
}

執行者會看到錯誤並在沒有進一步建議的情況下繼續。請求本身不會失敗。

`error_code`	意義
`max_uses_exceeded`	請求達到了工具定義上設定的 `max_uses` 上限。同一請求中後續的顧問呼叫會回傳此錯誤。
`too_many_requests`	顧問子推論受到速率限制。
`overloaded`	顧問子推論達到容量限制。
`prompt_too_long`	對話記錄超過了顧問模型的上下文視窗。
`execution_time_exceeded`	顧問子推論逾時。
`unavailable`	任何其他顧問失敗。

多回合對話

在後續回合中，將完整的助理內容（包括 advisor_tool_result 區塊）傳回 API：

client = anthropic.Anthropic()

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
    }
]

messages = [
    {
        "role": "user",
        "content": "Build a concurrent worker pool in Go with graceful shutdown.",
    }
]

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=tools,
    messages=messages,
)

# 附加完整的回應內容，包括任何 advisor_tool_result 區塊
messages.append({"role": "assistant", "content": response.content})

# 繼續對話
messages.append({"role": "user", "content": "Now add a max-in-flight limit of 10."})

response = client.beta.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    betas=["advisor-tool-2026-03-01"],
    tools=tools,
    messages=messages,
)

如果您在後續回合中從 tools 省略顧問工具，而訊息歷史中仍包含 advisor_tool_result 區塊，API 會回傳 400 invalid_request_error。

恢復暫停的回合

針對呼叫不足的執行者進行對話中途提醒

使用預設的 NUDGE_TURN 值 2，提醒通常會在模型已掌握任務方向但尚未決定採用某種方法之前到達。

client = anthropic.Anthropic()

NUDGE_TURN = 2  # inject before this assistant turn if no advisor call yet
NUDGE_TEXT = (
    "You have not consulted the advisor yet. If the task has a non-obvious "
    "design decision or a failure mode you haven't ruled out, call advisor "
    "now before committing to an approach."
)
MAX_TURNS = 10  # agent loop cap


def run_your_tools(content):
    # 請替換為您的工具分派邏輯。每個 tool_use 區塊會回傳一個 tool_result 區塊。
    return [
        {
            "type": "tool_result",
            "tool_use_id": block.id,
            "content": "Replace with your tool output.",
        }
        for block in content
        if block.type == "tool_use"
    ]


tools = [
    {"type": "advisor_20260301", "name": "advisor", "model": "claude-opus-4-8"},
    # ... 您的其他工具
]
task = "Build a concurrent worker pool in Go with graceful shutdown."
messages = [{"role": "user", "content": task}]
advisor_called = False

for turn in range(1, MAX_TURNS + 1):
    response = client.beta.messages.create(
        model="claude-haiku-4-5",
        max_tokens=4096,
        betas=["advisor-tool-2026-03-01"],
        tools=tools,
        messages=messages,
    )
    messages.append({"role": "assistant", "content": response.content})
    advisor_called = advisor_called or any(
        b.type == "server_tool_use" and b.name == "advisor" for b in response.content
    )
    if response.stop_reason == "end_turn":
        break
    if response.stop_reason == "pause_turn":
        continue  # server tool pending; re-send to let the API complete it

    results = run_your_tools(response.content)  # list of tool_result blocks
    if results:
        messages.append({"role": "user", "content": results})
    # 如果您的系統提示已告知模型謹慎呼叫，可略過此步驟。
    if turn == NUDGE_TURN - 1 and not advisor_called:
        messages.append({"role": "user", "content": NUDGE_TEXT})

串流

顧問子推論不會串流。顧問執行時，執行者的串流會暫停，然後完整結果會在單一事件中到達。

當顧問完成時，advisor_tool_result 會在單一 content_block_start 事件中完整到達（沒有增量）。執行者的輸出隨後恢復串流。

接著會有一個 message_delta 事件，其中更新的 usage.iterations 陣列反映顧問的 token 計數。

用量與計費

顧問呼叫作為獨立的子推論執行，以顧問模型的費率計費。用量在 usage.iterations[] 陣列中回報：

{
  "usage": {
    "input_tokens": 412,
    "cache_read_input_tokens": 0,
    "cache_creation_input_tokens": 0,
    "output_tokens": 531,
    "iterations": [
      {
        "type": "message",
        "input_tokens": 412,
        "cache_read_input_tokens": 0,
        "cache_creation_input_tokens": 0,
        "output_tokens": 89
      },
      {
        "type": "advisor_message",
        "model": "claude-opus-4-8",
        "input_tokens": 823,
        "cache_read_input_tokens": 0,
        "cache_creation_input_tokens": 0,
        "output_tokens": 1612
      },
      {
        "type": "message",
        "input_tokens": 1348,
        "cache_read_input_tokens": 412,
        "cache_creation_input_tokens": 0,
        "output_tokens": 442
      }
    ]
  }
}

顧問提示快取

有兩個獨立的快取層。

執行者端快取

顧問端快取

在工具定義上設定 caching，為顧問自身的對話記錄在同一對話的多次呼叫間啟用提示快取：

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
        "caching": {"type": "ephemeral", "ttl": "5m"},
    }
]

保持一致： 設定 caching 一次後，在整個對話中保持不變。在對話中途切換開關會導致快取未命中。

與其他工具結合

顧問工具可以與其他伺服器端和用戶端工具組合。將它們全部加入同一個 tools 陣列：

tools = [
    {
        "type": "web_search_20250305",
        "name": "web_search",
        "max_uses": 5,
    },
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
    },
    {
        "name": "run_bash",
        "description": "Run a bash command",
        "input_schema": {
            "type": "object",
            "properties": {"command": {"type": "string"}},
        },
    },
]

執行者可以在同一回合中搜尋網路、呼叫顧問並使用您的自訂工具。顧問的計畫可以指引執行者接下來要使用哪些工具。

功能	互動
批次處理	支援。`usage.iterations` 會按項目回報。
Token 計數	僅回傳執行者第一次迭代的輸入 token。若要粗略估算顧問的用量，請使用設定為顧問模型的 `model` 和相同的訊息呼叫 `count_tokens`。
上下文編輯	`clear_tool_uses` 與顧問工具區塊不完全相容。關於 `clear_thinking`，請參閱前面的快取警告。
`pause_turn`	當同一回合中沒有用戶端 `tool_use` 區塊在等待您的結果時，懸置的顧問呼叫會使回應以 `stop_reason: "pause_turn"` 結束，並帶有一個沒有結果的 `server_tool_use` 區塊。顧問會在恢復時執行。如果執行者在該回合中也呼叫了您的某個工具，回應會改以 `stop_reason: "tool_use"` 結束，待處理的顧問呼叫會在您發送 `tool_result` 區塊後，於下一個請求開始時執行。請參閱恢復暫停的回合、在同一回合中混合伺服器工具和用戶端工具，以及伺服器工具。

最佳實踐

針對程式編寫和代理任務的提示

顧問工具內建的描述會引導執行者在複雜任務開始時以及遇到困難時呼叫它。對於研究任務，通常不需要額外的提示。

在程式編寫和代理任務上，當顧問減少總工具呼叫次數和對話長度時，能以相近的成本產生更高的智慧。有兩個時機驅動這種改善：

在對話記錄中有幾次探索性讀取之後，儘早進行第一次顧問呼叫。
對於困難的任務，在檔案寫入和測試輸出進入對話記錄後，進行最後一次顧問呼叫。

針對程式編寫任務的建議系統提示

時機指引：

You have access to an `advisor` tool backed by a stronger reviewer model. It takes NO parameters — when you call advisor(), your entire conversation history is automatically forwarded. They see the task, every tool call you've made, every result you've seen.

Call advisor BEFORE substantive work — before writing, before committing to an interpretation, before building on an assumption. If the task requires orientation first (finding files, fetching a source, seeing what's there), do that, then call advisor. Orientation is not substantive work. Writing, editing, and declaring an answer are.

Also call advisor:
- When you believe the task is complete. BEFORE this call, make your deliverable durable: write the file, save the result, commit the change. The advisor call takes time; if the session ends during it, a durable result persists and an unwritten one doesn't.
- When stuck — errors recurring, approach not converging, results that don't fit.
- When considering a change of approach.

On tasks longer than a few steps, call advisor at least once before committing to an approach and once before declaring done. On short reactive tasks where the next action is dictated by tool output you just read, you don't need to keep calling — the advisor adds most of its value on the first call, before the approach crystallizes.

執行者應如何對待建議（直接放在時機區塊之後）：

Give the advice serious weight. If you follow a step and it fails empirically, or you have primary-source evidence that contradicts a specific claim (the file says X, the paper states Y), adapt. A passing self-test is not evidence the advice is wrong — it's evidence your test doesn't check what the advice is checking.

If you've already retrieved data pointing one way and the advisor points another: don't silently switch. Surface the conflict in one more advisor call — "I found X, you suggest Y, which constraint breaks the tie?" The advisor saw your evidence but may have underweighted it; a reconcile call is cheaper than committing to the wrong branch.

針對程式編寫工作負載上的 Haiku 的替代系統提示

當您的 Haiku 執行者主要執行程式編寫或寫入任務工作負載時，請使用此區塊取代前面的時機和建議區塊：

Consult a stronger reviewer who sees your full conversation transcript.

No parameters. When you call advisor(), your entire history -- task, every tool call and result, your reasoning -- is automatically forwarded. The advisor sees exactly what you've done.

Call advisor BEFORE substantive work -- before writing, before committing to an interpretation, before building on an assumption. If the task requires orientation first (finding files, fetching a source, seeing what's there), do that, then call advisor. Orientation is not substantive work. Writing, editing, and declaring an answer are.

Also call advisor:
- When you believe the task is complete. BEFORE this call, make your deliverable durable: write the file, save the result, commit the change. The advisor call takes time; if the session ends during it, a durable result persists and an unwritten one doesn't.
- When stuck -- errors recurring, approach not converging, results that don't fit.
- When considering a change of approach.

On tasks longer than a few steps, call advisor at least once before committing to an approach and once before declaring done. On short reactive tasks where the next action is dictated by tool output you just read, you don't need to keep calling -- the advisor adds most of its value on the first call, before the approach crystallizes.

Give the advice serious weight. If you follow a step and it fails empirically, or you have primary-source evidence that contradicts a specific claim (the file says X, the paper states Y), adapt. A passing self-test is not evidence the advice is wrong -- it's evidence your test doesn't check what the advice is checking.

If you've already retrieved data pointing one way and the advisor points another: don't silently switch. Surface the conflict in one more advisor call -- "I found X, you suggest Y, which constraint breaks the tie?" The advisor saw your evidence but may have underweighted it; a reconcile call is cheaper than committing to the wrong branch.

Call advisor for design, architecture, and risk questions where you won't touch a file. If your response would be analysis or a recommendation with no other tool calls, call advisor first -- that judgment call is exactly where a second opinion is highest-value.

Hard rule: your first write_file, edit_file, or state-changing bash call on a task must be preceded by an advisor call in the same or an earlier turn. Read-only orientation commands (ls, cat, grep, find) are not state-changing. This is a checkpoint, not a difficulty judgment. It applies to one-line edits too.

增加 Opus 執行者的顧問呼叫

Opus 執行者通常在沒有額外提示的情況下以適當的頻率呼叫顧問。如果您的 Opus 執行者在您的工作負載上呼叫不足，請將以下檢查點加入您的系統提示：

Call advisor for design, architecture, and risk questions where you won't touch a file. If your response would be analysis or a recommendation with no other tool calls, call advisor first. That judgment call is exactly where a second opinion is highest-value. (This does not apply to simple factual lookups or arithmetic; those you answer directly.)

Hard rule: your first write_file, edit_file, or state-changing bash call on a task must be preceded by an advisor call in the same or an earlier turn. Read-only orientation commands (ls, cat, grep, find) are not state-changing. This is a checkpoint, not a difficulty judgment. It applies to one-line edits too.

縮減顧問輸出長度

(Advisor: please keep your guidance under 80 words — I need a focused starting point, not a comprehensive plan.)

這一行可以由您的代理框架在發送請求之前以程式化方式加在前面。此限制是軟性約束。顧問偶爾會超過它，因此請要求大約您真正上限的 80%。

在 Anthropic 的測試中，這一行也增加了執行者諮詢顧問的頻率，但淨效果仍然是更低的總成本（更多次諮詢，每次更短）。

限制顧問輸出

在工具定義上設定 max_tokens，以限制顧問每次呼叫的總輸出（思考加文字）：

tools = [
    {
        "type": "advisor_20260301",
        "name": "advisor",
        "model": "claude-opus-4-8",
        "max_tokens": 2048,
    }
]

這不僅僅是硬性截斷。伺服器也會將剩餘的 token 預算傳遞給顧問，因此顧問會調整其回應以符合限制。

`max_tokens`	顧問平均輸出 token 數	被截斷的呼叫
未設定	約 4,200 到 5,900	不適用
2048	約 630 到 840	約 0%
1024	約 370 到 480	約 10%

{
  "type": "advisor_tool_result",
  "tool_use_id": "srvtoolu_abc123",
  "content": {
    "type": "advisor_result",
    "text": "Use a channel-based coordination pattern. The tricky part is\n\n[Advisor output truncated at max_tokens=2048.]",
    "stop_reason": "max_tokens"
  }
}

檢查 usage.iterations 中對應的 advisor_message 項目上的 output_tokens，以查看每次呼叫與其上限的接近程度。

與 effort 設定搭配

成本控制

對於對話層級的預算，請在用戶端計算顧問呼叫次數。當達到您的上限時，從 tools 中移除顧問工具，並且從訊息歷史中移除所有 advisor_tool_result 區塊，以避免 400 invalid_request_error（請參閱多回合對話中的注意事項）。
僅在您預期有三次或更多顧問呼叫的對話中啟用 caching。

後續步驟

記憶工具

使用用戶端記憶目錄在對話之間儲存和檢索資訊。

伺服器工具

使用由 Anthropic 執行的工具：server_tool_use 區塊、pause_turn 延續和網域篩選。

工具參考

Anthropic 提供的工具目錄以及選用工具定義屬性的參考。

Effort

使用 effort 參數控制 Claude 回應時使用的 token 數量，在回應的完整性和 token 效率之間取得平衡。

Was this page helpful?

何時使用

模型相容性

平台可用性

快速開始

運作方式

工具參數

回應結構

成功的顧問呼叫

結果變體

錯誤結果

多回合對話

恢復暫停的回合

針對呼叫不足的執行者進行對話中途提醒

串流

用量與計費

顧問提示快取

執行者端快取

顧問端快取

與其他工具結合

最佳實踐

針對程式編寫和代理任務的提示

針對程式編寫任務的建議系統提示

針對程式編寫工作負載上的 Haiku 的替代系統提示

增加 Opus 執行者的顧問呼叫

縮減顧問輸出長度

限制顧問輸出

與 effort 設定搭配

成本控制

後續步驟

何時使用

模型相容性

平台可用性

快速開始

運作方式

工具參數

回應結構

成功的顧問呼叫

結果變體

錯誤結果

多回合對話

恢復暫停的回合

針對呼叫不足的執行者進行對話中途提醒

串流

用量與計費

顧問提示快取

執行者端快取

顧問端快取

與其他工具結合

最佳實踐

針對程式編寫和代理任務的提示

針對程式編寫任務的建議系統提示

針對程式編寫工作負載上的 Haiku 的替代系統提示

增加 Opus 執行者的顧問呼叫

縮減顧問輸出長度

限制顧問輸出

與 effort 設定搭配

成本控制

後續步驟

何時使用

模型相容性

平台可用性

快速開始

運作方式

工具參數

回應結構

成功的顧問呼叫

結果變體

錯誤結果

多回合對話

恢復暫停的回合

針對呼叫不足的執行者進行對話中途提醒

串流

用量與計費

顧問提示快取

執行者端快取

顧問端快取

與其他工具結合

最佳實踐

針對程式編寫和代理任務的提示

針對程式編寫任務的建議系統提示

針對程式編寫工作負載上的 Haiku 的替代系統提示

增加 Opus 執行者的顧問呼叫

縮減顧問輸出長度

限制顧問輸出

與 effort 設定搭配

成本控制

後續步驟

何時使用

模型相容性

平台可用性

快速開始

運作方式

工具參數

回應結構

成功的顧問呼叫

結果變體

錯誤結果

多回合對話

恢復暫停的回合

針對呼叫不足的執行者進行對話中途提醒

串流

用量與計費

顧問提示快取

執行者端快取

顧問端快取

與其他工具結合

最佳實踐

針對程式編寫和代理任務的提示

針對程式編寫任務的建議系統提示

針對程式編寫工作負載上的 Haiku 的替代系統提示

增加 Opus 執行者的顧問呼叫

縮減顧問輸出長度

限制顧問輸出

與 effort 設定搭配

成本控制

後續步驟