工具

网页获取工具

网页获取工具允许 Claude 从指定的网页和 PDF 文档中检索完整内容。

网页获取工具目前处于 Beta 阶段。要启用它，请在 API 请求中使用 Beta 头 web-fetch-2025-09-10。

请使用此表单提供关于模型响应质量、API 本身或文档质量的反馈。

在 Claude 处理不受信任的输入和敏感数据的环境中启用网页获取工具会带来数据泄露风险。我们建议仅在受信任的环境中或处理非敏感数据时使用此工具。

为了最大限度地降低数据泄露风险，Claude 不允许动态构建 URL。Claude 只能获取用户明确提供的 URL 或来自先前网页搜索或网页获取结果的 URL。但是，在使用此工具时仍然存在需要仔细考虑的残余风险。

如果数据泄露是一个问题，请考虑：

完全禁用网页获取工具
使用 max_uses 参数限制请求次数
使用 allowed_domains 参数限制为已知安全域名

支持的模型

网页获取可在以下模型上使用：

Claude Opus 4.6 (claude-opus-4-6)
Claude Opus 4.5 (claude-opus-4-5-20251101)
Claude Opus 4.1 (claude-opus-4-1-20250805)
Claude Opus 4 (claude-opus-4-20250514)
Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
Claude Sonnet 4 (claude-sonnet-4-20250514)
Claude Sonnet 3.7（已弃用）(claude-3-7-sonnet-20250219)
Claude Haiku 4.5 (claude-haiku-4-5-20251001)
Claude Haiku 3.5（已弃用）(claude-3-5-haiku-latest)

网页获取的工作原理

当您在 API 请求中添加网页获取工具时：

Claude 根据提示和可用的 URL 决定何时获取内容。
API 从指定的 URL 检索完整的文本内容。
对于 PDF，会自动执行文本提取。
Claude 分析获取的内容并提供带有可选引用的响应。

网页获取工具目前不支持通过 Javascript 动态渲染的网站。

如何使用网页获取

在您的 API 请求中提供网页获取工具：

curl https://api.anthropic.com/v1/messages \
    --header "x-api-key: $ANTHROPIC_API_KEY" \
    --header "anthropic-version: 2023-06-01" \
    --header "anthropic-beta: web-fetch-2025-09-10" \
    --header "content-type: application/json" \
    --data '{
        "model": "claude-opus-4-6",
        "max_tokens": 1024,
        "messages": [
            {
                "role": "user",
                "content": "Please analyze the content at https://example.com/article"
            }
        ],
        "tools": [{
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5
        }]
    }'

工具定义

网页获取工具支持以下参数：

JSON

{
  "type": "web_fetch_20250910",
  "name": "web_fetch",

  // 可选：限制每个请求的获取次数
  "max_uses": 10,

  // 可选：仅从这些域名获取
  "allowed_domains": ["example.com", "docs.example.com"],

  // 可选：永不从这些域名获取
  "blocked_domains": ["private.example.com"],

  // 可选：为获取的内容启用引用
  "citations": {
    "enabled": true
  },

  // 可选：最大内容长度（以 token 为单位）
  "max_content_tokens": 100000
}

最大使用次数

max_uses 参数限制执行的网页获取次数。如果 Claude 尝试的获取次数超过允许的次数，web_fetch_tool_result 将返回一个带有 max_uses_exceeded 错误代码的错误。目前没有默认限制。

域名过滤

使用域名过滤时：

域名不应包含 HTTP/HTTPS 协议（使用 example.com 而不是 https://example.com）
子域名会自动包含（example.com 涵盖 docs.example.com）
支持子路径（example.com/blog）
您可以使用 allowed_domains 或 blocked_domains，但不能在同一请求中同时使用两者。

请注意，域名中的 Unicode 字符可能通过同形异义词攻击造成安全漏洞，其中来自不同文字系统的视觉相似字符可以绕过域名过滤器。例如，аmazon.com（使用西里尔字母 'а'）可能看起来与 amazon.com 相同，但代表的是不同的域名。

配置域名允许/阻止列表时：

尽可能使用纯 ASCII 域名
考虑到 URL 解析器可能以不同方式处理 Unicode 规范化
使用潜在的同形异义词变体测试您的域名过滤器
定期审核您的域名配置以检查可疑的 Unicode 字符

内容限制

max_content_tokens 参数限制将包含在上下文中的内容量。如果获取的内容超过此限制，将被截断。这有助于在获取大型文档时控制 token 使用量。

max_content_tokens 参数限制是近似值。实际使用的输入 token 数量可能会有少量差异。

引用

与网页搜索始终启用引用不同，网页获取的引用是可选的。设置 "citations": {"enabled": true} 以启用 Claude 引用获取文档中的特定段落。

当直接向最终用户显示 API 输出时，必须包含指向原始来源的引用。如果您在向最终用户显示之前对 API 输出进行了修改，包括重新处理和/或将其与您自己的材料结合，请根据您法律团队的建议适当显示引用。

响应

以下是一个示例响应结构：

{
  "role": "assistant",
  "content": [
    // 1. Claude 决定获取
    {
      "type": "text",
      "text": "I'll fetch the content from the article to analyze it."
    },
    // 2. 获取请求
    {
      "type": "server_tool_use",
      "id": "srvtoolu_01234567890abcdef",
      "name": "web_fetch",
      "input": {
        "url": "https://example.com/article"
      }
    },
    // 3. 获取结果
    {
      "type": "web_fetch_tool_result",
      "tool_use_id": "srvtoolu_01234567890abcdef",
      "content": {
        "type": "web_fetch_result",
        "url": "https://example.com/article",
        "content": {
          "type": "document",
          "source": {
            "type": "text",
            "media_type": "text/plain",
            "data": "Full text content of the article..."
          },
          "title": "Article Title",
          "citations": {"enabled": true}
        },
        "retrieved_at": "2025-08-25T10:30:00Z"
      }
    },
    // 4. Claude 的分析及引用（如果已启用）
    {
      "text": "Based on the article, ",
      "type": "text"
    },
    {
      "text": "the main argument presented is that artificial intelligence will transform healthcare",
      "type": "text",
      "citations": [
        {
          "type": "char_location",
          "document_index": 0,
          "document_title": "Article Title",
          "start_char_index": 1234,
          "end_char_index": 1456,
          "cited_text": "Artificial intelligence is poised to revolutionize healthcare delivery..."
        }
      ]
    }
  ],
  "id": "msg_a930390d3a",
  "usage": {
    "input_tokens": 25039,
    "output_tokens": 931,
    "server_tool_use": {
      "web_fetch_requests": 1
    }
  },
  "stop_reason": "end_turn"
}

获取结果

获取结果包括：

url：被获取的 URL
content：包含获取内容的文档块
retrieved_at：内容被检索时的时间戳

网页获取工具会缓存结果以提高性能并减少冗余请求。这意味着返回的内容可能并不总是该 URL 上可用的最新版本。缓存行为是自动管理的，可能会随时间变化以针对不同的内容类型和使用模式进行优化。

对于 PDF 文档，内容将以 base64 编码数据的形式返回：

{
  "type": "web_fetch_tool_result",
  "tool_use_id": "srvtoolu_02",
  "content": {
    "type": "web_fetch_result",
    "url": "https://example.com/paper.pdf",
    "content": {
      "type": "document",
      "source": {
        "type": "base64",
        "media_type": "application/pdf",
        "data": "JVBERi0xLjQKJcOkw7zDtsOfCjIgMCBvYmo..."
      },
      "citations": {"enabled": true}
    },
    "retrieved_at": "2025-08-25T10:30:02Z"
  }
}

错误

当网页获取工具遇到错误时，Claude API 返回 200（成功）响应，错误在响应体中表示：

{
  "type": "web_fetch_tool_result",
  "tool_use_id": "srvtoolu_a93jad",
  "content": {
    "type": "web_fetch_tool_error",
    "error_code": "url_not_accessible"
  }
}

以下是可能的错误代码：

invalid_input：无效的 URL 格式
url_too_long：URL 超过最大长度（250 个字符）
url_not_allowed：URL 被域名过滤规则和模型限制阻止
url_not_accessible：无法获取内容（HTTP 错误）
too_many_requests：超过速率限制
unsupported_content_type：不支持的内容类型（仅支持文本和 PDF）
max_uses_exceeded：超过网页获取工具的最大使用次数
unavailable：发生内部错误

URL 验证

出于安全原因，网页获取工具只能获取之前在对话上下文中出现过的 URL。这包括：

用户消息中的 URL
客户端工具结果中的 URL
来自先前网页搜索或网页获取结果的 URL

该工具无法获取 Claude 生成的任意 URL 或来自基于容器的服务器工具（代码执行、Bash 等）的 URL。

搜索与获取组合使用

网页获取与网页搜索无缝配合，实现全面的信息收集：

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Find recent articles about quantum computing and analyze the most relevant one in detail"
        }
    ],
    tools=[
        {
            "type": "web_search_20250305",
            "name": "web_search",
            "max_uses": 3
        },
        {
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5,
            "citations": {"enabled": True}
        }
    ],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

在此工作流程中，Claude 将：

使用网页搜索查找相关文章
选择最有前景的结果
使用网页获取检索完整内容
提供带有引用的详细分析

提示缓存

网页获取与提示缓存配合使用。要启用提示缓存，请在请求中添加 cache_control 断点。缓存的获取结果可以在对话轮次之间重复使用。

import anthropic

client = anthropic.Anthropic()

# 第一个带有网页获取的请求
messages = [
    {
        "role": "user",
        "content": "Analyze this research paper: https://arxiv.org/abs/2024.12345"
    }
]

response1 = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=messages,
    tools=[{
        "type": "web_fetch_20250910",
        "name": "web_fetch"
    }],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

# 将 Claude 的响应添加到对话中
messages.append({
    "role": "assistant",
    "content": response1.content
})

# 带有缓存断点的第二个请求
messages.append({
    "role": "user",
    "content": "What methodology does the paper use?",
    "cache_control": {"type": "ephemeral"}
})

response2 = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=messages,
    tools=[{
        "type": "web_fetch_20250910",
        "name": "web_fetch"
    }],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

# 第二个响应受益于缓存的获取结果
print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}")

流式传输

启用流式传输后，获取事件是流的一部分，在内容检索期间会有暂停：

event: message_start
data: {"type": "message_start", "message": {"id": "msg_abc123", "type": "message"}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

// Claude 决定获取

event: content_block_start
data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "web_fetch"}}

// 获取 URL 流式传输
event: content_block_delta
data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"url\":\"https://example.com/article\"}"}}

// 获取执行期间暂停

// 获取结果流式传输
event: content_block_start
data: {"type": "content_block_start", "index": 2, "content_block": {"type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"type": "web_fetch_result", "url": "https://example.com/article", "content": {"type": "document", "source": {"type": "text", "media_type": "text/plain", "data": "Article content..."}}}}}

// Claude 的响应继续...

批量请求

您可以在 Messages Batches API 中包含网页获取工具。通过 Messages Batches API 进行的网页获取工具调用与常规 Messages API 请求中的定价相同。

使用量和定价

Web fetch usage has no additional charges beyond standard token costs:

"usage": {
  "input_tokens": 25039,
  "output_tokens": 931,
  "cache_read_input_tokens": 0,
  "cache_creation_input_tokens": 0,
  "server_tool_use": {
    "web_fetch_requests": 1
  }
}

The web fetch tool is available on the Claude API at no additional cost. You only pay standard token costs for the fetched content that becomes part of your conversation context.

To protect against inadvertently fetching large content that would consume excessive tokens, use the max_content_tokens parameter to set appropriate limits based on your use case and budget considerations.

Example token usage for typical content:

Average web page (10KB): ~2,500 tokens
Large documentation page (100KB): ~25,000 tokens
Research paper PDF (500KB): ~125,000 tokens

Was this page helpful?

工具

网页获取工具

网页获取工具允许 Claude 从指定的网页和 PDF 文档中检索完整内容。

网页获取工具目前处于 Beta 阶段。要启用它，请在 API 请求中使用 Beta 头 web-fetch-2025-09-10。

请使用此表单提供关于模型响应质量、API 本身或文档质量的反馈。

如果数据泄露是一个问题，请考虑：

完全禁用网页获取工具
使用 max_uses 参数限制请求次数
使用 allowed_domains 参数限制为已知安全域名

支持的模型

网页获取可在以下模型上使用：

Claude Opus 4.6 (claude-opus-4-6)
Claude Opus 4.5 (claude-opus-4-5-20251101)
Claude Opus 4.1 (claude-opus-4-1-20250805)
Claude Opus 4 (claude-opus-4-20250514)
Claude Sonnet 4.5 (claude-sonnet-4-5-20250929)
Claude Sonnet 4 (claude-sonnet-4-20250514)
Claude Sonnet 3.7（已弃用）(claude-3-7-sonnet-20250219)
Claude Haiku 4.5 (claude-haiku-4-5-20251001)
Claude Haiku 3.5（已弃用）(claude-3-5-haiku-latest)

网页获取的工作原理

当您在 API 请求中添加网页获取工具时：

Claude 根据提示和可用的 URL 决定何时获取内容。
API 从指定的 URL 检索完整的文本内容。
对于 PDF，会自动执行文本提取。
Claude 分析获取的内容并提供带有可选引用的响应。

网页获取工具目前不支持通过 Javascript 动态渲染的网站。

如何使用网页获取

在您的 API 请求中提供网页获取工具：

curl https://api.anthropic.com/v1/messages \
    --header "x-api-key: $ANTHROPIC_API_KEY" \
    --header "anthropic-version: 2023-06-01" \
    --header "anthropic-beta: web-fetch-2025-09-10" \
    --header "content-type: application/json" \
    --data '{
        "model": "claude-opus-4-6",
        "max_tokens": 1024,
        "messages": [
            {
                "role": "user",
                "content": "Please analyze the content at https://example.com/article"
            }
        ],
        "tools": [{
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5
        }]
    }'

工具定义

网页获取工具支持以下参数：

JSON

{
  "type": "web_fetch_20250910",
  "name": "web_fetch",

  // 可选：限制每个请求的获取次数
  "max_uses": 10,

  // 可选：仅从这些域名获取
  "allowed_domains": ["example.com", "docs.example.com"],

  // 可选：永不从这些域名获取
  "blocked_domains": ["private.example.com"],

  // 可选：为获取的内容启用引用
  "citations": {
    "enabled": true
  },

  // 可选：最大内容长度（以 token 为单位）
  "max_content_tokens": 100000
}

最大使用次数

域名过滤

使用域名过滤时：

域名不应包含 HTTP/HTTPS 协议（使用 example.com 而不是 https://example.com）
子域名会自动包含（example.com 涵盖 docs.example.com）
支持子路径（example.com/blog）
您可以使用 allowed_domains 或 blocked_domains，但不能在同一请求中同时使用两者。

配置域名允许/阻止列表时：

尽可能使用纯 ASCII 域名
考虑到 URL 解析器可能以不同方式处理 Unicode 规范化
使用潜在的同形异义词变体测试您的域名过滤器
定期审核您的域名配置以检查可疑的 Unicode 字符

内容限制

max_content_tokens 参数限制将包含在上下文中的内容量。如果获取的内容超过此限制，将被截断。这有助于在获取大型文档时控制 token 使用量。

max_content_tokens 参数限制是近似值。实际使用的输入 token 数量可能会有少量差异。

引用

与网页搜索始终启用引用不同，网页获取的引用是可选的。设置 "citations": {"enabled": true} 以启用 Claude 引用获取文档中的特定段落。

响应

以下是一个示例响应结构：

{
  "role": "assistant",
  "content": [
    // 1. Claude 决定获取
    {
      "type": "text",
      "text": "I'll fetch the content from the article to analyze it."
    },
    // 2. 获取请求
    {
      "type": "server_tool_use",
      "id": "srvtoolu_01234567890abcdef",
      "name": "web_fetch",
      "input": {
        "url": "https://example.com/article"
      }
    },
    // 3. 获取结果
    {
      "type": "web_fetch_tool_result",
      "tool_use_id": "srvtoolu_01234567890abcdef",
      "content": {
        "type": "web_fetch_result",
        "url": "https://example.com/article",
        "content": {
          "type": "document",
          "source": {
            "type": "text",
            "media_type": "text/plain",
            "data": "Full text content of the article..."
          },
          "title": "Article Title",
          "citations": {"enabled": true}
        },
        "retrieved_at": "2025-08-25T10:30:00Z"
      }
    },
    // 4. Claude 的分析及引用（如果已启用）
    {
      "text": "Based on the article, ",
      "type": "text"
    },
    {
      "text": "the main argument presented is that artificial intelligence will transform healthcare",
      "type": "text",
      "citations": [
        {
          "type": "char_location",
          "document_index": 0,
          "document_title": "Article Title",
          "start_char_index": 1234,
          "end_char_index": 1456,
          "cited_text": "Artificial intelligence is poised to revolutionize healthcare delivery..."
        }
      ]
    }
  ],
  "id": "msg_a930390d3a",
  "usage": {
    "input_tokens": 25039,
    "output_tokens": 931,
    "server_tool_use": {
      "web_fetch_requests": 1
    }
  },
  "stop_reason": "end_turn"
}

获取结果

获取结果包括：

url：被获取的 URL
content：包含获取内容的文档块
retrieved_at：内容被检索时的时间戳

对于 PDF 文档，内容将以 base64 编码数据的形式返回：

{
  "type": "web_fetch_tool_result",
  "tool_use_id": "srvtoolu_02",
  "content": {
    "type": "web_fetch_result",
    "url": "https://example.com/paper.pdf",
    "content": {
      "type": "document",
      "source": {
        "type": "base64",
        "media_type": "application/pdf",
        "data": "JVBERi0xLjQKJcOkw7zDtsOfCjIgMCBvYmo..."
      },
      "citations": {"enabled": true}
    },
    "retrieved_at": "2025-08-25T10:30:02Z"
  }
}

错误

当网页获取工具遇到错误时，Claude API 返回 200（成功）响应，错误在响应体中表示：

{
  "type": "web_fetch_tool_result",
  "tool_use_id": "srvtoolu_a93jad",
  "content": {
    "type": "web_fetch_tool_error",
    "error_code": "url_not_accessible"
  }
}

以下是可能的错误代码：

invalid_input：无效的 URL 格式
url_too_long：URL 超过最大长度（250 个字符）
url_not_allowed：URL 被域名过滤规则和模型限制阻止
url_not_accessible：无法获取内容（HTTP 错误）
too_many_requests：超过速率限制
unsupported_content_type：不支持的内容类型（仅支持文本和 PDF）
max_uses_exceeded：超过网页获取工具的最大使用次数
unavailable：发生内部错误

URL 验证

出于安全原因，网页获取工具只能获取之前在对话上下文中出现过的 URL。这包括：

用户消息中的 URL
客户端工具结果中的 URL
来自先前网页搜索或网页获取结果的 URL

该工具无法获取 Claude 生成的任意 URL 或来自基于容器的服务器工具（代码执行、Bash 等）的 URL。

搜索与获取组合使用

网页获取与网页搜索无缝配合，实现全面的信息收集：

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Find recent articles about quantum computing and analyze the most relevant one in detail"
        }
    ],
    tools=[
        {
            "type": "web_search_20250305",
            "name": "web_search",
            "max_uses": 3
        },
        {
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5,
            "citations": {"enabled": True}
        }
    ],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

在此工作流程中，Claude 将：

使用网页搜索查找相关文章
选择最有前景的结果
使用网页获取检索完整内容
提供带有引用的详细分析

提示缓存

网页获取与提示缓存配合使用。要启用提示缓存，请在请求中添加 cache_control 断点。缓存的获取结果可以在对话轮次之间重复使用。

import anthropic

client = anthropic.Anthropic()

# 第一个带有网页获取的请求
messages = [
    {
        "role": "user",
        "content": "Analyze this research paper: https://arxiv.org/abs/2024.12345"
    }
]

response1 = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=messages,
    tools=[{
        "type": "web_fetch_20250910",
        "name": "web_fetch"
    }],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

# 将 Claude 的响应添加到对话中
messages.append({
    "role": "assistant",
    "content": response1.content
})

# 带有缓存断点的第二个请求
messages.append({
    "role": "user",
    "content": "What methodology does the paper use?",
    "cache_control": {"type": "ephemeral"}
})

response2 = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=messages,
    tools=[{
        "type": "web_fetch_20250910",
        "name": "web_fetch"
    }],
    extra_headers={
        "anthropic-beta": "web-fetch-2025-09-10"
    }
)

# 第二个响应受益于缓存的获取结果
print(f"Cache read tokens: {response2.usage.get('cache_read_input_tokens', 0)}")

流式传输

启用流式传输后，获取事件是流的一部分，在内容检索期间会有暂停：

event: message_start
data: {"type": "message_start", "message": {"id": "msg_abc123", "type": "message"}}

event: content_block_start
data: {"type": "content_block_start", "index": 0, "content_block": {"type": "text", "text": ""}}

// Claude 决定获取

event: content_block_start
data: {"type": "content_block_start", "index": 1, "content_block": {"type": "server_tool_use", "id": "srvtoolu_xyz789", "name": "web_fetch"}}

// 获取 URL 流式传输
event: content_block_delta
data: {"type": "content_block_delta", "index": 1, "delta": {"type": "input_json_delta", "partial_json": "{\"url\":\"https://example.com/article\"}"}}

// 获取执行期间暂停

// 获取结果流式传输
event: content_block_start
data: {"type": "content_block_start", "index": 2, "content_block": {"type": "web_fetch_tool_result", "tool_use_id": "srvtoolu_xyz789", "content": {"type": "web_fetch_result", "url": "https://example.com/article", "content": {"type": "document", "source": {"type": "text", "media_type": "text/plain", "data": "Article content..."}}}}}

// Claude 的响应继续...

批量请求

您可以在 Messages Batches API 中包含网页获取工具。通过 Messages Batches API 进行的网页获取工具调用与常规 Messages API 请求中的定价相同。

使用量和定价

Web fetch usage has no additional charges beyond standard token costs:

"usage": {
  "input_tokens": 25039,
  "output_tokens": 931,
  "cache_read_input_tokens": 0,
  "cache_creation_input_tokens": 0,
  "server_tool_use": {
    "web_fetch_requests": 1
  }
}

The web fetch tool is available on the Claude API at no additional cost. You only pay standard token costs for the fetched content that becomes part of your conversation context.

Example token usage for typical content:

Average web page (10KB): ~2,500 tokens
Large documentation page (100KB): ~25,000 tokens
Research paper PDF (500KB): ~125,000 tokens

Was this page helpful?