Claude Platform Docs
  • Messages
  • Managed Agents
  • Admin

Search...
⌘K
First steps
Intro to ClaudeQuickstart
Building with Claude
Features overviewUsing the Messages APIStop reasons and fallbackRefusals and fallbackFallback credit
Model capabilities
Extended thinkingAdaptive thinkingEffortTask budgets (beta)Fast mode (research preview)Structured outputsCitationsStreaming MessagesBatch processingSearch resultsStreaming refusalsMultilingual supportEmbeddings
Tools
OverviewHow tool use worksTutorial: Build a tool-using agentDefine toolsHandle tool callsParallel tool useTool Runner (SDK)Strict tool useServer toolsWeb search toolWeb fetch toolCode execution toolAdvisor toolTool search toolMemory toolBash toolText editor toolComputer use toolTroubleshooting
Tool infrastructure
Tool referenceManage tool contextTool combinationsTool use with prompt cachingProgrammatic tool callingFine-grained tool streaming
Context management
Context windowsCompactionContext editingPrompt cachingMid-conversation system messagesBuild an orchestration modeCache diagnostics (beta)Token counting
Working with files
Files APIPDF support
Skills
OverviewQuickstartBest practicesSkills for enterpriseSkills in the API
MCP
Remote MCP serversMCP connector
Claude on cloud platforms
Amazon BedrockAmazon Bedrock (legacy)Claude Platform on AWSGoogle CloudMicrosoft Foundry

Log in
Google Cloud
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Claude Platform Docs

Solutions

  • AI agents
  • Code modernization
  • Coding
  • Customer support
  • Education
  • Financial services
  • Government
  • Life sciences

Partners

  • Claude on AWS
  • Claude on Google Cloud

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Company

  • Anthropic
  • Careers
  • Economic Futures
  • Research
  • News
  • Responsible Scaling Policy
  • Security and compliance
  • Transparency

Learn

  • Blog
  • Courses
  • Use cases
  • Connectors
  • Customer stories
  • Engineering at Anthropic
  • Events
  • Powered by Claude
  • Service partners
  • Startups program

Help and security

  • Availability
  • Status
  • Support
  • Discord

Terms and policies

  • Privacy policy
  • Responsible disclosure policy
  • Terms of service: Commercial
  • Terms of service: Consumer
  • Usage policy
Messages/Claude on cloud platforms

Claude on Google Cloud

Anthropic's Claude models are available through Google Cloud's Agent Platform.

The API for accessing Claude on Google Cloud's Agent Platform is nearly identical to the Messages API, with two key differences in request format:

  • On Agent Platform, model is not passed in the request body. Instead, it is specified in the Google Cloud endpoint URL.
  • On Agent Platform, anthropic_version is passed in the request body (rather than as a header), and must be set to the value vertex-2023-10-16.

Agent Platform is also supported by Anthropic's official client SDKs. This guide walks you through making a request to Claude on Agent Platform using one of Anthropic's client SDKs.

Note that this guide assumes you already have a Google Cloud project that is able to use Agent Platform. See Anthropic Claude models on Agent Platform for more information on the setup required and a full walkthrough.

Install an SDK for accessing Agent Platform

First, install Anthropic's client SDK for your language of choice.

Accessing Agent Platform

Model availability

Note that Anthropic model availability varies by region. Search for "Claude" in the Model Garden or go to Anthropic Claude models for the latest information.

API model IDs

Lifecycle terms (Deprecated, Retired) are defined in Model deprecations. Lifecycle dates on partner-operated platforms are set by the partner and can differ from the Claude API schedule. For the current retirement date of any model on Agent Platform, see Google Cloud's documentation for Claude models on Agent Platform.

ModelAgent Platform API model ID
Claude Fable 5claude-fable-5
Claude Opus 4.8claude-opus-4-8
Claude Opus 4.7claude-opus-4-7
Claude Opus 4.6claude-opus-4-6
Claude Sonnet 4.6claude-sonnet-4-6
Claude Sonnet 4.5claude-sonnet-4-5@20250929
Claude Sonnet 4
Deprecated.
claude-sonnet-4@20250514
Claude Sonnet 3.7
Retired.
claude-3-7-sonnet@20250219
Claude Opus 4.5claude-opus-4-5@20251101
Claude Opus 4.1
Deprecated.
claude-opus-4-1@20250805
Claude Opus 4
Deprecated.
claude-opus-4@20250514
Claude Haiku 4.5claude-haiku-4-5@20251001
Claude Haiku 3.5
Deprecated.
claude-3-5-haiku@20241022


Upgrading to a newer Claude model? In Claude Code, run /claude-api migrate to apply model ID swaps and breaking parameter changes across your codebase. The skill detects which cloud platform your code targets and adjusts model ID formats and feature changes for that platform. See Migrating to a newer Claude model.

Making requests

Before running requests you may need to run gcloud auth application-default login to authenticate with Google Cloud.

The following examples show how to generate text from Claude on Agent Platform:

from anthropic import AnthropicVertex

project_id = "MY_PROJECT_ID"
region = "global"

client = AnthropicVertex(project_id=project_id, region=region)

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=100,
    messages=[
        {
            "role": "user",
            "content": "Hey Claude!",
        }
    ],
)
print(message)

See the client SDKs and the official Agent Platform docs for more details.

Claude is also available through Amazon Bedrock, Claude Platform on AWS, and Microsoft Foundry.

Data retention

Data handling for this offering is governed by Google Cloud. For details, see Agent Platform and zero data retention.

Activity logging

Agent Platform provides a request-response logging service that allows customers to log the prompts and completions associated with your usage.

Anthropic recommends that you log your activity on at least a 30-day rolling basis in order to understand your activity and investigate any potential misuse.



Turning on this service does not give Google or Anthropic any access to your content.

Feature support

For the full feature list with Google Cloud availability, see Features overview.

Supported feature highlights

  • Messages API
  • Prompt caching
  • Extended thinking
  • Tool use, including the Bash tool, Computer use tool, Memory tool, and Text editor tool
  • Web search tool
  • Citations
  • Structured outputs

Features not supported

  • Input sources (URL sources for images and documents, Files API)
  • Server-side tools (code execution, web fetch, advisor)
  • Agent infrastructure (Agent Skills, MCP connector, programmatic tool calling)
  • API endpoints (Message Batches, Models, Admin, Compliance, Usage and Cost)
  • Claude Managed Agents
  • Server-side fallback (the fallbacks parameter; use the client-side fallback pattern instead)

Context window

Claude Fable 5, Claude Opus 4.8, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6 have a 1M-token context window on Agent Platform. Other Claude models, including Sonnet 4.5 and Sonnet 4 (deprecated), have a 200k-token context window.

Agent Platform limits request payloads to 30 MB. When sending large documents or many images, you may reach this limit before the token limit.

Global, multi-region, and regional endpoints

Agent Platform offers three endpoint types:

  • Global endpoints: Dynamic routing for maximum availability
  • Multi-region endpoints: Dynamic routing within a geographic area (for example, the United States or the European Union) for data residency with high availability
  • Regional endpoints: Guaranteed data routing through specific geographic regions

Regional and multi-region endpoints include a 10% pricing premium over global endpoints.



This applies to Claude Sonnet 4.5 and future models only. Older models (Claude Sonnet 4 (deprecated), Opus 4 (deprecated), and earlier) maintain their existing pricing structures.

When to use each option

Global endpoints (recommended):

  • Provide maximum availability and uptime
  • Dynamically route requests to regions with available capacity
  • No pricing premium
  • Best for applications where data residency is flexible
  • Only supports pay-as-you-go traffic (provisioned throughput requires regional endpoints)

Multi-region endpoints:

  • Dynamically route requests across regions within a geographic area (currently us and eu)
  • Useful when you need data residency within a broad geography but want higher availability than a single region
  • 10% pricing premium over global endpoints
  • Only supports pay-as-you-go traffic (provisioned throughput requires regional endpoints)

Regional endpoints:

  • Route traffic through specific geographic regions
  • Required for single-region data residency, strict compliance mandates, or provisioned throughput
  • Support both pay-as-you-go and provisioned throughput
  • 10% pricing premium reflects infrastructure costs for dedicated regional capacity

Implementation

Using global endpoints (recommended):

Set the region parameter to "global" when initializing the client:

from anthropic import AnthropicVertex

project_id = "MY_PROJECT_ID"
region = "global"

client = AnthropicVertex(project_id=project_id, region=region)

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=100,
    messages=[
        {
            "role": "user",
            "content": "Hey Claude!",
        }
    ],
)
print(message)

Using multi-region endpoints:

Set the region parameter to a multi-region identifier: "us" for the United States or "eu" for the European Union. The SDK routes requests to the corresponding multi-region endpoint (https://aiplatform.us.rep.googleapis.com or https://aiplatform.eu.rep.googleapis.com), which dynamically balances traffic across regions within that geography.

from anthropic import AnthropicVertex

project_id = "MY_PROJECT_ID"
region = "us"  # Multi-region identifier: "us" or "eu"

client = AnthropicVertex(project_id=project_id, region=region)

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=100,
    messages=[
        {
            "role": "user",
            "content": "Hey Claude!",
        }
    ],
)
print(message)

Using regional endpoints:

Specify a specific region like "us-east1" or "europe-west1":

from anthropic import AnthropicVertex

project_id = "MY_PROJECT_ID"
region = "us-east1"  # Specify a specific region

client = AnthropicVertex(project_id=project_id, region=region)

message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=100,
    messages=[
        {
            "role": "user",
            "content": "Hey Claude!",
        }
    ],
)
print(message)


Claude Mythos Preview is a research preview available to invited customers on Agent Platform. For more information, see Project Glasswing.

Additional resources

  • Agent Platform pricing: Generative AI pricing on cloud.google.com
  • Claude models documentation: Claude on Agent Platform
  • Google blog post: Global endpoint for Claude models
  • Anthropic pricing details: Cloud platform pricing

Was this page helpful?

  • Install an SDK for accessing Agent Platform
  • Accessing Agent Platform
  • Model availability
  • Making requests
  • Data retention
  • Activity logging
  • Feature support
  • Supported feature highlights
  • Features not supported
  • Context window
  • Global, multi-region, and regional endpoints
  • When to use each option
  • Implementation
  • Additional resources