Was this page helpful?
The Vertex API for accessing Claude is nearly-identical to the Messages API and supports all of the same options, with two key differences:
model is not passed in the request body. Instead, it is specified in the Google Cloud endpoint URL.anthropic_version is passed in the request body (rather than as a header), and must be set to the value vertex-2023-10-16.Vertex is also supported by Anthropic's official client SDKs. This guide walks you through making a request to Claude on Vertex AI using one of Anthropic's client SDKs.
Note that this guide assumes you already have a GCP project that is able to use Vertex AI. See using the Claude 3 models from Anthropic for more information on the setup required, as well as a full walkthrough.
First, install Anthropic's client SDK for your language of choice.
Note that Anthropic model availability varies by region. Search for "Claude" in the Vertex AI Model Garden or go to Use Claude 3 for the latest information.
| Model | Vertex AI API model ID |
|---|---|
| Claude Opus 4.6 | claude-opus-4-6 |
| Claude Sonnet 4.6 | claude-sonnet-4-6 |
| Claude Sonnet 4.5 | claude-sonnet-4-5@20250929 |
| Claude Sonnet 4 | claude-sonnet-4@20250514 |
| Claude Sonnet 3.7 ⚠️ | claude-3-7-sonnet@20250219 |
| Claude Opus 4.5 | claude-opus-4-5@20251101 |
| Claude Opus 4.1 | claude-opus-4-1@20250805 |
| Claude Opus 4 | claude-opus-4@20250514 |
| Claude Haiku 4.5 | claude-haiku-4-5@20251001 |
| Claude Haiku 3.5 ⚠️ | claude-3-5-haiku@20241022 |
| Claude Haiku 3 ⚠️ |
Before running requests you may need to run gcloud auth application-default login to authenticate with GCP.
The following examples show how to generate text from Claude on Vertex AI:
See the client SDKs and the official Vertex AI docs for more details.
Claude is also available through Amazon Bedrock and Microsoft Foundry.
Vertex provides a request-response logging service that allows customers to log the prompts and completions associated with your usage.
Anthropic recommends that you log your activity on at least a 30-day rolling basis in order to understand your activity and investigate any potential misuse.
Turning on this service does not give Google or Anthropic any access to your content.
For all currently supported features on Vertex AI, see API features overview.
Claude Opus 4.6, Sonnet 4.6, Sonnet 4.5, and Sonnet 4 have a 1M-token context window on Vertex AI.
For Claude Sonnet 4.5 and Sonnet 4, the 1M-token context window is in beta. To use, include the context-1m-2025-08-07 beta header in your API requests.
Vertex AI limits request payloads to 30 MB. When sending large documents or many images, you may reach this limit before the token limit.
Starting with Claude Sonnet 4.5 and all future models, Google Vertex AI offers two endpoint types:
Regional endpoints include a 10% pricing premium over global endpoints.
This applies to Claude Sonnet 4.5 and future models only. Older models (Claude Sonnet 4, Opus 4, and earlier) maintain their existing pricing structures.
Global endpoints (recommended):
Regional endpoints:
Using global endpoints (recommended):
Set the region parameter to "global" when initializing the client:
Using regional endpoints:
Specify a specific region like "us-east1" or "europe-west1":
| claude-3-haiku@20240307 |
MODEL_ID=claude-opus-4-6
LOCATION=global
PROJECT_ID=MY_PROJECT_ID
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://$LOCATION-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/anthropic/models/${MODEL_ID}:streamRawPredict -d \
'{
"anthropic_version": "vertex-2023-10-16",
"messages": [{
"role": "user",
"content": "Hey Claude!"
}],
"max_tokens": 100,
}'from anthropic import AnthropicVertex
project_id = "MY_PROJECT_ID"
region = "global"
client = AnthropicVertex(project_id=project_id, region=region)
message = client.messages.create(
model="claude-opus-4-6",
max_tokens=100,
messages=[
{
"role": "user",
"content": "Hey Claude!",
}
],
)
print(message)from anthropic import AnthropicVertex
project_id = "MY_PROJECT_ID"
region = "us-east1" # Specify a specific region
client = AnthropicVertex(project_id=project_id, region=region)
message = client.messages.create(
model="claude-opus-4-6",
max_tokens=100,
messages=[
{
"role": "user",
"content": "Hey Claude!",
}
],
)
print(message)