This feature is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
You can ask Claude about any text, pictures, charts, and tables in PDFs you provide. Some sample use cases:
Claude works with any standard PDF. Ensure your request size meets these requirements:
| Requirement | Limit |
|---|---|
| Maximum request size | 32 MB (varies by platform) |
| Maximum pages per request | 600 (100 for models with a 200k-token context window) |
| Format | Standard PDF (no passwords/encryption) |
Both limits are on the entire request payload, including any other content sent alongside PDFs. For large PDFs, consider uploading with the Files API and referencing by file_id to keep request payloads small.
Dense PDFs (many small-font pages, complex tables, or heavy graphics) can fill the context window before reaching the page limit. Requests with large PDFs can also fail before reaching the page limit, even when using the Files API. Try splitting the document into sections; for large files, since each page is processed as an image, downsampling embedded images can also help.
Since PDF support relies on Claude's vision capabilities, it is subject to the same limitations and considerations as other vision tasks.
PDF support is available on the Claude API, Claude Platform on AWS, Amazon Bedrock (see Amazon Bedrock PDF support), Vertex AI, and Microsoft Foundry. All active models support PDF processing.
When using PDF support through Bedrock's Converse API, there are two distinct document processing modes:
Important: To access Claude's full visual PDF understanding capabilities in the Converse API, you must enable citations. Without citations enabled, the API falls back to basic text extraction only. Learn more about working with citations.
Converse Document Chat (Original mode - Text extraction only)
Claude PDF Chat (New mode - Full visual understanding)
If Claude isn't seeing images or charts in your PDFs when using the Converse API, you likely need to enable the citations flag. Without it, Converse falls back to basic text extraction only.
This is a known constraint with the Converse API. For applications that require visual PDF analysis without citations, consider using the InvokeModel API instead.
For non-PDF files like .csv, .xlsx, .docx, .md, or .txt files, see Working with other file formats.
Let's start with a simple example using the Messages API. You can provide PDFs to Claude in three ways:
document content blocksfile_id from the Files APIThe simplest approach is to reference a PDF directly from a URL:
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "url",
"url": "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf",
},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)If you need to send PDFs from your local system or when a URL isn't available:
import base64
import httpx
# First, load and encode the PDF
pdf_url = "https://assets.anthropic.com/m/1cd9d098ac3e6467/original/Claude-3-Model-Card-October-Addendum.pdf"
pdf_data = base64.standard_b64encode(httpx.get(pdf_url).content).decode("utf-8")
# Alternative: Load from a local file
# with open("document.pdf", "rb") as f:
# pdf_data = base64.standard_b64encode(f.read()).decode("utf-8")
# Send to Claude using base64 encoding
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)For PDFs you'll use repeatedly, or when you want to avoid encoding overhead, use the Files API:
client = anthropic.Anthropic()
# Upload the PDF file
with open("document.pdf", "rb") as f:
file_upload = client.beta.files.upload(file=("document.pdf", f, "application/pdf"))
# Use the uploaded file in a message
message = client.beta.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
betas=["files-api-2025-04-14"],
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {"type": "file", "file_id": file_upload.id},
},
{"type": "text", "text": "What are the key findings in this document?"},
],
}
],
)
print(message.content)When you send a PDF to Claude, the following steps occur:
The system extracts the contents of the document.
Claude analyzes both the text and images to better understand the document.
Claude responds, referencing the PDF's contents if relevant.
Claude can reference both textual and visual content when it responds. You can further improve performance by integrating PDF support with:
The token count of a PDF file depends on the total text extracted from the document as well as the number of pages:
You can use token counting to estimate costs for your specific PDFs.
Follow these best practices for optimal results:
For high-volume processing, consider these approaches:
Cache PDFs to improve performance on repeated queries:
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
"cache_control": {"type": "ephemeral"},
},
{"type": "text", "text": "Analyze this document."},
],
}
],
)Use the Message Batches API for high-volume workflows:
client = anthropic.Anthropic()
message_batch = client.messages.batches.create(
requests=[
{
"custom_id": "doc1",
"params": {
"model": "claude-opus-4-7",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data,
},
},
{"type": "text", "text": "Summarize this document."},
],
}
],
},
}
]
)Explore practical examples of PDF processing in the cookbook recipe.
See complete API documentation for PDF support.
Was this page helpful?