Batches
Create a Message Batch
Retrieve a Message Batch
List Message Batches
Cancel a Message Batch
Delete a Message Batch
Retrieve Message Batch results
ModelsExpand Collapse
class DeletedMessageBatch:
String id
ID of the Message Batch.
JsonValue; type "message_batch_deleted"constant"message_batch_deleted"constant
Deleted object type.
For Message Batches, this is always "message_batch_deleted".
class MessageBatch:
String id
Unique object identifier.
The format and length of IDs may change over time.
Optional<LocalDateTime> archivedAt
RFC 3339 datetime string representing the time at which the Message Batch was archived and its results became unavailable.
Optional<LocalDateTime> cancelInitiatedAt
RFC 3339 datetime string representing the time at which cancellation was initiated for the Message Batch. Specified only if cancellation was initiated.
LocalDateTime createdAt
RFC 3339 datetime string representing the time at which the Message Batch was created.
Optional<LocalDateTime> endedAt
RFC 3339 datetime string representing the time at which processing for the Message Batch ended. Specified only once processing ends.
Processing ends when every request in a Message Batch has either succeeded, errored, canceled, or expired.
LocalDateTime expiresAt
RFC 3339 datetime string representing the time at which the Message Batch will expire and end processing, which is 24 hours after creation.
ProcessingStatus processingStatus
Processing status of the Message Batch.
MessageBatchRequestCounts requestCounts
Tallies requests within the Message Batch, categorized by their status.
Requests start as processing and move to one of the other statuses only once processing of the entire batch ends. The sum of all values always matches the total number of requests in the batch.
long canceled
Number of requests in the Message Batch that have been canceled.
This is zero until processing of the entire Message Batch has ended.
long errored
Number of requests in the Message Batch that encountered an error.
This is zero until processing of the entire Message Batch has ended.
long expired
Number of requests in the Message Batch that have expired.
This is zero until processing of the entire Message Batch has ended.
long processing
Number of requests in the Message Batch that are processing.
long succeeded
Number of requests in the Message Batch that have completed successfully.
This is zero until processing of the entire Message Batch has ended.
Optional<String> resultsUrl
URL to a .jsonl file containing the results of the Message Batch requests. Specified only once processing ends.
Results in the file are not guaranteed to be in the same order as requests. Use the custom_id field to match results to requests.
JsonValue; type "message_batch"constant"message_batch"constant
Object type.
For Message Batches, this is always "message_batch".
class MessageBatchCanceledResult:
JsonValue; type "canceled"constant"canceled"constant
class MessageBatchErroredResult:
ErrorResponse error
ErrorObject error
class InvalidRequestError:
JsonValue; type "invalid_request_error"constant"invalid_request_error"constant
class AuthenticationError:
JsonValue; type "authentication_error"constant"authentication_error"constant
class BillingError:
JsonValue; type "billing_error"constant"billing_error"constant
class PermissionError:
JsonValue; type "permission_error"constant"permission_error"constant
class NotFoundError:
JsonValue; type "not_found_error"constant"not_found_error"constant
class RateLimitError:
JsonValue; type "rate_limit_error"constant"rate_limit_error"constant
class GatewayTimeoutError:
JsonValue; type "timeout_error"constant"timeout_error"constant
class ApiErrorObject:
JsonValue; type "api_error"constant"api_error"constant
class OverloadedError:
JsonValue; type "overloaded_error"constant"overloaded_error"constant
JsonValue; type "error"constant"error"constant
JsonValue; type "errored"constant"errored"constant
class MessageBatchExpiredResult:
JsonValue; type "expired"constant"expired"constant
class MessageBatchIndividualResponse:
This is a single line in the response .jsonl file and does not represent the response as a whole.
String customId
Developer-provided ID created for each request in a Message Batch. Useful for matching results to requests, as results may be given out of request order.
Must be unique for each request within the Message Batch.
MessageBatchResult result
Processing result for this request.
Contains a Message output if processing was successful, an error response if processing failed, or the reason why processing was not attempted, such as cancellation or expiration.
class MessageBatchSucceededResult:
Message message
String id
Unique object identifier.
The format and length of IDs may change over time.
List<ContentBlock> content
Content generated by the model.
This is an array of content blocks, each of which has a type that determines its shape.
Example:
[{"type": "text", "text": "Hi, I'm Claude."}]
If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.
For example, if the input messages were:
[
{"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
{"role": "assistant", "content": "The best answer is ("}
]
Then the response content might be:
[{"type": "text", "text": "B)"}]
class TextBlock:
Citations supporting the text block.
The type of citation returned will depend on the type of document being cited. Citing a PDF results in page_location, plain text results in char_location, and content document results in content_block_location.
class CitationCharLocation:
JsonValue; type "char_location"constant"char_location"constant
class CitationPageLocation:
JsonValue; type "page_location"constant"page_location"constant
class CitationContentBlockLocation:
JsonValue; type "content_block_location"constant"content_block_location"constant
class CitationsWebSearchResultLocation:
JsonValue; type "web_search_result_location"constant"web_search_result_location"constant
class CitationsSearchResultLocation:
JsonValue; type "search_result_location"constant"search_result_location"constant
JsonValue; type "text"constant"text"constant
class ThinkingBlock:
JsonValue; type "thinking"constant"thinking"constant
class RedactedThinkingBlock:
JsonValue; type "redacted_thinking"constant"redacted_thinking"constant
class ToolUseBlock:
JsonValue; type "tool_use"constant"tool_use"constant
class ServerToolUseBlock:
JsonValue; name "web_search"constant"web_search"constant
JsonValue; type "server_tool_use"constant"server_tool_use"constant
class WebSearchToolResultBlock:
WebSearchToolResultBlockContent content
class WebSearchToolResultError:
ErrorCode errorCode
JsonValue; type "web_search_tool_result_error"constant"web_search_tool_result_error"constant
List<WebSearchResultBlock>
JsonValue; type "web_search_result"constant"web_search_result"constant
JsonValue; type "web_search_tool_result"constant"web_search_tool_result"constant
Model model
The model that will complete your prompt.
See models for additional details and options.
CLAUDE_OPUS_4_5_20251101("claude-opus-4-5-20251101")
Premium model combining maximum intelligence with practical performance
CLAUDE_OPUS_4_5("claude-opus-4-5")
Premium model combining maximum intelligence with practical performance
CLAUDE_3_7_SONNET_LATEST("claude-3-7-sonnet-latest")
High-performance model with early extended thinking
CLAUDE_3_7_SONNET_20250219("claude-3-7-sonnet-20250219")
High-performance model with early extended thinking
CLAUDE_3_5_HAIKU_LATEST("claude-3-5-haiku-latest")
Fastest and most compact model for near-instant responsiveness
CLAUDE_3_5_HAIKU_20241022("claude-3-5-haiku-20241022")
Our fastest model
CLAUDE_HAIKU_4_5("claude-haiku-4-5")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_HAIKU_4_5_20251001("claude-haiku-4-5-20251001")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_SONNET_4_20250514("claude-sonnet-4-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_0("claude-sonnet-4-0")
High-performance model with extended thinking
CLAUDE_4_SONNET_20250514("claude-4-sonnet-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_5("claude-sonnet-4-5")
Our best model for real-world agents and coding
CLAUDE_SONNET_4_5_20250929("claude-sonnet-4-5-20250929")
Our best model for real-world agents and coding
CLAUDE_OPUS_4_0("claude-opus-4-0")
Our most capable model
CLAUDE_OPUS_4_20250514("claude-opus-4-20250514")
Our most capable model
CLAUDE_4_OPUS_20250514("claude-4-opus-20250514")
Our most capable model
CLAUDE_OPUS_4_1_20250805("claude-opus-4-1-20250805")
Our most capable model
CLAUDE_3_OPUS_LATEST("claude-3-opus-latest")
Excels at writing and complex tasks
CLAUDE_3_OPUS_20240229("claude-3-opus-20240229")
Excels at writing and complex tasks
CLAUDE_3_HAIKU_20240307("claude-3-haiku-20240307")
Our previous most fast and cost-effective
JsonValue; role "assistant"constant"assistant"constant
Conversational role of the generated message.
This will always be "assistant".
The reason that we stopped.
This may be one the following values:
"end_turn": the model reached a natural stopping point"max_tokens": we exceeded the requestedmax_tokensor the model's maximum"stop_sequence": one of your provided customstop_sequenceswas generated"tool_use": the model invoked one or more tools"pause_turn": we paused a long-running turn. You may provide the response back as-is in a subsequent request to let the model continue."refusal": when streaming classifiers intervene to handle potential policy violations
In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.
Optional<String> stopSequence
Which custom stop sequence was generated, if any.
This value will be a non-null string if one of your custom stop sequences was generated.
JsonValue; type "message"constant"message"constant
Object type.
For Messages, this is always "message".
Usage usage
Billing and rate-limit usage.
Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.
Under the hood, the API transforms requests into a format suitable for the model. The model's output then goes through a parsing stage before becoming an API response. As a result, the token counts in usage will not match one-to-one with the exact visible content of an API request or response.
For example, output_tokens will be non-zero, even for an empty string response from Claude.
Total input tokens in a request is the summation of input_tokens, cache_creation_input_tokens, and cache_read_input_tokens.
Breakdown of cached tokens by TTL
long ephemeral1hInputTokens
The number of input tokens used to create the 1 hour cache entry.
long ephemeral5mInputTokens
The number of input tokens used to create the 5 minute cache entry.
Optional<Long> cacheCreationInputTokens
The number of input tokens used to create the cache entry.
Optional<Long> cacheReadInputTokens
The number of input tokens read from the cache.
long inputTokens
The number of input tokens which were used.
long outputTokens
The number of output tokens which were used.
The number of server tool requests.
long webSearchRequests
The number of web search tool requests.
Optional<ServiceTier> serviceTier
If the request used the priority, standard, or batch tier.
JsonValue; type "succeeded"constant"succeeded"constant
class MessageBatchErroredResult:
ErrorResponse error
ErrorObject error
class InvalidRequestError:
JsonValue; type "invalid_request_error"constant"invalid_request_error"constant
class AuthenticationError:
JsonValue; type "authentication_error"constant"authentication_error"constant
class BillingError:
JsonValue; type "billing_error"constant"billing_error"constant
class PermissionError:
JsonValue; type "permission_error"constant"permission_error"constant
class NotFoundError:
JsonValue; type "not_found_error"constant"not_found_error"constant
class RateLimitError:
JsonValue; type "rate_limit_error"constant"rate_limit_error"constant
class GatewayTimeoutError:
JsonValue; type "timeout_error"constant"timeout_error"constant
class ApiErrorObject:
JsonValue; type "api_error"constant"api_error"constant
class OverloadedError:
JsonValue; type "overloaded_error"constant"overloaded_error"constant
JsonValue; type "error"constant"error"constant
JsonValue; type "errored"constant"errored"constant
class MessageBatchCanceledResult:
JsonValue; type "canceled"constant"canceled"constant
class MessageBatchExpiredResult:
JsonValue; type "expired"constant"expired"constant
class MessageBatchRequestCounts:
long canceled
Number of requests in the Message Batch that have been canceled.
This is zero until processing of the entire Message Batch has ended.
long errored
Number of requests in the Message Batch that encountered an error.
This is zero until processing of the entire Message Batch has ended.
long expired
Number of requests in the Message Batch that have expired.
This is zero until processing of the entire Message Batch has ended.
long processing
Number of requests in the Message Batch that are processing.
long succeeded
Number of requests in the Message Batch that have completed successfully.
This is zero until processing of the entire Message Batch has ended.
class MessageBatchResult: A class that can be one of several variants.union
Processing result for this request.
Contains a Message output if processing was successful, an error response if processing failed, or the reason why processing was not attempted, such as cancellation or expiration.
class MessageBatchSucceededResult:
Message message
String id
Unique object identifier.
The format and length of IDs may change over time.
List<ContentBlock> content
Content generated by the model.
This is an array of content blocks, each of which has a type that determines its shape.
Example:
[{"type": "text", "text": "Hi, I'm Claude."}]
If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.
For example, if the input messages were:
[
{"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
{"role": "assistant", "content": "The best answer is ("}
]
Then the response content might be:
[{"type": "text", "text": "B)"}]
class TextBlock:
Citations supporting the text block.
The type of citation returned will depend on the type of document being cited. Citing a PDF results in page_location, plain text results in char_location, and content document results in content_block_location.
class CitationCharLocation:
JsonValue; type "char_location"constant"char_location"constant
class CitationPageLocation:
JsonValue; type "page_location"constant"page_location"constant
class CitationContentBlockLocation:
JsonValue; type "content_block_location"constant"content_block_location"constant
class CitationsWebSearchResultLocation:
JsonValue; type "web_search_result_location"constant"web_search_result_location"constant
class CitationsSearchResultLocation:
JsonValue; type "search_result_location"constant"search_result_location"constant
JsonValue; type "text"constant"text"constant
class ThinkingBlock:
JsonValue; type "thinking"constant"thinking"constant
class RedactedThinkingBlock:
JsonValue; type "redacted_thinking"constant"redacted_thinking"constant
class ToolUseBlock:
JsonValue; type "tool_use"constant"tool_use"constant
class ServerToolUseBlock:
JsonValue; name "web_search"constant"web_search"constant
JsonValue; type "server_tool_use"constant"server_tool_use"constant
class WebSearchToolResultBlock:
WebSearchToolResultBlockContent content
class WebSearchToolResultError:
ErrorCode errorCode
JsonValue; type "web_search_tool_result_error"constant"web_search_tool_result_error"constant
List<WebSearchResultBlock>
JsonValue; type "web_search_result"constant"web_search_result"constant
JsonValue; type "web_search_tool_result"constant"web_search_tool_result"constant
Model model
The model that will complete your prompt.
See models for additional details and options.
CLAUDE_OPUS_4_5_20251101("claude-opus-4-5-20251101")
Premium model combining maximum intelligence with practical performance
CLAUDE_OPUS_4_5("claude-opus-4-5")
Premium model combining maximum intelligence with practical performance
CLAUDE_3_7_SONNET_LATEST("claude-3-7-sonnet-latest")
High-performance model with early extended thinking
CLAUDE_3_7_SONNET_20250219("claude-3-7-sonnet-20250219")
High-performance model with early extended thinking
CLAUDE_3_5_HAIKU_LATEST("claude-3-5-haiku-latest")
Fastest and most compact model for near-instant responsiveness
CLAUDE_3_5_HAIKU_20241022("claude-3-5-haiku-20241022")
Our fastest model
CLAUDE_HAIKU_4_5("claude-haiku-4-5")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_HAIKU_4_5_20251001("claude-haiku-4-5-20251001")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_SONNET_4_20250514("claude-sonnet-4-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_0("claude-sonnet-4-0")
High-performance model with extended thinking
CLAUDE_4_SONNET_20250514("claude-4-sonnet-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_5("claude-sonnet-4-5")
Our best model for real-world agents and coding
CLAUDE_SONNET_4_5_20250929("claude-sonnet-4-5-20250929")
Our best model for real-world agents and coding
CLAUDE_OPUS_4_0("claude-opus-4-0")
Our most capable model
CLAUDE_OPUS_4_20250514("claude-opus-4-20250514")
Our most capable model
CLAUDE_4_OPUS_20250514("claude-4-opus-20250514")
Our most capable model
CLAUDE_OPUS_4_1_20250805("claude-opus-4-1-20250805")
Our most capable model
CLAUDE_3_OPUS_LATEST("claude-3-opus-latest")
Excels at writing and complex tasks
CLAUDE_3_OPUS_20240229("claude-3-opus-20240229")
Excels at writing and complex tasks
CLAUDE_3_HAIKU_20240307("claude-3-haiku-20240307")
Our previous most fast and cost-effective
JsonValue; role "assistant"constant"assistant"constant
Conversational role of the generated message.
This will always be "assistant".
The reason that we stopped.
This may be one the following values:
"end_turn": the model reached a natural stopping point"max_tokens": we exceeded the requestedmax_tokensor the model's maximum"stop_sequence": one of your provided customstop_sequenceswas generated"tool_use": the model invoked one or more tools"pause_turn": we paused a long-running turn. You may provide the response back as-is in a subsequent request to let the model continue."refusal": when streaming classifiers intervene to handle potential policy violations
In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.
Optional<String> stopSequence
Which custom stop sequence was generated, if any.
This value will be a non-null string if one of your custom stop sequences was generated.
JsonValue; type "message"constant"message"constant
Object type.
For Messages, this is always "message".
Usage usage
Billing and rate-limit usage.
Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.
Under the hood, the API transforms requests into a format suitable for the model. The model's output then goes through a parsing stage before becoming an API response. As a result, the token counts in usage will not match one-to-one with the exact visible content of an API request or response.
For example, output_tokens will be non-zero, even for an empty string response from Claude.
Total input tokens in a request is the summation of input_tokens, cache_creation_input_tokens, and cache_read_input_tokens.
Breakdown of cached tokens by TTL
long ephemeral1hInputTokens
The number of input tokens used to create the 1 hour cache entry.
long ephemeral5mInputTokens
The number of input tokens used to create the 5 minute cache entry.
Optional<Long> cacheCreationInputTokens
The number of input tokens used to create the cache entry.
Optional<Long> cacheReadInputTokens
The number of input tokens read from the cache.
long inputTokens
The number of input tokens which were used.
long outputTokens
The number of output tokens which were used.
The number of server tool requests.
long webSearchRequests
The number of web search tool requests.
Optional<ServiceTier> serviceTier
If the request used the priority, standard, or batch tier.
JsonValue; type "succeeded"constant"succeeded"constant
class MessageBatchErroredResult:
ErrorResponse error
ErrorObject error
class InvalidRequestError:
JsonValue; type "invalid_request_error"constant"invalid_request_error"constant
class AuthenticationError:
JsonValue; type "authentication_error"constant"authentication_error"constant
class BillingError:
JsonValue; type "billing_error"constant"billing_error"constant
class PermissionError:
JsonValue; type "permission_error"constant"permission_error"constant
class NotFoundError:
JsonValue; type "not_found_error"constant"not_found_error"constant
class RateLimitError:
JsonValue; type "rate_limit_error"constant"rate_limit_error"constant
class GatewayTimeoutError:
JsonValue; type "timeout_error"constant"timeout_error"constant
class ApiErrorObject:
JsonValue; type "api_error"constant"api_error"constant
class OverloadedError:
JsonValue; type "overloaded_error"constant"overloaded_error"constant
JsonValue; type "error"constant"error"constant
JsonValue; type "errored"constant"errored"constant
class MessageBatchCanceledResult:
JsonValue; type "canceled"constant"canceled"constant
class MessageBatchExpiredResult:
JsonValue; type "expired"constant"expired"constant
class MessageBatchSucceededResult:
Message message
String id
Unique object identifier.
The format and length of IDs may change over time.
List<ContentBlock> content
Content generated by the model.
This is an array of content blocks, each of which has a type that determines its shape.
Example:
[{"type": "text", "text": "Hi, I'm Claude."}]
If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.
For example, if the input messages were:
[
{"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
{"role": "assistant", "content": "The best answer is ("}
]
Then the response content might be:
[{"type": "text", "text": "B)"}]
class TextBlock:
Citations supporting the text block.
The type of citation returned will depend on the type of document being cited. Citing a PDF results in page_location, plain text results in char_location, and content document results in content_block_location.
class CitationCharLocation:
JsonValue; type "char_location"constant"char_location"constant
class CitationPageLocation:
JsonValue; type "page_location"constant"page_location"constant
class CitationContentBlockLocation:
JsonValue; type "content_block_location"constant"content_block_location"constant
class CitationsWebSearchResultLocation:
JsonValue; type "web_search_result_location"constant"web_search_result_location"constant
class CitationsSearchResultLocation:
JsonValue; type "search_result_location"constant"search_result_location"constant
JsonValue; type "text"constant"text"constant
class ThinkingBlock:
JsonValue; type "thinking"constant"thinking"constant
class RedactedThinkingBlock:
JsonValue; type "redacted_thinking"constant"redacted_thinking"constant
class ToolUseBlock:
JsonValue; type "tool_use"constant"tool_use"constant
class ServerToolUseBlock:
JsonValue; name "web_search"constant"web_search"constant
JsonValue; type "server_tool_use"constant"server_tool_use"constant
class WebSearchToolResultBlock:
WebSearchToolResultBlockContent content
class WebSearchToolResultError:
ErrorCode errorCode
JsonValue; type "web_search_tool_result_error"constant"web_search_tool_result_error"constant
List<WebSearchResultBlock>
JsonValue; type "web_search_result"constant"web_search_result"constant
JsonValue; type "web_search_tool_result"constant"web_search_tool_result"constant
Model model
The model that will complete your prompt.
See models for additional details and options.
CLAUDE_OPUS_4_5_20251101("claude-opus-4-5-20251101")
Premium model combining maximum intelligence with practical performance
CLAUDE_OPUS_4_5("claude-opus-4-5")
Premium model combining maximum intelligence with practical performance
CLAUDE_3_7_SONNET_LATEST("claude-3-7-sonnet-latest")
High-performance model with early extended thinking
CLAUDE_3_7_SONNET_20250219("claude-3-7-sonnet-20250219")
High-performance model with early extended thinking
CLAUDE_3_5_HAIKU_LATEST("claude-3-5-haiku-latest")
Fastest and most compact model for near-instant responsiveness
CLAUDE_3_5_HAIKU_20241022("claude-3-5-haiku-20241022")
Our fastest model
CLAUDE_HAIKU_4_5("claude-haiku-4-5")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_HAIKU_4_5_20251001("claude-haiku-4-5-20251001")
Hybrid model, capable of near-instant responses and extended thinking
CLAUDE_SONNET_4_20250514("claude-sonnet-4-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_0("claude-sonnet-4-0")
High-performance model with extended thinking
CLAUDE_4_SONNET_20250514("claude-4-sonnet-20250514")
High-performance model with extended thinking
CLAUDE_SONNET_4_5("claude-sonnet-4-5")
Our best model for real-world agents and coding
CLAUDE_SONNET_4_5_20250929("claude-sonnet-4-5-20250929")
Our best model for real-world agents and coding
CLAUDE_OPUS_4_0("claude-opus-4-0")
Our most capable model
CLAUDE_OPUS_4_20250514("claude-opus-4-20250514")
Our most capable model
CLAUDE_4_OPUS_20250514("claude-4-opus-20250514")
Our most capable model
CLAUDE_OPUS_4_1_20250805("claude-opus-4-1-20250805")
Our most capable model
CLAUDE_3_OPUS_LATEST("claude-3-opus-latest")
Excels at writing and complex tasks
CLAUDE_3_OPUS_20240229("claude-3-opus-20240229")
Excels at writing and complex tasks
CLAUDE_3_HAIKU_20240307("claude-3-haiku-20240307")
Our previous most fast and cost-effective
JsonValue; role "assistant"constant"assistant"constant
Conversational role of the generated message.
This will always be "assistant".
The reason that we stopped.
This may be one the following values:
"end_turn": the model reached a natural stopping point"max_tokens": we exceeded the requestedmax_tokensor the model's maximum"stop_sequence": one of your provided customstop_sequenceswas generated"tool_use": the model invoked one or more tools"pause_turn": we paused a long-running turn. You may provide the response back as-is in a subsequent request to let the model continue."refusal": when streaming classifiers intervene to handle potential policy violations
In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.
Optional<String> stopSequence
Which custom stop sequence was generated, if any.
This value will be a non-null string if one of your custom stop sequences was generated.
JsonValue; type "message"constant"message"constant
Object type.
For Messages, this is always "message".
Usage usage
Billing and rate-limit usage.
Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.
Under the hood, the API transforms requests into a format suitable for the model. The model's output then goes through a parsing stage before becoming an API response. As a result, the token counts in usage will not match one-to-one with the exact visible content of an API request or response.
For example, output_tokens will be non-zero, even for an empty string response from Claude.
Total input tokens in a request is the summation of input_tokens, cache_creation_input_tokens, and cache_read_input_tokens.
Breakdown of cached tokens by TTL
long ephemeral1hInputTokens
The number of input tokens used to create the 1 hour cache entry.
long ephemeral5mInputTokens
The number of input tokens used to create the 5 minute cache entry.
Optional<Long> cacheCreationInputTokens
The number of input tokens used to create the cache entry.
Optional<Long> cacheReadInputTokens
The number of input tokens read from the cache.
long inputTokens
The number of input tokens which were used.
long outputTokens
The number of output tokens which were used.
The number of server tool requests.
long webSearchRequests
The number of web search tool requests.
Optional<ServiceTier> serviceTier
If the request used the priority, standard, or batch tier.