Anthropic Message Batches

The Message Batches API is Anthropic's asynchronous execution layer for high-volume workflows. It processes requests independently, returns results when the batch completes, and prices everything at 50% of synchronous API rates.

When to Use Batches

Batches are the right choice when:

You need to process large volumes of data (evaluations, content analysis, bulk generation)
Immediate responses are not required
Cost matters more than latency
Thinking budgets exceed 32k tokens (long synchronous requests risk timeout; use batches)

Not suitable for: interactive use cases, streaming output, or workloads that require immediate responses.

Batch Limits

Limit	Value
Requests per batch	100,000
Batch size	256 MB
Processing window	Most complete < 1 hour; max 24 hours
Result availability	29 days after batch creation (not after processing end)
`max_tokens: 0`	Not supported (cache pre-warming is incompatible with batches)

Batches expire if processing does not complete within 24 hours. Results expire 29 days after the batch's created_at timestamp — not after ended_at.

Request Shape

Each batched request has a custom_id and a params object with standard Messages API parameters:

from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request

batch = client.messages.batches.create(
    requests=[
        Request(
            custom_id="req-001",
            params=MessageCreateParamsNonStreaming(
                model="claude-opus-4-7",
                max_tokens=1024,
                messages=[{"role": "user", "content": "Hello"}],
            ),
        ),
    ]
)

Validation of params is asynchronous — errors appear in individual result records, not at submission time. Dry-run single requests against the Messages API before batching.

Result Types

Type	Description	Billing
`succeeded`	Message completed, result included	Billed
`errored`	Invalid request or server error	Not billed
`canceled`	Canceled before processing	Not billed
`expired`	Not processed within 24-hour window	Not billed

Results are not ordered. Always match results to requests using custom_id.

for result in client.messages.batches.results(batch_id):
    match result.result.type:
        case "succeeded":
            # result.result.message is the full Message object
        case "errored":
            if result.result.error.error.type == "invalid_request_error":
                # fix and resubmit
            # else server error — safe to retry

Polling

while True:
    batch = client.messages.batches.retrieve(batch_id)
    if batch.processing_status == "ended":
        break
    time.sleep(60)

processing_status transitions: in_progress → ended (or canceling → ended if canceled).

Extended Output (Beta)

The output-300k-2026-03-24 beta header raises max_tokens to 300,000 for batches using Opus 4.7, Opus 4.6, or Sonnet 4.6. This is batch-only — unavailable on the synchronous Messages API.

batch = client.beta.messages.batches.create(
    betas=["output-300k-2026-03-24"],
    requests=[Request(
        custom_id="long-form",
        params=MessageCreateParamsNonStreaming(
            model="claude-opus-4-7",
            max_tokens=300_000,
            messages=[{"role": "user", "content": "..."}],
        ),
    )],
)

A single 300k-token generation can take over an hour — plan within the 24-hour processing window. Standard batch pricing applies (50% discount).

Prompt Caching + Batches

Batch and cache discounts stack. Caching within batches is best-effort (concurrent async processing means requests may not share cache state). Expected cache hit rates: 30–98% depending on traffic pattern.

To maximize cache hits:

Include identical cache_control blocks in every request in the batch.
Use the 1-hour cache TTL — the default 5-minute TTL expires before most batches complete.
Structure requests to share as much prefix as possible.

See anthropic-prompt-caching for cache control mechanics.

Data Retention

Batch request and response data is retained for 29 days. Not ZDR-eligible. Delete batches explicitly via DELETE /v1/messages/batches/{batch_id} (cancel first if in-progress).