Anthropic Tool Runner SDK

The Tool Runner is an SDK-provided abstraction over the manual tool call loop. It handles request/response cycling, tool execution, conversation state management, and error wrapping. Available in Python, TypeScript, and Ruby SDKs (beta).

When to Use

Use Tool Runner when: you want Claude to call tools and receive results automatically — the common case.

Use the manual loop when: you need human-in-the-loop approval between tool calls, custom logging of intermediate states, or conditional execution based on tool results before continuing.

Basic Pattern (Python)

from anthropic import Anthropic, beta_tool

client = Anthropic()

@beta_tool
def get_weather(location: str, unit: str = "fahrenheit") -> str:
    """Get the current weather in a given location.

    Args:
        location: The city and state, e.g. San Francisco, CA
        unit: Temperature unit, either 'celsius' or 'fahrenheit'
    """
    return json.dumps({"temperature": "20°C", "condition": "Sunny"})

runner = client.beta.messages.tool_runner(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=[get_weather],
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
)

# Iterate to get intermediate messages, or:
final = runner.until_done()

The @beta_tool decorator introspects the function signature and docstring to generate a JSON schema. Tool functions must return a string, content block, or content block array — non-string primitives must be stringified.

The Tool Call Loop

runner is an iterable that yields messages from Claude. Each iteration:

If the previous message contained stop_reason: "tool_use", the runner executes the requested tools and sends results back automatically.
The next Claude message is yielded.
The loop ends when Claude returns a message without tool use.

runner.until_done() skips intermediate yields and returns only the final message.

Compaction for Long-Running Agents

Tool Runner supports automatic context compaction: when token usage exceeds a threshold, the runner generates a summary of the conversation and continues, allowing agentic tasks to run beyond the context window limit. This is critical for long-horizon workflows.

Advanced: Intercepting and Modifying Results

for message in runner:
    tool_response = runner.generate_tool_call_response()

    if tool_response is not None:
        # Check for errors before Claude sees them
        for block in tool_response["content"]:
            if block.get("is_error"):
                raise RuntimeError(f"Tool failed: {block['content']}")
            # Add cache_control to large tool results
            if block["type"] == "tool_result":
                block["cache_control"] = {"type": "ephemeral"}

        # Append modified response (prevents auto-append of original)
        runner.append_messages(message, tool_response)

generate_tool_call_response() returns the tool result dict that would be sent back. Appending it manually (with modifications) prevents the runner from auto-appending the original. See anthropic-prompt-caching for the caching benefit of this pattern with large tool results.

Error Behavior

By default, exceptions thrown by tools are caught and returned to Claude as is_error: true tool results — Claude can then decide how to respond. The full stack trace is logged (not sent to the model) when ANTHROPIC_LOG=debug is set.

To stop the loop on error rather than letting Claude handle it, intercept via generate_tool_call_response() and raise before the runner proceeds.

Streaming

runner = client.beta.messages.tool_runner(
    model="claude-opus-4-7",
    max_tokens=1024,
    tools=[get_weather],
    messages=[{"role": "user", "content": "What's the weather?"}],
    stream=True,
)

for message_stream in runner:
    for event in message_stream:
        print("event:", event)
    print("final:", message_stream.get_final_message())

With stream=True, each iteration yields a stream object rather than a complete message.

Architectural Note

Tool Runner is a convenience layer, not a protocol change. It uses the same Messages API with the same tool call loop described in anthropic-tool-use — it just eliminates the boilerplate. The same constraints apply: tool results must follow tool use turns, tool_choice limitations still exist, and thinking block preservation rules still apply when using adaptive thinking.