OpenAI Agents SDK

The OpenAI Agents SDK is the production-grade Python framework for building, running, and scaling AI agents. It supersedes the experimental openai-swarm with a focus on durability, safety, and multi-agent orchestration. The developer retains explicit ownership of orchestration, tool execution, and state management.

Running Agents

The Runner class is the primary entry point for agent execution. Three run modes are available:

Runner.run() — async, returns a RunResult
Runner.run_sync() — synchronous wrapper
Runner.run_streamed() — returns a RunResultStreaming for incremental output

Input can be a plain string (treated as a user message), a list of OpenAI Responses API items, or a RunState object to resume an interrupted run.

The Agent Loop

When Runner.run() is called, it executes a loop:

Call the current agent's LLM with the accumulated conversation
Execute any tool calls returned by the LLM
If a handoff tool is invoked, switch the active agent
Repeat until the LLM returns a final output (no more tool calls)

Agent Definition

An Agent is defined with:

instructions — system prompt, either a static string or a dynamic function (RunContextWrapper, Agent) -> str that allows per-request context injection
tools — list of callable functions the LLM can invoke
handoffs — list of Agent instances or Handoff objects the agent can delegate to
model — the underlying LLM; supports multiple providers

Dynamic instructions pattern:

def dynamic_instructions(context: RunContextWrapper[UserContext], agent: Agent) -> str:
    return f"The user's name is {context.context.name}."

Handoffs

Handoffs allow an agent to delegate the conversation to a specialist sub-agent. When a handoff is invoked, the delegated agent receives the full conversation history and becomes the active agent. Handoffs are represented as tools on the agent — the LLM invokes a handoff exactly as it invokes any other tool.

Two Multi-Agent Patterns

Manager (agents as tools): A central orchestrator invokes specialist agents as tools and retains control of the conversation at all times. The manager synthesizes results.

Peer handoffs: Agents pass control laterally to a specialist that takes over the reply. The new agent drives the conversation until it completes or hands off again.

Customizing Handoffs

The handoff() function provides fine-grained control:

agent — target agent
tool_name_override — custom tool name (default: transfer_to_<agent_name>)
on_handoff — callback invoked when the handoff fires

Prompt discipline matters: include the RECOMMENDED_PROMPT_PREFIX from agents.extensions.handoff_prompt to ensure the LLM understands handoff semantics.

Guardrails

Guardrails attach to agents and run validations at defined workflow boundaries:

Type	When it runs
Input guardrails	First agent in the chain only
Output guardrails	Final agent producing the terminal output
Tool guardrails	Every invocation of a custom function-tool

Input guardrails run in three steps: receive the same input as the agent → execute the guardrail function → return a GuardrailFunctionOutput. A tripwire flag signals immediate abort. This design lets you use a cheap/fast model as a safety filter before the expensive primary model runs.

Lifecycle Hooks

Two hook scopes give observability into agent execution:

RunHooks — module-level, fires for any agent in the run
AgentHooks — agent-level, fires only for that specific agent

Hooks support pre/post-tool, pre/post-handoff, and agent-start/end events. Primary use cases: logging, pre-fetching, usage recording.

MCP Integration

The SDK provides native support for the Model Context Protocol. MCP standardizes how applications expose tools and context to LLMs — "MCP is like USB-C for AI." Agents connect to MCPServer instances to access any MCP-compliant tool or resource, enabling a wide ecosystem of pre-built integrations. See mcp-moc.

Tracing

Built-in tracing records agent runs as structured traces. Multiple Runner.run() calls can be grouped into a single trace using the trace() context manager:

with trace("Workflow name"):
    result1 = await Runner.run(agent, "...")
    result2 = await Runner.run(agent, "...")

Traces are visualizable for debugging agentic flows.

Resumable State

Interrupted runs (e.g., human-in-the-loop pauses) are resumed by passing a RunState object back to Runner.run(). This preserves the full conversation and tool-execution history across the interruption boundary.

Comparison: Swarm vs. Agents SDK

Feature	[[openai-swarm\	Swarm]]
Status	Experimental / Educational	Production-Ready
State	Stateless (caller manages)	Resumable via `RunState`
Orchestration	Lightweight function-based	`Runner` loop + typed hooks
Safety	Minimal	Input/output/tool guardrails
Observability	None	Built-in tracing
MCP	No	Native

*Source: lit-openai-agents-sdk*