OpenAI Agents SDK
The OpenAI Agents SDK is the production-grade Python framework for building, running, and scaling AI agents. It supersedes the experimental openai-swarm with a focus on durability, safety, and multi-agent orchestration. The developer retains explicit ownership of orchestration, tool execution, and state management.
Running Agents
The Runner class is the primary entry point for agent execution. Three run modes are available:
Runner.run()— async, returns aRunResultRunner.run_sync()— synchronous wrapperRunner.run_streamed()— returns aRunResultStreamingfor incremental output
Input can be a plain string (treated as a user message), a list of OpenAI Responses API items, or a RunState object to resume an interrupted run.
The Agent Loop
When Runner.run() is called, it executes a loop:
- Call the current agent's LLM with the accumulated conversation
- Execute any tool calls returned by the LLM
- If a handoff tool is invoked, switch the active agent
- Repeat until the LLM returns a final output (no more tool calls)
Agent Definition
An Agent is defined with:
instructions— system prompt, either a static string or a dynamic function(RunContextWrapper, Agent) -> strthat allows per-request context injectiontools— list of callable functions the LLM can invokehandoffs— list ofAgentinstances orHandoffobjects the agent can delegate tomodel— the underlying LLM; supports multiple providers
Dynamic instructions pattern:
def dynamic_instructions(context: RunContextWrapper[UserContext], agent: Agent) -> str:
return f"The user's name is {context.context.name}."
Handoffs
Handoffs allow an agent to delegate the conversation to a specialist sub-agent. When a handoff is invoked, the delegated agent receives the full conversation history and becomes the active agent. Handoffs are represented as tools on the agent — the LLM invokes a handoff exactly as it invokes any other tool.
Two Multi-Agent Patterns
Manager (agents as tools): A central orchestrator invokes specialist agents as tools and retains control of the conversation at all times. The manager synthesizes results.
Peer handoffs: Agents pass control laterally to a specialist that takes over the reply. The new agent drives the conversation until it completes or hands off again.
Customizing Handoffs
The handoff() function provides fine-grained control:
agent— target agenttool_name_override— custom tool name (default:transfer_to_<agent_name>)on_handoff— callback invoked when the handoff fires
Prompt discipline matters: include the RECOMMENDED_PROMPT_PREFIX from agents.extensions.handoff_prompt to ensure the LLM understands handoff semantics.
Guardrails
Guardrails attach to agents and run validations at defined workflow boundaries:
| Type | When it runs |
|---|---|
| Input guardrails | First agent in the chain only |
| Output guardrails | Final agent producing the terminal output |
| Tool guardrails | Every invocation of a custom function-tool |
Input guardrails run in three steps: receive the same input as the agent → execute the guardrail function → return a GuardrailFunctionOutput. A tripwire flag signals immediate abort. This design lets you use a cheap/fast model as a safety filter before the expensive primary model runs.
Lifecycle Hooks
Two hook scopes give observability into agent execution:
RunHooks— module-level, fires for any agent in the runAgentHooks— agent-level, fires only for that specific agent
Hooks support pre/post-tool, pre/post-handoff, and agent-start/end events. Primary use cases: logging, pre-fetching, usage recording.
MCP Integration
The SDK provides native support for the Model Context Protocol. MCP standardizes how applications expose tools and context to LLMs — "MCP is like USB-C for AI." Agents connect to MCPServer instances to access any MCP-compliant tool or resource, enabling a wide ecosystem of pre-built integrations. See mcp-moc.
Tracing
Built-in tracing records agent runs as structured traces. Multiple Runner.run() calls can be grouped into a single trace using the trace() context manager:
with trace("Workflow name"):
result1 = await Runner.run(agent, "...")
result2 = await Runner.run(agent, "...")
Traces are visualizable for debugging agentic flows.
Resumable State
Interrupted runs (e.g., human-in-the-loop pauses) are resumed by passing a RunState object back to Runner.run(). This preserves the full conversation and tool-execution history across the interruption boundary.
Comparison: Swarm vs. Agents SDK
| Feature | [[openai-swarm\ | Swarm]] | Agents SDK |
|---|---|---|---|
| Status | Experimental / Educational | Production-Ready | |
| State | Stateless (caller manages) | Resumable via RunState |
|
| Orchestration | Lightweight function-based | Runner loop + typed hooks |
|
| Safety | Minimal | Input/output/tool guardrails | |
| Observability | None | Built-in tracing | |
| MCP | No | Native |
*Source: lit-openai-agents-sdk*