Agent Thought Cycle
The Thought-Action-Observation cycle is the fundamental workflow of an autonomous agent, enabling it to reason about a goal and refine its approach based on environment feedback.
Core Opinion
This loop is the smallest useful model of an agent runtime. It matters in the Nest because most higher-level frameworks differ less in whether they use this cycle than in how much of it they expose, constrain, or automate.
The practical reading is:
- "thought" is the model selecting a next move from the current state
- "action" is the runtime boundary where the model leaves pure text and invokes a capability
- "observation" is the new state returned to the loop, which either grounds the next step or reveals failure
Once that is clear, notes like react-pattern, agent-tools, graph-orchestration, and workflow-agents become easier to place.
The Loop
- Thought: The LLM reasons about the current state and decides which step to take next.
- Action: The agent executes a command or calls a tool based on the reasoning (e.g.,
search_web). - Observation: The system returns the result of the action (e.g., "The weather is 72°F") as a new message to the LLM.
Evolution: ReAct
This cycle is often implemented using the react-pattern, where the reasoning and acting are interleaved in a single context window to maintain a logical "chain of thought."
Decision Rule
Start from agent-thought-cycle when your question is about the minimal logic of agency:
- "What actually makes a tool-using system an agent instead of just a completion?"
- "Where does failure usually enter the loop?"
- "How do reasoning, tool use, and returned state fit together?"
If the question is about a framework packaging this loop, route to agent-development-kit, openai-agents-sdk, or smolagents. If the question is about deterministic control over multiple loops, route to graph-orchestration or workflow-agents.
Failure Modes
Most agent bugs are distortions of one phase of this cycle:
- bad Thought: the model plans against the wrong objective or hallucinates what it already knows
- bad Action: the tool contract is vague, the chosen tool is wrong, or the runtime boundary is unsafe
- bad Observation: the returned result is incomplete, noisy, or not written back into state in a usable form
That is why agent-tools and agent-actions matter operationally more than generic "reasoning" descriptions.
Relationship to the Rest of the Vault
- react-pattern is the canonical prompt/runtime expression of this loop.
- agent-tools explains the action surface in more detail.
- graph-orchestration explains what happens when many such loops are wired together explicitly.
- workflow-agents shows the deterministic counterpart where flow control is moved out of the model.
References
- Source:
00_Raw/hf-agents-course-unit1.md - agentic-frameworks-moc
- react-pattern
- ps-vulture-search (Context Packet generation for 'Thought' phase)
- agent-actions