|

11 min read

Why the Best AI Agents Are Powered by a 3-Step Loop from 2022

Part 1 of the Data Governance & AI series

I spent a month building an agent framework the “right” way. There were prescriptive JSON based workflows with staged pipelines, and a persistence layer for coordinating human-in-the-loop feedback. A gate review system enabled agents to validate each other’s work and loop back on failures. A debate pattern engine coordinated multi-agent arguments to work towards consensus. These were real engineering solutions for real problems, and the coordination systems worked.

Then over a hackathon weekend, I decided to try something different. I replaced all of the complex orchestration systems with a single ReAct loop, some useful tools, and a well-written skill file. It replaced everything I had built before and worked better. It was more adaptable, easier to debug, and I built it in two days instead of four weeks. That experience changed how I think about agent architecture. And it turns out the entire industry is converging on the same conclusion.

Tools like Claude Code, Codex, and Cursor’s agent mode feel almost magical when you first use them, because they don’t just generate code. They plan an approach, read files, write code, run tests, notice failures, debug, and iterate on their own. But the core pattern behind all of them isn’t some novel orchestration architecture. It’s a loop from a 2022 research paper called ReAct, and the whole idea fits in four words: reason, act, observe, repeat. I think this might be the most under-appreciated concept in AI right now, and it’s worth understanding why.

What ReAct Actually Is

The ReAct pattern is a reasoning-and-acting loop where an AI agent thinks step by step, uses tools, and adapts based on results. An agent that:

  • Reasons about the current situation and what to do next
  • Acts by using a tool (read a file, run a query, call an API, search the web)
  • Observes the result
  • Repeats, reasoning about the new information and deciding the next action

That’s it. No complex graph. No choreography. One agent, thinking step by step, using tools as needed, adapting based on what it finds. The paper (Yao et al., 2022) showed this worked; it now has over 5,200 citations. What’s changed since then isn’t the pattern, it’s the engine behind it.

The ReAct Loop: A single agent reasoning, acting, and observing in a continuous cycle

A common reaction is that a simple step-by-step loop sounds too reactive. What about tasks that need a plan? Planning isn’t separate from the loop. It’s part of the reasoning. A capable agent facing a complex task will reason about what needs to happen, write a plan (often as an actual artifact like a markdown checklist), then execute against that plan step by step. As it works and observes results, it revises: updating the document, reprioritizing steps, adding tasks it didn’t anticipate. The plan is a living artifact inside the loop, not a static blueprint defined before execution begins.

This is how experienced engineers work too: sketch a plan, start executing, learn things that change the plan, adapt. The ReAct loop makes that natural workflow explicit.

And this isn’t a fringe take. Anthropic’s “Building Effective Agents” (December 2024), arguably the most influential document on agent design in two years, makes the same argument: the most successful implementations use simple, composable patterns rather than complex frameworks. Claude Code’s creator has cited the Bitter Lesson as the design philosophy. OpenAI’s agent-building guide recommends maximizing single-agent capability before adding complexity. Sketch.dev published the entire agent core in 9 lines of Python. Amp’s Thorsten Ball built a working code agent in under 400 lines. The consensus is broad and practitioner-driven.

The LLM Capability Multiplier

ReAct in 2022 with GPT-3.5 was promising but limited. ReAct in 2026 with the latest models is a different beast. The same pattern produces wildly different results because of four improvements in the underlying models:

  • Better Reasoning. The model can hold a complex plan in context, adjust as new information arrives, and recover from dead ends.
  • Better Tool Use. The model understands tool interfaces, constructs correct arguments, and interprets results accurately.
  • Longer Context. The model maintains coherent reasoning across dozens of tool calls without losing the thread.
  • Better Judgment. The model knows when to stop, when to try a different approach, and when to ask for help.
Same architecture in 2022 and 2026, with four capability improvements driving the difference in results

The architecture didn’t change. The intelligence inside it did. This is what most people miss. Working with Claude Sonnet 4.6 in a ReAct loop is a qualitatively different experience from the same pattern with earlier models. That’s where tools like Claude Code and Cowork went from interesting to genuinely practical. The gap isn’t incremental. It’s a step change in what the agent can handle autonomously. Claude Code’s system prompt has actually shrunk over time as models improved, the Bitter Lesson playing out in real time.

The Orchestration Complexity Trap

There’s a widespread assumption that more sophisticated orchestration equals better results. Build a more complex graph, add more specialized agents, create handoff protocols. But for most real-world use cases, the bottleneck was never orchestration complexity. It was LLM capability.

Comparison of a 14-node orchestration graph versus a ReAct loop with good tools

I experienced this firsthand. The orchestration framework I’d spent a month building had gate systems, debate patterns, persistence layers. Real engineering. But the problems it was solving were problems the model could now handle on its own. When I replaced it with a ReAct loop, the model just figured out the coordination. LangChain’s own leadership has acknowledged that frameworks making it harder to control what reaches the LLM create more problems than they solve. Their founding engineer documented repeatedly building complex structured workflows only to have model improvements make the structure unnecessary. Multiple production teams, including Octomind, found that removing framework complexity improved performance.

This has practical implications for anyone building or buying AI systems.

If your agent can reason well and has the right tools, a ReAct loop handles the task. You probably don’t need to manage the framework complexity you think you need.

The real leverage is in giving your agent the right tools (database access, file search, API calls), the right context (clear instructions, relevant documents), and the right access. Often it’s not in building multi-agent graphs.

As LLMs improve, a ReAct-based system improves for free. You can use the same code, with the same tools, and see better results. A complex orchestration graph may see some benefit, but it doesn’t immediately benefit from increased orchestration reasoning capabilities.

A simple architecture also means debugging can be simpler. When something goes wrong with a ReAct agent, you read the trace: it reasoned X, tried Y, observed Z, decided W. And when there are errors, the agent can sometimes course correct itself. Instead of just doing a simple retry of the same thing 3 times, it can reason that based on the API error there may be an intermittent issue, and find an alternate path to solve the task.

It’s ReAct All the Way Down

The natural objection may be “But what about tasks that need multiple agents?”

When Claude Code spawns a sub-agent to handle an isolated task, what’s actually happening? From the main agent’s perspective, it’s just another tool call: reason, act (spawn sub-agent), observe (get result back), repeat. The sub-agent itself is also running a ReAct loop. It’s ReAct loops all the way down.

Multi-agent architecture showing a main agent delegating to sub-agents, each running their own ReAct loop

Even Anthropic’s multi-agent research system, which demonstrated a 90% improvement over a single agent on complex research tasks, follows this pattern. A lead agent runs a ReAct loop, delegating to sub-agents that each run their own loops. The “multi-agent” part isn’t a different pattern. It’s the same pattern composed: ReAct agents as tools within other ReAct agents. The coordination between them is simple. The complexity lives inside each agent’s reasoning, not in the orchestration graph.

When I got a multi-agent team working from a chat interface (project lead, backend dev, frontend dev, running in parallel with a shared filesystem), the coordination wasn’t a complex message bus. It was each agent running its own ReAct loop, with a simple skill file defining responsibilities. The interaction pattern was straightforward. The intelligence was in the agents.

When Coordination Complexity Is Genuinely Warranted

The simple loop isn’t the answer to everything. The thesis here isn’t “ReAct always wins.” It’s that ReAct with modern LLMs is a qualitatively different thing than ReAct with 2022-era models. The production failures people cite usually come from one of three places: less capable models that couldn’t reliably hold a plan or recover from errors, missing infrastructure like state persistence and observability, or hard coordination problems where the interaction pattern between agents is the primary challenge.

Think of it like team topologies in software organizations. A single empowered team can ship remarkable things with a clear mission and the right tools. But when you need multiple teams working on the same system, crossing security boundaries, managing shared data models, coordinating across compliance domains, the interaction patterns between teams become a first-class design problem.

The same applies to agents. When tasks require parallelism across independent domains, when context windows fill up from sequential work, when different phases cross security or compliance boundaries, that’s when multi-agent coordination becomes a real design problem. Not because the individual agents need to be more complex, but because how they work together matters. The right response isn’t “add more orchestration.” It’s “use a capable model, give it good tools, and add coordination complexity only when the problem itself demands it.”

Simple Architecture, Skilled Implementation

So if the architecture is this simple, does that mean building effective AI systems is easy? Not remotely. The simplicity of the ReAct loop shifts the challenge, but it doesn’t eliminate it.

The architecture of these agentic systems is the easy part, and it’s getting easier every month. The hard part is everything around the loop that makes a specific agent effective and safe. You need to provide the right context so the agent can reason well, scope your system with the right permissions and integrations so it can act safely, and build the right harnesses so you get reliable behavior and clear observability when something goes wrong.

This is where the bottleneck is actually moving as the tooling matures. A year or two ago the conversation was about how to chain agents together and build complex orchestration layers, but that problem is largely solved now. The real challenge has shifted to the ecosystem around the agent. How mature is your security posture? Do you have strong controls around your MCP servers and tool integrations? Do you understand what an AI system should and shouldn’t be able to access in your environment? These are fundamentally data governance and security questions, not AI architecture questions, and most organizations aren’t as far along on them as they think.

The teams getting the best results aren’t the ones who built the most sophisticated agent workflows. They’re the ones who deeply understand their domain and have invested in their data and security ecosystem so that when they do hand an agent the right tools, it operates within boundaries they trust. At SEP, this is exactly the kind of work we do, helping organizations design the context, the security boundaries, and the agent architecture that makes AI systems effective and safe for their specific domain

What This Means for Your Organization

If you’re evaluating AI tools, it’s worth understanding that most of them have converged on this same basic pattern. The differentiation isn’t in the architecture anymore, it’s in what surrounds the loop. The tools the agent can access, the context it reasons over, the guardrails that keep it operating safely, and how well the whole thing integrates with your existing systems and security posture. That’s where you should be asking hard questions, because two products built on the same agentic system can produce wildly different results depending on how well those pieces are designed.

Up Next

This is the first post in a series covering recent advancements in AI and how the AI impacts your enterprise’s approach to data governance. In Post 2, we’ll look at what steps you should be taking to prepare for AI tooling and why your data ecosystem is more critical than ever.

AI agents are ready,
is your foundation ready for them?

SEP builds data platforms and AI systems for organizations where getting this right matters. Start a conversation about what agent architecture looks like for your domain.

References & Further Reading

Updated:

Published: