In software engineering, “Context” is usually something we developers manage. We query the database, format the JSON, and pass it to the function.
For the first half of this year, we built our AI products with this same mindset. We treated LLMs as powerful functions within a rigid Workflow.
But by November, we realized this was a dead end. We completely refactored our system to follow the Deep Agents pattern (inspired by Manus and the ReAct paper). This post details why we made that shift.
Phase 1: The “Workflow” Era (May - July)
Our early architecture (built on LangGraph) was technically a “Graph”, but conceptually a Fixed Workflow.
To generate a report, we hard-coded a sequence of nodes:
- Intent Node: Classify user request.
- Tool Selection Node: Hard-coded logic to pick data sources.
- Generator Node: Loop through chapters and generate text.
The Problems: No Agency & State Bloat
This rigid structure created three major engineering headaches:
- No Agency (Manual Tool Calling): We weren’t exposing tools to the model. We were writing Python code to invoke tools for the model. This stripped the LLM of its reasoning capability—it couldn’t decide to “search again” if the first result was bad.
- State Bloat: To share context between these disparate nodes, we had to maintain a massive, monolithic
GraphState. Every variable needed by any node had to be carried globally, making the state object enormous and debugging a nightmare. - The Subgraph Streaming Trap: When we tried to break the monolith into subgraphs, we hit a wall. The streaming events from a subgraph couldn’t bubble up to the parent graph in real-time. This meant the frontend UI was “blind” while a sub-agent was working, destroying the user experience.
Essentially, we were trying to Pre-compute context for a dynamic problem. We were “pushing” context to the AI, hoping we guessed right.
Phase 2: The Interpretation Gap (August - October)
In late July, Manus released a blog post that deeply inspired us. They argued that Context Engineering is about building an environment, not just prompt tuning.
However, in our initial attempt to apply this, we focused too much on the “Management” aspect and missed the “Agency”. We spent months implementing rigid “Todo Lists” and “Explicit Memories” to force the Agent to stay on track. We were micromanaging the AI, trying to manually curate what it should “remember” at every step.
During this period, I was also pulled away to support customer POCs, using Claude Code to process complex reports. We also experimented with Multi-Agent patterns (Red-Blue Debate, Delphi Consensus). While these experiments were valuable, they felt fragmented. We were building sophisticated scaffolds around the model, but we weren’t letting the model drive the scaffold.
We were still trapped in the mindset: “We need to organize the information for the AI.”
Phase 3: The Deep Agent Revolution (November)
By late October, the release of LangChain’s Deep Research (DeepAgents) template was a turning point. It didn’t just give us code; it validated the philosophy Manus had hinted at: Trust the Loop.
We realized that true Context Engineering isn’t about us feeding data to the Agent. It’s about giving the Agent the tools (File System, Sandbox) to construct its own context.
In November, we pivoted. We deleted the complex workflow. We replaced it with a simple Agent Loop (Model -> Tool -> Model).
The Paradigm Shift: “Context Pull”
Instead of us preparing the context, we empowered the Agent to pull it.
- Old Way: Engineer writes code to list files -> Engineer feeds content to LLM.
- New Way (AI Native): Give Agent
ls,grep,read_file. Agent thinks “I need to see data” -> Callsls. Agent thinks “File A is relevant” -> Callsread_file.
Why “Deep Agents” Work Better
- Self-Correction: If the Agent reads a CSV and finds the format is messy, it can autonomously decide to write a Python script to clean it. In our old workflow, this would have required a code change by us.
- Precision: The Agent builds its context iteratively. It searches, reads, filters, and searches again. This dynamic context construction is far superior to any static RAG retrieval we could engineer.
- Simplicity: Our
graph.pyshrank from hundreds of lines of routing logic to a clean definitions of Tools and a standard ReAct node.
Conclusion
We learned the hard way that Context Engineering is not about data preparation; it is about Agent Agency.
The job of the AI Engineer is not to feed the model, but to build a robust Environment (Sandbox, File System, Search Tools) where the model can feed itself.