Skip to content
Zheng Nian's Blog
Go back

My Journey with MCP: From RPC Wrapper to Native Agent Integration

We actually started exploring MCP (Model Context Protocol) way back in May, long before our major refactor in November. Looking back, our understanding of it has evolved drastically.

Table of contents

Open Table of contents

Phase 1: The “RPC” Misunderstanding (May)

In the beginning, we treated MCP just like a fancy RPC (Remote Procedure Call). We weren’t even using frameworks like LangGraph yet.

Our workflow was crude:

  1. Ask LLM to generate parameters based on a schema.
  2. Manual Code: Catch the output, parse JSON.
  3. Manual Code: Invoke the MCP tool (using a simple client wrapper).
  4. Manual Code: Get the result and feed it back to the LLM.

It worked, but it defeated the purpose. We were writing glue code for every single tool. It felt like we were just adding an extra layer of complexity over a standard HTTP API.

# The "Old Way" (May) - Manual Glue
response = llm.invoke(prompt)
if "call_tool" in response:
    # We were manually parsing and invoking...
    tool_name = parse_tool_name(response)
    args = parse_args(response)

    # Treating MCP client just like a requests.post() wrapper
    result = mcp_client.call_tool(tool_name, args)

    # Manually constructing the next prompt
    next_prompt = f"Tool result: {result}. Now continue."

Phase 2: Understanding the Transport (Stdio vs. SSE)

As we dug deeper (I highly recommend reading the technical deep-dive This is MCP), we started to understand the underlying mechanics.

Initially, I was confused by the npx and python commands in the MCP configuration. “Why do I need to run a command locally?”

It turns out, the default mode uses Stdio (Standard Input/Output). The host process (Agent) spawns the tool server as a subprocess and talks to it via stdin/stdout.

The Stdio Limitation

This explains why we couldn’t scale initially. Stdio is great for local development (like running an Agent on your laptop that talks to a local SQLite DB), but it fails in a server environment:

  1. No Concurrency: One process, one connection.
  2. Resource Heavy: Spawning a new Python process for every user request is a disaster.

Switching to HTTP SSE

We eventually standardized on HTTP SSE (Server-Sent Events) for all our internal MCP servers. This allows a single long-running server to handle thousands of concurrent Agent connections, stateless and efficient.

Phase 3: The Long-Running Task Challenge (November)

By November, we had migrated to LangGraph. We wanted to fully leverage create_react_agent and let the model drive the interaction natively.

But we hit a UX wall: Progress Reporting.

When an Agent calls a tool like start_collection_job (e.g., crawling 100 posts from a social media platform), it might take several minutes.

The Solution: Background Jobs + Frontend Polling

We realized that for long tasks—especially data collection and scraping—the MCP tool shouldn’t “do” the work synchronously. It should “dispatch” the work to a specialized crawler service.

  1. Agent: Calls start_collection_job(platform="xhs", keyword="AI").
  2. MCP Tool: Immediately returns {"job_id": "job_99", "status": "processing"}.
  3. Agent: Receives this instantly and tells the user: “I’ve started the collection task. You can see the progress in the panel.”
  4. Frontend: Sees the job_id in the tool output and starts polling our background worker API to show a real-time progress bar (e.g., “45/100 items collected”).
FrontendCrawler_WorkerMCP_ServerAgentFrontendCrawler_WorkerMCP_ServerAgentloop[Every 2s]call_tool: start_collectionenqueue_collection_task()return {job_id: 99}"Collection started!"Check Status (job_99)45/100 items...Task Complete (Data ready in DB)

This split architecture allows us to keep the Agent loop responsive while handling heavy lifting asynchronously.

Conclusion

The journey with MCP was ultimately a journey of understanding protocol mechanics.

Resources


Share this post on:

Previous Post
The Evolution of Agent Sandbox: From Docker to Cloud Services