Skip to content
Zheng Nian's Blog
Go back

The Evolution of Agent Sandbox: From Docker to Cloud Services

In April, Manus made waves in the industry by demonstrating how an Agent could autonomously write and execute code to solve complex tasks. Since then, a secure Sandbox (Code Interpreter) has become a standard configuration for any serious Agent system.

Industry giants followed suit: Alibaba launched Agent Bay (Computer/Mobile/Browser use), and specialized vendors like E2B and PPIO emerged.

This post documents our journey in selecting and implementing a sandbox architecture, from self-hosting exploration to cloud service adoption.

Table of contents

Open Table of contents

Phase 1: The Self-Hosting Exploration (Docker + NFS)

When we first started, the natural instinct was to build it ourselves using open-source frameworks like agent-infra/aio.

Our proposed architecture was standard:

Why We Moved On

We didn’t actually implement this in production. During the exploration phase, we realized the Ops Burden would be massive:

  1. Complexity: Synchronizing files via NFS across a distributed system while maintaining strict permissions is a maintenance nightmare.
  2. Overhead: Managing Docker images (installing new libraries on the fly), handling auto-scaling, and maintaining session state (warm-up/cleanup) was too much work for a small team. We wanted to focus on Agent reasoning, not Kubernetes plumbing.

Phase 2: Evaluating Cloud Sandboxes

By November, we realized: Why reinvent the wheel? We started evaluating managed Sandbox-as-a-Service providers.

The Contenders

  1. E2B:
    • Pros: Industry standard, high community visibility, generous free tier ($100).
    • Cons: Network latency for users in China was unacceptable.
  2. Novita AI:
    • Pros: Stable.
    • Cons: Requires upfront payment (no free trial).
  3. PPIO Sandbox:
    • Pros: Optimized for China, supports persistent file systems, very similar API to Novita.
    • Decision: We chose PPIO for its domestic network performance and compatibility with our file system needs.

Phase 3: The Storage Breakthrough (OSS Integration)

Using a cloud sandbox introduced a new challenge: File Sharing. If the Agent generates a chart inside the cloud sandbox, how do we get it out? And how does the Agent edit a file stored in our cloud?

We solved this by bridging the Agent and Sandbox file systems via OSS (Object Storage Service).

1. The “OSS Filesystem Backend” (Server-side)

We rewrote the backend for our Filesystem Middleware. Instead of operating on the local server disk, we implemented a custom backend that translates standard file operations into OSS API calls.

This gives the Agent “Built-in Tools” to manipulate files in the cloud as if they were local.

2. OSSFS Mounting (Sandbox-side)

Inside the PPIO sandbox, we use ossfs (FUSE) to mount the same bucket to a local directory.

3. Path Consistency & Security

The trickiest part was ensuring the Agent “knows” where it is.

This architecture allows the Agent to use its “own computer” to write code, generate documents, and create images, with all artifacts persistently stored in OSS.

Phase 4: The Concurrency Trap (Redis Locking)

As our system scaled, we encountered a strange bug: State Drift.

An Agent would issue two parallel tool calls:

  1. pip install numpy
  2. python analysis.py

Logic dictated that these should run in the same sandbox. However, due to async concurrency, both requests checked agent_state.sandbox_id, found it empty (or the previous sandbox expired), and simultaneously created two different sandboxes.

The Fix: Redis Distributed Lock

We couldn’t rely on the Agent’s memory state alone. We introduced a Redis Lock keyed by the thread_id.

PPIO_APIRedisMiddlewareAgentPPIO_APIRedisMiddlewareAgentalt[Lock Acquired][Lock Failed]Request Sandbox (thread-123)SETNX lock:sandbox:thread-123OKCheck/Create SandboxSave ID & Release LockReturn Sandbox InstanceWait & PollExisting Sandbox IDReturn Sandbox Instance

This ensures that even if an Agent “thinks” in parallel, its “body” (the sandbox) remains a singleton resource for that session.

Phase 5: The “State Persistence” Trap

The “Pause” Strategy (Failed)

We initially wanted the Agent to “remember” everything indefinitely. We used the sandbox’s Pause/Resume feature to snapshot the container state after every run.

The “Keep-Alive” Strategy (Current)

We analyzed our logs and found that 99% of tasks (Data Analysis, Plotting) finish within minutes. Long-running state was rarely needed.

We switched to a simpler strategy:

Conclusion

Building a Sandbox is not just about docker run. It’s about lifecycle management, file synchronization, and choosing the right provider for your target audience.

Moving to a managed service like PPIO saved us hundreds of engineering hours, allowing us to focus on the Agent’s reasoning logic rather than Kubernetes maintenance.

Resources


Share this post on:

Previous Post
Teable + AI: Enhancing Agents with SQL and Automated Labeling
Next Post
My Journey with MCP: From RPC Wrapper to Native Agent Integration