Skip to main content

The Problem

For LLM researchers, setting up LLM training or reinforcement learning environment for real-world tool use is complex and painful:
  • Managing different environment or test accounts
  • Implementing MCP Servers and handling various authentication issues
  • Initializing realistic data
  • Resetting states between multiple runs
  • Ensuring isolation across concurrent sessions

The Solution

Klavis MCP Sandbox as a Service solves these challenges. In addition to letting your model interact with our comprehensive MCP server ecosystem, you can use our sandbox infrastructure to easily dump and reset data on any concurrent run.
Our sandbox infrastructure is horizontally scalable, so it can handle any number of concurrent sessions as you need.

Lifecycle

1

Create

Request a sandbox based on the external services you need (Snowflake, Gmail, CRM, ERP, etc.) and get an MCP server URL for that isolated instance.
2

Initialize (seed)

Load a deterministic “world state” in JSON format. We handle everything—creating databases, setting up CRM data, ERP systems, and more.
3

Interact (MCP)

Let your LLM / AI agent use MCP tools against the sandbox as if it were the real app. You can use multiple MCP servers with many tools simultaneously.
4

Dump (verify)

Snapshot the full sandbox state to programmatically compare against your ground truth—whether your LLM completed the task correctly or not.
5

Reset / Delete

Wipe the sandbox back to a clean slate and kick off the next run.

Video

Resources