The Claude Agent SDK differs from traditional stateless LLM APIs in that it maintains conversational state and executes commands in a persistent environment. This guide covers the architecture, hosting considerations, and best practices for deploying SDK-based agents in production.
For security hardening beyond basic sandboxing—including network controls, credential management, and isolation options—see Secure Deployment.
For security and isolation, the SDK should run inside a sandboxed container environment. This provides process isolation, resource limits, network control, and ephemeral filesystems.
The SDK also supports programmatic sandbox configuration for command execution.
Each SDK instance requires:
Runtime dependencies
npm install -g @anthropic-ai/claude-codeResource allocation
Network access
api.anthropic.comUnlike stateless API calls, the Claude Agent SDK operates as a long-running process that:
Several providers specialize in secure container environments for AI code execution:
For self-hosted options (Docker, gVisor, Firecracker) and detailed isolation configuration, see Isolation Technologies.
Create a new container for each user task, then destroy it when complete.
Best for one-off tasks, the user may still interact with the AI while the task is completing, but once completed the container is destroyed.
Examples:
Maintain persistent container instances for long running tasks. Often times running multiple Claude Agent processes inside of the container based on demand.
Best for proactive agents that take action without the users input, agents that serve content or agents that process high amounts of messages.
Examples:
Ephemeral containers that are hydrated with history and state, possibly from a database or from the SDK's session resumption features.
Best for containers with intermittent interaction from the user that kicks off work and spins down when the work is completed but can be continued.
Examples:
Run multiple Claude Agent SDK processes in one global container.
Best for agents that must collaborate closely together. This is likely the least popular pattern because you will have to prevent agents from overwriting each other.
Examples:
When hosting in containers, expose ports to communicate with your SDK instances. Your application can expose HTTP/WebSocket endpoints for external clients while the SDK runs internally within the container.
We have found that the dominant cost of serving agents is the tokens, containers vary based on what you provision but a minimum cost is roughly 5 cents per hour running.
This is likely provider dependent, different sandbox providers will let you set different criteria for idle timeouts after which a sandbox might spin down. You will want to tune this timeout based on how frequent you think user response might be.
The Claude Code CLI is versioned with semver, so any breaking changes will be versioned.
Since containers are just servers the same logging infrastructure you use for the backend will work for containers.
An agent session will not timeout, but we recommend setting a 'maxTurns' property to prevent Claude from getting stuck in a loop.