Why Durable Execution?
Traditional agents run in-memory. If your process crashes during:- A long-running LLM call
- An external API request in a tool
- A multi-step conversation with many tool calls
- Checkpoint Every Step: Each LLM call and tool execution is automatically saved
- Automatic Recovery: After a crash, the agent resumes from the last checkpoint
- Exactly-Once Guarantees: Tool executions are never duplicated, even after retries
- Long-Running Workflows: Agents can run for hours or days without risk
Prerequisites
- Restate Server: You need a running Restate server. Follow the Restate installation guide to set one up.
Creating a Durable Agent
To create a durable agent, useNewRestateAgent() instead of NewAgent():
Deployment Options
There are two ways to deploy durable agents:Option 1: Single Process (Development/Testing)
Run both the Restate service handler and the application code in the same process. This is simpler but less resilient since both components share the same process lifecycle.- The Restate service runs on
http://localhost:9080 - Your application server runs on
http://localhost:8070 - Both are in the same process
Option 2: Separate Processes (Production)
For production deployments, run the Restate service and application in separate processes. This provides true fault isolation—if your application crashes, the Restate service continues running and can recover the workflow.Application Process
Restate Service Process
Running the Services
Registering the Deployment with Restate Server
After starting your Restate service (on port 9081), you must register it with the Restate server so it can discover and invoke your agent workflows.Using the Restate CLI
Using the Admin API
Using the Restate UI
Navigate to your Restate server’s UI (typicallyhttp://localhost:9070) and register the deployment through the web interface.
Note: If running Restate in Docker, replace localhost with host.docker.internal:
Automatic Registration
For production environments, consider automating deployment registration:- Kubernetes: Use the Restate Kubernetes Operator for automatic registration and lifecycle management
- CI/CD Pipeline: Add deployment registration as a step in your CI/CD pipeline
- FaaS Platforms: AWS Lambda, Vercel, and other FaaS platforms automatically handle versioning through version-specific ARNs/URLs
Example: Complete Durable Agent
See the complete working example in the repository:examples/13_durable_agent/main.go