Architecture
Request Flow
Clients access the platform through KrakenD, which maps friendly REST paths to internal Restate services: POST /api/v1/<domain>/<service>/<method> with a JSON (protojson) request body.
Gateway Routing
Component Types
Gateways
Stateless Restate services that extract auth context from request headers (x-user-id, x-user-email) and route to downstream actors, services, or workflows via the Restate SDK.
| Gateway | Domain | Downstream |
|---|---|---|
| LLM Gateway | Conversations | ConversationManagerActor, MCPService, MemoryService |
| Socayo Gateway | Health Coaching | SocayoUserActor |
| Storage Gateway | Storage | StorageManagerActor |
| Integrations Gateway | Integrations | PipedreamService |
| Webhook Gateway | Webhooks | WebhookService |
| Notification Gateway | Notifications | NotificationService → Novu REST API |
| Auth Gateway | Auth | ApiKeyService |
Actors
Stateful components keyed by ID. Each actor instance maintains its own state via Restate's durable state.
| Actor | Key | Purpose |
|---|---|---|
| ConversationActor | conversation_key | Individual conversation state and message history |
| ConversationManagerActor | user_id | Per-user conversation management (list, create, delete) |
| SocayoUserActor | user_id | User profile and onboarding data |
| StorageManagerActor | user_id | Per-user file storage operations and quota |
| FirebaseBridgeActor | user_id | Firebase Auth token bridge |
Services
Stateless processors that handle specific integrations or computations.
- OpenRouter — LLM inference via OpenRouter API
- MCP — Model Context Protocol tool server management
- Pinecone — Vector database operations
- Memory — Mem0 memory search and management
- Storage — Object storage backend (S3-compatible)
- OpenFGA — Fine-grained authorization checks
- Pipedream — Third-party app integrations
- NotificationService — Multi-channel notification delivery via Novu
- Pipecat / Daily / Cerebrium — Voice AI pipeline
Workflows
Long-running orchestrations that coordinate multiple services.
| Workflow | Purpose |
|---|---|
| SocayoPlanGeneratorWorkflow | Generate coaching plans from user profiles |
| GenerationWorkflow | Content generation with LLM |
| DeploymentWorkflow | Service deployment orchestration |
Error Handling
All services use a standardized RpcError structure for error responses and event payloads. Errors are classified as terminal (non-retryable, returned to the client) or retryable (Restate will retry automatically).
{
"error": {
"code": "MODEL_INVALID",
"message": "The requested model is not available",
"isTerminal": true,
"details": {
"upstreamError": { "statusCode": 404, "body": "..." },
"helpLink": { "url": "https://docs.yocaso.dev/guides/model-routing" }
}
}
}
The ErrorCode enum defines ~25 standardized codes (e.g., INVALID_ARGUMENT, UNAUTHENTICATED, RATE_LIMITED, MODEL_INVALID, TOOL_EXECUTION_FAILED) that map to HTTP status codes at the gateway boundary. Structured ErrorDetails can include upstream error context, model errors, retry info, field violations, and help links.
Infrastructure
Deployment
Services are built with ko (Go) or Docker, pushed to GHCR, and automatically deployed via Flux image automation policies.