Error Handling
This guide explains the platform's standardized RPC error model and how to handle errors safely in clients.
All gateway failures now use common.v1.RpcError from apis/common/v1/errors.proto.
1) Error Response Shape
For full request failures, the gateway returns a non-2xx HTTP status and a top-level RpcError JSON body.
{
"code": "ERROR_CODE_MODEL_INVALID",
"message": "all candidate models were filtered out: [invalid/model-xyz: not_in_catalog]",
"is_terminal": true,
"details": {
"error_info": {
"reason": "ALL_MODELS_FILTERED",
"domain": "openrouter",
"metadata": {
"conversation_key": "research-001"
}
},
"model_error": {
"model_id": "invalid/model-xyz",
"reason": "invalid"
}
}
}
2) Error Codes Reference
Source of truth: apis/common/v1/errors.proto. If this table and the proto ever differ, the proto wins.
| ErrorCode | HTTP | Default terminal? | Meaning |
|---|---|---|---|
ERROR_CODE_UNSPECIFIED | 500 | Depends | Unspecified fallback code |
ERROR_CODE_CANCELLED | 499 | Yes | Operation cancelled by caller |
ERROR_CODE_UNKNOWN | 500 | No | Unknown error |
ERROR_CODE_INVALID_ARGUMENT | 400 | Yes | Request payload/fields invalid |
ERROR_CODE_DEADLINE_EXCEEDED | 504 | No | Request timed out |
ERROR_CODE_NOT_FOUND | 404 | Yes | Resource does not exist |
ERROR_CODE_ALREADY_EXISTS | 409 | Yes | Resource already exists |
ERROR_CODE_PERMISSION_DENIED | 403 | Yes | Caller lacks permission |
ERROR_CODE_RESOURCE_EXHAUSTED | 429 | Depends | Quota/rate limit exhausted |
ERROR_CODE_FAILED_PRECONDITION | 400 | Yes | State not ready for operation |
ERROR_CODE_ABORTED | 409 | No | Concurrency conflict, retry may succeed |
ERROR_CODE_OUT_OF_RANGE | 400 | Yes | Value outside allowed range |
ERROR_CODE_UNIMPLEMENTED | 501 | Yes | Not implemented |
ERROR_CODE_INTERNAL | 500 | No | Internal server failure |
ERROR_CODE_UNAVAILABLE | 503 | No | Temporary service unavailable |
ERROR_CODE_DATA_LOSS | 500 | Yes | Unrecoverable data loss |
ERROR_CODE_UNAUTHENTICATED | 401 | Yes | Missing/invalid auth |
ERROR_CODE_MODEL_INVALID | 400 | Yes | Invalid or unsupported model ID |
ERROR_CODE_MODEL_UNAVAILABLE | 503 | No | Model exists but providers unavailable |
ERROR_CODE_MODERATION_FLAGGED | 403 | Yes | Content blocked by moderation |
ERROR_CODE_GENERATION_FAILED | 500 | Depends | Generation failed after retries |
ERROR_CODE_TOOL_EXECUTION_FAILED | 500 | Depends | Tool execution failed |
ERROR_CODE_UPSTREAM_PROVIDER | 503 | No | Upstream API/provider failed |
ERROR_CODE_VALIDATION_EXHAUSTED | 500 | Yes | Structured validation failed repeatedly |
ERROR_CODE_PAYMENT_REQUIRED | 402 | Yes | Credits/billing required |
3) Retryability (is_terminal)
is_terminal=true: retrying the same request is not expected to help.is_terminal=false: transient condition; retry with backoff.- Always trust
is_terminalfrom the payload over assumptions from HTTP status alone. - When present,
details.retry_info.retry_delay_msgives a minimum delay hint.
4) Error Details Fields
RpcError.details can include:
error_info: machine-readable classification (reason,domain, optionalmetadata)retry_info: retry hint (retry_delay_ms)field_violations: field-level validation failuresupstream_error: provider context (provider,status_code,raw_body)model_error: model-specific context (model_id,reason,alternatives_tried)help_links: URLs for recovery docs
5) HTTP Mapping
HTTP status is derived from ErrorCode by server-side mapping (pkg/restatex/errors.go):
400:ERROR_CODE_INVALID_ARGUMENT,ERROR_CODE_FAILED_PRECONDITION,ERROR_CODE_OUT_OF_RANGE,ERROR_CODE_MODEL_INVALID401:ERROR_CODE_UNAUTHENTICATED402:ERROR_CODE_PAYMENT_REQUIRED403:ERROR_CODE_PERMISSION_DENIED,ERROR_CODE_MODERATION_FLAGGED404:ERROR_CODE_NOT_FOUND409:ERROR_CODE_ALREADY_EXISTS,ERROR_CODE_ABORTED429:ERROR_CODE_RESOURCE_EXHAUSTED499:ERROR_CODE_CANCELLED500:ERROR_CODE_UNKNOWN,ERROR_CODE_INTERNAL,ERROR_CODE_DATA_LOSS,ERROR_CODE_GENERATION_FAILED,ERROR_CODE_TOOL_EXECUTION_FAILED,ERROR_CODE_VALIDATION_EXHAUSTED, unspecified/unknown501:ERROR_CODE_UNIMPLEMENTED503:ERROR_CODE_UNAVAILABLE,ERROR_CODE_MODEL_UNAVAILABLE,ERROR_CODE_UPSTREAM_PROVIDER504:ERROR_CODE_DEADLINE_EXCEEDED
6) Client Handling Pattern (TypeScript)
type RpcError = {
code: string;
message: string;
is_terminal: boolean;
details?: {
error_info?: { reason?: string; domain?: string; metadata?: Record<string, string> };
retry_info?: { retry_delay_ms?: number };
};
};
async function callGateway(path: string, body: unknown) {
const res = await fetch(path, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify(body),
});
if (res.ok) return res.json();
const err = (await res.json()) as RpcError;
const retryDelay = err.details?.retry_info?.retry_delay_ms ?? 1000;
if (!err.is_terminal) {
// Use your retry policy (exponential backoff, jitter, max attempts)
throw new Error(`retryable:${retryDelay}:${err.code}`);
}
// Terminal: surface a stable, code-driven UX path
throw new Error(`terminal:${err.code}:${err.message}`);
}
7) Streaming Errors
Streaming failures are emitted in LLMStreamChunkEvent when finish_reason is "error":
{
"run_id": "run_123",
"chunk_index": 42,
"is_final": true,
"finish_reason": "error",
"error": {
"code": "ERROR_CODE_UPSTREAM_PROVIDER",
"message": "OpenRouter request failed",
"is_terminal": false
}
}
For streaming consumers, treat this final chunk as terminal for that stream and branch on error.code/error.is_terminal.
8) Common Scenarios
- Invalid model ID:
ERROR_CODE_MODEL_INVALID, usually terminal. - Rate limited:
ERROR_CODE_RESOURCE_EXHAUSTED, often retryable; checkretry_info. - Provider outage:
ERROR_CODE_UPSTREAM_PROVIDER, usually retryable. - Auth issues:
ERROR_CODE_UNAUTHENTICATEDorERROR_CODE_PERMISSION_DENIED, terminal until credentials/permissions change.
9) Error Domain Map
details.error_info.domain identifies the subsystem that originated the error. Common values:
| Domain | Typical source |
|---|---|
gateway | LLM gateway validation/request context |
openrouter | Model routing and provider API integration |
conversation | Conversation actor/service operations |
generation | Generation workflow orchestration |
mcp | MCP service/tool execution |
notification / novu | Notification gateway/service |
storage | Storage gateway/service and storage actors |
apikey | API key service and gateway |
webhook / convoy | Webhook gateway/service |
For full API field-level details, see LLM Gateway API Reference.