Skip to main content

Tenant & User Lifecycle Roadmap

End-to-end view of how tenants and users are provisioned, authenticated, managed, and decommissioned across the platform. This page consolidates the design decisions from all enterprise PRDs into a single reference.

Source PRDs

Each section below links to its authoritative PRD(s). The consolidated flows are derived from:

Service Inventory

The platform separates services into three categories: new services introduced by the enterprise PRDs, existing services that are kept (with updates where needed), and external dependencies.

New Services

ServicePurposePRD
ZitadelPlatform IdP for tenant members (replaces Firebase Auth for authentication)Zitadel PRD
TenantProfileActorSingle source of truth for tenant state and configurationTenant Onboarding PRD FR-1
TenantOnboardingWorkflowOrchestrates tenant provisioning across all integrationsTenant Onboarding PRD FR-2
TenantAdminGatewayTenant management and self-service RPCsTenant Onboarding PRD FR-3
EndUserActorEnd-user identity (from JWT) and profile (from client)EndUserActor PRD FR-1
EndUser GatewayProfile management RPCs for end usersEndUserActor PRD FR-6
Identity Event BridgeZitadel events → Kafka fan-out to downstream systemsZitadel PRD FR-5
Firestore projectionsQueryable projection layer for tenant/user/member dataFirestore ADR

Existing Services (Kept)

ServiceUpdated?Notes
KrakenDNoSubdomain routing + same POST /verify auth flow
Auth ServiceYesOIDCValidator reused for Zitadel; claim extraction expanded
LLM GatewayYesIdentity stamping on messages, EndUserActor upsert
Storage GatewayNo
Webhook GatewayNo
Admin Portal (GYBC Console)NoAlready built on Refine + Mantine
Conversation ActorYesParticipant tracking added
Generation WorkflowYesUser context injection (name, timezone, weather)
MCP Service, Memory, PineconeNo

External Dependencies

ServiceUsed For
UnkeyAPI key management, rate limiting
LagoBilling and usage metering
NovuMulti-channel notifications
ConvoyOutbound webhooks
OpenFGAFine-grained authorization (ReBAC)
FirestoreQueryable projections
Kafka / NATSEvent bus
ChatwootOmnichannel chat (optional)

Removed / Superseded

ServiceReplaced By
Firebase Auth (for tenant member auth)Zitadel
SocayoUserActorEndUserActor (platform-level) + Socayo domain rebuilding
UserProfileActorMerged into EndUserActor
scripts/onboard-orgTenantOnboardingWorkflow
Socayo GatewaySocayo team's own domain services
Socayo Plan WorkflowSocayo team's own domain

Overall Platform Architecture

Flow 1: Tenant Onboarding

PRDs: Tenant Onboarding FR-2 · Zitadel FR-2 · API Key Management · Firestore ADR

ServiceRole in this flow
TenantAdminGatewayEntry point — receives CreateTenant RPC
TenantOnboardingWorkflowOrchestrates all 9 provisioning steps (Restate exactly-once)
TenantProfileActorStores tenant state, tracks integration registry
OpenFGAWrites initial authorization tuples (tenant admin, API key scopes)
NovuCreates notification org + tenant branding
LagoCreates billing customer with business fields
UnkeyCreates initial pk_live + sk_live API key pair
ZitadelCreates Organization + admin user + role grant
FirestoreWrites queryable tenant projection

Platform admin creates a tenant. TenantOnboardingWorkflow orchestrates all provisioning with exactly-once semantics via Restate. Each step is idempotent and resumable.

Integration Registry State Machine

Each step tracks its own state on TenantProfileActor.integrations_registry, enabling resume-on-failure.

Example registry state after successful onboarding:

{
"openfga": { "state": "PROVISIONED", "provisioned_at": "..." },
"novu": { "state": "PROVISIONED", "config_refs": { "org_id": "org_abc" } },
"lago": { "state": "PROVISIONED", "config_refs": { "customer_id": "cust_xyz" } },
"unkey": { "state": "PROVISIONED", "config_refs": { "pk_id": "pk_...", "sk_id": "sk_..." } },
"zitadel": { "state": "PROVISIONED", "config_refs": { "org_id": "123456789" } },
"firestore": { "state": "PROVISIONED" }
}

Initial API Key Pair

PRD: API Key Management

Step 5 creates two keys with different scope templates:

KeyTemplateScopesUsage
sk_live_xxxBackend (Full)conversations:*, storage:*, memory:*, tools:*, voice:*, webhooks:*Server-to-server (tenant's backend)
pk_live_xxxMobile Appconversations:send_message, conversations:view, voice:start_session, memory:searchMobile/web app, safe to embed client-side

Initial OpenFGA Tuples

PRD: Authorization

Step 2 writes the minimum set of tuples for the tenant to be authorizable:

# Tenant admin
tenant:acme-a1b2c3#admin@user:alice_admin

# API keys belong to tenant
api_key:pk_acme_xxx#tenant@tenant:acme-a1b2c3
api_key:sk_acme_xxx#tenant@tenant:acme-a1b2c3

# Scopes granted to each key (one tuple per scope in template)
api_key:sk_acme_xxx#scope@scope:conversations:send_message
api_key:sk_acme_xxx#scope@scope:conversations:view
api_key:sk_acme_xxx#scope@scope:storage:read
# ... (all scopes from template)

Conversation/file/thread tuples are written on-demand when end users create resources.

Flow 2: Data Residency — Per-Tenant Region

PRD: Data Residency

ServiceRole in this flow
TenantProfileActorStores data_region field (immutable after creation)
TenantOnboardingWorkflowProvisions resources in the tenant's assigned region
GCP Global Load BalancerRoutes requests to correct regional stack
KrakenD (regional)Terminates requests in the correct region
OPAEnforces data_residency policy (e.g., block US LLMs for EU tenants)

Each tenant is assigned a region at creation (immutable). All PII-bearing infrastructure deploys per region; auth and tenant config remain global.

Global vs Regional split

LayerDeploymentExamples
GlobalSingle instance, sharedZitadel, Unkey, TenantProfileActor (non-PII config), DNS/CDN
RegionalPer US/EU/AP stackRestate, Kafka, Redis, Pinecone, Mem0, Firestore, GCS, Novu, Convoy, KrakenD, all PII-bearing actors

Request routing

Impact on onboarding

TenantProfileState.data_region is set during creation (field 72, immutable). The onboarding workflow provisions all regional resources in the tenant's assigned region:

TenantOnboardingWorkflow.Run(tenant_id, data_region="eu")
├── Step 1: Initialize TenantProfileActor (global)
├── Step 2: OpenFGA tuples (global)
├── Step 3: Novu org (regional — EU Novu instance)
├── Step 4: Lago customer (global — billing is global)
├── Step 5: Unkey keys (global — key management is global)
├── Step 6: Zitadel org (global — identity is global)
├── Step 7: Novu tenant (regional)
├── Step 8: Firestore projection (regional — EU Firestore)
└── Step 9: Activate

LLM provider routing for EU tenants

EU tenants' requests are routed only to EU-hosted or GDPR-compliant LLM providers:

ProviderEU endpoint?Allowed for EU tenants?
Mistral (La Plateforme)Yes (Paris)Yes
Azure OpenAI (West Europe)YesYes
Vertex AI (europe-west4)YesYes
OpenAI (US)NoNo (blocked by OPA data_residency policy)
Anthropic (US)NoNo

This is enforced by the OPA data_residency policy domain — not hardcoded.


Flow 3: End User Authentication (BYOA)

PRDs: EndUserActor FR-5 · API Key Management · User-Aware Generation

ServiceRole in this flow
Tenant's IdPAuthenticates the end user, issues JWT
KrakenDReceives request, calls Auth Service
Auth ServiceValidates pk_* key via Unkey, reads OIDC config from TenantProfileActor, validates JWT via OIDCValidator
UnkeyVerifies publishable key → returns tenant_id
TenantProfileActorProvides end_user_oidc_config (issuer, JWKS URL, audience)
LLM GatewayStamps user_id + user_type on message, triggers identity upsert
EndUserActorSynchronous identity upsert from JWT claims; write-through to Firestore
Conversation ActorAppends message, updates participants, triggers generation
OpenFGAChecks can_send_message on conversation; writes owner tuple on first conversation
FirestoreReceives end-user projection from EndUserActor write-through

End users authenticate via the tenant's own IdP (BYOA — bring your own auth). The platform validates their JWTs using OIDC config stored on the TenantProfileActor. End users never interact with Zitadel.

End User Cold Start

PRD: EndUserActor FR-2

First time an end user hits the platform, EndUserActor doesn't exist yet. The first upsert initializes it with identity claims from the JWT.

First request → EndUserActor doesn't exist
├─ Initialize state:
│ first_seen_at: now
│ last_seen_at: now
│ sign_in_provider: "firebase.google.com"
│ name, email, picture, email_verified, locale, zoneinfo (from JWT)
│ profile: null (set later via EndUser Gateway)
├─ Write-through to Firestore: tenants/acme/users/alice_123
└─ Return success

First conversation creation → write OpenFGA tuples:
conversation:conv_new#owner@user:alice_123
conversation:conv_new#tenant@tenant:acme-a1b2c3

Flow 4: Tenant Member Authentication (Zitadel)

PRDs: Zitadel FR-3 · Admin Portal

ServiceRole in this flow
Admin Portal (Refine)Reads subdomain, redirects to Zitadel, renders UI
ZitadelPlatform IdP — org-scoped login, SAML/OIDC federation, JIT provisioning, issues OIDC token
Corporate IdP (Okta/Azure AD)Authenticates employee via SAML/OIDC (if SSO configured)
KrakenDValidates Zitadel OIDC token, routes to /admin/v1/*
Auth ServiceValidates token via OIDCValidator against Zitadel JWKS, extracts org_id + roles
TenantAdminGatewayEntry point for invite flow (InviteTeamMember RPC)
API Key GatewayExample downstream — lists tenant's keys from Unkey
UnkeyReturns keys filtered by externalId = tenant_id

Tenant members (admins, developers, CS agents) authenticate via Zitadel — the platform IdP. SSO is delegated to the tenant's corporate IdP if configured.

Tenant Member Onboarding — Two Paths

PRD: Tenant Onboarding FR-6 · Zitadel FR-4

Path A: JIT Provisioning (SSO tenants)

Tenant has configured their corporate IdP with Zitadel. Employees log in with work credentials. Zitadel auto-creates accounts on first login.

New Employee visits acme.console.platform.com
→ Redirect to Zitadel → Okta SAML
→ Okta authenticates → SAML assertion with groups
→ Zitadel JIT creates user in acme-a1b2c3 organization
→ Maps IdP groups → platform role
→ Emits user.human.added + user.grant.added events
→ Kafka → Firestore projection + OpenFGA tuples + Chatwoot + Novu
→ User has full access on first login

Path B: Invite Flow (non-SSO tenants)

Smaller tenant without corporate IdP. Admin manually invites team members by email.

Admin clicks "Invite Team Member" (email, role)
→ TenantAdminGateway.InviteTeamMember()
→ Zitadel API: CreateUserInvite(org, email, role)
→ Zitadel sends invite email
→ Invitee clicks link → signup page (org-branded)
→ Sets password + MFA → Zitadel creates user + grants role
→ Same downstream sync as JIT

Flow 5: Event-Driven Identity Sync

PRD: Zitadel FR-5 · Tenant Onboarding FR-6.5

ServiceRole in this flow
Corporate IdP (Okta)Source of truth for employee lifecycle — pushes SCIM events
ZitadelReceives SCIM events, creates/updates/removes users, emits lifecycle events
Identity Event BridgeSubscribes to Zitadel Actions webhooks, publishes to Kafka
KafkaFan-out bus for identity events (identity.user.created, identity.role.changed, etc.)
FirestoreReceives tenant member projection (tenants/{id}/members/{id})
OpenFGAReceives authorization tuples (user:carol#admin@tenant:acme)
ChatwootReceives user creation/role assignment
NovuReceives subscriber creation for notifications

When an IT admin adds a new tenant member in Okta, changes propagate automatically to all downstream systems via Zitadel events and Kafka fan-out.

Why event-driven? Deprovisioning is the critical case. When Okta disables an employee, SCIM notifies Zitadel → events flow → Chatwoot access is revoked in seconds, not on next sign-in. With Firebase Auth, terminated employees could keep access via refresh tokens indefinitely.

Flow 6: Per-Tenant Policy Enforcement (OPA)

PRD: Policy Engine

ServiceRole in this flow
OPA (sidecar)Evaluates Rego policies locally (<1ms), returns allow/deny per request
GCSStores policy bundles — synced to OPA sidecars every 10-30s
TenantProfileActorProvides PolicyOverrides as input to OPA evaluation
OpenFGAComplements OPA — checks resource-level relationships after OPA allows the action
Loki / KafkaReceives decision logs for audit trail

Replaces 11 scattered enforcement points across 8 files with a centralized OPA policy engine. Policies are per-tenant, dynamic (no code deploy), and auditable.

Policy domains

DomainWhat it controlsExample rule
model_accessWhich LLM models a tenant can use"EU tenants can only use Mistral, Azure OpenAI EU, Vertex AI EU"
feature_accessFeature gating by plan tier"Free tier cannot use RAG, voice, or memory"
data_residencyData stays in tenant's region"EU tenant requests cannot route to US LLM endpoints"
tool_executionTool approval rules, MCP restrictions"HIPAA tenants require mandatory approval for all external tools"
memoryMemory extraction/search consent gates"Only extract memories if user has consented to memory_extraction"
rbacRole-based action authorization"developer role can create API keys but not delete"

Three-layer policy merge

Platform base policies (safety, rate limits, model blocklist)
↓ merged with
Plan tier overlay (free, pro, enterprise — additive restrictions)
↓ merged with
Tenant overrides (custom restrictions only — cannot weaken platform base)
=
Final policy bundle for this tenant

OPA complements OpenFGA

ConcernEngineExample
Policy (can role R do action A?)OPA"Can a developer create API keys?"
Relationship (is user U in org O?)OpenFGA"Is Alice a member of tenant Acme?"
Combined (can user U do action A on resource R?)BothOPA checks policy → OpenFGA checks relationship

How it fits in the request path

Request → KrakenD → Auth Service (who are you?)
→ Gateway: OPA check (is this action allowed by policy?)
→ Gateway: OpenFGA check (do you have access to this resource?)
→ Actor (proceed)

Policies stored as Rego files in GCS bundles, synced to OPA sidecars every 10-30s. No code deploy for policy changes.


Flow 7: Tenant Self-Service

PRD: Tenant Onboarding FR-3 + FR-9

ServiceRole in this flow
Admin Portal (Refine)UI for tenant admins to manage config
TenantAdminGatewayExposes self-service RPCs (UpdateBusinessInfo, UpdateBranding, etc.)
TenantProfileActorStores updated config, triggers downstream sync
LagoReceives business field updates (legal name, address, tax ID)
NovuReceives branding updates (logo, colors)
UnkeyReceives metadata updates (feature flags, plan tier)
ZitadelReceives SSO config changes (per-org IdP setup)
FirestoreReceives full state projection on every change

After onboarding, tenant admins manage their own config via TenantAdminGateway. Changes propagate to downstream systems asynchronously.

RPCUpdatesDownstream Sync
UpdateBusinessInfoLegal name, address, tax IDLago, Firestore
UpdateBrandingLogo, colors, display nameNovu, Firestore
UpdateDefaultGenerationConfigDefault LLM settingsFirestore only
UpdateFeatureFlagsFeature togglesUnkey metadata, Firestore
ConfigureSSOIdP metadata, group mappingZitadel per-org config
InviteTeamMemberNew memberZitadel invite API

Sync is best-effort and non-blocking. Actor state is authoritative — if a downstream system is temporarily unavailable, the update succeeds locally and a background reconciliation catches up.

Lago immutable fields: currency and external_id cannot be changed after the first invoice. The sync handler skips these on updates.

Flow 8: Display Name Resolution

PRDs: User-Aware Generation · EndUserActor

ServiceRole in this flow
EndUserActorReturns identity + profile for USER_TYPE_END_USER lookups
FirestoreStores tenant member projection (tenants/{id}/members/{id}) for USER_TYPE_TENANT_MEMBER lookups
Zitadel (fallback)Live API lookup for tenant members if Firestore projection is stale
Generation WorkflowResolves display names server-side for LLM prompt injection

When rendering a conversation with messages from multiple user types, the client resolves display names based on user_type stamped on each message.

  • End usersEndUserActor.GetEndUser(tenant:user_id) — platform owns the data, <50ms
  • Tenant members → Firestore projection at tenants/{id}/members/{id} (populated by Zitadel event pipeline)

Server-side resolution (for LLM prompt injection) happens in the generation workflow. Display resolution is a client concern.

Flow 9: Chatwoot SSO Chain

PRD: Zitadel FR-4

ServiceRole in this flow
ZitadelSAML SP (receives assertion from corporate IdP) + SAML IdP (produces assertion for Chatwoot)
Corporate IdP (Okta)Authenticates the user, sends SAML assertion to Zitadel
Chatwoot (Enterprise)SAML SP — receives assertion from Zitadel, creates/updates user session
Admin PortalEntry point — "Open Chatwoot" button triggers the SSO chain

Zitadel acts as both SAML SP (to Okta) and SAML IdP (to Chatwoot), enabling single sign-on across the platform and Chatwoot with one login.

Flow 10: Lifecycle Management

PRD: Tenant Onboarding FR-5

ServiceRole in this flow
TenantAdminGatewayReceives Suspend/Reactivate/Decommission RPCs
TenantProfileActorUpdates tenant status, coordinates downstream changes
UnkeyDisables/re-enables all API keys for the tenant
LagoPauses/resumes subscription; issues final invoice on decommission
ZitadelDeactivates/reactivates organization; deletes org on decommission
OpenFGADeletes all authorization tuples on decommission
FirestoreUpdates status field; deletes tenant doc + subcollections on decommission
EndUserActor (all instances)Deleted on decommission (all end users for this tenant)
Conversation/Storage actorsDeleted on decommission

Suspend — Sets status to SUSPENDED, disables all Unkey keys, pauses Lago subscription, updates Firestore. Data is preserved. Gateway enforcement returns 403 for all requests. Admin Portal enters read-only mode.

Reactivate — Restores status to ACTIVE, re-enables keys, resumes subscription. Full access restored.

Decommission — Final and irreversible. After a grace period (e.g., 30 days for data export), a background workflow deletes conversations, files, memories, end users, Zitadel organization, Lago subscription, Unkey keys, OpenFGA tuples, and Firestore documents. tenant_id is tombstoned and never reused.

Where tenant_id Lives

PRDs: Tenant Onboarding · Multi-Tenancy

SystemField / KeyPurpose
TenantProfileActorBare tenant_id as actor keyCanonical source of truth
ZitadelOrganization name (via zitadel_org_id in integration registry)Tenant member scope
UnkeyexternalId on every API keyReturned on verify → X-Tenant-ID
Lagoexternal_id on customer (immutable after first invoice)Billing attribution
Novuidentifier on tenant orgNotification scoping
Convoyowner_id on webhook endpointsWebhook isolation
OpenFGAtenant:{tenant_id} objectAuthorization scope
FirestoreDocument ID at tenants/{tenant_id}Queryable projection
Zitadel OIDC tokenurn:zitadel:iam:org:id claimTenant member request identification
HTTP headersX-Tenant-IDPropagated to all downstream services
Composite actor keystenant:user_id, tenant:conversation_id, etc.Per-platform convention

Format: {slugified_name}-{6-alphanumeric} (e.g., acme-a1b2c3). Generated at tenant creation, immutable. subdomain is a separate user-facing identifier that can differ from the slug.

Subdomain Lookup (When It Happens)

PRD: Tenant Onboarding FR-8

The subdomain → tenant_id lookup is not done on every request. For authenticated API calls, tenant_id comes directly from the Zitadel token claim (urn:zitadel:iam:org:id).

Pre-auth only — when a browser loads the login page:

Browser hits bigbank.platform.com
→ Admin Portal extracts subdomain
→ Fetches /api/auth/resolve-subdomain?name=bigbank
→ Cached at 4 layers:
1. Browser localStorage (persistent per-device)
2. CDN edge cache (1h TTL)
3. Redis cache (24h TTL, invalidated on subdomain change)
4. Firestore subdomain_index (authoritative)
→ Returns { tenant_id, zitadel_org_id, branding }
→ Renders tenant-branded login → Zitadel org-scoped login

Authenticated API calls (the normal path) — zero lookups, tenant_id is in the token.

Key Architectural Patterns

Actors as Source of Truth, Firestore as Projection

PRD: Firestore ADR

All authoritative state lives in Restate actors. Firestore is a write-through projection for queryable access (dashboards, analytics, batch jobs). Request-path reads always go to actors. Eventual consistency between actor and projection is acceptable.

Two Auth Paths (End User vs Tenant Member)

PRDs: EndUserActor · Zitadel

  • End users authenticate via tenant's own IdP (BYOA) + publishable key (pk_*). Validated by OIDCValidator against tenant's OIDC config.
  • Tenant members authenticate via Zitadel (platform IdP). Validated by OIDCValidator against Zitadel JWKS.
  • Both converge at the unified Auth Service and use the same POST /verify endpoint.

Event-Driven Downstream Sync

PRD: Zitadel FR-5

Zitadel events drive tenant member lifecycle. Kafka fan-out updates Firestore projections, OpenFGA tuples, Chatwoot, Novu. No manual tuple management for tenant members. Exactly-once semantics via Zitadel event store sequence numbers.

Unified Identity Capture (EndUserActor)

PRD: EndUserActor

Every authenticated end-user request triggers a synchronous upsert to EndUserActor (<10ms, change-detected). Identity claims captured from JWT. Profile set explicitly by client app. Single actor for both concerns.

Orchestrated Provisioning (Workflow)

PRD: Tenant Onboarding FR-2

TenantOnboardingWorkflow uses Restate exactly-once semantics to provision Zitadel, OpenFGA, Lago, Novu, Unkey, and Firestore in order. Each step is idempotent. On failure, the workflow resumes from the last incomplete step without duplicating work.

Write-Through Pattern

PRD: Firestore ADR

When an actor's state changes, it writes-through to Firestore (best-effort). If Firestore fails, actor state is still correct — catch-up happens via events later. Never read from Firestore in the request path.

Lifecycle Coverage Matrix

PhasePRDSection Above
Tenant creationTenant Onboarding FR-2Flow 1
Region assignmentData ResidencyFlow 2
Initial OpenFGA tuplesAuthorizationFlow 1
Initial API keysAPI Key ManagementFlow 1
End user first requestEndUserActor FR-2Flow 3
Tenant member first login (JIT)Zitadel FR-4Flow 4
Tenant member invite (non-SSO)Tenant Onboarding FR-6Flow 4
Tenant member lifecycle eventsZitadel FR-5Flow 5
Per-tenant policy enforcementPolicy Engine (OPA)Flow 6
End user / tenant member ongoing requestsAPI Key Management · ZitadelFlows 3, 4
Config updatesTenant Onboarding FR-3Flow 7
Display name resolutionUser-Aware GenerationFlow 8
Cross-system SSO (Chatwoot)Zitadel FR-4Flow 9
Per-tenant data residencyData ResidencyFlow 2
Suspend / reactivate / decommissionTenant Onboarding FR-5Flow 10