Skip to content

cloudon-one/openclaw-serverless

Repository files navigation

OpenClaw Serverless — Multi-Tenant GCP Deployment

Serverless, per-tenant isolated OpenClaw deployment on GCP Cloud Run with GCSFuse workspace persistence.

Architecture

flowchart LR
    subgraph Channels["Messaging Channels"]
        TG(("Telegram"))
        SL(("Slack"))
    end

    subgraph CloudRun["GCP Cloud Run"]
        Router["**Router**\nPublic ingress\nSignature validation\nTenant routing"]

        subgraph TenantA["Tenant A"]
            direction TB
            CRA["Cloud Run\nInternal ingress"]
            SAA["SA: openclaw-sl-a"]
            GCSA[("GCS Bucket\n/data mount")]
            CRA --- SAA
            CRA -- "GCSFuse" --> GCSA
        end

        subgraph TenantB["Tenant B"]
            direction TB
            CRB["Cloud Run\nInternal ingress"]
            SAB["SA: openclaw-sl-b"]
            GCSB[("GCS Bucket\n/data mount")]
            CRB --- SAB
            CRB -- "GCSFuse" --> GCSB
        end
    end

    subgraph Secrets["Secret Manager"]
        direction TB
        Anthropic["Anthropic API Key\n(shared)"]
        TenantSecrets["Bot Tokens &\nWebhook Secrets\n(per-tenant)"]
    end

    TG -- "webhook POST" --> Router
    SL -- "webhook POST" --> Router
    Router -- "ID token\nauth" --> CRA
    Router -- "ID token\nauth" --> CRB
    SAA -. "secretAccessor" .-> Anthropic
    SAB -. "secretAccessor" .-> Anthropic
    SAA -. "secretAccessor" .-> TenantSecrets
    Router -. "secretAccessor" .-> TenantSecrets
Loading

Request Flow

sequenceDiagram
    actor User
    participant Bot as Telegram / Slack
    participant Router as Router<br/>Cloud Run
    participant SM as Secret Manager
    participant Tenant as Tenant Service<br/>Cloud Run
    participant GCS as GCS Bucket

    User ->> Bot: Send message
    Bot ->> Router: POST /webhook/{channel}
    Router ->> SM: Get webhook secret (5min cache)
    Router ->> Router: Validate signature
    Router ->> Router: Match tenant by user/channel ID
    Router ->> SM: Get ID token for tenant URL
    Router ->> Tenant: Forward with Authorization header

    activate Tenant
    Tenant ->> GCS: Read context (GCSFuse /data)
    Tenant ->> Bot: Stream response via Bot API
    Tenant ->> GCS: Persist state (GCSFuse /data)
    deactivate Tenant

    Bot ->> User: Deliver response
Loading

Isolation Model

Each tenant gets strict resource isolation — zero cross-tenant data access:

Resource Scope Isolation
Cloud Run Service Per-tenant Separate container, SA, env vars
GCS Bucket Per-tenant SA has objectAdmin on own bucket only
Service Account Per-tenant No IAM bindings to other tenant resources
Secrets Per-tenant Tenant SA can only access own secrets + shared Anthropic key
Network Per-tenant Internal ingress — only router can reach tenant services

Directory Structure

openclaw-serverless/
├── tenants.yaml              # Tenant definitions (single source of truth)
├── agent/
│   ├── Dockerfile            # Extends openclaw-gateway base image
│   └── entrypoint.sh         # Config generation + skills sync + gateway start
├── router/
│   ├── Dockerfile            # Node.js webhook router
│   ├── index.js              # Telegram/Slack signature validation + tenant routing
│   └── package.json
├── infrastructure/
│   ├── main.tf               # Provider, backend, shared resources (AR, Anthropic secret)
│   ├── tenant.tf             # Per-tenant: SA, GCS, secrets, Cloud Run service
│   ├── router.tf             # Router: SA, Cloud Run service, public IAM
│   ├── outputs.tf            # Webhook URLs, tenant URLs, bucket names
│   ├── variables.tf          # Input variables
│   └── terraform.tfvars      # Your project-specific values
└── scripts/
    ├── build.sh              # Build + push both container images
    └── deploy-tenant.sh      # Terraform apply + optional webhook setup

Quick Start

Prerequisites

  • GCP project with billing enabled
  • gcloud CLI authenticated
  • terraform or opentofu installed
  • Docker with linux/amd64 build support

1. Clone and configure

git clone https://github.com/YOUR_ORG/openclaw-serverless.git
cd openclaw-serverless

export PROJECT_ID=your-gcp-project-id
export REGION=us-central1
export REGISTRY="${REGION}-docker.pkg.dev/${PROJECT_ID}/openclaw"

gcloud config set project $PROJECT_ID

2. Enable APIs and create Artifact Registry

gcloud services enable run.googleapis.com \
  secretmanager.googleapis.com \
  artifactregistry.googleapis.com

gcloud artifacts repositories create openclaw \
  --repository-format=docker \
  --location=$REGION

3. Build and push container images

./scripts/build.sh

4. Store your Anthropic API key

echo -n "sk-ant-..." | gcloud secrets create openclaw-sl-anthropic-api-key \
  --data-file=- --replication-policy=automatic

5. Define your tenant

Edit tenants.yaml:

tenants:
  alice:
    display_name: "Alice Smith"
    telegram_user_id: "YOUR_TELEGRAM_USER_ID"
    telegram_enabled: true
    slack_enabled: false
    min_instances: 0
    max_instances: 1
    cpu: "2"
    memory: "2Gi"

Find your Telegram user ID by messaging @userinfobot.

6. Deploy infrastructure

cd infrastructure
# Edit terraform.tfvars with your project ID, region, and image URIs
terraform init
terraform plan -out=tfplan
terraform apply tfplan

7. Create a Telegram bot and store secrets

Message @BotFather/newbot → copy the token.

# Bot token
echo -n "YOUR_BOT_TOKEN" | gcloud secrets versions add \
  openclaw-sl-alice-telegram-token --data-file=-

# Webhook secret (random)
openssl rand -hex 32 | gcloud secrets versions add \
  openclaw-sl-alice-telegram-webhook-secret --data-file=-

8. Register the Telegram webhook

ROUTER_URL=$(cd infrastructure && terraform output -raw router_url)
WEBHOOK_SECRET=$(gcloud secrets versions access latest \
  --secret=openclaw-sl-alice-telegram-webhook-secret)

curl "https://api.telegram.org/bot${YOUR_BOT_TOKEN}/setWebhook" \
  -d "url=${ROUTER_URL}/webhook/telegram" \
  -d "secret_token=${WEBHOOK_SECRET}"

9. Update the agent config

Edit agent/entrypoint.sh and add your Telegram user ID to the allowFrom array:

"allowFrom": ["YOUR_TELEGRAM_USER_ID"],

Rebuild and push the agent image, then deploy a new Cloud Run revision:

./scripts/build.sh
gcloud run services update openclaw-alice --region=$REGION \
  --image=${REGISTRY}/agent:latest

Send a message to your bot. First response takes ~15-20 seconds (cold start); subsequent messages are fast.

Adding More Tenants

  1. Add an entry to tenants.yaml
  2. Create the tenant's secrets (bot token + webhook secret)
  3. Run terraform apply
  4. Set the Telegram webhook
  5. Add the user ID to allowFrom in entrypoint.sh, rebuild, and redeploy

Each tenant is a fully independent island with no shared state beyond the Anthropic API key.

Key Design Decisions

Decision Rationale
GCSFuse over sync scripts Cloud Run v2 native volume mounts — no sidecar or sync daemon needed
Shared router, isolated tenants Single public endpoint validates signatures; tenant services use internal ingress only
Config as JSON at $OPENCLAW_STATE_DIR/openclaw.json OpenClaw reads config from this path, not XDG_CONFIG_HOME
Always-overwrite config on startup GCSFuse persists writes across container restarts; ensures latest config on every deploy
dmPolicy: allowlist with allowFrom Cloud Run has no shell access for openclaw pairing approve; allowlist bypasses device pairing
Gen2 execution environment Required for GCSFuse support in Cloud Run
cpu_idle: false Keeps CPU allocated during idle — required for WebSocket/long-running agent sessions

Cost Notes

  • min_instances: 0 — free at idle, ~15s cold start
  • min_instances: 1 — always warm, costs ~$50-70/month per tenant (2 vCPU / 2 GiB)
  • No NAT gateway — Cloud Run has direct internet egress, unlike the AWS AgentCore pattern
  • GCS storage — negligible for agent workspace data (pennies/month)

License

MIT

About

GCP Cloud Run Multi-tenant Openclaw Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors