This platform is designed to deploy and operate a scalable queue-based browser agent (ex. browser-use, skyvern, etc.) execution environment on Kubernetes.
By processing self-hosted browser-use requests through an Asynchronous (job) + Queue (queue) + Worker (worker) architecture, it minimizes operational issues such as concurrent request spikes, long-running executions, failure tracking, and horizontal scaling with a minimal configuration.
- Queue-based async execution: The API returns a
job_idimmediately, while the actual execution is handled asynchronously by the Worker. - Horizontal scalability: Automatically scales Worker replicas via KEDA based on Redis Streams consumer group lag.
- Job tracking: Permanently store job status (QUEUED/RUNNING/SUCCEEDED/FAILED) along with results/errors (including tracebacks) in PostgreSQL.
- Queue consumer-group: Ensures safe job distribution across multiple Worker instances.
- Kubernetes deployment: Supports standard K8s operational workflows including rollouts, restarts, and scaling.
- API / Orchestrator: FastAPI
- Queue: Redis Streams
- Worker: Python Worker + browser-use + Playwright (Chromium, headless)
- Database: PostgreSQL (Stores job status/result/error)
Goal: "Queue agent requests → Process safely via Workers → Provide results/status in a searchable format."
agents/: Contains browser agent scripts. This project currently uses browser-use. If you want to use other library-based agents (e.g., Skyvern), you can replace the script in this directory.services/: Source code for the system's core components.orchestrator/: API server that receives requests and manages the job queue.worker/: Asynchronous worker that pulls jobs from Redis and executes them.
deploy/: Local development and deployment configurations (e.g., Docker Compose).k8s/base/: Kubernetes manifests for various components including the Orchestrator, Worker, Postgres, Redis, and observability stack.
- A Kubernetes cluster (k3s, kind, EKS, GKE, etc., are all supported)
kubectlCLI installed and configuredhelmCLI (v3+) for installing dependencies
Install the following components before deploying the application:
KEDA (Kubernetes Event-driven Autoscaling)
KEDA is required for automatic worker scaling based on Redis Streams queue lag.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespaceThese components enhance observability but are not required for basic operation:
Prometheus + Grafana (Metrics & Observability)
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install kube-prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
-f k8s/base/monitoring/monitoring-values.yamlLoki + Alloy (Centralized Logging)
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki \
--namespace logging --create-namespace \
-f k8s/base/logging/loki-values.yaml
helm install alloy grafana/alloy \
--namespace logging \
-f k8s/base/logging/alloy-values.yaml- LLM Provider API Key (e.g.,
GOOGLE_API_KEYfor Google Gemini) - PostgreSQL password
- (Recommended) Inject via Kubernetes Secrets
Note: The Worker runs Playwright Chromium in headless mode. The Worker image must include the necessary dependencies for running a browser within the cluster.
This repository is designed to be installed and operated using the deployment manifests located in k8s/.
git clone https://github.com/squatboy/scalable-browser-agent.git
cd scalable-browser-agentkubectl create ns sbaPlease check the env/secretRef configuration in the deployment manifest for required secret keys. Generally, the following values are required:
kubectl -n sba create secret generic sba-secrets \
--from-literal=GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY" \
--from-literal=POSTGRES_PASSWORD="YOUR_POSTGRES_PASSWORD"Note: Actual key names and secret names must match the manifests in
k8s/.
kubectl apply -k k8s/basekubectl -n sba get pods -o wide
kubectl -n sba get svc- NodePort (Current MVP Stage)
- Service type: If exposed as a NodePort, you can access the API via
NODE_IP:<nodePort>of the cluster node. - For external access, you can expose the Orchestrator via NodePort and access it as
http://<NODE_PUBLIC_IP>:<NODEPORT>(e.g.,/docs,/healthz). For production, consider Ingress + TLS + Auth.
kubectl -n sba get svc orchestrator
# PORT(S): 8000:<nodePort>/TCPPOST /v1/run-agent
- Request
curl -s -X POST http://127.0.0.1:8000/v1/run-agent \
-H "Content-Type: application/json" \
-d '{
"task":"Go to https://news.ycombinator.com/ and return top 5 stories."
}'- Response
{ "job_id": "..." }GET /v1/jobs/{job_id}
- Request
curl -s http://127.0.0.1:8000/v1/jobs/<JOB_ID>- Response (example)
{
"job_id": "...",
"agent_id": "browser-use-generic",
"status": "SUCCEEDED",
"result": { "raw": "..." },
"error": null
}Tip: In the MVP,
browser-use-genericstores the result directly inresult.rawwithout parsing. If you need structured (JSON) results, specify a format like "Return ONLY JSON ..." in your task.
- Symptom: Orchestrator returns a 500 error indicating the
jobstable does not exist. - Cause: PostgreSQL schema (tables) have not been created yet.
- Resolution: Check if the DB migration job has completed.
kubectl -n sba get job
kubectl -n sba logs job/db-migrate- Symptom:
GOOGLE_API_KEY is not setappears in Worker logs. - Resolution: Check Kubernetes Secret/Env settings and redeploy.
- Cause: Site response delay / Browser execution issues / No external network access / Worker hang.
- Action: Check Worker logs and GC CronJob logs.
A GC CronJob periodically enforces job timeouts (e.g.,JOB_TIMEOUT_SECONDS), expires stale QUEUED jobs, and cleans up old finished jobs based on retention policy.
- KEDA-based autoscaling (Automatic worker scaling based on Redis backlog)
- Helm chart support
- Observability: metrics / tracing / structured logs
- Retry/backoff, timeout, cancellation, DLQ
- Ingress + TLS + Auth (production-ready access)
- Multi-tenant / per-tenant quota / budget-aware scheduling