Skip to content

Scalable, queue-based Browser Agents deployed and running on Kubernetes

License

Notifications You must be signed in to change notification settings

squatboy/scalable-browser-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scalable Browser Agent

k8s FastAPI PostgreSQL Redis

This platform is designed to deploy and operate a scalable queue-based browser agent (ex. browser-use, skyvern, etc.) execution environment on Kubernetes.

By processing self-hosted browser-use requests through an Asynchronous (job) + Queue (queue) + Worker (worker) architecture, it minimizes operational issues such as concurrent request spikes, long-running executions, failure tracking, and horizontal scaling with a minimal configuration.

Features

  • Queue-based async execution: The API returns a job_id immediately, while the actual execution is handled asynchronously by the Worker.
  • Horizontal scalability: Automatically scales Worker replicas via KEDA based on Redis Streams consumer group lag.
  • Job tracking: Permanently store job status (QUEUED/RUNNING/SUCCEEDED/FAILED) along with results/errors (including tracebacks) in PostgreSQL.
  • Queue consumer-group: Ensures safe job distribution across multiple Worker instances.
  • Kubernetes deployment: Supports standard K8s operational workflows including rollouts, restarts, and scaling.

Architecture

architecture image

Core Stack

  • API / Orchestrator: FastAPI
  • Queue: Redis Streams
  • Worker: Python Worker + browser-use + Playwright (Chromium, headless)
  • Database: PostgreSQL (Stores job status/result/error)

Goal: "Queue agent requests → Process safely via Workers → Provide results/status in a searchable format."

Project Structure

  • agents/: Contains browser agent scripts. This project currently uses browser-use. If you want to use other library-based agents (e.g., Skyvern), you can replace the script in this directory.
  • services/: Source code for the system's core components.
    • orchestrator/: API server that receives requests and manages the job queue.
    • worker/: Asynchronous worker that pulls jobs from Redis and executes them.
  • deploy/: Local development and deployment configurations (e.g., Docker Compose).
  • k8s/base/: Kubernetes manifests for various components including the Orchestrator, Worker, Postgres, Redis, and observability stack.

Prerequisites

Kubernetes

  • A Kubernetes cluster (k3s, kind, EKS, GKE, etc., are all supported)
  • kubectl CLI installed and configured
  • helm CLI (v3+) for installing dependencies

Required Dependencies

Install the following components before deploying the application:

KEDA (Kubernetes Event-driven Autoscaling)

KEDA is required for automatic worker scaling based on Redis Streams queue lag.

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda --namespace keda --create-namespace

Optional Dependencies

These components enhance observability but are not required for basic operation:

Prometheus + Grafana (Metrics & Observability)

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install kube-prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace \
  -f k8s/base/monitoring/monitoring-values.yaml

Loki + Alloy (Centralized Logging)

helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki \
  --namespace logging --create-namespace \
  -f k8s/base/logging/loki-values.yaml
helm install alloy grafana/alloy \
  --namespace logging \
  -f k8s/base/logging/alloy-values.yaml

Credentials / Secrets

  • LLM Provider API Key (e.g., GOOGLE_API_KEY for Google Gemini)
  • PostgreSQL password
  • (Recommended) Inject via Kubernetes Secrets

Note: The Worker runs Playwright Chromium in headless mode. The Worker image must include the necessary dependencies for running a browser within the cluster.

Quickstart

This repository is designed to be installed and operated using the deployment manifests located in k8s/.

1) Install

git clone https://github.com/squatboy/scalable-browser-agent.git
cd scalable-browser-agent

2) Create namespace

kubectl create ns sba

3) Create secrets

Please check the env/secretRef configuration in the deployment manifest for required secret keys. Generally, the following values are required:

kubectl -n sba create secret generic sba-secrets \
  --from-literal=GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY" \
  --from-literal=POSTGRES_PASSWORD="YOUR_POSTGRES_PASSWORD"

Note: Actual key names and secret names must match the manifests in k8s/.

4) Apply manifests

kubectl apply -k k8s/base

5) Verify

kubectl -n sba get pods -o wide
kubectl -n sba get svc

Accessing the API

  • NodePort (Current MVP Stage)
  • Service type: If exposed as a NodePort, you can access the API via NODE_IP:<nodePort> of the cluster node.
  • For external access, you can expose the Orchestrator via NodePort and access it as http://<NODE_PUBLIC_IP>:<NODEPORT> (e.g., /docs, /healthz). For production, consider Ingress + TLS + Auth.
kubectl -n sba get svc orchestrator
# PORT(S): 8000:<nodePort>/TCP

API Reference

1) Run an agent job

POST /v1/run-agent

  • Request
curl -s -X POST http://127.0.0.1:8000/v1/run-agent \
  -H "Content-Type: application/json" \
  -d '{
  "task":"Go to https://news.ycombinator.com/ and return top 5 stories."
  }'
  • Response
{ "job_id": "..." }

2) Get job status/result

GET /v1/jobs/{job_id}

  • Request
curl -s http://127.0.0.1:8000/v1/jobs/<JOB_ID>
  • Response (example)
{
  "job_id": "...",
  "agent_id": "browser-use-generic",
  "status": "SUCCEEDED",
  "result": { "raw": "..." },
  "error": null
}

Tip: In the MVP, browser-use-generic stores the result directly in result.raw without parsing. If you need structured (JSON) results, specify a format like "Return ONLY JSON ..." in your task.

Troubleshooting

1) relation "jobs" does not exist

  • Symptom: Orchestrator returns a 500 error indicating the jobs table does not exist.
  • Cause: PostgreSQL schema (tables) have not been created yet.
  • Resolution: Check if the DB migration job has completed.
kubectl -n sba get job
kubectl -n sba logs job/db-migrate

2) API Key not set

  • Symptom: GOOGLE_API_KEY is not set appears in Worker logs.
  • Resolution: Check Kubernetes Secret/Env settings and redeploy.

3) Job stuck in RUNNING

  • Cause: Site response delay / Browser execution issues / No external network access / Worker hang.
  • Action: Check Worker logs and GC CronJob logs.
    A GC CronJob periodically enforces job timeouts (e.g., JOB_TIMEOUT_SECONDS), expires stale QUEUED jobs, and cleans up old finished jobs based on retention policy.

Roadmap

  • KEDA-based autoscaling (Automatic worker scaling based on Redis backlog)
  • Helm chart support
  • Observability: metrics / tracing / structured logs
  • Retry/backoff, timeout, cancellation, DLQ
  • Ingress + TLS + Auth (production-ready access)
  • Multi-tenant / per-tenant quota / budget-aware scheduling

About

Scalable, queue-based Browser Agents deployed and running on Kubernetes

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages