using-system · using-system · Apr 17, 2026 · Apr 17, 2026 · Apr 17, 2026 · Apr 17, 2026
diff --git a/README.md b/README.md
@@ -191,17 +191,23 @@ Feedback reactions produce: `feedback.receive_reaction` → `feedback.analyze`
 | `reddit_digest.feedback.reactions` | Counter | | Reactions received (`reaction_type`: `like`/`dislike`) |
 | `reddit_digest.feedback.preference_updates` | Counter | | Preference updates from feedback |
 
-### Local observability stack
-
-A full Docker Compose stack (OTel Collector, Tempo, Prometheus, Grafana, Phoenix) is available for local development. See [`docker-compose/observability-stack/`](docker-compose/observability-stack/) for setup instructions.
-
-## Deploy with Docker
+## Deploy agent with Docker
 
 ```bash
 docker build -t reddit-digest-agent .
 docker run -d --env-file .env --name reddit-digest reddit-digest-agent
 ```
 
+## Deploy agent with Docker Compose
+
+### Local LocalAI stack
+
+A Docker Compose stack that runs the agent against a self-hosted [LocalAI](https://localai.io/) server preloading `google/gemma-3-4b-it`, so the digest can be produced without any external LLM provider. See [`docker-compose/localai-stack/`](docker-compose/localai-stack/) for setup instructions.
+
+### Local observability stack
+
+A full Docker Compose stack (OTel Collector, Tempo, Prometheus, Grafana, Phoenix) is available for local development. See [`docker-compose/observability-stack/`](docker-compose/observability-stack/) for setup instructions.
+
 ## Development
 
 ```bash

diff --git a/docker-compose/localai-stack/README.md b/docker-compose/localai-stack/README.md
@@ -0,0 +1,62 @@
+# LocalAI Stack
+
+Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `google/gemma-3-4b-it`.
-Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `google/gemma-3-4b-it`.
+Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `gemma-3-4b-it`.
-Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `google/gemma-3-4b-it`.
+Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `gemma-3-4b-it`.
+
+## Architecture
+
+```
+Agent --OpenAI API--> LocalAI (gemma-3-4b-it)
+```
+
+The agent talks to LocalAI over its OpenAI-compatible endpoint, so no external LLM provider is required.
+
+## Prerequisites
+
+- Docker and Docker Compose
+- A configured `.env` file at the repository root (see `.env.example`). `OPENAI_BASE_URL` and `LLM_MODEL` from the file are overridden by this stack.
+- ~3 GB of free disk space for the gemma-3-4b-it model weights (downloaded on first start and cached in the `localai-models` volume).
+
+## Quick Start
+
+```bash
+# From this directory
+docker compose up --build
+
+# Or from the repository root
+docker compose -f docker-compose/localai-stack/docker-compose.yml up --build
+```
+
+On the first run LocalAI downloads the `gemma-3-4b-it` weights from its gallery — expect a few minutes before the healthcheck turns green. The agent is blocked on `localai: service_healthy` and will only start once the model is ready.
+
+The agent runs a single digest iteration (`--once`) and exits. LocalAI keeps running so you can query it directly or trigger another digest.
+
+## Services
+
+| Service | URL | Description |
+|---------|-----|-------------|
+| LocalAI | http://localhost:8080 | OpenAI-compatible inference API (`/v1/chat/completions`, `/v1/models`, …) |
+
+## Re-run the Agent
+
+```bash
+docker compose run --rm agent
+```
+
+## Probe LocalAI directly
+
+```bash
+curl http://localhost:8080/v1/models
+curl http://localhost:8080/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{"model":"gemma-3-4b-it","messages":[{"role":"user","content":"Hello"}]}'
+```
+
+## Tear Down
+
+```bash
+# Stop the stack (keeps the model cache)
+docker compose down
+
+# Stop and also remove the model cache volume
+docker compose down -v
+```
diff --git a/docker-compose/localai-stack/docker-compose.yml b/docker-compose/localai-stack/docker-compose.yml
@@ -0,0 +1,34 @@
+services:
+  localai:
+    image: localai/localai:latest
+    ports:
+      - "8080:8080"
+    environment:
+      - MODELS=gemma-3-4b-it
+      - THREADS=4
+    volumes:
+      - localai-models:/build/models
+    healthcheck:
+      test: ["CMD-SHELL", "curl -sf http://localhost:8080/readyz || exit 1"]
+      interval: 30s
+      timeout: 10s
+      retries: 40
+      start_period: 300s
+
+  agent:
+    build:
+      context: ../..
+      dockerfile: Dockerfile
+    env_file:
+      - ../../.env
+    environment:
+      - OPENAI_API_KEY=localai
+      - OPENAI_BASE_URL=http://localai:8080/v1
+      - LLM_MODEL=gemma-3-4b-it
+    command: ["python", "-m", "reddit_digest.main", "--once"]
+    depends_on:
+      localai:
+        condition: service_healthy
+
+volumes:
+  localai-models: