-
Notifications
You must be signed in to change notification settings - Fork 0
feat(docker): add localai-stack compose for self-hosted gemma-3-4b-it #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # LocalAI Stack | ||
|
|
||
| Local Docker Compose stack that runs the reddit-digest-agent against a self-hosted [LocalAI](https://localai.io/) inference server serving `google/gemma-3-4b-it`. | ||
|
Comment on lines
+1
to
+3
|
||
|
|
||
| ## Architecture | ||
|
|
||
| ``` | ||
| Agent --OpenAI API--> LocalAI (gemma-3-4b-it) | ||
| ``` | ||
|
|
||
| The agent talks to LocalAI over its OpenAI-compatible endpoint, so no external LLM provider is required. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - Docker and Docker Compose | ||
| - A configured `.env` file at the repository root (see `.env.example`). `OPENAI_BASE_URL` and `LLM_MODEL` from the file are overridden by this stack. | ||
| - ~3 GB of free disk space for the gemma-3-4b-it model weights (downloaded on first start and cached in the `localai-models` volume). | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```bash | ||
| # From this directory | ||
| docker compose up --build | ||
|
|
||
| # Or from the repository root | ||
| docker compose -f docker-compose/localai-stack/docker-compose.yml up --build | ||
| ``` | ||
|
|
||
| On the first run LocalAI downloads the `gemma-3-4b-it` weights from its gallery — expect a few minutes before the healthcheck turns green. The agent is blocked on `localai: service_healthy` and will only start once the model is ready. | ||
|
|
||
| The agent runs a single digest iteration (`--once`) and exits. LocalAI keeps running so you can query it directly or trigger another digest. | ||
|
|
||
| ## Services | ||
|
|
||
| | Service | URL | Description | | ||
| |---------|-----|-------------| | ||
| | LocalAI | http://localhost:8080 | OpenAI-compatible inference API (`/v1/chat/completions`, `/v1/models`, …) | | ||
|
Comment on lines
+35
to
+37
|
||
|
|
||
| ## Re-run the Agent | ||
|
|
||
| ```bash | ||
| docker compose run --rm agent | ||
| ``` | ||
|
|
||
| ## Probe LocalAI directly | ||
|
|
||
| ```bash | ||
| curl http://localhost:8080/v1/models | ||
| curl http://localhost:8080/v1/chat/completions \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{"model":"gemma-3-4b-it","messages":[{"role":"user","content":"Hello"}]}' | ||
| ``` | ||
|
|
||
| ## Tear Down | ||
|
|
||
| ```bash | ||
| # Stop the stack (keeps the model cache) | ||
| docker compose down | ||
|
|
||
| # Stop and also remove the model cache volume | ||
| docker compose down -v | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| services: | ||
| localai: | ||
| image: localai/localai:latest | ||
| ports: | ||
| - "8080:8080" | ||
| environment: | ||
| - MODELS=gemma-3-4b-it | ||
| - THREADS=4 | ||
| volumes: | ||
| - localai-models:/build/models | ||
| healthcheck: | ||
| test: ["CMD-SHELL", "curl -sf http://localhost:8080/readyz || exit 1"] | ||
| interval: 30s | ||
| timeout: 10s | ||
| retries: 40 | ||
| start_period: 300s | ||
|
|
||
| agent: | ||
| build: | ||
| context: ../.. | ||
| dockerfile: Dockerfile | ||
| env_file: | ||
| - ../../.env | ||
| environment: | ||
| - OPENAI_API_KEY=localai | ||
| - OPENAI_BASE_URL=http://localai:8080/v1 | ||
| - LLM_MODEL=gemma-3-4b-it | ||
| command: ["python", "-m", "reddit_digest.main", "--once"] | ||
| depends_on: | ||
| localai: | ||
| condition: service_healthy | ||
|
|
||
| volumes: | ||
| localai-models: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README mentions serving
google/gemma-3-4b-it, but the rest of this stack (composeMODELS/LLM_MODELand the example/v1/chat/completionspayload) usesgemma-3-4b-it. Please align the model identifier in the docs with the actual model name that LocalAI exposes (either update this line togemma-3-4b-itor update the compose/examples accordingly).