feat(backend): add multi-provider LLM support via LLM_PROFILE by dimknaf · Pull Request #3 · dimknaf/knowledge-robot

dimknaf · 2026-05-09T13:09:49Z

Summary

Adds five new LLM provider profiles alongside deepinfra, so the agent backend can run on any of:

Profile	Model	Required env
`gemini`	`gemini/gemini-3.1-flash-lite-preview`	`GEMINI_API_KEY`
`deepinfra`	`deepinfra/google/gemma-4-31B-it`	`DEEPINFRA_API_KEY`
`nim`	`nvidia_nim/google/gemma-4-31b-it`	`NVIDIA_NIM_API_KEY`
`together`	`together_ai/google/gemma-4-31B-it`	`TOGETHER_API_KEY`
`local`	`openai/qwen` (llama.cpp / vLLM / LM Studio on :8003)	`LOCAL_API_KEY` + `LOCAL_API_BASE`
`local_gemma`	`openai/cyankiwi/gemma-4-31B-it-AWQ-4bit` (Gemma 4 31B AWQ on :8002)	`LOCAL_API_KEY` + `LOCAL_API_BASE_GEMMA`

Switching providers is a single env-var flip (LLM_PROFILE=...) — no other config changes required.

What's in the diff

backend/API/config.py — six profiles in _LLM_PROFILES; new agent_api_base setting + resolved_api_base property; resolved base URL logged on startup.
backend/API/agent.py — _create_model() passes base_url=resolved_api_base to LitellmModel when the active profile defines an api_base_env. Cloud profiles behave exactly as before (base_url is omitted).
docker-compose.local.yml — declares host.docker.internal:host-gateway on the backend service so containerised runs can reach a local LLM running on the host. Docker Desktop already maps this; the explicit entry gives Linux parity.
backend/README.md + backend/.env.example — provider matrix and one-flip workflow documented.

Verification

✅ pytest — all 56 backend tests pass.
✅ LLM_PROFILE=deepinfra end-to-end via Docker Compose (cloud Gemma 4 31B).
✅ LLM_PROFILE=local_gemma end-to-end via Docker Compose (AWQ Gemma served by vLLM on :8002, reached through host.docker.internal).

Test plan

Pull the branch, copy backend/.env.example to backend/.env, fill in DEEPINFRA_API_KEY, run docker-compose -f docker-compose.local.yml up --build. Hit /api/agent-status, expect "llm_profile": "deepinfra" and "litellm_model_initialized": true.
Run a single CSV row from the UI; result populates as before — no regression for the default cloud path.
(Optional) Stand up any OpenAI-compatible server on the host, set LLM_PROFILE=local (or local_gemma) with the matching LOCAL_API_BASE*, restart the backend with docker compose up -d (no rebuild), repeat — same result over the local endpoint.

Adds five new profiles alongside deepinfra (gemini, nim, together, local, local_gemma) and threads base_url through to LitellmModel so OpenAI-compatible local servers (llama.cpp, vLLM, LM Studio) work via LLM_PROFILE=local or local_gemma. Switching providers is a single env var flip — no other config changes required. - config.py: six profiles in _LLM_PROFILES; agent_api_base setting; resolved_api_base property; startup log line for the resolved base URL. - agent.py: _create_model() passes base_url=resolved_api_base when set. - docker-compose.local.yml: extra_hosts maps host.docker.internal so the container can reach a local LLM on the host (Linux parity; Docker Desktop already maps this). - README + .env.example: provider matrix and one-flip workflow. Verified end-to-end with both deepinfra (cloud Gemma 4 31B) and local_gemma (AWQ Gemma via vLLM on :8002). All 56 backend tests pass.

dimknaf merged commit 510e544 into master May 9, 2026
2 checks passed

dimknaf deleted the feat/local-llm-profile branch May 9, 2026 13:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): add multi-provider LLM support via LLM_PROFILE#3

feat(backend): add multi-provider LLM support via LLM_PROFILE#3
dimknaf merged 1 commit into
masterfrom
feat/local-llm-profile

dimknaf commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dimknaf commented May 9, 2026

Summary

What's in the diff

Verification

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant