Skip to content

feat(backend): add multi-provider LLM support via LLM_PROFILE#3

Merged
dimknaf merged 1 commit into
masterfrom
feat/local-llm-profile
May 9, 2026
Merged

feat(backend): add multi-provider LLM support via LLM_PROFILE#3
dimknaf merged 1 commit into
masterfrom
feat/local-llm-profile

Conversation

@dimknaf
Copy link
Copy Markdown
Owner

@dimknaf dimknaf commented May 9, 2026

Summary

Adds five new LLM provider profiles alongside deepinfra, so the agent backend can run on any of:

Profile Model Required env
gemini gemini/gemini-3.1-flash-lite-preview GEMINI_API_KEY
deepinfra deepinfra/google/gemma-4-31B-it DEEPINFRA_API_KEY
nim nvidia_nim/google/gemma-4-31b-it NVIDIA_NIM_API_KEY
together together_ai/google/gemma-4-31B-it TOGETHER_API_KEY
local openai/qwen (llama.cpp / vLLM / LM Studio on :8003) LOCAL_API_KEY + LOCAL_API_BASE
local_gemma openai/cyankiwi/gemma-4-31B-it-AWQ-4bit (Gemma 4 31B AWQ on :8002) LOCAL_API_KEY + LOCAL_API_BASE_GEMMA

Switching providers is a single env-var flip (LLM_PROFILE=...) — no other config changes required.

What's in the diff

  • backend/API/config.py — six profiles in _LLM_PROFILES; new agent_api_base setting + resolved_api_base property; resolved base URL logged on startup.
  • backend/API/agent.py_create_model() passes base_url=resolved_api_base to LitellmModel when the active profile defines an api_base_env. Cloud profiles behave exactly as before (base_url is omitted).
  • docker-compose.local.yml — declares host.docker.internal:host-gateway on the backend service so containerised runs can reach a local LLM running on the host. Docker Desktop already maps this; the explicit entry gives Linux parity.
  • backend/README.md + backend/.env.example — provider matrix and one-flip workflow documented.

Verification

  • pytest — all 56 backend tests pass.
  • LLM_PROFILE=deepinfra end-to-end via Docker Compose (cloud Gemma 4 31B).
  • LLM_PROFILE=local_gemma end-to-end via Docker Compose (AWQ Gemma served by vLLM on :8002, reached through host.docker.internal).

Test plan

  • Pull the branch, copy backend/.env.example to backend/.env, fill in DEEPINFRA_API_KEY, run docker-compose -f docker-compose.local.yml up --build. Hit /api/agent-status, expect "llm_profile": "deepinfra" and "litellm_model_initialized": true.
  • Run a single CSV row from the UI; result populates as before — no regression for the default cloud path.
  • (Optional) Stand up any OpenAI-compatible server on the host, set LLM_PROFILE=local (or local_gemma) with the matching LOCAL_API_BASE*, restart the backend with docker compose up -d (no rebuild), repeat — same result over the local endpoint.

Adds five new profiles alongside deepinfra (gemini, nim, together,
local, local_gemma) and threads base_url through to LitellmModel so
OpenAI-compatible local servers (llama.cpp, vLLM, LM Studio) work via
LLM_PROFILE=local or local_gemma. Switching providers is a single env
var flip — no other config changes required.

- config.py: six profiles in _LLM_PROFILES; agent_api_base setting;
  resolved_api_base property; startup log line for the resolved base URL.
- agent.py: _create_model() passes base_url=resolved_api_base when set.
- docker-compose.local.yml: extra_hosts maps host.docker.internal so the
  container can reach a local LLM on the host (Linux parity; Docker
  Desktop already maps this).
- README + .env.example: provider matrix and one-flip workflow.

Verified end-to-end with both deepinfra (cloud Gemma 4 31B) and
local_gemma (AWQ Gemma via vLLM on :8002). All 56 backend tests pass.
@dimknaf dimknaf merged commit 510e544 into master May 9, 2026
2 checks passed
@dimknaf dimknaf deleted the feat/local-llm-profile branch May 9, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant