Skip to content

feat: configurable embedding provider, model, and dimensions via environment variables#77

Open
Ashwin-3cS wants to merge 3 commits intoMystenLabs:devfrom
Ashwin-3cS:feat/server-configurable-embedding-llm
Open

feat: configurable embedding provider, model, and dimensions via environment variables#77
Ashwin-3cS wants to merge 3 commits intoMystenLabs:devfrom
Ashwin-3cS:feat/server-configurable-embedding-llm

Conversation

@Ashwin-3cS
Copy link
Copy Markdown
Contributor

Description

The server currently hardcodes openai/text-embedding-3-small for embeddings and openai/gpt-4o-mini for LLM calls. This limits flexibility for self-hosters who may want to:

  • Use a dedicated embedding provider (Jina, Cohere, etc.) with a separate key or budget
  • Use OpenRouter free models instead of OpenAI directly
  • Customize embedding dimensions (e.g. 1024 for Jina v3)

At the moment, achieving this requires modifying the source code.


Changes

Configuration (types.rs)

Adds five new Config fields, all read from environment variables:

Variable Default Purpose
EMBEDDING_API_KEY falls back to OPENAI_API_KEY API key for embedding provider
EMBEDDING_API_BASE falls back to OPENAI_API_BASE Base URL for embedding provider
EMBEDDING_MODEL openai/text-embedding-3-small Embedding model identifier
EMBEDDING_DIMENSIONS omitted Optional output dimension override (e.g. 1024)
LLM_MODEL openai/gpt-4o-mini LLM used for /api/analyze and /api/ask

This improves flexibility for developers in the ecosystem by allowing different providers, models, and configurations without modifying source code.


Embedding + LLM usage (routes.rs)

  • generate_embedding() now uses a fallback chain:
    • EMBEDDING_API_KEY → OPENAI_API_KEY
    • EMBEDDING_API_BASE → OPENAI_API_BASE
  • Adds optional dimensions field to embedding API request
    • Uses skip_serializing_if = "Option::is_none" for provider compatibility
  • Replaces hardcoded LLM model with config.llm_model in all call sites
  • Mock embedding path respects EMBEDDING_DIMENSIONS for consistency in dev/test

Server startup (main.rs)

  • Logs active configuration at startup:
    • embedding model
    • base URL
    • optional dimensions
    • LLM model
  • Emits a WARN if EMBEDDING_DIMENSIONS does not match the schema column dimension
    • Prevents silent breakage in cosine similarity queries

Environment config (.env.example)

  • Documents all new environment variables
  • Includes commented example values (e.g. Jina setup)

Documentation (docs/)

  • Updated:
    • environment-variables.md
    • self-hosting.md
  • Includes:
    • Explanation of fallback chains
    • Guidance on using custom providers
    • Warning against mixing embedding dimensions mid-deployment

Documentation has been updated to reflect these changes. If any adjustments or clarifications are needed, feel free to point them out and I'll follow up in this PR.


Backwards Compatibility

All new variables have defaults that preserve the current behavior.
Existing deployments with no .env changes remain unaffected.


Test

  • Verified locally with:
    • Jina embeddings (EMBEDDING_MODEL=jina-embeddings-v3, EMBEDDING_DIMENSIONS=1024)
    • OpenRouter free LLM
  • Server startup logs correctly reflect active configuration
  • Dimension mismatch warning triggers when schema and config differ

Add five env vars so self-hosters can swap in any OpenAI-compatible
embedding provider (Jina, Cohere, OpenRouter) and any LLM without
touching code

Fallback chain: EMBEDDING_API_KEY → OPENAI_API_KEY, EMBEDDING_API_BASE → OPENAI_API_BASE.
Mock embedding path respects EMBEDDING_DIMENSIONS so dev/test dimensions stay consistent.
Server logs a WARN at boot if the schema column dimension doesn't match EMBEDDING_DIMENSIONS.
Add EMBEDDING_API_KEY, EMBEDDING_API_BASE, EMBEDDING_MODEL,
EMBEDDING_DIMENSIONS, and LLM_MODEL to environment-variables.md and
self-hosting.md. Includes Jina example values in .env.example and a
warning about not mixing embedding dimensions across memories.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant