Skip to content

feat: Vector Search microservice + parallel RAG engine#637

Merged
ShaerWare merged 1 commit intomainfrom
feat/vector-search-integration
Mar 23, 2026
Merged

feat: Vector Search microservice + parallel RAG engine#637
ShaerWare merged 1 commit intomainfrom
feat/vector-search-integration

Conversation

@ShaerWare
Copy link
Owner

Summary

  • Add standalone Vector Search microservice (services/vector-search/) — FastAPI + ChromaDB + paraphrase-multilingual-mpnet-base-v2 (768 dims), Bearer token auth, persistent storage
  • Add async HTTP client (app/services/vector_search_client.py) — httpx with retry + graceful degradation
  • Integrate as parallel search engine in WikiRAGService: new search_async(), retrieve_async(), retrieve_multi_async() run BM25 + embeddings + vector search via asyncio.gather, merge/deduplicate results
  • Switch all 4 RAG injection call sites in chat router to async (_inject_rag_context_async)
  • Add background sync task (vector-search-sync) + incremental sync via DatasetSynced events
  • Add admin endpoints: GET /admin/wiki-rag/vector-search/status, POST .../sync
  • Add vector search status to /health endpoint
  • Docker: vector-search service (profile: vector-search, port 8003), vector_search_data volume
  • Env vars: VECTOR_SEARCH_URL, VECTOR_SEARCH_TOKEN
  • Update CLAUDE.md with new architecture docs

Relates to #489

NEWS

🔍 Vector Search — новый движок семантического поиска

Добавлен микросервис Vector Search на базе ChromaDB и мультиязычной модели paraphrase-multilingual-mpnet-base-v2. Работает параллельно с BM25 и embeddings — все три движка ищут одновременно, результаты объединяются и дедуплицируются. Запускается через Docker: docker compose --profile vector-search up -d.

Test plan

  • docker compose --profile vector-search up -d — microservice starts
  • curl http://localhost:8003/health — returns ok + model info
  • Restart orchestrator with VECTOR_SEARCH_URL=http://localhost:8003
  • GET /admin/wiki-rag/vector-search/status — shows connected
  • POST /admin/wiki-rag/vector-search/sync — upserts all sections
  • POST /admin/wiki-rag/search — results include engine: "vector_search"
  • Chat with RAG enabled — vector search results included in context
  • ruff check . + cd admin && npm run lint:check pass
  • All 106 unit tests pass

🤖 Generated with Claude Code

Add standalone Vector Search microservice (ChromaDB + paraphrase-multilingual-mpnet-base-v2,
768 dims) as a parallel search engine alongside BM25 and embeddings.

New files:
- services/vector-search/ — FastAPI microservice (Dockerfile, endpoints: upsert, search,
  compare, count, delete, health), Bearer token auth, persistent ChromaDB storage
- app/services/vector_search_client.py — async httpx client with retry + graceful degradation

Integration:
- WikiRAGService: new async methods (search_async, retrieve_async, retrieve_multi_async)
  run all engines in parallel via asyncio.gather, merge/deduplicate by (source_file, title)
- Chat router: all 4 RAG injection call sites switched to async _inject_rag_context_async
- Knowledge startup: init client from VECTOR_SEARCH_URL env, register vector-search-sync task,
  hook into DatasetSynced events for incremental sync
- Admin endpoints: GET /admin/wiki-rag/vector-search/status, POST .../sync
- Health check: vector_search section in /health response
- Docker: vector-search service (profile: vector-search, port 8003)

## NEWS

🔍 **Vector Search — новый движок семантического поиска**

Добавлен микросервис Vector Search на базе ChromaDB и мультиязычной модели
paraphrase-multilingual-mpnet-base-v2. Работает параллельно с BM25 и embeddings —
все три движка ищут одновременно, результаты объединяются и дедуплицируются.
Запускается через Docker: `docker compose --profile vector-search up -d`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ShaerWare ShaerWare merged commit d762198 into main Mar 23, 2026
3 checks passed
@ShaerWare ShaerWare deleted the feat/vector-search-integration branch March 23, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant