Context search engine for AI agents and humans.
Your files are the truth, seekx is just the index.
No GPU. Hybrid Search. Realtime Index.
Index once, find anything. seekx brings SOTA hybrid search to your local documents — just your files and a single command, reducing token costs by 90%.
seekx add ~/notes
seekx search "how do agents use tool calling"
That's it. Your notes are indexed, and you're searching.
If you find seekx useful, please consider giving it a ⭐ on GitHub — it really helps!
You have hundreds of Markdown files, notes, docs — scattered across folders. Spotlight finds filenames. Grep finds exact strings. Neither understands what you're looking for.
seekx does.
| What you get | How |
|---|---|
| Find by meaning, not just keywords | Hybrid search fuses BM25 keyword matching with vector semantic search via RRF — you get results whether you remember the exact words or not. |
| Up and running in 2 minutes | No GPU, no model downloads, no Docker. Point it at any OpenAI-compatible API and go. |
| Always in sync | Edit a file, search it instantly. The index updates as you work — no manual rebuilds. |
| Works with Chinese, Japanese, Korean | Jieba-based tokenization built in. CJK full-text search just works. |
- Hybrid search — BM25 + vector + Reciprocal Rank Fusion, out of the box
- Cross-encoder reranking — optional rerank API for higher-precision results
- Query expansion — automatic query rewriting via LLM for better recall
- HyDE — Hypothetical Document Embeddings for improved semantic retrieval
- Content-aware chunking — Markdown heading-based splitting; plain-text paragraph splitting
- Incremental indexing — SHA-1 content hashing skips unchanged files; only re-embeds what changed
- CJK tokenization — Jieba-based segmentation for Chinese, Japanese, and Korean text
- MCP server — expose your knowledge base to AI agents via Model Context Protocol
- OpenClaw plugin — drop-in memory backend that replaces
memory-corewith seekx's hybrid pipeline - JSON output — every command supports
--jsonfor scripting and piping
- Bun ≥ 1.1.0
- An OpenAI-compatible embedding API (SiliconFlow, Jina, Ollama, OpenAI, etc.)
From npm (recommended) — install the CLI globally; npm pulls seekx-core automatically.
npm install -g seekx
# or: bun add -g seekxYou still need Bun on your PATH at runtime — the published CLI runs via Bun, not Node.
From source — for development or unreleased commits:
git clone https://github.com/oceanbase/seekx.git
cd seekx
bun install
bun link --cwd packages/cli # makes 'seekx' available globallyseekx onboard # interactive — configure API, check environmentonboard walks you through API key setup, embedding model selection, and macOS SQLite configuration for vector search.
# Add directories to the index
seekx add ~/notes
seekx add ~/Documents/obsidian --name obsidian
# Hybrid search (BM25 + vector + RRF)
seekx search "vector database embedding"
# Search with automatic query expansion
seekx query "how does RRF fusion work"
# Pure semantic search
seekx vsearch "semantic similarity"seekx watch # watches all indexed collectionsQuery
│
├─── [Query Expansion] ──► expanded queries
│ │
▼ ▼
Original query Expanded queries
│ │
├─► BM25 (weight 2×) ├─► BM25 (weight 1×)
├─► Vector (weight 2×) ├─► Vector (weight 1×)
│ │
│ [HyDE] ──► Vector (1×) │
│ │
└────────── all lists ───────────┘
│
RRF Fusion
│
[Rerank]
│
Final
- Query expansion (optional): an LLM rewrites the query into multiple variants for better recall.
- The original query and all expanded variants are run against BM25 and vector indexes in parallel. Original results carry 2× weight in fusion; expanded results carry 1×.
- HyDE (optional): a hypothetical answer is generated and embedded as an additional vector search pass.
- All result lists are merged via Reciprocal Rank Fusion (RRF).
- Reranking (optional): a cross-encoder re-scores the fused candidates with position-aware blending.
| Command | Description |
|---|---|
seekx onboard |
Interactive setup wizard |
seekx add <path> |
Index a directory (creates a collection) |
seekx collections |
List all indexed collections |
seekx remove <name> |
Remove a collection |
seekx reindex [name] |
Rebuild the index for a collection |
seekx search <query> |
Hybrid search (BM25 + vector + RRF) |
seekx query <query> |
Hybrid search with query expansion |
seekx vsearch <query> |
Pure vector search |
seekx get <id> |
Retrieve a document by ID |
seekx watch |
Start the realtime file watcher |
seekx status |
Show index stats and health |
seekx config |
View or update configuration |
seekx mcp |
Start MCP server (stdio) for AI agents |
All commands support --json for machine-readable output.
Config file: ~/.seekx/config.yml
# Provider defaults (shared across services)
provider:
base_url: https://api.siliconflow.cn/v1
api_key: sk-...
# Embedding — required for vector search
embed:
model: BAAI/bge-m3
# Cross-encoder reranking — optional
rerank:
model: BAAI/bge-reranker-v2-m3
# Query expansion — optional
expand:
model: Qwen/Qwen3-8B
# Search defaults
search:
default_limit: 10
rerank: true
min_score: 0.3
# File watcher
watch:
debounce_ms: 500
ignore:
- node_modules
- .gitEach service (embed, rerank, expand) can override base_url, api_key, and model independently if you use different providers.
| Variable | Description |
|---|---|
SEEKX_API_KEY |
API key (overrides config) |
SEEKX_BASE_URL |
Base URL (overrides config) |
SEEKX_DB_PATH |
SQLite database path (default: ~/.seekx/index.sqlite) |
SEEKX_CONFIG_PATH |
Config file path (default: ~/.seekx/config.yml) |
SEEKX_SQLITE_PATH |
Path to libsqlite3.dylib (macOS, for extension loading) |
The system SQLite on macOS disables extension loading. For vector search (sqlite-vec):
brew install sqliteseekx auto-detects standard Homebrew paths (Apple Silicon and Intel). If auto-detection fails:
export SEEKX_SQLITE_PATH="$(brew --prefix sqlite)/lib/libsqlite3.dylib"seekx onboard will check this and guide you.
Expose your indexed knowledge base to AI agents (Claude Desktop, Cursor, etc.) via the Model Context Protocol:
seekx mcp # starts an MCP server over stdioThe server exposes four tools: search, get, list, status.
seekx-openclaw is a drop-in plugin that replaces OpenClaw's built-in memory-core backend with seekx's full hybrid search pipeline. Once installed, every memory_search and memory_get call is transparently routed through BM25 + vector + rerank — no changes to your agent or prompts are needed.
openclaw plugins install seekx-openclawConfigure the plugin in ~/.openclaw/openclaw.json:
{
"plugins": {
"slots": { "memory": "seekx" },
"entries": { "seekx": { "enabled": true } }
}
}What you get beyond the built-in backend:
- Hybrid BM25 + vector search with cross-encoder reranking
- CJK-aware full-text search via Jieba
- Proactive auto-recall: injects relevant memory into prompts before the agent even searches
Source: path#linecitation footers on search results (QMD-compatible)- Search timeout protection (default 8 s)
- Graceful degradation — BM25-only works with no API key
The plugin inherits API credentials from ~/.seekx/config.yml when that file exists, so no duplication is needed if you already use the seekx CLI. See the plugin README for provider configuration, extra directories, and troubleshooting.
packages/
core/ seekx-core — engine library (SQLite, search, indexer, watcher)
cli/ seekx — CLI + MCP server
openclaw-plugin/ seekx-openclaw — OpenClaw memory backend plugin
bench/ Retrieval benchmarks (SciFact, MIRACL-zh)
| Package | Version | Description |
|---|---|---|
seekx |
CLI — 13 commands, MCP server, realtime watcher | |
seekx-core |
Search engine library (Node / Bun compatible) | |
seekx-openclaw |
OpenClaw memory backend plugin |
bun test --recursive packages/ # run all tests
bun run typecheck # tsc -b
bun run lint # biome check
bun run format # biome format --write- MCP server — expose your knowledge base to AI agents (Claude Desktop, Cursor, etc.)
- OpenClaw memory backend plugin (
seekx-openclaw) - Session transcript indexing for the OpenClaw plugin
- PDF and DOCX support
- Multi-tenancy (isolated indexes per user/workspace)
- Web UI for search and collection management
- Plugin system for custom file parsers
MIT
