An OpenClaw skill that caches LLM responses by meaning using Redis vector search. Similar questions return cached answers in ~100ms instead of making expensive API calls.
77x faster. 60-80% cost reduction. One command to install.
clawhub install semantic-cacheEvery OpenClaw agent makes LLM API calls. Many of these are semantically identical — "How do I reset my password?" and "I forgot my password, how do I change it?" are the same question worded differently. Without caching, each one costs tokens and takes seconds.
Exact-match caching doesn't help because users never ask the same question the same way twice.
Semantic Cache embeds every query into a vector and stores it in Redis. When a new query comes in, it finds the most semantically similar cached query using cosine similarity. If the match is strong enough, it returns the cached response instantly.
First ask: "What are the benefits of drinking water?"
→ Cache MISS → LLM call (9.2s, 398 tokens) → cached
Second ask: "What are the health benefits of drinking water regularly?"
→ Cache HIT (0.832 similarity, 119ms) → instant response, zero tokens
Query → Embed (text-embedding-3-small) → Redis HNSW Vector Search
↓
similarity > 0.80?
↓ ↓
YES NO
↓ ↓
Return cached Call LLM → Cache → Return
(100ms) (2-10s)
- Incoming query is embedded into a 1536-dimension vector
- Redis vector search (HNSW algorithm) finds the nearest cached query
- If cosine similarity exceeds the threshold (default 0.80), return the cached response
- If not, pass through to the LLM, cache the response for future similar queries
| Metric | Value |
|---|---|
| Cache hit lookup | ~100-120ms |
| Full LLM call | ~2-10 seconds |
| Speedup | 77x on cache hits |
| Embedding cost | ~$0.00002 per query |
| Storage per entry | ~6KB |
| Concurrent lookups | 5 in 491ms (98ms avg) |
clawhub install semantic-cache| Variable | Required | Description |
|---|---|---|
REDIS_URL |
Yes | Redis connection string (Redis Cloud or Redis Stack with vector search) |
OPENAI_API_KEY |
Yes | For generating embeddings |
SEMANTIC_CACHE_THRESHOLD |
No | Similarity threshold 0-1 (default: 0.80) |
SEMANTIC_CACHE_TTL |
No | Cache TTL in seconds (default: 86400 = 24 hours) |
Free tier on Redis Cloud works. Requires vector search support (included in Redis Cloud and Redis Stack).
node scripts/cache.js query "How do I reset my password?"node scripts/cache.js store "What is your return policy?" "We offer 30-day returns on all items."node scripts/cache.js lookup "How can I return a product?"node scripts/cache.js statsnode scripts/cache.js clearnode scripts/cache.js test=== SEMANTIC CACHE STRESS TEST ===
Test 1: Store and exact recall
PASS: Exact match is a hit
PASS: Similarity is ~1.0
PASS: Lookup under 500ms
Test 2: Semantic paraphrase
PASS: Paraphrase detected as similar
Test 3: Completely different query
PASS: Different query is a miss
PASS: Low similarity
Test 4: Multiple entries
PASS: Cancel query matches cancel entry
PASS: Payment query matches payment entry
PASS: Docs query matches docs entry
Test 5: Edge cases
PASS: Empty string doesn't crash
PASS: Single char doesn't crash
PASS: Very long query doesn't crash
Test 6: Concurrent lookups
PASS: 5 concurrent lookups complete
PASS: Total time under 3s (491ms for 5 queries)
Test 7: Hit count tracking
PASS: Entries exist
=== RESULTS ===
Passed: 15/15
Failed: 0/15
The SEMANTIC_CACHE_THRESHOLD controls the tradeoff between cache hits and accuracy:
| Threshold | Behavior | Best For |
|---|---|---|
| 0.70 | Aggressive — more hits, higher false positive risk | FAQ bots, support with limited question variety |
| 0.80 | Balanced (default) — good hit rate, low false positives | General purpose |
| 0.90 | Conservative — fewer hits, very precise matching | Code generation, technical queries where precision matters |
| 0.95 | Strict — near-exact matches only | Safety-critical applications |
┌─────────────────────────────────────────────┐
│ OpenClaw Agent │
├─────────────────────────────────────────────┤
│ Semantic Cache Skill │
│ │
│ ┌──────────┐ ┌────────────────────────┐ │
│ │ OpenAI │ │ Redis Cloud │ │
│ │ Embeddings│ │ ┌──────────────────┐ │ │
│ │ │ │ │ HNSW Vector │ │ │
│ │ text- │──>│ │ Search Index │ │ │
│ │ embedding-│ │ │ │ │ │
│ │ 3-small │ │ │ 1536-dim float │ │ │
│ └──────────┘ │ │ cosine distance │ │ │
│ │ └──────────────────┘ │ │
│ └────────────────────────┘ │
└─────────────────────────────────────────────┘
- Redis (Gold Sponsor) — HNSW vector search for semantic similarity matching
- OpenAI — text-embedding-3-small for query vectorization
- OpenClaw — Skill framework and ClawHub publishing
MIT-0 (ClawHub standard)
