You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Production semantic search at https://cms.jesusfilm.org/api/search is silently degraded — the OpenRouter query-embedding call is failing (or returning non-overlapping results), leaving keyword search as the only contributing retrieval. The try/catch in apps/cms/src/api/search/services/search.ts:154-166 swallows the failure, so the API still returns 200s with results. The hybrid search promise is not being delivered in production.
Discovered while validating PR #777 (feat-086). Roadmap ticket: docs/roadmap/content-discovery/feat-097-investigate-prod-query-embedding.md.
Evidence
Tested 6 queries against production on 2026-04-15:
Query
Top score
Has scene-level data?
Easter
0.500
No (startSeconds: null, playbackId: null)
forgiveness
0.500
No
Jesus heals
0.500
No
resurrection
0.500
No
centurion at the cross
empty
N/A
feeling alone in suffering
empty
N/A
The score 0.500 is mathematically the exact value for rank-1 in keyword search when 2 lists are passed to RRF and semantic is empty:
If semantic were contributing AND ranked the same items at rank-1, scores would be 1.000. They are not. The thematic-only queries that should have zero keyword matches return empty — strong evidence that semantic isn't producing usable results.
For comparison, the same code paths run against a local DB return rich semantic results: themes (new life, awe, meaning), bible verses, demographics, scene-level snippets. Production returns none of this. The code is identical — only the runtime environment differs.
Hypotheses (Ranked)
OPENROUTER_API_KEY env var missing or invalid in Railway. Most likely. The try/catch logs strapi.log.warn(...) which may be filtered out of production retention. (apps/cms/src/api/search/services/search.ts:154-166 swallows the failure, apps/cms/src/lib/openrouter.ts requires the env var.)
OpenRouter API outage or throttling on the production IP.
Model deprecation — text-embedding-3-small renamed/removed.
Network egress blocked Railway → OpenRouter.
Semantic returning data, but for different videos that never make top-5 (unlikely given empty thematic results).
The current try/catch is too quiet. It should still degrade gracefully (don't break the API) but should also:
Log at `error` level, not `warn`, so it surfaces in default log retention
Increment a metric (e.g., `search_query_embedding_failures_total`) so degraded operation triggers alerts
Optionally surface a non-blocking signal in the API response (`degraded: true` flag, or `X-Search-Mode: keyword-only` header) so consumers can render a banner
4. Add a synthetic health probe
Run a single test embedding at boot or periodically; report failure to monitoring. Catches regressions before users notice.
TL;DR
Production semantic search at
https://cms.jesusfilm.org/api/searchis silently degraded — the OpenRouter query-embedding call is failing (or returning non-overlapping results), leaving keyword search as the only contributing retrieval. Thetry/catchinapps/cms/src/api/search/services/search.ts:154-166swallows the failure, so the API still returns 200s with results. The hybrid search promise is not being delivered in production.Discovered while validating PR #777 (feat-086). Roadmap ticket:
docs/roadmap/content-discovery/feat-097-investigate-prod-query-embedding.md.Evidence
Tested 6 queries against production on 2026-04-15:
EasterstartSeconds: null,playbackId: null)forgivenessJesus healsresurrectioncenturion at the crossfeeling alone in sufferingThe score
0.500is mathematically the exact value for rank-1 in keyword search when 2 lists are passed to RRF and semantic is empty:If semantic were contributing AND ranked the same items at rank-1, scores would be
1.000. They are not. The thematic-only queries that should have zero keyword matches return empty — strong evidence that semantic isn't producing usable results.For comparison, the same code paths run against a local DB return rich semantic results: themes (
new life, awe, meaning), bible verses, demographics, scene-level snippets. Production returns none of this. The code is identical — only the runtime environment differs.Hypotheses (Ranked)
OPENROUTER_API_KEYenv var missing or invalid in Railway. Most likely. Thetry/catchlogsstrapi.log.warn(...)which may be filtered out of production retention. (apps/cms/src/api/search/services/search.ts:154-166swallows the failure,apps/cms/src/lib/openrouter.tsrequires the env var.)text-embedding-3-smallrenamed/removed.Investigation Plan
1. Confirm from logs
```bash
railway logs --service forge-cms | grep -E '(embedding failed|OPENROUTER|[search])'
```
If
[search] Query embedding failed, falling back to keyword-only: ...appears on every query, hypothesis 1 is confirmed.2. Verify the env var
```bash
railway variables --service forge-cms | grep OPENROUTER
```
If missing, set from Doppler. If present, validate it works:
```bash
curl -X POST https://openrouter.ai/api/v1/embeddings \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"input": "test", "model": "text-embedding-3-small"}'
```
3. Make the failure visible going forward
The current
try/catchis too quiet. It should still degrade gracefully (don't break the API) but should also:4. Add a synthetic health probe
Run a single test embedding at boot or periodically; report failure to monitoring. Catches regressions before users notice.
```ts
// apps/cms/src/bootstrap/probe-openrouter.ts
async function probeOpenRouter(strapi: Core.Strapi): Promise {
try {
await embedQuery("health-check probe")
strapi.log.info("[probe] OpenRouter embedding healthy")
} catch (err) {
strapi.log.error(`[probe] OpenRouter embedding FAILED: ${err}`)
}
}
```
Verification (Once Fixed)
```bash
Thematic-only query should return non-empty (semantic kicks in)
curl 'https://cms.jesusfilm.org/api/search?q=feeling%20alone%20in%20suffering&locale=en' | jq '.results | length'
Expect: > 0
Top result should have scene-level data
curl 'https://cms.jesusfilm.org/api/search?q=Easter&locale=en' | jq '.results[0]'
Expect: startSeconds and playbackId non-null
Expect: snippet contains scene-level themes/bible-verses prose
Top score should NOT be exactly 0.500
curl 'https://cms.jesusfilm.org/api/search?q=Easter&locale=en' | jq '.results[0].score'
Expect: ~1.0 (rank-1 in both lists) or ~0.95+ (rank-1 in one, rank-2 in the other)
```
Constraints
Related
embedQuerydependency means experiences will exhibit the same degraded behavior in production until this is fixed.