Add strict OpenRAG mode, runtime verifier, health endpoint, and recommender behavior#908
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR implements strict OpenRAG mode where retrieval failures trigger HTTP 503 errors instead of falling back to catalog-only recommendations. It adds a CLI verification tool, a health check API endpoint with authentication, optional startup preflight, and comprehensive tests and documentation. ChangesStrict OpenRAG Mode with Runtime Verification and Health Checks
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| Documentation | 1 minor |
🟢 Metrics 0 complexity
Metric Results Complexity 0
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68e35c1f72
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| payload = recommend_models_for_query( | ||
| query=query, | ||
| top_k=top_k, | ||
| strict_openrag=True, |
There was a problem hiding this comment.
Honor RAG_STRICT_OPENRAG in recommendation endpoint
The endpoint now always passes strict_openrag=True, which bypasses the new environment-based resolver and makes RAG_STRICT_OPENRAG=0 ineffective for /api/rag/recommend/. In environments where OpenRAG is temporarily unavailable (e.g., local/dev fallback scenarios), requests will still return 503 instead of using the intended non-strict catalog fallback, despite configuration explicitly disabling strict mode.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request implements a strict mode for OpenRAG retrieval, ensuring the system returns an HTTP 503 error if the retrieval runtime is unavailable rather than silently falling back to the catalog. Key additions include an integration guide, a verification script, and a new authenticated health check endpoint. Review feedback suggests removing a hardcoded 'True' value for strict mode in the recommendation view to allow environment variable overrides as documented, and recommends using centralized health check logic in the verification script to avoid duplication.
| payload = recommend_models_for_query( | ||
| query=query, | ||
| top_k=top_k, | ||
| strict_openrag=True, | ||
| ) |
There was a problem hiding this comment.
The strict_openrag parameter is hardcoded to True in RagRecommendationView, which overrides the RAG_STRICT_OPENRAG environment variable. This prevents the use of the "dev fallback" mode (RAG_STRICT_OPENRAG=0) described in OPENRAG_INTEGRATION.md. Removing this explicit argument allows the recommender to resolve the value from the environment variable as intended.
payload = recommend_models_for_query(
query=query,
top_k=top_k,
)| self._check("omop_modality_concept_ids" in payload["query_profile"]) | ||
| mock_recommend.assert_called_once() | ||
| call_kwargs = mock_recommend.call_args.kwargs | ||
| self._check_equal(call_kwargs["strict_openrag"], True) |
There was a problem hiding this comment.
This test currently asserts that strict_openrag is explicitly set to True. If the view is updated to respect the environment variable (by omitting the argument), this assertion should be updated to verify that the parameter is not passed or is None.
| self._check_equal(call_kwargs["strict_openrag"], True) | |
| self._check(call_kwargs.get("strict_openrag") is None) |
| for key in sorted(status): | ||
| mark = "OK" if status[key] else "MISSING" | ||
| lines.append(f"- {key}: {mark}") | ||
| all_ready = all(status[name] for name in REQUIRED_KEYS) |
There was a problem hiding this comment.
The readiness check logic is duplicated here. Using the centralized is_rag_runtime_ready() function from RAG.healthcheck ensures consistency across the application and simplifies maintenance if the readiness criteria change in the future.
| all_ready = all(status[name] for name in REQUIRED_KEYS) | |
| all_ready = is_rag_runtime_ready() |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
AIMER-ROOT/website/views.py (1)
233-235:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid returning raw exception text in the 503 payload.
str(exc)can expose internal runtime details to external clients. Return a stable error message instead, and keep details in server logs.Suggested fix
- except OpenRAGRuntimeUnavailableError as exc: - return JsonResponse({"error": str(exc)}, status=503) + except OpenRAGRuntimeUnavailableError: + return JsonResponse({"error": "OpenRAG runtime unavailable"}, status=503)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@AIMER-ROOT/website/views.py` around lines 233 - 235, Don't return raw exception text for OpenRAGRuntimeUnavailableError; instead return a stable, non-sensitive message in the JsonResponse (e.g., {"error":"runtime unavailable"} with status 503) and log the exception details on the server side using the module logger (use logger.exception or logger.error(..., exc_info=True)) before returning. Update the except block that catches OpenRAGRuntimeUnavailableError to log exc and return the stable message; keep the existing successful return of payload.model_dump() unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@AIMER-ROOT/RAG/verify_openrag.py`:
- Line 25: The current check all_ready = all(status[name] for name in
REQUIRED_KEYS) will raise KeyError if rag_runtime_health() returns a dict
missing some REQUIRED_KEYS; change it to defensively access the map (e.g., use
status.get(name, False) or check name in status and status[name]) so all_ready
becomes all(status.get(name, False) for name in REQUIRED_KEYS), ensuring missing
keys are treated as unhealthy; update any related logging in verify_openrag.py
to surface which REQUIRED_KEYS were missing using the same symbols (status and
REQUIRED_KEYS).
In `@AIMER-ROOT/website/tests.py`:
- Around line 595-598: The test
test_recommendation_api_returns_503_when_openrag_unavailable references
OpenRAGRuntimeUnavailableError but never imports it; add an import for
OpenRAGRuntimeUnavailableError at the top of the tests.py file (import it from
the module/package that defines that exception) so the test can set
mock_recommend.side_effect = OpenRAGRuntimeUnavailableError(...) without raising
NameError.
---
Duplicate comments:
In `@AIMER-ROOT/website/views.py`:
- Around line 233-235: Don't return raw exception text for
OpenRAGRuntimeUnavailableError; instead return a stable, non-sensitive message
in the JsonResponse (e.g., {"error":"runtime unavailable"} with status 503) and
log the exception details on the server side using the module logger (use
logger.exception or logger.error(..., exc_info=True)) before returning. Update
the except block that catches OpenRAGRuntimeUnavailableError to log exc and
return the stable message; keep the existing successful return of
payload.model_dump() unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a6d7db26-e60c-4933-95b3-4f331772447a
📒 Files selected for processing (10)
AIMER-ROOT/RAG/OPENRAG_INTEGRATION.mdAIMER-ROOT/RAG/recommender.pyAIMER-ROOT/RAG/tests/test_healthcheck.pyAIMER-ROOT/RAG/tests/test_recommender_strict.pyAIMER-ROOT/RAG/verify_openrag.pyAIMER-ROOT/entrypoint.shAIMER-ROOT/pyproject.tomlAIMER-ROOT/website/tests.pyAIMER-ROOT/website/urls.pyAIMER-ROOT/website/views.py
| for key in sorted(status): | ||
| mark = "OK" if status[key] else "MISSING" | ||
| lines.append(f"- {key}: {mark}") | ||
| all_ready = all(status[name] for name in REQUIRED_KEYS) |
There was a problem hiding this comment.
Add defensive check for missing keys in status.
Line 25 assumes all names in REQUIRED_KEYS exist in the status dict returned by rag_runtime_health(). If the healthcheck module changes and omits a key, this will raise KeyError.
Consider adding a defensive check or using .get() with a default:
- all_ready = all(status[name] for name in REQUIRED_KEYS)
+ all_ready = all(status.get(name, False) for name in REQUIRED_KEYS)This makes the verifier more resilient to partial healthcheck responses.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| all_ready = all(status[name] for name in REQUIRED_KEYS) | |
| all_ready = all(status.get(name, False) for name in REQUIRED_KEYS) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@AIMER-ROOT/RAG/verify_openrag.py` at line 25, The current check all_ready =
all(status[name] for name in REQUIRED_KEYS) will raise KeyError if
rag_runtime_health() returns a dict missing some REQUIRED_KEYS; change it to
defensively access the map (e.g., use status.get(name, False) or check name in
status and status[name]) so all_ready becomes all(status.get(name, False) for
name in REQUIRED_KEYS), ensuring missing keys are treated as unhealthy; update
any related logging in verify_openrag.py to surface which REQUIRED_KEYS were
missing using the same symbols (status and REQUIRED_KEYS).
| def test_recommendation_api_returns_503_when_openrag_unavailable(self, mock_recommend) -> None: | ||
| """Ensure runtime retrieval errors are surfaced as HTTP 503.""" | ||
| mock_recommend.side_effect = OpenRAGRuntimeUnavailableError("OpenRAG retrieval is required") | ||
|
|
There was a problem hiding this comment.
Import OpenRAGRuntimeUnavailableError in this test module.
This test references OpenRAGRuntimeUnavailableError without importing it, which will raise NameError before the endpoint behavior is exercised.
Suggested fix
-from RAG.recommender import recommend_models_for_query
+from RAG.recommender import OpenRAGRuntimeUnavailableError, recommend_models_for_query🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@AIMER-ROOT/website/tests.py` around lines 595 - 598, The test
test_recommendation_api_returns_503_when_openrag_unavailable references
OpenRAGRuntimeUnavailableError but never imports it; add an import for
OpenRAGRuntimeUnavailableError at the top of the tests.py file (import it from
the module/package that defines that exception) so the test can set
mock_recommend.side_effect = OpenRAGRuntimeUnavailableError(...) without raising
NameError.
Motivation
Description
OpenRAGRuntimeUnavailableError,_safe_retrieve_strict,_resolve_strict_openrag, and astrict_openragargument torecommend_models_for_querythat defaults toRAG_STRICT_OPENRAGenvironment flag.RAG.verify_openragand wire it intopyproject.tomlasverify-openrag, plus documentationRAG/OPENRAG_INTEGRATION.mddescribing env vars and behavior.OpenRAGRuntimeUnavailableErrorby returning HTTP503, and add an authenticatedapi/rag/health/endpoint exposingrag_runtime_health()andis_rag_runtime_ready().entrypoint.shcontrolled byRAG_VERIFY_ON_STARTand make the entrypoint executable.recommend_models_for_query(..., strict_openrag=True)and that 503 behavior is surfaced.Testing
RAG/tests/test_recommender_strict.py, extendedRAG/tests/test_healthcheck.py, and updatedwebsite/tests.pyto validate strict-mode raising, verifier reporting, and API surface changes.pytestagainst the modified codebase and all automated tests passed.Codex Task
Summary by CodeRabbit
Release Notes
New Features
/api/rag/health/endpoint for monitoring OpenRAG runtime status (staff-only access)verify-openragCLI command to verify OpenRAG integration readinessRAG_VERIFY_ON_STARTDocumentation
Tests