Enforce strict OpenRAG retrieval with verification CLI and API 503 handling#907
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughThis pull request introduces OpenRAG strict retrieval mode with runtime verification and error handling. The ChangesOpenRAG Strict Mode with Verification and Error Handling
Sequence DiagramssequenceDiagram
participant Client
participant APIEndpoint as RagRecommendationView
participant Recommender as recommend_models_for_query
participant ResolveStrict as _resolve_strict_openrag
participant SafeRetrieve as _safe_retrieve_strict
Client->>APIEndpoint: GET /api/rag/recommend/?q=test
APIEndpoint->>Recommender: (query, strict_openrag=True)
Recommender->>ResolveStrict: (True, env RAG_STRICT_OPENRAG)
ResolveStrict-->>Recommender: strict=True
Recommender->>SafeRetrieve: retrieve documents
alt OpenRAG unavailable
SafeRetrieve-->>Recommender: OpenRAGRuntimeUnavailableError
Recommender-->>APIEndpoint: exception
APIEndpoint-->>Client: HTTP 503 {error: message}
else OpenRAG available
SafeRetrieve-->>Recommender: documents
Recommender-->>APIEndpoint: RecommendationResponse
APIEndpoint-->>Client: HTTP 200 {recommendation}
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements a strict OpenRAG retrieval mode to prevent silent fallbacks when the runtime is unavailable, adding a verification script, environment variable support, and 503 error handling. Feedback indicates that hardcoding the strict flag in the API view prevents the environment variable from acting as a development fallback as documented, and suggests delegating this resolution to the recommender logic.
| payload = recommend_models_for_query( | ||
| query=query, | ||
| top_k=top_k, | ||
| strict_openrag=True, |
There was a problem hiding this comment.
Hardcoding strict_openrag=True here overrides the RAG_STRICT_OPENRAG environment variable, which contradicts the documentation in OPENRAG_INTEGRATION.md describing it as an optional dev fallback. By passing None (or omitting the argument), the API will remain strict by default (as defined in recommender.py) while still respecting the environment variable for local development scenarios where a fallback is desired.
| strict_openrag=True, | |
| strict_openrag=None, |
| self._check("omop_modality_concept_ids" in payload["query_profile"]) | ||
| mock_recommend.assert_called_once() | ||
| call_kwargs = mock_recommend.call_args.kwargs | ||
| self._check_equal(call_kwargs["strict_openrag"], True) |
There was a problem hiding this comment.
This assertion should be updated to expect None if the view is modified to delegate strictness resolution to the recommender, ensuring the environment variable configuration is correctly respected.
| self._check_equal(call_kwargs["strict_openrag"], True) | |
| self._check_equal(call_kwargs["strict_openrag"], None) |
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| Documentation | 7 minor |
| ErrorProne | 1 high |
| Security | 8 high |
🟢 Metrics 13 complexity
Metric Results Complexity 13
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cbf5d97565
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| payload = recommend_models_for_query( | ||
| query=query, | ||
| top_k=top_k, | ||
| strict_openrag=True, |
There was a problem hiding this comment.
Respect RAG_STRICT_OPENRAG in API recommendations
The API now hard-codes strict_openrag=True when calling recommend_models_for_query, which bypasses _resolve_strict_openrag() and makes the documented RAG_STRICT_OPENRAG=0 fallback ineffective for /api/rag/recommend/. In environments where OpenRAG is intentionally unavailable (e.g., local/dev), requests will still fail with 503 even when the operator has explicitly disabled strict mode via env var, so this change removes a configuration escape hatch that the same commit advertises.
Useful? React with 👍 / 👎.
| from RAG.verify_openrag import format_report | ||
|
|
||
| report = format_report() | ||
| assert "OpenRAG runtime verification:" in report |
|
|
||
| report = format_report() | ||
| assert "OpenRAG runtime verification:" in report | ||
| assert "runtime_ready:" in report |
|
|
||
| monkeypatch.setenv("OPENRAG_ENDPOINT", "http://localhost:8000") | ||
| status = main() | ||
| assert status in (0, 1) |
| from RAG.verify_openrag import main | ||
|
|
||
| monkeypatch.delenv("OPENRAG_ENDPOINT", raising=False) | ||
| assert main() == 1 |
| strict_openrag=True, | ||
| ) | ||
| except OpenRAGRuntimeUnavailableError as exc: | ||
| assert "OpenRAG retrieval is required" in str(exc) |
| from RAG.recommender import _resolve_strict_openrag | ||
|
|
||
| monkeypatch.delenv("RAG_STRICT_OPENRAG", raising=False) | ||
| assert _resolve_strict_openrag(None) is True |
| from RAG.recommender import _resolve_strict_openrag | ||
|
|
||
| monkeypatch.setenv("RAG_STRICT_OPENRAG", "false") | ||
| assert _resolve_strict_openrag(None) is False |
| from RAG.recommender import _resolve_strict_openrag | ||
|
|
||
| monkeypatch.setenv("RAG_STRICT_OPENRAG", "0") | ||
| assert _resolve_strict_openrag(True) is True |
Motivation
503so clients can distinguish runtime outages from normal responses.Description
OpenRAGRuntimeUnavailableError,_safe_retrieve_strict, and_resolve_strict_openragtoRAG/recommender.py, and add astrict_openragparameter torecommend_models_for_queryso retrieval can be enforced or fallback to catalog-only behavior.RAG/verify_openrag.pyas a CLI readiness checker withformat_report()andmain()and register it asverify-openraginpyproject.toml.website/views.pyRagRecommendationViewto callrecommend_models_for_query(..., strict_openrag=True)and return503with an error payload whenOpenRAGRuntimeUnavailableErroris raised.AIMER-ROOT/RAG/OPENRAG_INTEGRATION.mddocumentation, changeentrypoint.shto optionally runverify-openragon startup whenRAG_VERIFY_ON_START=1, and mark the script executable.RAG/tests/test_recommender_strict.pyfor strict mode behavior and env resolution, extendRAG/tests/test_healthcheck.pywithverify_openragreport tests, and updatewebsite/tests.pyto assert the strict flag is passed and that 503 is returned when runtime is unavailable.Testing
pytestcoveringRAGandwebsitetest modules includingtest_recommender_strict.py,test_healthcheck.py, and the updatedRagRecommendationApiTests, and they completed successfully.verify-openragscript locally viapython -m RAG.verify_openragto confirm it returns0whenOPENRAG_ENDPOINTis set and1when it is missing.websitetests that the strict flag is forwarded and that a retrieval failure maps to a503response.Codex Task
Summary by CodeRabbit
New Features
/api/rag/recommend/endpoint now enforces strict OpenRAG mode by defaultDocumentation