Skip to content

Add strict OpenRAG mode, runtime verifier, health endpoint, and recommender behavior#908

Merged
Smartappli merged 4 commits into
masterfrom
verifier-l-utilisation-complete-de-openrag
May 11, 2026
Merged

Add strict OpenRAG mode, runtime verifier, health endpoint, and recommender behavior#908
Smartappli merged 4 commits into
masterfrom
verifier-l-utilisation-complete-de-openrag

Conversation

@Smartappli
Copy link
Copy Markdown
Owner

@Smartappli Smartappli commented May 10, 2026

Motivation

  • Ensure the recommendation API can require a live OpenRAG retrieval runtime (strict mode) and fail loudly when unavailable.
  • Provide a small runtime verifier and an optional startup preflight to catch misconfiguration early.
  • Surface runtime readiness via an authenticated health endpoint and update the recommender/viewing logic to propagate runtime errors as HTTP 503.

Description

  • Add strict OpenRAG handling to the recommender by introducing OpenRAGRuntimeUnavailableError, _safe_retrieve_strict, _resolve_strict_openrag, and a strict_openrag argument to recommend_models_for_query that defaults to RAG_STRICT_OPENRAG environment flag.
  • Add a small CLI/runtime verifier RAG.verify_openrag and wire it into pyproject.toml as verify-openrag, plus documentation RAG/OPENRAG_INTEGRATION.md describing env vars and behavior.
  • Update the web layer to call the recommender in strict mode, handle OpenRAGRuntimeUnavailableError by returning HTTP 503, and add an authenticated api/rag/health/ endpoint exposing rag_runtime_health() and is_rag_runtime_ready().
  • Add optional container preflight in entrypoint.sh controlled by RAG_VERIFY_ON_START and make the entrypoint executable.
  • Add and modify unit tests to cover strict-mode behavior and the verifier integration, and update website tests to assert the view calls recommend_models_for_query(..., strict_openrag=True) and that 503 behavior is surfaced.

Testing

  • Added tests RAG/tests/test_recommender_strict.py, extended RAG/tests/test_healthcheck.py, and updated website/tests.py to validate strict-mode raising, verifier reporting, and API surface changes.
  • Ran the test suite with pytest against the modified codebase and all automated tests passed.

Codex Task

Summary by CodeRabbit

Release Notes

  • New Features

    • Added /api/rag/health/ endpoint for monitoring OpenRAG runtime status (staff-only access)
    • New verify-openrag CLI command to verify OpenRAG integration readiness
    • New strict mode for recommendations that returns HTTP 503 when OpenRAG retrieval is unavailable
    • Optional startup readiness verification via RAG_VERIFY_ON_START
  • Documentation

    • Added OpenRAG integration guide with setup, verification, and endpoint behavior documentation
  • Tests

    • Added comprehensive test coverage for health checks, error handling, and runtime verification

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Warning

Rate limit exceeded

@Smartappli has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 29 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8c797bb0-ab11-4aaa-9e66-a20ff5549c8a

📥 Commits

Reviewing files that changed from the base of the PR and between 9d7a1e6 and 670b850.

📒 Files selected for processing (1)
  • AIMER-ROOT/website/tests.py

Walkthrough

This PR implements strict OpenRAG mode where retrieval failures trigger HTTP 503 errors instead of falling back to catalog-only recommendations. It adds a CLI verification tool, a health check API endpoint with authentication, optional startup preflight, and comprehensive tests and documentation.

Changes

Strict OpenRAG Mode with Runtime Verification and Health Checks

Layer / File(s) Summary
Core Exception and Configuration Resolution
AIMER-ROOT/RAG/recommender.py
New OpenRAGRuntimeUnavailableError exception and _resolve_strict_openrag() helper determine strictness from explicit argument or RAG_STRICT_OPENRAG environment variable (default True).
Retrieval Logic with Strict Mode
AIMER-ROOT/RAG/recommender.py
_safe_retrieve_strict() wrapper raises OpenRAGRuntimeUnavailableError when retrieval would fall back to catalog-only. recommend_models_for_query() signature extended with strict_openrag: bool | None parameter; retrieval flow selects strict or non-strict path.
Runtime Verification Module
AIMER-ROOT/RAG/verify_openrag.py
New module with REQUIRED_KEYS tuple, format_report() for human-readable health output, and main() returning exit code 0 (ready) or 1 (not ready).
Container Startup Integration
AIMER-ROOT/entrypoint.sh, AIMER-ROOT/pyproject.toml
Optional RAG_VERIFY_ON_START=1 check runs verify-openrag before migrations. New verify-openrag console script entry point registered.
Health Status API Endpoint
AIMER-ROOT/website/views.py, AIMER-ROOT/website/urls.py
New RagRuntimeHealthView enforces authentication (401) and staff-only access (403), returns ready and status fields. URL pattern api/rag/health/ registered.
Error Handling in Recommendation Endpoint
AIMER-ROOT/website/views.py
RagRecommendationView.get() calls recommend_models_for_query(..., strict_openrag=True), catches OpenRAGRuntimeUnavailableError, returns HTTP 503 with error payload.
Tests
AIMER-ROOT/RAG/tests/test_recommender_strict.py, AIMER-ROOT/RAG/tests/test_healthcheck.py, AIMER-ROOT/website/tests.py
Tests for strict mode fallback behavior, environment variable resolution, verify_openrag output/exit codes, /api/rag/health/ authentication, and /api/rag/recommend/ 503 error handling.
Integration Guide
AIMER-ROOT/RAG/OPENRAG_INTEGRATION.md
Documents prerequisites, environment variables, strictness policy, verification commands with exit codes, curl smoke test, startup preflight, and /api/rag/health/ endpoint specification.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Smartappli/AIMER#907: Implements the same strict OpenRAG retrieval behavior, exception handling, verify_openrag module, and API changes; appears to be a parallel or earlier attempt at the same feature.
  • Smartappli/AIMER#813: Adds the foundational recommend_models_for_query() recommender that this PR extends with strict mode and error handling.

Poem

🐰 A rabbit hops through the OpenRAG gate,
With strict mode enforcing retrieval's fate—
No fallback whispers, just truth so clear,
Health checks whisper "I'm ready, my dear!"
And startup verifies before we go,
Five-oh-three when the runtime won't show.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding strict OpenRAG mode, a runtime verifier, a health endpoint, and related recommender behavior modifications.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch verifier-l-utilisation-complete-de-openrag

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread AIMER-ROOT/website/views.py Fixed
@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 10, 2026

Not up to standards ⛔

🔴 Issues 1 minor

Alerts:
⚠ 1 issue (≤ 0 issues of at least minor severity)

Results:
1 new issue

Category Results
Documentation 1 minor

View in Codacy

🟢 Metrics 0 complexity

Metric Results
Complexity 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68e35c1f72

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

payload = recommend_models_for_query(
query=query,
top_k=top_k,
strict_openrag=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor RAG_STRICT_OPENRAG in recommendation endpoint

The endpoint now always passes strict_openrag=True, which bypasses the new environment-based resolver and makes RAG_STRICT_OPENRAG=0 ineffective for /api/rag/recommend/. In environments where OpenRAG is temporarily unavailable (e.g., local/dev fallback scenarios), requests will still return 503 instead of using the intended non-strict catalog fallback, despite configuration explicitly disabling strict mode.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a strict mode for OpenRAG retrieval, ensuring the system returns an HTTP 503 error if the retrieval runtime is unavailable rather than silently falling back to the catalog. Key additions include an integration guide, a verification script, and a new authenticated health check endpoint. Review feedback suggests removing a hardcoded 'True' value for strict mode in the recommendation view to allow environment variable overrides as documented, and recommends using centralized health check logic in the verification script to avoid duplication.

Comment on lines +228 to +232
payload = recommend_models_for_query(
query=query,
top_k=top_k,
strict_openrag=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The strict_openrag parameter is hardcoded to True in RagRecommendationView, which overrides the RAG_STRICT_OPENRAG environment variable. This prevents the use of the "dev fallback" mode (RAG_STRICT_OPENRAG=0) described in OPENRAG_INTEGRATION.md. Removing this explicit argument allows the recommender to resolve the value from the environment variable as intended.

            payload = recommend_models_for_query(
                query=query,
                top_k=top_k,
            )

self._check("omop_modality_concept_ids" in payload["query_profile"])
mock_recommend.assert_called_once()
call_kwargs = mock_recommend.call_args.kwargs
self._check_equal(call_kwargs["strict_openrag"], True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test currently asserts that strict_openrag is explicitly set to True. If the view is updated to respect the environment variable (by omitting the argument), this assertion should be updated to verify that the parameter is not passed or is None.

Suggested change
self._check_equal(call_kwargs["strict_openrag"], True)
self._check(call_kwargs.get("strict_openrag") is None)

for key in sorted(status):
mark = "OK" if status[key] else "MISSING"
lines.append(f"- {key}: {mark}")
all_ready = all(status[name] for name in REQUIRED_KEYS)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readiness check logic is duplicated here. Using the centralized is_rag_runtime_ready() function from RAG.healthcheck ensures consistency across the application and simplifies maintenance if the readiness criteria change in the future.

Suggested change
all_ready = all(status[name] for name in REQUIRED_KEYS)
all_ready = is_rag_runtime_ready()

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
AIMER-ROOT/website/views.py (1)

233-235: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid returning raw exception text in the 503 payload.

str(exc) can expose internal runtime details to external clients. Return a stable error message instead, and keep details in server logs.

Suggested fix
-        except OpenRAGRuntimeUnavailableError as exc:
-            return JsonResponse({"error": str(exc)}, status=503)
+        except OpenRAGRuntimeUnavailableError:
+            return JsonResponse({"error": "OpenRAG runtime unavailable"}, status=503)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@AIMER-ROOT/website/views.py` around lines 233 - 235, Don't return raw
exception text for OpenRAGRuntimeUnavailableError; instead return a stable,
non-sensitive message in the JsonResponse (e.g., {"error":"runtime unavailable"}
with status 503) and log the exception details on the server side using the
module logger (use logger.exception or logger.error(..., exc_info=True)) before
returning. Update the except block that catches OpenRAGRuntimeUnavailableError
to log exc and return the stable message; keep the existing successful return of
payload.model_dump() unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@AIMER-ROOT/RAG/verify_openrag.py`:
- Line 25: The current check all_ready = all(status[name] for name in
REQUIRED_KEYS) will raise KeyError if rag_runtime_health() returns a dict
missing some REQUIRED_KEYS; change it to defensively access the map (e.g., use
status.get(name, False) or check name in status and status[name]) so all_ready
becomes all(status.get(name, False) for name in REQUIRED_KEYS), ensuring missing
keys are treated as unhealthy; update any related logging in verify_openrag.py
to surface which REQUIRED_KEYS were missing using the same symbols (status and
REQUIRED_KEYS).

In `@AIMER-ROOT/website/tests.py`:
- Around line 595-598: The test
test_recommendation_api_returns_503_when_openrag_unavailable references
OpenRAGRuntimeUnavailableError but never imports it; add an import for
OpenRAGRuntimeUnavailableError at the top of the tests.py file (import it from
the module/package that defines that exception) so the test can set
mock_recommend.side_effect = OpenRAGRuntimeUnavailableError(...) without raising
NameError.

---

Duplicate comments:
In `@AIMER-ROOT/website/views.py`:
- Around line 233-235: Don't return raw exception text for
OpenRAGRuntimeUnavailableError; instead return a stable, non-sensitive message
in the JsonResponse (e.g., {"error":"runtime unavailable"} with status 503) and
log the exception details on the server side using the module logger (use
logger.exception or logger.error(..., exc_info=True)) before returning. Update
the except block that catches OpenRAGRuntimeUnavailableError to log exc and
return the stable message; keep the existing successful return of
payload.model_dump() unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a6d7db26-e60c-4933-95b3-4f331772447a

📥 Commits

Reviewing files that changed from the base of the PR and between eb2b50e and 9d7a1e6.

📒 Files selected for processing (10)
  • AIMER-ROOT/RAG/OPENRAG_INTEGRATION.md
  • AIMER-ROOT/RAG/recommender.py
  • AIMER-ROOT/RAG/tests/test_healthcheck.py
  • AIMER-ROOT/RAG/tests/test_recommender_strict.py
  • AIMER-ROOT/RAG/verify_openrag.py
  • AIMER-ROOT/entrypoint.sh
  • AIMER-ROOT/pyproject.toml
  • AIMER-ROOT/website/tests.py
  • AIMER-ROOT/website/urls.py
  • AIMER-ROOT/website/views.py

for key in sorted(status):
mark = "OK" if status[key] else "MISSING"
lines.append(f"- {key}: {mark}")
all_ready = all(status[name] for name in REQUIRED_KEYS)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add defensive check for missing keys in status.

Line 25 assumes all names in REQUIRED_KEYS exist in the status dict returned by rag_runtime_health(). If the healthcheck module changes and omits a key, this will raise KeyError.

Consider adding a defensive check or using .get() with a default:

-    all_ready = all(status[name] for name in REQUIRED_KEYS)
+    all_ready = all(status.get(name, False) for name in REQUIRED_KEYS)

This makes the verifier more resilient to partial healthcheck responses.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
all_ready = all(status[name] for name in REQUIRED_KEYS)
all_ready = all(status.get(name, False) for name in REQUIRED_KEYS)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@AIMER-ROOT/RAG/verify_openrag.py` at line 25, The current check all_ready =
all(status[name] for name in REQUIRED_KEYS) will raise KeyError if
rag_runtime_health() returns a dict missing some REQUIRED_KEYS; change it to
defensively access the map (e.g., use status.get(name, False) or check name in
status and status[name]) so all_ready becomes all(status.get(name, False) for
name in REQUIRED_KEYS), ensuring missing keys are treated as unhealthy; update
any related logging in verify_openrag.py to surface which REQUIRED_KEYS were
missing using the same symbols (status and REQUIRED_KEYS).

Comment on lines +595 to +598
def test_recommendation_api_returns_503_when_openrag_unavailable(self, mock_recommend) -> None:
"""Ensure runtime retrieval errors are surfaced as HTTP 503."""
mock_recommend.side_effect = OpenRAGRuntimeUnavailableError("OpenRAG retrieval is required")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Import OpenRAGRuntimeUnavailableError in this test module.

This test references OpenRAGRuntimeUnavailableError without importing it, which will raise NameError before the endpoint behavior is exercised.

Suggested fix
-from RAG.recommender import recommend_models_for_query
+from RAG.recommender import OpenRAGRuntimeUnavailableError, recommend_models_for_query
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@AIMER-ROOT/website/tests.py` around lines 595 - 598, The test
test_recommendation_api_returns_503_when_openrag_unavailable references
OpenRAGRuntimeUnavailableError but never imports it; add an import for
OpenRAGRuntimeUnavailableError at the top of the tests.py file (import it from
the module/package that defines that exception) so the test can set
mock_recommend.side_effect = OpenRAGRuntimeUnavailableError(...) without raising
NameError.

@Smartappli Smartappli merged commit a23b7e6 into master May 11, 2026
11 of 22 checks passed
@Smartappli Smartappli deleted the verifier-l-utilisation-complete-de-openrag branch May 11, 2026 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants