Skip to content

Enforce strict OpenRAG retrieval with verification CLI and API 503 handling#907

Merged
Smartappli merged 1 commit into
masterfrom
verifier-l-utilisation-complete-de-openrag
May 10, 2026
Merged

Enforce strict OpenRAG retrieval with verification CLI and API 503 handling#907
Smartappli merged 1 commit into
masterfrom
verifier-l-utilisation-complete-de-openrag

Conversation

@Smartappli
Copy link
Copy Markdown
Owner

@Smartappli Smartappli commented May 10, 2026

Motivation

  • Ensure the recommendation engine can optionally require a live OpenRAG runtime and fail fast when retrieval is unavailable.
  • Provide a lightweight runtime verification tool and optional container preflight so deployments can assert OpenRAG readiness before serving traffic.
  • Surface runtime retrieval failures through the HTTP API as 503 so clients can distinguish runtime outages from normal responses.

Description

  • Add OpenRAGRuntimeUnavailableError, _safe_retrieve_strict, and _resolve_strict_openrag to RAG/recommender.py, and add a strict_openrag parameter to recommend_models_for_query so retrieval can be enforced or fallback to catalog-only behavior.
  • Add RAG/verify_openrag.py as a CLI readiness checker with format_report() and main() and register it as verify-openrag in pyproject.toml.
  • Update website/views.py RagRecommendationView to call recommend_models_for_query(..., strict_openrag=True) and return 503 with an error payload when OpenRAGRuntimeUnavailableError is raised.
  • Add AIMER-ROOT/RAG/OPENRAG_INTEGRATION.md documentation, change entrypoint.sh to optionally run verify-openrag on startup when RAG_VERIFY_ON_START=1, and mark the script executable.
  • Add unit tests: RAG/tests/test_recommender_strict.py for strict mode behavior and env resolution, extend RAG/tests/test_healthcheck.py with verify_openrag report tests, and update website/tests.py to assert the strict flag is passed and that 503 is returned when runtime is unavailable.

Testing

  • Ran unit tests with pytest covering RAG and website test modules including test_recommender_strict.py, test_healthcheck.py, and the updated RagRecommendationApiTests, and they completed successfully.
  • Executed the verify-openrag script locally via python -m RAG.verify_openrag to confirm it returns 0 when OPENRAG_ENDPOINT is set and 1 when it is missing.
  • Confirmed the API behavior via the updated website tests that the strict flag is forwarded and that a retrieval failure maps to a 503 response.

Codex Task

Summary by CodeRabbit

  • New Features

    • /api/rag/recommend/ endpoint now enforces strict OpenRAG mode by default
    • Returns HTTP 503 when OpenRAG runtime is unavailable
    • New command-line verification tool for checking OpenRAG readiness
    • Optional automatic readiness verification at container startup
  • Documentation

    • Added OpenRAG integration guide with configuration, verification steps, and testing examples

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 702664f4-aa62-4ff5-b660-ee10b8b8266d

📥 Commits

Reviewing files that changed from the base of the PR and between a1d17ef and cbf5d97.

📒 Files selected for processing (9)
  • AIMER-ROOT/RAG/OPENRAG_INTEGRATION.md
  • AIMER-ROOT/RAG/recommender.py
  • AIMER-ROOT/RAG/tests/test_healthcheck.py
  • AIMER-ROOT/RAG/tests/test_recommender_strict.py
  • AIMER-ROOT/RAG/verify_openrag.py
  • AIMER-ROOT/entrypoint.sh
  • AIMER-ROOT/pyproject.toml
  • AIMER-ROOT/website/tests.py
  • AIMER-ROOT/website/views.py

Walkthrough

This pull request introduces OpenRAG strict retrieval mode with runtime verification and error handling. The /api/rag/recommend/ endpoint now enforces strict mode by default, raising OpenRAGRuntimeUnavailableError and returning HTTP 503 when OpenRAG retrieval is unavailable. A new verification script provides health checks for deployment readiness.

Changes

OpenRAG Strict Mode with Verification and Error Handling

Layer / File(s) Summary
OpenRAG Integration Guide
AIMER-ROOT/RAG/OPENRAG_INTEGRATION.md
Documents environment variables (OPENRAG_ENDPOINT, OPENRAG_API_KEY, RAG_COLLECTION_NAME, RAG_STRICT_OPENRAG), strict mode behavior with HTTP 503 response, verification commands with exit code interpretation, curl-based smoke test, and optional startup preflight via RAG_VERIFY_ON_START.
Error Type & Runtime Verification
AIMER-ROOT/RAG/recommender.py, AIMER-ROOT/RAG/verify_openrag.py, AIMER-ROOT/RAG/tests/test_healthcheck.py
New OpenRAGRuntimeUnavailableError exception. New verify_openrag.py module with format_report() and main() functions that call rag_runtime_health(), report required keys as OK or MISSING, and return exit code 0 or 1. Tests validate report format includes runtime markers and main() returns correct exit codes when endpoint is present or missing.
Strict Mode Recommender Logic
AIMER-ROOT/RAG/recommender.py, AIMER-ROOT/RAG/tests/test_recommender_strict.py
recommend_models_for_query() accepts keyword-only parameter strict_openrag: bool | None. New _resolve_strict_openrag() resolves strictness from parameter or RAG_STRICT_OPENRAG environment variable (default True). New _safe_retrieve_strict() converts catalog-only fallback into OpenRAGRuntimeUnavailableError when strict mode enabled. Tests verify strict mode raises error on fallback, environment variable defaults to True, respects "false" value, and explicit argument overrides environment.
API Endpoint Error Handling
AIMER-ROOT/website/views.py, AIMER-ROOT/website/tests.py
RagRecommendationView.get() calls recommend_models_for_query() with strict_openrag=True and wraps in try/except to catch OpenRAGRuntimeUnavailableError, returning HTTP 503 JSON response with error field. Tests validate strict_openrag=True is passed and that endpoint returns 503 with error field when exception is raised.
Packaging & Startup Integration
AIMER-ROOT/pyproject.toml, AIMER-ROOT/entrypoint.sh
pyproject.toml adds verify-openrag console script entry point mapped to RAG.verify_openrag:main. entrypoint.sh adds conditional startup verification: when RAG_VERIFY_ON_START=1, runs uv run verify-openrag before migrations.

Sequence Diagrams

sequenceDiagram
  participant Client
  participant APIEndpoint as RagRecommendationView
  participant Recommender as recommend_models_for_query
  participant ResolveStrict as _resolve_strict_openrag
  participant SafeRetrieve as _safe_retrieve_strict
  
  Client->>APIEndpoint: GET /api/rag/recommend/?q=test
  APIEndpoint->>Recommender: (query, strict_openrag=True)
  Recommender->>ResolveStrict: (True, env RAG_STRICT_OPENRAG)
  ResolveStrict-->>Recommender: strict=True
  Recommender->>SafeRetrieve: retrieve documents
  alt OpenRAG unavailable
    SafeRetrieve-->>Recommender: OpenRAGRuntimeUnavailableError
    Recommender-->>APIEndpoint: exception
    APIEndpoint-->>Client: HTTP 503 {error: message}
  else OpenRAG available
    SafeRetrieve-->>Recommender: documents
    Recommender-->>APIEndpoint: RecommendationResponse
    APIEndpoint-->>Client: HTTP 200 {recommendation}
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Smartappli/AIMER#813: Introduces the initial recommender recommendation flow that this PR extends with strict OpenRAG mode enforcement.

Poem

🐰 Strict retrieval hops with care,
No fallbacks in the midnight air,
Verify the paths before we run,
503 when work's undone—
OpenRAG now stands its ground! ✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch verifier-l-utilisation-complete-de-openrag

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Smartappli Smartappli merged commit eb2b50e into master May 10, 2026
8 of 21 checks passed
@Smartappli Smartappli deleted the verifier-l-utilisation-complete-de-openrag branch May 10, 2026 22:33
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a strict OpenRAG retrieval mode to prevent silent fallbacks when the runtime is unavailable, adding a verification script, environment variable support, and 503 error handling. Feedback indicates that hardcoding the strict flag in the API view prevents the environment variable from acting as a development fallback as documented, and suggests delegating this resolution to the recommender logic.

payload = recommend_models_for_query(
query=query,
top_k=top_k,
strict_openrag=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding strict_openrag=True here overrides the RAG_STRICT_OPENRAG environment variable, which contradicts the documentation in OPENRAG_INTEGRATION.md describing it as an optional dev fallback. By passing None (or omitting the argument), the API will remain strict by default (as defined in recommender.py) while still respecting the environment variable for local development scenarios where a fallback is desired.

Suggested change
strict_openrag=True,
strict_openrag=None,

self._check("omop_modality_concept_ids" in payload["query_profile"])
mock_recommend.assert_called_once()
call_kwargs = mock_recommend.call_args.kwargs
self._check_equal(call_kwargs["strict_openrag"], True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This assertion should be updated to expect None if the view is modified to delegate strictness resolution to the recommender, ensuring the environment variable configuration is correctly respected.

Suggested change
self._check_equal(call_kwargs["strict_openrag"], True)
self._check_equal(call_kwargs["strict_openrag"], None)

@codacy-production
Copy link
Copy Markdown

Not up to standards ⛔

🔴 Issues 9 high · 7 minor

Alerts:
⚠ 16 issues (≤ 0 issues of at least minor severity)

Results:
16 new issues

Category Results
Documentation 7 minor
ErrorProne 1 high
Security 8 high

View in Codacy

🟢 Metrics 13 complexity

Metric Results
Complexity 13

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbf5d97565

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

payload = recommend_models_for_query(
query=query,
top_k=top_k,
strict_openrag=True,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Respect RAG_STRICT_OPENRAG in API recommendations

The API now hard-codes strict_openrag=True when calling recommend_models_for_query, which bypasses _resolve_strict_openrag() and makes the documented RAG_STRICT_OPENRAG=0 fallback ineffective for /api/rag/recommend/. In environments where OpenRAG is intentionally unavailable (e.g., local/dev), requests will still fail with 503 even when the operator has explicitly disabled strict mode via env var, so this change removes a configuration escape hatch that the same commit advertises.

Useful? React with 👍 / 👎.

from RAG.verify_openrag import format_report

report = format_report()
assert "OpenRAG runtime verification:" in report

report = format_report()
assert "OpenRAG runtime verification:" in report
assert "runtime_ready:" in report

monkeypatch.setenv("OPENRAG_ENDPOINT", "http://localhost:8000")
status = main()
assert status in (0, 1)
from RAG.verify_openrag import main

monkeypatch.delenv("OPENRAG_ENDPOINT", raising=False)
assert main() == 1
strict_openrag=True,
)
except OpenRAGRuntimeUnavailableError as exc:
assert "OpenRAG retrieval is required" in str(exc)
from RAG.recommender import _resolve_strict_openrag

monkeypatch.delenv("RAG_STRICT_OPENRAG", raising=False)
assert _resolve_strict_openrag(None) is True
from RAG.recommender import _resolve_strict_openrag

monkeypatch.setenv("RAG_STRICT_OPENRAG", "false")
assert _resolve_strict_openrag(None) is False
from RAG.recommender import _resolve_strict_openrag

monkeypatch.setenv("RAG_STRICT_OPENRAG", "0")
assert _resolve_strict_openrag(True) is True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants