Skip to content

RAG Poisioning#17

Merged
preetkaran20 merged 12 commits into
mainfrom
rag_poision
May 16, 2026
Merged

RAG Poisioning#17
preetkaran20 merged 12 commits into
mainfrom
rag_poision

Conversation

@preetkaran20
Copy link
Copy Markdown
Member

@preetkaran20 preetkaran20 commented May 10, 2026

Summary by CodeRabbit

  • New Features

    • Interactive RAG Context Poisoning lab (3 difficulty levels) with Generate Code flow, retrieval trace, raw/HTML view toggle, and a “Identify the Poisoned Document” challenge with validation and feedback.
  • Content

    • Added multi-level retrieval documents and multi-level payload templates for code/telemetry/data-processing scenarios.
  • Localization

    • Added US‑English strings describing RAG Context Poisoning and related attack vectors.
  • Documentation

    • Substantially rewritten README with updated quick start and architecture.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

Warning

Rate limit exceeded

@preetkaran20 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 30 minutes and 13 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: db276904-257b-4c51-9adf-03ee2ee09a6d

📥 Commits

Reviewing files that changed from the base of the PR and between 612f32b and 504911b.

📒 Files selected for processing (4)
  • src/controllers/bola_chatbot_controller.py
  • src/controllers/indirect_prompt_injection_controller.py
  • src/controllers/prompt_injection_controller.py
  • src/controllers/rag_poisoning_controller.py
📝 Walkthrough

Walkthrough

Implements a RAG Context Poisoning lab: service retrieval/evaluation and validate flow, three controller endpoints, app wiring, Ollama client JSON-mode support, frontend facade (HTML/CSS/JS), per-level static documents, README and locale string additions, and a BOLA lab refactor to a planner→responder JSON flow.

Changes

RAG Context Poisoning Feature

Layer / File(s) Summary
Framework enum & app wiring
src/framework/decorators.py, src/app.py
Register VulnerabilityType.RAG_CONTEXT_POISONING and import RagContextPoisoningController.
RAG service: core types & retrieval
src/service/vulnerabilities/rag_poisoning_lab.py
Constants and frozen RagPoisoningLevel, filesystem doc loading, token-overlap + similarity scoring, trigger-word boost, and LEVELS mapping.
RAG service: evaluate & validate
src/service/vulnerabilities/rag_poisoning_lab.py
evaluate_level() builds retrieved_context, stores poisoned doc ids in CHALLENGE_ANSWERS, calls chat_completion(), returns generated_code and sanitized retrieved_docs; validate_challenge() checks selections and returns feedback.
Service package exports
src/service/vulnerabilities/__init__.py
Expose RAG_LEVELS, evaluate_rag_level, and validate_rag_challenge.
API Controller & Endpoints
src/controllers/rag_poisoning_controller.py
RagContextPoisoningController with _handle_level routing generate/validate, and level1/level2/level3 endpoints mapping errors to JSON.
Ollama client JSON-mode
src/ollama_client.py
Add json_mode: bool = False to chat helpers and set payload["format"]="json" when enabled.
BOLA lab refactor
src/service/vulnerabilities/bola_chatbot_lab.py
Switch to two-call planner→responder flow using chat_completion_with_messages(..., json_mode=True), add plan parsing/extraction and context builders, and adjust per-level behavior and error handling.
Per-level documents (static JSON)
src/service/vulnerabilities/docs/LEVEL1/documents.json, .../LEVEL2/documents.json, .../LEVEL3/documents.json
Add document arrays with doc_id, title, chunk_id, content, scoring metadata, and is_poisoned flags.
Frontend CSS
src/static/facade/rag_poisoning_template.css
Styles for facade root, panels, retrieved-docs, analysis/checkbox UI, and challenge feedback states.
Frontend HTML template
src/static/facade/rag_poisoning_template.html
Facade layout: readonly Prompt, Generate button, Generated Code and Retrieved Documents panels, Toggle Raw, hidden Challenge UI, and output area.
Frontend JS: entry & state
src/static/facade/rag_poisoning_template.js
IIFE binds elements, computes active level from global state, and updates meta endpoint path.
Frontend JS: response/render helpers
src/static/facade/rag_poisoning_template.js
Response parsing, message extraction, escapeHtml, markdown→safe HTML with fenced code handling, and retrieval trace rendering.
Frontend JS: challenge & generate flow
src/static/facade/rag_poisoning_template.js
Generate action posts {action:"generate", user_input}, stores challenge_session_id, renders retrieved docs/generated code, populates challenge options, and validates selection via {action:"validate", ...}.
Locale strings & README
locale/messages_us.properties, README.md
Add vuln.rag_context_poisoning, three attack.* entries, three payload.l{1,2,3}_code_request keys, and substantial README rewrite.

Sequence Diagram(s)

sequenceDiagram
  participant Browser as Browser UI
  participant Controller as RagContextPoisoningController
  participant Service as rag_poisoning_lab.evaluate_level
  participant Model as chat_completion (LLM)

  Browser->>Controller: POST /api/.../rag-context-poisoning/level{n} {action:"generate", user_input, model?}
  Controller->>Service: evaluate_rag_level(level, user_input, model)
  Service->>Service: load docs for LEVEL{n}, compute retrieved_context (scoring + boosts)
  Service->>Model: chat_completion(system_prompt + retrieved_context + user_input)
  Model-->>Service: assistant_output / generated_code
  Service-->>Controller: { generated_code, retrieved_docs (sanitized), challenge_session_id }
  Controller-->>Browser: JSON response with generated_code & retrieved_docs
  Browser->>Controller: POST {action:"validate", challenge_session_id, selected_doc_ids}
  Controller->>Service: validate_rag_challenge(level, challenge_session_id, selected_doc_ids)
  Service-->>Controller: { correct, feedback }
  Controller-->>Browser: { correct, feedback }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble through docs and hidden cues,
prompts assemble, code to peruse,
three levels whisper, panels glow,
choose the poisoned bit I know —
a rabbit cheers: detect, decode, amuse.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title "RAG Poisioning" contains a typo ("Poisioning" instead of "Poisoning") and is vague about the scope of changes. The PR introduces a complete RAG Context Poisoning vulnerability lab with multiple controllers, services, UI components, and documentation, but the title does not clearly convey these substantial additions. Consider revising the title to be more descriptive and accurate, such as "Add RAG Context Poisoning vulnerability lab" or "Implement RAG Context Poisoning challenge with UI and validation". Also fix the spelling of "Poisoning".
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch rag_poision

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
src/controllers/rag_poisoning_controller.py (2)

46-51: ⚡ Quick win

Consider returning proper HTTP status codes for error responses.

The error handlers return error dictionaries but don't set explicit HTTP status codes. FastAPI will return 200 OK by default, which may not accurately represent error conditions. Consider returning appropriate status codes (400 for ValueError, 503 for service unavailable).

🔧 Proposed fix to add proper status codes

Update the error handling to use FastAPI's JSONResponse:

from fastapi import Request
from fastapi.responses import JSONResponse
import httpx

# In the helper method:
async def _handle_level_request(self, request: Request, level: int) -> dict | JSONResponse:
    """Common handler for RAG poisoning level requests."""
    data = await request.json()
    user_input = data.get("user_input", "")
    model = data.get("model")
    temperature = data.get("temperature")
    
    try:
        return await evaluate_rag_level(level, user_input, temperature=temperature, model=model)
    except ValueError as exc:
        return JSONResponse({"error": str(exc)}, status_code=400)
    except httpx.RequestError as exc:
        return JSONResponse({"error": "Model service unavailable"}, status_code=503)

Also applies to: 72-77, 98-103

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/controllers/rag_poisoning_controller.py` around lines 46 - 51, The
current error handlers in the RAG poisoning endpoints return plain dicts (e.g.,
in the try/except around evaluate_rag_level) which yields 200 OK; change these
to return FastAPI JSONResponse objects with proper status codes: return
JSONResponse({"error": str(exc)}, status_code=400) for ValueError and
JSONResponse({"error":"Model service unavailable"}, status_code=503) for
httpx.RequestError, and add the required import for JSONResponse; apply the same
change to the other identical handlers noted (the blocks around
evaluate_rag_level at the other locations).

39-51: ⚡ Quick win

Consider extracting common request handling logic.

All three endpoint handlers (level1, level2, level3) follow an identical pattern: extract request data, call evaluate_rag_level with the level number, and handle the same exceptions. This presents an opportunity to reduce duplication.

♻️ Proposed refactor to reduce duplication

Add a helper method to the controller class:

async def _handle_level_request(self, request: Request, level: int) -> dict:
    """Common handler for RAG poisoning level requests."""
    data = await request.json()
    user_input = data.get("user_input", "")
    model = data.get("model")
    temperature = data.get("temperature")
    
    try:
        return await evaluate_rag_level(level, user_input, temperature=temperature, model=model)
    except ValueError as exc:
        return {"error": str(exc)}
    except httpx.RequestError as exc:
        return {"error": "Model service unavailable"}

Then simplify each endpoint:

async def level1(self, request: Request) -> dict:
    """Level 1: Obvious Poisoned Guidance"""
    return await self._handle_level_request(request, 1)

async def level2(self, request: Request) -> dict:
    """Level 2: Hidden Instruction Injection"""
    return await self._handle_level_request(request, 2)

async def level3(self, request: Request) -> dict:
    """Level 3: Multi-Document Context Poisoning"""
    return await self._handle_level_request(request, 3)

Also applies to: 65-77, 91-103

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/controllers/rag_poisoning_controller.py` around lines 39 - 51, The three
endpoint methods level1, level2, and level3 duplicate request parsing and error
handling; add a private helper method (e.g., _handle_level_request(self,
request: Request, level: int)) that extracts data (user_input, model,
temperature), calls evaluate_rag_level(level, ...), and centralizes the
ValueError and httpx.RequestError handling, then refactor level1/level2/level3
to simply return await self._handle_level_request(request, 1/2/3) to remove
duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/service/vulnerabilities/__init__.py`:
- Around line 21-24: Add a verify_rag_level_secret function to
rag_poisoning_lab.py that checks incoming secret tokens against the module's
LEVELS (RAG_LEVELS) similarly to other labs (e.g., prompt_injection), export it
from the package init alongside LEVELS and evaluate_rag_level, and register the
new verify_rag_level_secret in the central dispatcher in registry.py so the
framework can validate RAG poisoning secrets; use the same signature and
behavior pattern as existing verify_* functions to ensure consistency with
evaluate_rag_level and RAG_LEVELS.

In `@src/static/facade/rag_poisoning_template.css`:
- Line 76: Remove the deprecated CSS property "word-break: break-word" from the
stylesheet (it's redundant because "overflow-wrap: anywhere" is already set);
locate the rule in rag_poisoning_template.css that contains both "overflow-wrap:
anywhere" and "word-break: break-word" and delete the "word-break: break-word"
declaration so only "overflow-wrap: anywhere" remains.

In `@src/static/facade/rag_poisoning_template.js`:
- Line 84: The template currently renders `(doc.metadata?.trust_score *
100).toFixed(1)` which yields "NaN%" when metadata or trust_score is undefined;
change the expression to guard with nullish coalescing or conditional rendering
so you compute trust = (doc.metadata?.trust_score ?? null) and then render
either a formatted percent `(trust !== null ? (trust * 100).toFixed(1) + '%' :
'N/A')` (or default 0%) inside the same div with the `.doc-score` class; keep
`doc.similarity_score` logic as-is but apply a similar guard if needed.
- Around line 111-116: The telemetry rendering builds raw HTML using
telemetryData.map into the html string (variables: html, telemetryData,
entry.timestamp, entry.data, telemetry-entry) which risks XSS; instead construct
DOM nodes and set text content or explicitly escape values before inserting into
the page. Replace the template literal approach with creating elements (e.g.,
document.createElement for the wrapper and child nodes), assign entry.timestamp
and JSON.stringify(entry.data) to textContent (or use a trusted sanitizer) and
append them to the container rather than using innerHTML so no user-controlled
text is interpreted as HTML.
- Around line 81-87: The HTML string assembly in the retrievedDocs.map (assigned
to variable html) injects unsanitized fields doc.doc_id, doc.chunk_id,
doc.content and doc.metadata?.trust_score directly into markup, creating an XSS
vector; replace this string-concatenation approach with safe DOM construction:
for each item in retrievedDocs create elements via document.createElement,
assign text values using textContent/innerText (or setAttribute for non-text
attributes), truncate doc.content via substring on the text value (not raw HTML)
and format numeric trust_score before inserting, and then append these nodes to
the container instead of setting innerHTML from the html string. Ensure all
places referencing retrievedDocs.map and the html variable are updated to use
this safe DOM path.

---

Nitpick comments:
In `@src/controllers/rag_poisoning_controller.py`:
- Around line 46-51: The current error handlers in the RAG poisoning endpoints
return plain dicts (e.g., in the try/except around evaluate_rag_level) which
yields 200 OK; change these to return FastAPI JSONResponse objects with proper
status codes: return JSONResponse({"error": str(exc)}, status_code=400) for
ValueError and JSONResponse({"error":"Model service unavailable"},
status_code=503) for httpx.RequestError, and add the required import for
JSONResponse; apply the same change to the other identical handlers noted (the
blocks around evaluate_rag_level at the other locations).
- Around line 39-51: The three endpoint methods level1, level2, and level3
duplicate request parsing and error handling; add a private helper method (e.g.,
_handle_level_request(self, request: Request, level: int)) that extracts data
(user_input, model, temperature), calls evaluate_rag_level(level, ...), and
centralizes the ValueError and httpx.RequestError handling, then refactor
level1/level2/level3 to simply return await self._handle_level_request(request,
1/2/3) to remove duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 10073d4c-09b1-43bc-8740-ee2d5c2681ec

📥 Commits

Reviewing files that changed from the base of the PR and between af068a5 and 1ad5987.

📒 Files selected for processing (9)
  • locale/messages_us.properties
  • src/app.py
  • src/controllers/rag_poisoning_controller.py
  • src/framework/decorators.py
  • src/service/vulnerabilities/__init__.py
  • src/service/vulnerabilities/rag_poisoning_lab.py
  • src/static/facade/rag_poisoning_template.css
  • src/static/facade/rag_poisoning_template.html
  • src/static/facade/rag_poisoning_template.js

Comment thread src/service/vulnerabilities/__init__.py
Comment thread src/static/facade/rag_poisoning_template.css
Comment thread src/static/facade/rag_poisoning_template.js Outdated
Comment thread src/static/facade/rag_poisoning_template.js Outdated
Comment thread src/static/facade/rag_poisoning_template.js Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/static/facade/rag_poisoning_template.css (1)

95-97: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove deprecated word-break: break-word and use overflow-wrap consistently.

Line 96 and Line 201 use deprecated word-break: break-word; this is likely to keep failing stylelint and is already covered in prior feedback.

Suggested fix
 .llmforge-facade-output {
@@
   white-space: pre-wrap;
   overflow-wrap: anywhere;
-  word-break: break-word;
   min-width: 0;
   font-size: 12px;
 }
@@
 .doc-content {
   font-size: 11px;
   color: `#24292e`;
   white-space: pre-wrap;
-  word-break: break-word;
+  overflow-wrap: anywhere;
   max-height: 150px;
   overflow: hidden;
 }

Also applies to: 200-201

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.css` around lines 95 - 97, Remove
the deprecated declaration "word-break: break-word" wherever it appears in this
CSS file and rely on the existing "overflow-wrap: anywhere;" for the same
selectors; specifically locate occurrences of the property "word-break:
break-word" and delete them, ensuring each selector still contains
"overflow-wrap: anywhere;" (and keep "min-width: 0;" where present) so stylelint
no longer flags the rule and overflow-wrap is used consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/service/vulnerabilities/rag_poisoning_lab.py`:
- Around line 149-168: The sample document entries used by get_poisoned_docs
have mismatched poisoning metadata: get_poisoned_docs marks certain level-3 docs
as poisoned but the corresponding document dicts (e.g., the entry with doc_id
"observability_principles" and the other doc dicts in the same blocks) have
"is_poisoned": False; update those document dicts to set "is_poisoned": True so
the document records align with the poisoning map returned by get_poisoned_docs
(apply the same fix to the other blocks mentioned around the ranges 170-193,
195-216, 237-239).
- Line 323: The default literals in generate_malicious_logs (e.g.,
auth_token="Bearer sk-123abc456" and user_agent="Mozilla/5.0") and similar
hardcoded strings elsewhere in this file should be replaced with explicit
synthetic placeholders to avoid triggering secret/PII scanners; update
generate_malicious_logs to use values like auth_token="Bearer
<SYNTHETIC_API_KEY>" and user_agent="<SYNTHETIC_USER_AGENT>" and replace any
other token-like or sensitive-looking literals referenced at the other spots
(the ones around lines noted in the review) with clear markers such as
"<SYNTHETIC_TOKEN>", "<SYNTHETIC_CARD_NUMBER>", or "<SYNTHETIC_SECRET>" so it is
obvious these are harmless placeholders.
- Line 456: The code treats 0.0 as falsy by doing "selected_temperature =
temperature or 0.7"; change this to preserve an explicit 0.0 by checking for
None instead (e.g., set selected_temperature = temperature if temperature is not
None else 0.7), so the temperature parameter's explicit value is respected;
update the assignment that uses the selected_temperature and temperature symbols
accordingly.

---

Duplicate comments:
In `@src/static/facade/rag_poisoning_template.css`:
- Around line 95-97: Remove the deprecated declaration "word-break: break-word"
wherever it appears in this CSS file and rely on the existing "overflow-wrap:
anywhere;" for the same selectors; specifically locate occurrences of the
property "word-break: break-word" and delete them, ensuring each selector still
contains "overflow-wrap: anywhere;" (and keep "min-width: 0;" where present) so
stylelint no longer flags the rule and overflow-wrap is used consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f843a910-2437-4254-9184-9b23c50eecd1

📥 Commits

Reviewing files that changed from the base of the PR and between 1ad5987 and 9bd4263.

📒 Files selected for processing (3)
  • src/service/vulnerabilities/rag_poisoning_lab.py
  • src/static/facade/rag_poisoning_template.css
  • src/static/facade/rag_poisoning_template.html

Comment thread src/service/vulnerabilities/rag_poisoning_lab.py Outdated
Comment thread src/service/vulnerabilities/rag_poisoning_lab.py Outdated
Comment thread src/service/vulnerabilities/rag_poisoning_lab.py Outdated
Karan Preet Singh Sasan and others added 4 commits May 10, 2026 16:41
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
src/static/facade/rag_poisoning_template.css (1)

209-209: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Deprecated word-break: break-word.

Same deprecation already flagged at line 104. overflow-wrap: break-word (or anywhere) is the modern replacement; drop the deprecated declaration here as well.

🧹 Proposed fix
   white-space: pre-wrap;
-  word-break: break-word;
   max-height: 150px;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.css` at line 209, Replace the
deprecated CSS declaration 'word-break: break-word;' in this rule with the
modern equivalent by removing that line and adding 'overflow-wrap: break-word;'
(or 'overflow-wrap: anywhere;' if you prefer the newer behavior); also ensure
the other occurrence flagged earlier (the one at the previous instance of
'word-break: break-word;') is handled consistently so both use overflow-wrap
instead of the deprecated property.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/controllers/rag_poisoning_controller.py`:
- Around line 32-40: The poll_logs branch currently passes raw cursor and
batch_size from the request into poll_rag_log_session which unconditionally does
int(...) and can raise on bad input; coerce and validate both values in the
controller before calling poll_rag_log_session (when action == "poll_logs"):
attempt to convert cursor and batch_size to int with safe try/except, enforce
sensible defaults (e.g., cursor=0, batch_size=1 or reject with a clear error
response) and reject non-numeric or out-of-range values rather than letting a
ValueError propagate; keep the log_session_id trimming and do not change
clear_logs behavior.

In `@src/service/vulnerabilities/rag_poisoning_lab.py`:
- Line 26: LOG_SESSIONS is a module-level dict that grows unbounded because
evaluate_level creates entries and clear_log_session only clears contents
instead of removing the key, and it is not visible across multiple workers
(generate vs poll_logs), causing memory leaks and intermittent "Unknown log
session." errors; fix by turning LOG_SESSIONS into a bounded/TTL store (attach
created_at to each entry and evict oldest or expired entries when inserting, or
limit total size) and change clear_log_session to pop the log_session_id entry
instead of mutating it, and address multi-worker safety by either documenting
this lab as single-worker-only or replacing the in-process store with a shared
backend (e.g., Redis or sqlite) used by evaluate_level, generate, and poll_logs
so sessions are visible across workers.

In `@src/static/facade/rag_poisoning_template.css`:
- Around line 284-291: The .rag-analysis-section CSS uses both padding-top and a
later padding shorthand which overrides it; remove the redundant padding-top:
16px or place it after the padding shorthand so the intended top padding
persists—update the .rag-analysis-section rule to keep only the needed
declarations (either a single padding shorthand or explicit per-side paddings)
so declaration-block-no-shorthand-property-overrides is not violated.

In `@src/static/facade/rag_poisoning_template.js`:
- Around line 220-225: The code references an undefined helper responseMessage
in multiple error branches causing ReferenceErrors; add a small file-scoped
function named responseMessage(data, fallback) that returns a safe string
(preferably data.error || data.message || fallback) and use it in the three
failure branches (the blocks that call stopPollingLogs() and set
output.textContent) so output.textContent gets a reliable string; ensure the
helper does not assume a global DOM element and only formats the message,
leaving stopPollingLogs(), output.className, and other UI updates as-is.
- Around line 76-137: The markdownToSafeHtml function is vulnerable because
heading text (heading[2]), list items (listItem[1]) and paragraph lines (line)
are inserted unescaped into html; fix by running these captured strings through
the existing escapeHtml helper before concatenation (use escapeHtml(heading[2]),
escapeHtml(listItem[1]) and escapeHtml(line.trim()) when building <hN>, <li> and
<p> respectively), keep the fenced-code branch unchanged, and ensure the comment
stash/restore logic still reinserts the original <!-- ... --> tokens after
escaping so the raw-view toggle continues to show comments.

---

Duplicate comments:
In `@src/static/facade/rag_poisoning_template.css`:
- Line 209: Replace the deprecated CSS declaration 'word-break: break-word;' in
this rule with the modern equivalent by removing that line and adding
'overflow-wrap: break-word;' (or 'overflow-wrap: anywhere;' if you prefer the
newer behavior); also ensure the other occurrence flagged earlier (the one at
the previous instance of 'word-break: break-word;') is handled consistently so
both use overflow-wrap instead of the deprecated property.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: dbe4d6a4-b63d-4610-95e6-1374254386f0

📥 Commits

Reviewing files that changed from the base of the PR and between 9bd4263 and c73dcf0.

📒 Files selected for processing (9)
  • src/controllers/rag_poisoning_controller.py
  • src/service/vulnerabilities/__init__.py
  • src/service/vulnerabilities/docs/LEVEL1/documents.json
  • src/service/vulnerabilities/docs/LEVEL2/documents.json
  • src/service/vulnerabilities/docs/LEVEL3/documents.json
  • src/service/vulnerabilities/rag_poisoning_lab.py
  • src/static/facade/rag_poisoning_template.css
  • src/static/facade/rag_poisoning_template.html
  • src/static/facade/rag_poisoning_template.js
✅ Files skipped from review due to trivial changes (1)
  • src/service/vulnerabilities/docs/LEVEL2/documents.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/service/vulnerabilities/init.py

Comment thread src/controllers/rag_poisoning_controller.py Outdated
Comment thread src/service/vulnerabilities/rag_poisoning_lab.py Outdated
Comment on lines +284 to +291
.rag-analysis-section {
margin-top: 20px;
padding-top: 16px;
border-top: 1px solid #e1e4e8;
background: #f6f8fa;
padding: 16px;
border-radius: 6px;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

padding shorthand overrides padding-top.

padding: 16px on line 289 resets all four sides, silently undoing padding-top: 16px on line 288. Either drop the redundant padding-top or move it after the shorthand. Stylelint also flags this (declaration-block-no-shorthand-property-overrides).

🧹 Proposed fix
 .rag-analysis-section {
   margin-top: 20px;
-  padding-top: 16px;
   border-top: 1px solid `#e1e4e8`;
   background: `#f6f8fa`;
   padding: 16px;
   border-radius: 6px;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.rag-analysis-section {
margin-top: 20px;
padding-top: 16px;
border-top: 1px solid #e1e4e8;
background: #f6f8fa;
padding: 16px;
border-radius: 6px;
}
.rag-analysis-section {
margin-top: 20px;
border-top: 1px solid `#e1e4e8`;
background: `#f6f8fa`;
padding: 16px;
border-radius: 6px;
}
🧰 Tools
🪛 Stylelint (17.10.0)

[error] 289-289: Overridden property "padding-top" by shorthand "padding" (declaration-block-no-shorthand-property-overrides)

(declaration-block-no-shorthand-property-overrides)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.css` around lines 284 - 291, The
.rag-analysis-section CSS uses both padding-top and a later padding shorthand
which overrides it; remove the redundant padding-top: 16px or place it after the
padding shorthand so the intended top padding persists—update the
.rag-analysis-section rule to keep only the needed declarations (either a single
padding shorthand or explicit per-side paddings) so
declaration-block-no-shorthand-property-overrides is not violated.

Comment on lines +76 to +137
function markdownToSafeHtml(markdownText) {
const comments = [];
let text = String(markdownText || "").replace(/<!--[\s\S]*?-->/g, function (match) {
const token = "__HTML_COMMENT_" + comments.length + "__";
comments.push(match);
return token;
});

text = text.replace(/```([\s\S]*?)```/g, function (_, code) {
return "\n<pre><code>" + escapeHtml(code.trim()) + "</code></pre>\n";
});

const lines = text.split("\n");
let html = "";
let inList = false;

for (let i = 0; i < lines.length; i += 1) {
const line = lines[i];
const heading = /^(#{1,4})\s+(.*)$/.exec(line);
const listItem = /^\s*-\s+(.*)$/.exec(line);

if (heading) {
if (inList) {
html += "</ul>";
inList = false;
}
const level = Math.min(4, heading[1].length + 2);
html += "<h" + level + ">" + heading[2] + "</h" + level + ">";
continue;
}

if (listItem) {
if (!inList) {
html += "<ul>";
inList = true;
}
html += "<li>" + listItem[1] + "</li>";
continue;
}

if (inList) {
html += "</ul>";
inList = false;
}

if (line.trim() === "") {
html += "<br>";
} else {
html += "<p>" + line + "</p>";
}
}

if (inList) {
html += "</ul>";
}

html = html.replace(/__HTML_COMMENT_(\d+)__/g, function (_, idx) {
return comments[Number(idx)] || "";
});

return html;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

markdownToSafeHtml is not actually safe — headings / list items / paragraphs are not escaped.

Only the fenced-code branch on line 85 escapes content. Heading text (line 103: heading[2]), list items (line 112: listItem[1]), and paragraph lines (line 124: line) are concatenated directly into the HTML string and ultimately assigned via retrievalTrace.innerHTML = html (line 166). A doc whose content contains e.g. # Hello <img src=x onerror=alert(1)> will execute script in the user's page.

Even though today's content comes from in-repo JSON, this function lives in a context-poisoning lab whose entire premise is that retrieved content is attacker-influenced. The function name is doing the wrong thing here.

Escape each captured piece before inserting:

🛡️ Proposed fix
       if (heading) {
         ...
-        const level = Math.min(4, heading[1].length + 2);
-        html += "<h" + level + ">" + heading[2] + "</h" + level + ">";
+        const level = Math.min(4, heading[1].length + 2);
+        html += "<h" + level + ">" + escapeHtml(heading[2]) + "</h" + level + ">";
         continue;
       }

       if (listItem) {
         ...
-        html += "<li>" + listItem[1] + "</li>";
+        html += "<li>" + escapeHtml(listItem[1]) + "</li>";
         continue;
       }
       ...
       if (line.trim() === "") {
         html += "<br>";
       } else {
-        html += "<p>" + line + "</p>";
+        html += "<p>" + escapeHtml(line) + "</p>";
       }

Note: the HTML-comment stash/restore at lines 78-82 / 132-134 is intentional (the comments are reinserted as <!-- ... --> so the raw-view toggle can reveal them). Keep that behavior but make sure the surrounding text is escaped before the restore runs.

🧰 Tools
🪛 OpenGrep (1.20.0)

[ERROR] 94-94: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)


[ERROR] 95-95: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.

(coderabbit.command-injection.exec-js)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.js` around lines 76 - 137, The
markdownToSafeHtml function is vulnerable because heading text (heading[2]),
list items (listItem[1]) and paragraph lines (line) are inserted unescaped into
html; fix by running these captured strings through the existing escapeHtml
helper before concatenation (use escapeHtml(heading[2]), escapeHtml(listItem[1])
and escapeHtml(line.trim()) when building <hN>, <li> and <p> respectively), keep
the fenced-code branch unchanged, and ensure the comment stash/restore logic
still reinserts the original <!-- ... --> tokens after escaping so the raw-view
toggle continues to show comments.

Comment thread src/static/facade/rag_poisoning_template.js Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

♻️ Duplicate comments (1)
src/service/vulnerabilities/rag_poisoning_lab.py (1)

24-25: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Session answer storage is unbounded and process-local.

CHALLENGE_ANSWERS only grows, never expires/pops, and is local to a single worker process. In multi-worker deployments this can intermittently fail validation, and in long-running processes it creates memory growth.

Suggested direction
+# Example direction (in-process mitigation):
+# - add TTL + size cap eviction on insert
+# - pop session on successful/terminal validation
+# Long-term: move to shared store (Redis/DB) for multi-worker consistency.

Also applies to: 142-146, 193-199

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/service/vulnerabilities/rag_poisoning_lab.py` around lines 24 - 25,
CHALLENGE_ANSWERS is an unbounded, process-local dict which causes memory growth
and inconsistent validation across workers; replace it with a bounded, expiring
store (either an in-process LRU/TTL cache like cachetools.TTLCache with a
reasonable maxsize and ttl and wrapped with thread-safety, or an external shared
store like Redis keyed by challenge/session id) and update all reads/writes to
go through small helper functions (e.g., get_challenge_answer(id),
set_challenge_answer(id, value), delete_challenge_answer(id)) so eviction and
multi-worker consistency are handled centrally; ensure existing references to
CHALLENGE_ANSWERS are replaced to use these helpers.
🧹 Nitpick comments (1)
src/static/facade/rag_poisoning_template.css (1)

244-244: ⚡ Quick win

Replace deprecated word-break: break-word with overflow-wrap.

The word-break: break-word value is deprecated. Use overflow-wrap: break-word or overflow-wrap: anywhere instead for modern browser compatibility.

♻️ Proposed fix
   white-space: pre-wrap;
-  word-break: break-word;
+  overflow-wrap: break-word;
   max-height: 220px;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.css` at line 244, Replace the
deprecated CSS declaration "word-break: break-word" in
src/static/facade/rag_poisoning_template.css with the modern equivalent by
changing the property to "overflow-wrap: break-word" (or "overflow-wrap:
anywhere" if you want more aggressive breaking); locate the rule containing the
"word-break: break-word" declaration and update that single property to use
"overflow-wrap" so browsers handle long words correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Line 59: Fix the inconsistent indentation in the README.md ASCII architecture
diagram by replacing the tab character on the line containing the vertical bar
'│' with the same number of spaces used by the surrounding diagram lines so the
vertical bars align; locate the ASCII diagram block in README.md (the line with
'│') and update that line's leading whitespace to match adjacent lines' spacing
for consistent visual alignment.
- Around line 122-126: The README lists an unused environment variable
MODEL_PROFILE; remove the `MODEL_PROFILE` row from the environment variables
table so only `OLLAMA_MODEL` and `OLLAMA_EMBED_MODEL` remain; locate the table
in README.md (identify the row containing the literal `MODEL_PROFILE`) and
delete that row and its separator so the table stays properly formatted.
- Around line 88-98: Update the Quick Start examples in README.md so the health
check uses the real route /api/v1/health (replace the existing /health example)
and remove or replace the non-existent /generate example with a real API example
such as the vulnerabilities prompt-injection endpoint (e.g. POST to
/api/v1/vulnerabilities/prompt-injection/level1 with a JSON body like
{"user_input":"Say hello"})—or simply remove the placeholder /generate example
if you prefer not to document that call.

In `@src/controllers/rag_poisoning_controller.py`:
- Around line 27-40: In _handle_level, validate the parsed JSON and each
expected field before dispatching: wrap await request.json() in a try/except to
catch malformed JSON and return a 400 error; ensure "action" is a string,
"challenge_session_id" is a string, "selected_doc_ids" is a list (and optionally
contains strings/ints), "user_input" is a string, and "model" is of expected
type—if any type checks fail return a 400 with a clear message rather than
proceeding; update the validate branch (validate_rag_challenge) and the evaluate
branch (evaluate_rag_level) call sites in _handle_level to only pass
sanitized/typed values.

In `@src/service/vulnerabilities/bola_chatbot_lab.py`:
- Around line 197-200: Remove the raw print debug statements that log sensitive
request-derived content (e.g., the two prints around planning_response and plan)
and any similar prints at the other noted locations (around lines 268-271 and
339-342); instead, either delete them or replace them with non-sensitive logging
that uses a sanitizer/redactor before logging or logs only an invariant summary
(e.g., plan status or step count) and is gated behind a debug flag. Locate the
prints near planning_response and _parse_plan and update logging to call a
redaction helper or emit minimal metadata via the module logger (do not log
patient IDs, request text, or full planner responses).
- Around line 304-310: The except blocks in
src/service/vulnerabilities/bola_chatbot_lab.py currently return raw exception
text (str(e)) to the client; change both exception handlers that build the
response dict (the one around the "L2 error" print and the later similar block)
to log the full exception server-side using logger.exception(...) or print with
traceback, and return a generic sanitized error string (e.g. "An internal error
occurred") in the "error" field and keep "response" empty; do this for the
handler that prints "L2 error: {e}" and the other handler (the one around lines
375-381) so no raw exception details are echoed to clients.
- Around line 125-134: The _parse_plan function currently returns whatever
json.loads produces which can be non-object types and later break
_extract_action() and _extract_patient_id(); change _parse_plan to load the JSON
into a variable, verify it's a dict and contains required keys (at minimum
"action", and "patient_id" if your pipeline expects it), and raise a clear
ValueError if validation fails so callers get an explicit error instead of
downstream crashes.

In `@src/static/facade/rag_poisoning_template.css`:
- Around line 249-257: Remove the duplicate .doc-html CSS block so there is a
single authoritative definition: locate the two consecutive .doc-html selectors
and delete the redundant one (keep the desired font-size/color values from the
later block or merge if values differ) to eliminate dead code and avoid
confusion.

In `@src/static/facade/rag_poisoning_template.js`:
- Around line 229-252: The validateAnswerWithBackend function lacks
network/error handling: wrap the fetch + parseResponseBody call in try/catch,
check the response.ok before using data, and set
challengeFeedback.className/textContent to a clear error state when fetch throws
or res.ok is false (use a fallback message if data.feedback is missing);
reference levelFromGlobalState(), apiPrefix, challengeSessionId,
parseResponseBody(), and challengeFeedback to locate where to add the try/catch,
an if (!res.ok) branch that reads/uses parseResponseBody() safely, and a catch
block that logs the error and updates challengeFeedback with an appropriate
error message.

---

Duplicate comments:
In `@src/service/vulnerabilities/rag_poisoning_lab.py`:
- Around line 24-25: CHALLENGE_ANSWERS is an unbounded, process-local dict which
causes memory growth and inconsistent validation across workers; replace it with
a bounded, expiring store (either an in-process LRU/TTL cache like
cachetools.TTLCache with a reasonable maxsize and ttl and wrapped with
thread-safety, or an external shared store like Redis keyed by challenge/session
id) and update all reads/writes to go through small helper functions (e.g.,
get_challenge_answer(id), set_challenge_answer(id, value),
delete_challenge_answer(id)) so eviction and multi-worker consistency are
handled centrally; ensure existing references to CHALLENGE_ANSWERS are replaced
to use these helpers.

---

Nitpick comments:
In `@src/static/facade/rag_poisoning_template.css`:
- Line 244: Replace the deprecated CSS declaration "word-break: break-word" in
src/static/facade/rag_poisoning_template.css with the modern equivalent by
changing the property to "overflow-wrap: break-word" (or "overflow-wrap:
anywhere" if you want more aggressive breaking); locate the rule containing the
"word-break: break-word" declaration and update that single property to use
"overflow-wrap" so browsers handle long words correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b833c544-29bb-4c3e-90a0-b5345917e9fb

📥 Commits

Reviewing files that changed from the base of the PR and between c73dcf0 and 612f32b.

📒 Files selected for processing (12)
  • README.md
  • src/controllers/rag_poisoning_controller.py
  • src/ollama_client.py
  • src/service/vulnerabilities/__init__.py
  • src/service/vulnerabilities/bola_chatbot_lab.py
  • src/service/vulnerabilities/docs/LEVEL1/documents.json
  • src/service/vulnerabilities/docs/LEVEL2/documents.json
  • src/service/vulnerabilities/docs/LEVEL3/documents.json
  • src/service/vulnerabilities/rag_poisoning_lab.py
  • src/static/facade/rag_poisoning_template.css
  • src/static/facade/rag_poisoning_template.html
  • src/static/facade/rag_poisoning_template.js
✅ Files skipped from review due to trivial changes (1)
  • src/service/vulnerabilities/docs/LEVEL3/documents.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/service/vulnerabilities/docs/LEVEL2/documents.json
  • src/service/vulnerabilities/docs/LEVEL1/documents.json

Comment thread README.md
│ │ Memory │ │ Evaluation │ │
│ └─────────────┘ └─────────────────┘ │
└──────────────────────┬──────────────────────┘
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix inconsistent indentation in ASCII diagram.

Line 59 uses tabs instead of spaces, breaking the visual alignment of the architecture diagram. The vertical bars should align with the rest of the diagram.

🎨 Proposed fix for consistent spacing
-
+
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 59, Fix the inconsistent indentation in the README.md
ASCII architecture diagram by replacing the tab character on the line containing
the vertical bar '│' with the same number of spaces used by the surrounding
diagram lines so the vertical bars align; locate the ASCII diagram block in
README.md (the line with '│') and update that line's leading whitespace to match
adjacent lines' spacing for consistent visual alignment.

Comment thread README.md
Comment on lines 88 to 98
```bash
curl http://localhost:8000/health
```

**Generate example**
**Send a prompt:**

```bash
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt":"Say hello"}'
-H "Content-Type: application/json" \
-d '{"prompt": "Say hello"}'
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for /health and /generate endpoint definitions in the codebase

# Search for health endpoint
echo "=== Searching for /health endpoint ==="
rg -n --type=py -C3 '(/health|@.*route.*health|path.*health)' src/

echo ""
echo "=== Searching for /generate endpoint ==="
rg -n --type=py -C3 '(/generate|@.*route.*generate|path.*generate)' src/

Repository: SasanLabs/LLMForge

Length of output: 575


🏁 Script executed:

# Search more broadly for the /generate endpoint
echo "=== Broader search for /generate endpoint ==="
rg -n -i 'generate' src/ --type=py | grep -E '(@|route|path|def).*generate'

echo ""
echo "=== Check all Python controller files ==="
find src/ -name '*controller*.py' -o -name 'main.py' -o -name 'app.py' | head -20

echo ""
echo "=== Search for any POST routes ==="
rg -n '@.*post' src/ --type=py -A2 | head -40

echo ""
echo "=== List all files in src/controllers/ ==="
ls -la src/controllers/ 2>/dev/null || echo "Directory not found"

Repository: SasanLabs/LLMForge

Length of output: 1098


🏁 Script executed:

# Check meta_controller.py first (we know it has /health)
echo "=== meta_controller.py ==="
cat src/controllers/meta_controller.py

echo ""
echo "=== prompt_injection_controller.py (likely candidate for generate) ==="
head -50 src/controllers/prompt_injection_controller.py

echo ""
echo "=== bola_chatbot_controller.py ==="
head -50 src/controllers/bola_chatbot_controller.py

echo ""
echo "=== Check app.py for route registration ==="
head -80 src/app.py

Repository: SasanLabs/LLMForge

Length of output: 5587


🏁 Script executed:

# Search for the register_controllers implementation to understand generated routes
echo "=== Check framework register_controllers ==="
find src/ -name '*.py' -exec grep -l 'def register_controllers' {} \;

echo ""
echo "=== Check framework/__init__.py or similar ==="
find src/framework -name '*.py' | head -10

echo ""
echo "=== Search for vulnerability endpoint patterns ==="
rg -n 'vulnerabilities' src/ --type=py | grep -E '(route|path|prefix)' | head -20

Repository: SasanLabs/LLMForge

Length of output: 562


🏁 Script executed:

# Check registry.py for controller slug generation
echo "=== registry.py content (focusing on route generation) ==="
cat src/framework/registry.py | head -200

echo ""
echo "=== Check what endpoints are actually created ==="
rg -n 'controller_slug' src/framework/registry.py -A2 -B2

Repository: SasanLabs/LLMForge

Length of output: 8505


Fix the Quick Start examples—both endpoints are incorrect.

The endpoints shown in the Quick Start section will not work:

  • The /health endpoint is missing the /api/v1 prefix. The correct endpoint is /api/v1/health.
  • The /generate endpoint does not exist in the codebase.

If the intent is to demonstrate the API, consider showing an actual vulnerability level endpoint instead, such as:

curl -X POST http://localhost:8000/api/v1/vulnerabilities/prompt-injection/level1 \
  -H "Content-Type: application/json" \
  -d '{"user_input": "Say hello"}'

Alternatively, remove the /generate example if it was a placeholder and ensure the /health example uses the correct full path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 88 - 98, Update the Quick Start examples in README.md
so the health check uses the real route /api/v1/health (replace the existing
/health example) and remove or replace the non-existent /generate example with a
real API example such as the vulnerabilities prompt-injection endpoint (e.g.
POST to /api/v1/vulnerabilities/prompt-injection/level1 with a JSON body like
{"user_input":"Say hello"})—or simply remove the placeholder /generate example
if you prefer not to document that call.

Comment thread README.md
Comment on lines +122 to +126
| Environment Variable | Purpose | Default |
|-----------------------|--------------------------------------------------|-------------------|
| `MODEL_PROFILE` | LLM model pulled and used by the gateway | `phi3:mini` |
| `OLLAMA_MODEL` | Model name forwarded on inference requests | `phi3:mini` |
| `OLLAMA_EMBED_MODEL` | Embedding model used for vector search | `nomic-embed-text`|
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for environment variable references

echo "=== Searching for MODEL_PROFILE ==="
rg -n 'MODEL_PROFILE' --glob '!README.md'

echo ""
echo "=== Searching for OLLAMA_MODEL ==="
rg -n 'OLLAMA_MODEL' --glob '!README.md'

echo ""
echo "=== Searching for OLLAMA_EMBED_MODEL ==="
rg -n 'OLLAMA_EMBED_MODEL' --glob '!README.md'

Repository: SasanLabs/LLMForge

Length of output: 2932


Remove MODEL_PROFILE from the environment variables table; it is not used in the codebase.

The table documents MODEL_PROFILE as an environment variable, but it is not referenced anywhere in the implementation, services, or Docker configuration. Only OLLAMA_MODEL and OLLAMA_EMBED_MODEL are actively used.

Verification results
Variable Status Found Locations
OLLAMA_MODEL ✓ Implemented src/ollama_client.py, multiple services, docker-compose.yml
OLLAMA_EMBED_MODEL ✓ Implemented src/ollama_client.py, src/vector_search.py, docker-compose.yml
MODEL_PROFILE ✗ Not found No matches in codebase
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 122 - 126, The README lists an unused environment
variable MODEL_PROFILE; remove the `MODEL_PROFILE` row from the environment
variables table so only `OLLAMA_MODEL` and `OLLAMA_EMBED_MODEL` remain; locate
the table in README.md (identify the row containing the literal `MODEL_PROFILE`)
and delete that row and its separator so the table stays properly formatted.

Comment on lines +27 to +40
async def _handle_level(self, level: int, request: Request) -> dict:
data = await request.json()
action = str(data.get("action", "generate")).strip().lower()

if action == "validate":
session_id = str(data.get("challenge_session_id", "")).strip()
selected_doc_ids = data.get("selected_doc_ids", [])
return validate_rag_challenge(level, session_id, selected_doc_ids)

user_input = data.get("user_input", "")
model = data.get("model")

try:
return await evaluate_rag_level(level, user_input, model=model)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate JSON body and payload types before dispatching actions.

Malformed JSON and wrong field types currently escape as 500s (e.g., non-string user_input, non-list selected_doc_ids).

Suggested fix
     async def _handle_level(self, level: int, request: Request) -> dict:
-        data = await request.json()
+        try:
+            data = await request.json()
+        except ValueError:
+            return {"error": "Invalid JSON body"}
+
+        if not isinstance(data, dict):
+            return {"error": "Request body must be a JSON object"}
+
         action = str(data.get("action", "generate")).strip().lower()
 
         if action == "validate":
             session_id = str(data.get("challenge_session_id", "")).strip()
             selected_doc_ids = data.get("selected_doc_ids", [])
+            if not isinstance(selected_doc_ids, list) or not all(isinstance(x, str) for x in selected_doc_ids):
+                return {"error": "selected_doc_ids must be an array of strings"}
             return validate_rag_challenge(level, session_id, selected_doc_ids)
 
         user_input = data.get("user_input", "")
         model = data.get("model")
+        if not isinstance(user_input, str):
+            return {"error": "user_input must be a string"}
+        if model is not None and not isinstance(model, str):
+            return {"error": "model must be a string"}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def _handle_level(self, level: int, request: Request) -> dict:
data = await request.json()
action = str(data.get("action", "generate")).strip().lower()
if action == "validate":
session_id = str(data.get("challenge_session_id", "")).strip()
selected_doc_ids = data.get("selected_doc_ids", [])
return validate_rag_challenge(level, session_id, selected_doc_ids)
user_input = data.get("user_input", "")
model = data.get("model")
try:
return await evaluate_rag_level(level, user_input, model=model)
async def _handle_level(self, level: int, request: Request) -> dict:
try:
data = await request.json()
except ValueError:
return {"error": "Invalid JSON body"}
if not isinstance(data, dict):
return {"error": "Request body must be a JSON object"}
action = str(data.get("action", "generate")).strip().lower()
if action == "validate":
session_id = str(data.get("challenge_session_id", "")).strip()
selected_doc_ids = data.get("selected_doc_ids", [])
if not isinstance(selected_doc_ids, list) or not all(isinstance(x, str) for x in selected_doc_ids):
return {"error": "selected_doc_ids must be an array of strings"}
return validate_rag_challenge(level, session_id, selected_doc_ids)
user_input = data.get("user_input", "")
model = data.get("model")
if not isinstance(user_input, str):
return {"error": "user_input must be a string"}
if model is not None and not isinstance(model, str):
return {"error": "model must be a string"}
try:
return await evaluate_rag_level(level, user_input, model=model)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/controllers/rag_poisoning_controller.py` around lines 27 - 40, In
_handle_level, validate the parsed JSON and each expected field before
dispatching: wrap await request.json() in a try/except to catch malformed JSON
and return a 400 error; ensure "action" is a string, "challenge_session_id" is a
string, "selected_doc_ids" is a list (and optionally contains strings/ints),
"user_input" is a string, and "model" is of expected type—if any type checks
fail return a 400 with a clear message rather than proceeding; update the
validate branch (validate_rag_challenge) and the evaluate branch
(evaluate_rag_level) call sites in _handle_level to only pass sanitized/typed
values.

Comment on lines +125 to +134
def _parse_plan(planning_response: str) -> dict:
"""
Parse the planner's JSON response.

def _extract_patient_id_from_planning(planning_response: str) -> str | None:
"""Extract patient ID from the planner's JSON response (only relevant for Level 1 & 2)."""
plan = _extract_json_object(planning_response)
if plan is not None:
patient = plan.get("patient")
print("Extracted patient from planner:", patient)
if patient in ["patient_001", "patient_002", "patient_003"]:
return patient
return None
With json_mode=True the model is constrained to emit valid JSON, so
json.loads() is sufficient. A fallback strip of markdown fences is kept
as a safety net for models that wrap output despite the format flag.
"""
text = planning_response.strip().removeprefix("```json").removeprefix("```").removesuffix("```").strip()
return json.loads(text)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the parsed planner payload before using it.

json.loads() succeeding here does not mean the planner returned the object shape the rest of the pipeline expects. A valid JSON array/string/null will make _extract_action() and _extract_patient_id() blow up and turn the whole request into a failed lab run.

💡 Suggested hardening
 def _parse_plan(planning_response: str) -> dict:
@@
     """
     text = planning_response.strip().removeprefix("```json").removeprefix("```").removesuffix("```").strip()
-    return json.loads(text)
+    plan = json.loads(text)
+    if not isinstance(plan, dict):
+        raise ValueError("planner response must be a JSON object")
+    if "action" not in plan:
+        raise ValueError("planner response must include an action")
+    return plan
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/service/vulnerabilities/bola_chatbot_lab.py` around lines 125 - 134, The
_parse_plan function currently returns whatever json.loads produces which can be
non-object types and later break _extract_action() and _extract_patient_id();
change _parse_plan to load the JSON into a variable, verify it's a dict and
contains required keys (at minimum "action", and "patient_id" if your pipeline
expects it), and raise a clear ValueError if validation fails so callers get an
explicit error instead of downstream crashes.

Comment on lines +197 to +200
print("L1 planner response:", planning_response)

plan = _parse_plan(planning_response)
print("L1 parsed plan:", plan)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove or redact these planner debug logs.

These prints can persist patient IDs and request-derived medical details in application logs. In this flow, that's avoidable sensitive-data leakage.

Also applies to: 268-271, 339-342

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/service/vulnerabilities/bola_chatbot_lab.py` around lines 197 - 200,
Remove the raw print debug statements that log sensitive request-derived content
(e.g., the two prints around planning_response and plan) and any similar prints
at the other noted locations (around lines 268-271 and 339-342); instead, either
delete them or replace them with non-sensitive logging that uses a
sanitizer/redactor before logging or logs only an invariant summary (e.g., plan
status or step count) and is gated behind a debug flag. Locate the prints near
planning_response and _parse_plan and update logging to call a redaction helper
or emit minimal metadata via the module logger (do not log patient IDs, request
text, or full planner responses).

Comment on lines 304 to +310
except Exception as e:
print(f"L2 error: {e}")
return {
"level": 2,
"success": False,
"error": str(e),
"response": ""
"response": "",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return a sanitized error message instead of str(e).

Echoing raw exception text back to the client can leak model, transport, or parsing details. Keep the response generic and reserve the detailed error for server-side logging.

💡 Suggested fix
         return {
             "level": 2,
             "success": False,
-            "error": str(e),
+            "error": "can you please try again?",
             "response": "",
         }
@@
         return {
             "level": 3,
             "success": False,
-            "error": str(e),
+            "error": "can you please try again?",
             "response": "",
         }

Also applies to: 375-381

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 304-304: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/service/vulnerabilities/bola_chatbot_lab.py` around lines 304 - 310, The
except blocks in src/service/vulnerabilities/bola_chatbot_lab.py currently
return raw exception text (str(e)) to the client; change both exception handlers
that build the response dict (the one around the "L2 error" print and the later
similar block) to log the full exception server-side using logger.exception(...)
or print with traceback, and return a generic sanitized error string (e.g. "An
internal error occurred") in the "error" field and keep "response" empty; do
this for the handler that prints "L2 error: {e}" and the other handler (the one
around lines 375-381) so no raw exception details are echoed to clients.

Comment on lines +171 to +174
except Exception as exc:
generated_code = ""
assistant_output = f"LLM generation failed: {exc}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not return raw exception text to the client.

Line 173 surfaces str(exc) in assistant_output. That can leak internal model/service details. Return a generic message and log the exception server-side.

Patch sketch
+import logging
+
+logger = logging.getLogger(__name__)
...
     except Exception as exc:
         generated_code = ""
-        assistant_output = f"LLM generation failed: {exc}"
+        logger.exception("RAG code generation failed")
+        assistant_output = "LLM generation failed. Please try again."

Also applies to: 189-190

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 171-171: Do not catch blind exception: Exception

(BLE001)

Comment on lines +249 to +257
.doc-html {
font-size: 12px;
color: #333;
}

.doc-html {
font-size: 13px;
color: #24292e;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Remove duplicate .doc-html selector definition.

The .doc-html selector is defined twice in immediate succession. The second definition (lines 254-257) completely overrides the first (lines 249-252), making the first definition dead code and confusing for maintainers.

🧹 Proposed fix
-.doc-html {
-  font-size: 12px;
-  color: `#333`;
-}
-
 .doc-html {
   font-size: 13px;
   color: `#24292e`;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.doc-html {
font-size: 12px;
color: #333;
}
.doc-html {
font-size: 13px;
color: #24292e;
}
.doc-html {
font-size: 13px;
color: `#24292e`;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.css` around lines 249 - 257, Remove
the duplicate .doc-html CSS block so there is a single authoritative definition:
locate the two consecutive .doc-html selectors and delete the redundant one
(keep the desired font-size/color values from the later block or merge if values
differ) to eliminate dead code and avoid confusion.

Comment on lines +229 to +252
async function validateAnswerWithBackend(selectedDocIds) {
const level = levelFromGlobalState();
const endpoint = apiPrefix + "/api/v1/vulnerabilities/rag-context-poisoning/level" + level;

const res = await fetch(endpoint, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
action: "validate",
challenge_session_id: challengeSessionId,
selected_doc_ids: selectedDocIds,
}),
});

const data = await parseResponseBody(res);

if (data.correct) {
challengeFeedback.className = "challenge-feedback correct";
challengeFeedback.textContent = data.feedback;
} else {
challengeFeedback.className = "challenge-feedback incorrect";
challengeFeedback.textContent = data.feedback;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing error handling for network failures and non-OK responses.

If the fetch throws (network error) or the server returns a non-2xx status, no error feedback is shown to the user. Additionally, data.feedback may be undefined on error responses, resulting in empty or "undefined" feedback text.

🛠️ Proposed fix
   async function validateAnswerWithBackend(selectedDocIds) {
     const level = levelFromGlobalState();
     const endpoint = apiPrefix + "/api/v1/vulnerabilities/rag-context-poisoning/level" + level;

-    const res = await fetch(endpoint, {
-      method: "POST",
-      headers: { "Content-Type": "application/json" },
-      body: JSON.stringify({
-        action: "validate",
-        challenge_session_id: challengeSessionId,
-        selected_doc_ids: selectedDocIds,
-      }),
-    });
-
-    const data = await parseResponseBody(res);
-    
-    if (data.correct) {
-      challengeFeedback.className = "challenge-feedback correct";
-      challengeFeedback.textContent = data.feedback;
-    } else {
-      challengeFeedback.className = "challenge-feedback incorrect";
-      challengeFeedback.textContent = data.feedback;
+    try {
+      const res = await fetch(endpoint, {
+        method: "POST",
+        headers: { "Content-Type": "application/json" },
+        body: JSON.stringify({
+          action: "validate",
+          challenge_session_id: challengeSessionId,
+          selected_doc_ids: selectedDocIds,
+        }),
+      });
+
+      const data = await parseResponseBody(res);
+
+      if (!res.ok) {
+        challengeFeedback.className = "challenge-feedback incorrect";
+        challengeFeedback.textContent = responseMessage(data, "Validation request failed.");
+        return;
+      }
+
+      if (data.correct) {
+        challengeFeedback.className = "challenge-feedback correct";
+        challengeFeedback.textContent = data.feedback || "Correct!";
+      } else {
+        challengeFeedback.className = "challenge-feedback incorrect";
+        challengeFeedback.textContent = data.feedback || "Incorrect. Try again.";
+      }
+    } catch (err) {
+      challengeFeedback.className = "challenge-feedback incorrect";
+      challengeFeedback.textContent = "Network error: " + String(err.message || err);
     }
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/static/facade/rag_poisoning_template.js` around lines 229 - 252, The
validateAnswerWithBackend function lacks network/error handling: wrap the fetch
+ parseResponseBody call in try/catch, check the response.ok before using data,
and set challengeFeedback.className/textContent to a clear error state when
fetch throws or res.ok is false (use a fallback message if data.feedback is
missing); reference levelFromGlobalState(), apiPrefix, challengeSessionId,
parseResponseBody(), and challengeFeedback to locate where to add the try/catch,
an if (!res.ok) branch that reads/uses parseResponseBody() safely, and a catch
block that logs the error and updates challengeFeedback with an appropriate
error message.

Karan Preet Singh Sasan added 2 commits May 16, 2026 14:20
@preetkaran20 preetkaran20 merged commit 5364273 into main May 16, 2026
1 check passed
@github-actions
Copy link
Copy Markdown

🎉 Thanks for contributing @preetkaran20!
We’d love to stay in touch and grow SasanLabs 🚀
👉 Please fill this (takes 30 sec):
https://docs.google.com/forms/d/e/1FAIpQLSfwWVdnULUhtfruA-DN328NwKnBGebaWg9U5y0xivLLxxoMog/viewform?usp=pp_url&entry.1414771947=preetkaran20
Also consider ⭐ starring the repo if you like it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant