Skip to content

Introduce output validation for LLM-generated responses to ensure data consistency#657

Open
piyush06singhal wants to merge 2 commits into
AOSSIE-Org:mainfrom
piyush06singhal:feature/output-validation-llm
Open

Introduce output validation for LLM-generated responses to ensure data consistency#657
piyush06singhal wants to merge 2 commits into
AOSSIE-Org:mainfrom
piyush06singhal:feature/output-validation-llm

Conversation

@piyush06singhal
Copy link
Copy Markdown

@piyush06singhal piyush06singhal commented Apr 2, 2026

Addressed Issues:

Fixes #656


Summary

This PR introduces a lightweight output validation layer for LLM-generated question responses to prevent malformed or incomplete data from being returned by the API.

Currently, LLM-based endpoints return parsed outputs without validating their structure or content. This can lead to silent failures where responses contain missing fields, empty values, or inconsistent formats.


What was implemented

  • Added a new validation module: backend/Generator/output_validator.py
  • Implemented validation functions:
    • validate_mcq
    • validate_shortq
    • validate_boolq
  • Integrated validation into llm_generator.py:
    • Applied validation after parsing LLM responses
    • Filtered out invalid questions before returning results
  • Maintained existing API response structure (no breaking changes)

How this improves the system

  • Prevents malformed or incomplete questions from reaching the frontend
  • Eliminates silent failures in LLM-generated outputs
  • Ensures consistent and reliable API responses
  • Improves overall user experience with minimal and safe changes

Scope

In Scope

  • Validation of LLM-generated outputs only
  • Integration after parsing and before returning response

Out of Scope

  • Retry logic
  • Architectural changes
  • Frontend updates
  • Advanced schema systems

Screenshots/Recordings:

Not applicable (backend validation improvement)


Additional Notes:

Validation is implemented at the generator level to ensure correctness before responses are returned. The implementation is intentionally minimal to align with the existing system design and maintain backward compatibility.


AI Usage Disclosure:

We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.

Check one of the checkboxes below:

  • This PR does not contain AI-generated code at all.
  • This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.

I have used the following AI models and tools: ChatGPT (for guidance and review)


Checklist

  • My PR addresses a single issue, fixes a single bug or makes a single improvement.
  • My code follows the project's code style and conventions
  • If applicable, I have made corresponding changes or additions to the documentation
  • If applicable, I have made corresponding changes or additions to tests
  • My changes generate no new warnings or errors
  • I have joined the Discord server and I will share a link to this PR with the project maintainers there
  • I have read the Contribution Guidelines
  • Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.
  • I have filled this PR template completely and carefully, and I understand that my PR may be closed without review otherwise.

Summary by CodeRabbit

  • Bug Fixes
    • AI-generated questions are now validated before being shown: malformed or incomplete items are dropped.
    • Multiple-choice items must have at least two non-empty options and a non-empty correct answer.
    • Short-answer items must include non-empty question and answer text.
    • True/false items now require a bona fide boolean answer (true/false), preventing string/integer values.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e7e7eed3-c11c-4c7b-806a-f61710e3ccd0

📥 Commits

Reviewing files that changed from the base of the PR and between 90721b9 and baf8d71.

📒 Files selected for processing (1)
  • backend/Generator/output_validator.py
✅ Files skipped from review due to trivial changes (1)
  • backend/Generator/output_validator.py

📝 Walkthrough

Walkthrough

A lightweight output validation layer was added: a new output_validator.py module implements type-specific validators, and llm_generator.py now validates parsed LLM outputs for MCQ, short-answer, and boolean questions before returning them.

Changes

Cohort / File(s) Summary
Output Validation Module
backend/Generator/output_validator.py
New module with validate_mcq, validate_shortq, and validate_boolq. Each function filters a list of question objects, enforcing non-empty question strings, appropriate answer types, and MCQ options requirements.
Generator Integration
backend/Generator/llm_generator.py
Imports the three validators and updates generate_short_questions, generate_mcq_questions, and generate_boolean_questions to run parsed outputs through validation before returning validated lists.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant LLMGenerator as LLM Generator
    participant Validator as Output Validator
    participant API as API Response

    Client->>LLMGenerator: request (get_*_llm)
    LLMGenerator->>LLM: call LLM, receive raw text
    LLMGenerator->>LLMGenerator: parse raw text into question objects
    LLMGenerator->>Validator: validate parsed questions
    Validator-->>LLMGenerator: validated question list
    LLMGenerator->>API: return validated questions
    API-->>Client: HTTP 200 with validated payload
Loading

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hopped through code, both day and night,
Filtering questions to make them right.
MCQs, booleans, short and neat—
Only tidy answers make my treat.
Hop on, validators, take your flight!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: introducing output validation for LLM-generated responses to ensure data consistency, which matches the core objective of the PR.
Linked Issues check ✅ Passed The PR fully satisfies all coding requirements from issue #656: validation functions validate MCQ/short/boolean questions with correct rules, integration occurs after parsing in llm_generator.py, and existing API behavior is preserved.
Out of Scope Changes check ✅ Passed All changes are directly in scope: new output_validator.py module, validation function integration into llm_generator.py, and filtering of invalid questions. No retry logic, architecture changes, or frontend modifications are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
backend/Generator/output_validator.py (2)

36-39: Consider validating individual option elements.

The validator checks that options is a non-empty list with at least 2 elements, but doesn't verify that each element is a valid string. This could allow malformed options like [None, 123] to pass through.

🛡️ Optional: Stricter options validation
         # Check options field
         options = q.get("options")
         if not options or not isinstance(options, list) or len(options) < 2:
             continue
+            
+        # Verify all options are non-empty strings
+        if not all(isinstance(opt, str) and opt.strip() for opt in options):
+            continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/Generator/output_validator.py` around lines 36 - 39, The options list
validation in output_validator.py (where options = q.get("options")) must also
ensure each element is a non-empty string: replace the current check with logic
that verifies options is a list, filters/validates elements with isinstance(opt,
str) and opt.strip() != "", and treat the question as invalid (continue) if
fewer than two valid string options remain or any element is malformed;
reference the variable names options and q.get("options") when making the
change.

17-20: Minor: Docstring doesn't reflect the actual minimum option count.

The code at line 38 requires len(options) >= 2, but the docstring only mentions "non-empty list". Consider updating the docstring to document the actual constraint.

📝 Suggested docstring update
     Validation Rules:
         - question field must exist and be non-empty string
-        - options field must exist and be non-empty list
+        - options field must exist and be a list with at least 2 elements
         - correct_answer field must exist and be non-empty string
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/Generator/output_validator.py` around lines 17 - 20, The docstring
block titled "Validation Rules" in output_validator.py currently says "options
field must exist and be non-empty list" but the validation enforces len(options)
>= 2 (see the check using len(options) >= 2); update that docstring to state
that options must be a list with at least two entries (e.g., "options field must
exist and be a list with minimum length 2") so the documented constraint matches
the implemented check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/Generator/output_validator.py`:
- Around line 36-39: The options list validation in output_validator.py (where
options = q.get("options")) must also ensure each element is a non-empty string:
replace the current check with logic that verifies options is a list,
filters/validates elements with isinstance(opt, str) and opt.strip() != "", and
treat the question as invalid (continue) if fewer than two valid string options
remain or any element is malformed; reference the variable names options and
q.get("options") when making the change.
- Around line 17-20: The docstring block titled "Validation Rules" in
output_validator.py currently says "options field must exist and be non-empty
list" but the validation enforces len(options) >= 2 (see the check using
len(options) >= 2); update that docstring to state that options must be a list
with at least two entries (e.g., "options field must exist and be a list with
minimum length 2") so the documented constraint matches the implemented check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e8bb1564-41d6-4f3d-849b-6260d6255715

📥 Commits

Reviewing files that changed from the base of the PR and between 2038116 and 90721b9.

📒 Files selected for processing (2)
  • backend/Generator/llm_generator.py
  • backend/Generator/output_validator.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT]: Add Output Validation for LLM-generated Question Responses

1 participant