Skip to content

Feat/clear input 668#670

Open
zohaib-7035 wants to merge 19 commits into
AOSSIE-Org:mainfrom
zohaib-7035:feat/clear-input-668
Open

Feat/clear input 668#670
zohaib-7035 wants to merge 19 commits into
AOSSIE-Org:mainfrom
zohaib-7035:feat/clear-input-668

Conversation

@zohaib-7035
Copy link
Copy Markdown

@zohaib-7035 zohaib-7035 commented Apr 20, 2026

Resolves #668

Description

This PR addresses the missing "Clear Input" issue on the input page for a better user experience. A "Clear Input" button has been added near the text box that instantly resets all form fields back to their default states.

Changes Made:

  • Added a handleClear function in Text_Input.jsx to clear the text state and reset difficulty, numQuestions, docUrl, and isToggleOn back to default.
  • Injected a "Clear Input" button in the top right corner of the textarea section for quick access.
  • The user is no longer required to manually select all of the text.

Testing:

  • Verified that all states default correctly: (Easy Difficulty, 10 questions, toggles disabled).
  • Standard build processes executed locally.

Summary by CodeRabbit

  • New Features

    • Clear Input button to reset form fields
    • Form generation now returns both responder and edit URLs on success
    • Centralized logging with consistent JSON error responses
  • Bug Fixes

    • Stricter input validation with structured 400/500 responses
    • Hardened file uploads (sanitization, allowed extensions, size limits, 413 handling)
    • Improved transcript retrieval (timeouts, scoped temp files, better errors)
    • Google Docs/content endpoint reports service-unavailable when unavailable
    • Removed unwanted browser side-effect during form generation
  • Tests

    • Large backend test suite with extensive unit/integration coverage and test configs
    • CI workflow added to run backend tests on Python 3.10–3.12

zohaib-7035 and others added 17 commits February 10, 2026 20:56
…ening to server.py

Problem:
- No input validation on any API endpoint (empty text, huge payloads, invalid types)
- Missing error handling causes 500 errors with stack traces leaked to clients
- /getTranscript endpoint vulnerable to command injection via unsanitized video_id
- /getTranscript has race condition when concurrent requests use glob on shared folder
- File uploads have no size limits or extension validation
- No structured logging (bare print statements)
- Hardcoded credential paths with no env variable fallback

Solution:
- Add _validate_input_text() helper: checks for None, empty, type, and max length (50k chars)
- Add _validate_max_questions() helper: clamps to safe range [1, 20]
- Add _allowed_file() helper: whitelist txt/pdf/docx extensions
- Validate video_id against strict regex pattern (alphanumeric, 11 chars) before subprocess
- Add 30s timeout to yt-dlp subprocess call
- Scope subtitle files to video_id to prevent race conditions between concurrent requests
- Add Flask MAX_CONTENT_LENGTH (10 MB) with 413 error handler
- Wrap all endpoints in try/except with user-friendly error messages (no stack traces)
- Replace print() with Python logging module throughout
- Use os.environ.get() for Google service account file path
- Add input validation to generate_gform (qa_pairs and question_type)
- Use request.get_json(silent=True) to handle malformed JSON gracefully

Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix potential NameError in /getTranscript by wrapping clean_transcript
  in try/except and removing unused expected_vtt variable
- Fix input_questions (list) incorrectly passed as num_questions (int)
  in get_shortq_hard, get_mcq_hard, get_boolq_hard; fix type annotation
  in QuestionGenerator.generate() from bool to int
- Add try/except to generate_gform, remove webbrowser.open_new_tab()
  server-side anti-pattern, return structured JSON response
- Remove unused exception variable 'e' across all except blocks (Ruff F841)
- Tighten glob pattern in getTranscript to use dot separator
- Add strict=True to zip() in get_mcq_answer for defensive validation

Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix critical bug: predict_boolean_answer receives a string instead of
  a list, causing character-level iteration; now wraps question in [question]
- Fix critical bug: answer_style="true_false" is not a valid option in
  QuestionGenerator; changed to "sentences" in get_boolq_hard
- Fix critical bug: make_question_harder in get_boolq_hard receives full
  dict instead of question string; aligned with other hard-question endpoints
- Fix major issue: num_questions silently ignored because use_evaluator
  defaults to False; added use_evaluator=True to all 3 hard-question endpoints
- Fix minor issue: subtitles directory creation moved from __main__ guard
  to module level so it works under Gunicorn/WSGI deployments
- Fix nitpick: num_questions type annotation updated to int | None per PEP 484

Co-authored-by: Cursor <cursoragent@cursor.com>
The length equality check above the loop already guarantees both lists
are the same size, making strict=True redundant. Removing it avoids a
runtime error on Python versions below 3.10.

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the minimal test_server.py (which required a live server and
used raw requests) with a proper pytest-based test suite that runs
against Flask's built-in test client — no running server or GPU needed.

172 tests covering:
- All 15 API endpoints with valid input
- Invalid input (empty text, missing fields, wrong types)
- Boundary-value validation for max_questions (clamping 1–20)
- Edge cases (unicode, emoji, special characters, very long text)
- File upload with accepted/rejected file types
- HTTP method enforcement (405 on wrong methods)
- Error handling (generator/pipeline exceptions → 500)

Also adds:
- backend/conftest.py: mocks all heavy ML dependencies (transformers,
  spacy, Generator, Google APIs) at the sys.modules level so the full
  suite runs in <1 second
- .github/workflows/tests.yml: CI workflow that runs tests on every
  push/PR to main across Python 3.9, 3.10, and 3.11

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace `int | None` with `Optional[int]` in Generator/main.py to
ensure compatibility with Python 3.9 and satisfy CodeRabbit review.

Co-authored-by: Cursor <cursoragent@cursor.com>
…e check

Add one-line docstrings to all 16 route handlers and error handler in
server.py, and all ~80 test methods in test_server.py to satisfy the
80% docstring coverage threshold flagged by CodeRabbit pre-merge checks.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix path traversal vulnerability in upload_file() by sanitizing
  filename with werkzeug.utils.secure_filename before processing
- Fix credentials loading bug in generate_gform() — add store.get()
  so cached credentials are reused instead of re-running OAuth flow
- Use logger.exception instead of logger.error in subprocess handlers
  for better diagnostics with tracebacks
- Add safety comment explaining VIDEO_ID_PATTERN guards subtitle_path
- Remove unused mock_mcq_gen fixture arg from test_max_questions_float
- Add ClassVar annotations to mutable class attributes (Ruff compliance)
- Add 6 new tests: transcript happy-path, timeout (504), subprocess
  error (500), no subtitles (404), get_content happy-path, and
  service-unavailable (503) — total now 178 tests

Co-authored-by: Cursor <cursoragent@cursor.com>
Merged origin/main into test/backend-comprehensive-tests, resolving
conflicts in backend/server.py and backend/test_server.py:

- server.py: kept PR's import organization, input validation, and
  consistent logger usage; restored LLMQuestionGenerator import from
  upstream
- test_server.py: kept comprehensive pytest-based test suite over
  old-style test functions
- CRITICAL: /get_boolq_hard now uses BoolQGen.generate_boolq() instead
  of qg.generate(answer_style='sentences') which was generating
  sentence-based questions instead of boolean questions
- Add input validation (_validate_input_text, _validate_max_questions)
  to all LLM endpoints (get_shortq_llm, get_mcq_llm, get_boolq_llm,
  get_problems_llm) for consistency with other endpoints
- Replace app.logger with configured logger in LLM endpoints
- Add strict=True to zip() in get_mcq_answer for defense-in-depth
- Fix en-dash to hyphen-minus in _validate_input_text docstring
…onGenerator parsing logic

- Add LLM generator mock infrastructure to conftest.py (defaults, fixture, reset logic)
- Add ~40 endpoint tests for /get_shortq_llm, /get_mcq_llm, /get_boolq_llm, /get_problems_llm
- Create test_llm_generator.py with ~35 unit tests for JSON parsing, fallback parsing, boolean coercion
- Add pytest.ini with filterwarnings to suppress third-party deprecation warnings
- Include LLM endpoints in TestInputValidation and TestHTTPMethods parametrized suites
- Fix pre-existing boolq_gen mock data structure (strings -> dicts)
- Fix pre-existing test_generator_exception_returns_500[/get_boolq_hard] assertion

All 316 tests pass with 0 warnings.
- Add reset_mock() calls in conftest.py to clear call history between tests
- Fix test_with_input_questions to assert correct mock for /get_boolq_hard

All 316 tests pass with 0 warnings.
- Set exact assertions for max_questions tests instead of loose thresholds
- Fix /get_content mock to correctly return string instead of dict
- Add missing tear_down keys for mocked third party instances
- Align mock _DEFAULTS mcq_gen and boolq_gen structure to exact application structure
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

Adds backend test infrastructure and pytest fixtures, hardens and refactors Flask server endpoints (validation, logging, error handling, uploads, transcript handling), fixes a generator type annotation, adds extensive backend tests, a CI workflow, and a frontend "Clear Input" button.

Changes

Cohort / File(s) Summary
CI & Test Config
​.github/workflows/tests.yml, backend/pytest.ini, backend/requirements-test.txt
New GitHub Actions workflow for backend tests (matrix py3.10–3.12), pytest warning filters, and pinned test-only dependencies.
Pytest Fixtures & Mocks
backend/conftest.py
New centralized pytest conftest that injects/patches heavy deps into sys.modules, provides MagicMock defaults, autouse reset fixture, and per-mock fixtures for server tests.
Server Refactor & Hardening
backend/server.py
Centralized logging, global error handlers, input validation helpers, tightened endpoints (generation, answers, Google Docs/Forms), secure upload handling (filename sanitization, allowed extensions, size limits), transcript download/timeouts, and consistent JSON error responses.
Generator Type Fix
backend/Generator/main.py
Typing fix: num_questions annotated as Optional[int] and Optional imported.
LLM Parsing Tests
backend/test_llm_generator.py
New unit tests for LLM generator parsing/coercion and fallback parsers (no model invocation).
Server Tests
backend/test_server.py
Reworked to Flask test-client suite covering many endpoints, extensive edge-case and error-path tests with mocked dependencies and subprocess behavior.
Frontend UI
eduaid_web/src/pages/Text_Input.jsx
Added handleClear and a "Clear Input" button to reset form state and clear hidden file input.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Server as Server (Flask)
    participant FileProc as FileProcessor
    participant Generator as QuestionGenerator
    participant Subproc as yt-dlp / subprocess

    Client->>Server: POST /upload (file, metadata)
    alt request invalid or too large
        Server-->>Client: 4xx / 413 JSON error
    else
        Server->>FileProc: secure save & process_file(path)
        FileProc-->>Server: processing result
        alt needs transcript download
            Server->>Subproc: run yt-dlp (with timeout)
            Subproc-->>Server: vtt or error/timeout
            alt timeout/error
                Server-->>Client: 504 / 500 JSON error
            end
        end
        Server->>Generator: generate (questions/answers)
        Generator-->>Server: generated items
        Server-->>Client: 200 JSON { result, responder_url?, edit_url? }
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰
I hopped in with mocks and a careful fix,
Cleared forms with a button—no more messy mix.
Tests line up tidy, logs warm and bright,
Uploads locked down, transcripts polite.
A little rabbit cheers: code, tests—delight!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes substantial out-of-scope changes to backend infrastructure (testing configuration, workflows, server improvements) unrelated to the frontend Clear Input feature requested in issue #668. Separate backend testing and infrastructure changes into independent PRs to maintain focus on issue #668's frontend requirement of adding a Clear Input button.
Title check ❓ Inconclusive The title 'Feat/clear input 668' is partially related to the changeset. It references issue #668 and the feature (clear input), but lacks clarity and specificity about what 'Feat/clear input 668' entails for someone unfamiliar with the issue number. Consider revising the title to be more descriptive, such as 'Add Clear Input button to reset form fields' to improve readability and clarity in commit history.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All coding requirements from issue #668 are fully met: the Clear Input button clears the text field, resets form fields (difficulty, numQuestions, docUrl, isToggleOn, question type) to defaults, and avoids navigation.
Docstring Coverage ✅ Passed Docstring coverage is 99.21% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/server.py (1)

466-478: ⚠️ Potential issue | 🟠 Major

Move Google Forms setup and payload construction inside error handling.

Credential loading, OAuth flow, service creation, and qapair["question"]/qapair["answer"] access all happen before the try at Line 615. Missing credentials or malformed qa_pairs can still escape as Flask HTML 500s instead of the JSON errors this route now aims to provide.

🛠️ Suggested structure
-    store = file.Storage("token.json")
-    creds = store.get()
-    if not creds or creds.invalid:
-        flow = client.flow_from_clientsecrets("credentials.json", SCOPES)
-        creds = tools.run_flow(flow, store)
-
-    form_service = discovery.build(
-        "forms",
-        "v1",
-        http=creds.authorize(Http()),
-        discoveryServiceUrl=DISCOVERY_DOC,
-        static_discovery=False,
-    )
+    try:
+        store = file.Storage("token.json")
+        creds = store.get()
+        if not creds or creds.invalid:
+            flow = client.flow_from_clientsecrets("credentials.json", SCOPES)
+            creds = tools.run_flow(flow, store)
+
+        form_service = discovery.build(
+            "forms",
+            "v1",
+            http=creds.authorize(Http()),
+            discoveryServiceUrl=DISCOVERY_DOC,
+            static_discovery=False,
+        )
+
+        # Build requests_list here too, after validating each qa_pair shape.
+        ...
-    try:
         result = form_service.forms().create(body=NEW_FORM).execute()

Also applies to: 486-615

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 466 - 478, The Google Forms credential
loading and service creation (store, creds, client.flow_from_clientsecrets,
tools.run_flow, discovery.build and form_service) plus any access to qa_pairs
fields (qapair["question"], qapair["answer"]) must be moved inside the existing
try/except that begins around the route handler so credential errors or
malformed qa_pairs are caught and turned into the JSON error responses; refactor
so the OAuth flow and form payload construction happen after entering the try
block and any exceptions from those steps are raised/caught and handled by the
route's JSON error handling logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 636-647: The code uses len(input_questions) directly to set
num_questions for qg.generate in hard endpoints, which can exceed
MAX_QUESTIONS_LIMIT or mis-handle non-list inputs; change to compute a bounded
count via the existing _validate_max_questions helper: set num_questions =
_validate_max_questions(len(input_questions)) when isinstance(input_questions,
list) else use a safe default (e.g., 5), then pass num_questions into
qg.generate (replace direct len(input_questions)); apply the same pattern in the
other hard endpoints referenced (the blocks around qg.generate at the other
listed ranges).
- Around line 330-344: The payload validation is too permissive: input_questions
and input_options may be truthy non-lists (causing TypeError later). Add
explicit shape checks before the current empty/mismatch check in the handler
that calls _validate_input_text: ensure input_questions is a list of strings (or
expected item type) and input_options is a list of lists (or expected
option-item shape) and that len(input_questions) == len(input_options); if
validation fails return a 400 JSON response (use jsonify({"error": "...",
"output": []}) or similar) rather than letting the model call throw. Implement
or reuse small helper(s) like validate_questions_shape(input_questions) and
validate_options_shape(input_options) and invoke them immediately after
_validate_input_text and before the existing truthy/length check; apply the same
helpers to the other similar blocks that handle input_questions/input_options
(the other handler blocks referenced in the review).
- Around line 791-837: The current implementation uses a video_id-scoped path
(subtitle_path) which causes races when two requests for the same video run
concurrently; change to create a per-request temporary directory (e.g., via
tempfile.TemporaryDirectory or mkdtemp) and use that dir for the yt-dlp output
(update the subprocess.run "-o" to write into the temp dir), then glob only
inside that temp dir for VTT files, pass the found file(s) to clean_transcript,
and ensure the temp directory is removed in a finally block (or rely on
TemporaryDirectory context) so each request has isolated files and no
cross-request deletion occurs; keep existing exception handling for
subprocess.TimeoutExpired and subprocess.CalledProcessError and adapt log
messages to reference the temp dir as needed.

In `@backend/test_llm_generator.py`:
- Around line 21-30: The test currently only stubs llama_cpp if it's absent, but
since llama-cpp-python may be installed the real Llama.from_pretrained gets
invoked; unconditionally replace sys.modules["llama_cpp"] with a lightweight
stub before loading the real llm_generator module so the test never loads
models: always assign sys.modules["llama_cpp"] = type(sys)("llama_cpp") and set
its Llama = type("Llama", (), {"from_pretrained": staticmethod(lambda **kw:
None)}), then proceed to load the real module via
_llm_gen_path/_spec/_real_module as before so the generator fixture and
Llama.from_pretrained are deterministically stubbed.

In `@eduaid_web/src/pages/Text_Input.jsx`:
- Around line 181-186: The Clear Input button removed the default keyboard
outline, so restore an accessible visible focus state for keyboard users by
adding a focus/focus-visible indicator to the button styling (e.g., using
Tailwind classes like focus:outline-none should be replaced or complemented with
focus-visible:ring, focus:ring-2 and focus:ring-offset-2 or focus:ring-red-400)
so when the element with onClick={handleClear} receives keyboard focus it shows
a clear visual ring/outline; update the button element's className to include
those focus classes (or add a focus style in the component's CSS) to ensure
keyboard navigation is usable.
- Around line 26-32: handleClear currently resets visible fields but leaves the
persisted question type and the hidden file input uncleared; update handleClear
to also remove the persisted question type from localStorage (e.g.,
localStorage.removeItem('selectedQuestionType') or set it to a default like
"get_shortq") and reset any state via setSelectedQuestionType(...) if present,
and clear the hidden file input value (e.g., fileInputRef.current.value = '') so
selecting the same file again will fire onChange.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 466-478: The Google Forms credential loading and service creation
(store, creds, client.flow_from_clientsecrets, tools.run_flow, discovery.build
and form_service) plus any access to qa_pairs fields (qapair["question"],
qapair["answer"]) must be moved inside the existing try/except that begins
around the route handler so credential errors or malformed qa_pairs are caught
and turned into the JSON error responses; refactor so the OAuth flow and form
payload construction happen after entering the try block and any exceptions from
those steps are raised/caught and handled by the route's JSON error handling
logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d23a28cc-a0cc-414d-8be6-4bc14f75e161

📥 Commits

Reviewing files that changed from the base of the PR and between 2038116 and 113c30e.

📒 Files selected for processing (9)
  • .github/workflows/tests.yml
  • backend/Generator/main.py
  • backend/conftest.py
  • backend/pytest.ini
  • backend/requirements-test.txt
  • backend/server.py
  • backend/test_llm_generator.py
  • backend/test_server.py
  • eduaid_web/src/pages/Text_Input.jsx

Comment thread backend/server.py
Comment on lines +330 to +344
data = request.get_json(silent=True) or {}
input_text = data.get("input_text", "")
input_questions = data.get("input_question", [])
input_options = data.get("input_options", [])
outputs = []

if not input_questions or not input_options or len(input_questions) != len(input_options):
return jsonify({"output": outputs})
input_text, err = _validate_input_text(input_text)
if err:
return err

for question, options in zip(input_questions, input_options):
# Generate answer using the QA model
qa_response = qa_model(question=question, context=input_text)
generated_answer = qa_response["answer"]
if (
not input_questions
or not input_options
or len(input_questions) != len(input_options)
):
return jsonify({"output": []})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate question and option payload shapes before model calls.

These endpoints accept any truthy input_question/input_options values. For example, a string input_options can reach Line 355 and raise a TypeError, returning 500 for a client-side payload error. Validate list types and item shapes up front.

🛡️ Suggested direction
+def _validate_string_list(value, field_name):
+    if value is None:
+        return [], None
+    if not isinstance(value, list) or any(
+        not isinstance(item, str) or not item.strip() for item in value
+    ):
+        return None, (
+            jsonify({"error": f"{field_name} must be a list of non-empty strings"}),
+            400,
+        )
+    return [item.strip() for item in value], None
+
+def _validate_options_list(value):
+    if value is None:
+        return [], None
+    if not isinstance(value, list) or any(
+        not isinstance(options, list)
+        or any(not isinstance(option, str) for option in options)
+        for options in value
+    ):
+        return None, (
+            jsonify({"error": "input_options must be a list of string lists"}),
+            400,
+        )
+    return value, None

Then call these helpers before the current empty/mismatch checks.

Also applies to: 377-386, 403-412

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 330 - 344, The payload validation is too
permissive: input_questions and input_options may be truthy non-lists (causing
TypeError later). Add explicit shape checks before the current empty/mismatch
check in the handler that calls _validate_input_text: ensure input_questions is
a list of strings (or expected item type) and input_options is a list of lists
(or expected option-item shape) and that len(input_questions) ==
len(input_options); if validation fails return a 400 JSON response (use
jsonify({"error": "...", "output": []}) or similar) rather than letting the
model call throw. Implement or reuse small helper(s) like
validate_questions_shape(input_questions) and
validate_options_shape(input_options) and invoke them immediately after
_validate_input_text and before the existing truthy/length check; apply the same
helpers to the other similar blocks that handle input_questions/input_options
(the other handler blocks referenced in the review).

Comment thread backend/server.py
Comment thread backend/server.py Outdated
Comment thread backend/test_llm_generator.py Outdated
Comment thread eduaid_web/src/pages/Text_Input.jsx
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/server.py (1)

507-519: ⚠️ Potential issue | 🟠 Major

Don’t run OAuth bootstrapping inside the request path.

If token.json is missing/invalid, tools.run_flow(...) can block the Flask worker with an interactive OAuth flow, and failures before Line 656 bypass the JSON error handling. Prefer initializing the Forms service at startup/config time and returning 503 when it is unavailable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 507 - 519, The current request-path code is
running interactive OAuth bootstrapping (file.Storage / store.get /
client.flow_from_clientsecrets / tools.run_flow) and building the Forms client
(discovery.build) inside a request, which can block or fail; move all
OAuth/token loading and discovery.build into application startup or a dedicated
init function (e.g., init_forms_service) and ensure tools.run_flow is never
called during request handling; instead initialize creds ahead of time (or fail
fast) and set form_service as a global/attached app resource, and update the
request handler(s) to check that form_service is available and return HTTP 503
if it's not initialized or credentials are invalid.
🧹 Nitpick comments (2)
backend/test_llm_generator.py (2)

193-197: Assertion too permissive for non-JSON path.

assert isinstance(result, list) will pass even if _parse_response silently returned a non-empty garbage list, defeating the purpose of the test. Given that _parse_response falls back to _fallback_parse for text with no Q:/A: markers, the deterministic expected value here is []. Consider assert result == [] to lock in the contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_llm_generator.py` around lines 193 - 197, The test
test_non_json_input is too permissive: change the assertion so it verifies
generator._parse_response(raw, 5) returns the deterministic fallback value
instead of any list; specifically assert result == [] to confirm the method
falls back to _fallback_parse for non-JSON, non-Q:/A: input and returns an empty
list.

409-425: Tighten weak assertions in fallback-bool tests.

assert len(result) >= 1 (lines 413, 419) and assert len(result) <= 1 (line 425) won't catch regressions where the parser returns extra spurious items or returns 0 when it should return 1. Since _fallback_bool_parse is deterministic for these fixed inputs, the exact expected count (and ideally the parsed question/answer values) is knowable—prefer == 1 and assert on the content, mirroring the style used in TestFallbackParse.test_qa_pattern.

Example tightening
-    def test_qa_bool_pattern(self, generator):
-        """Question with true/false answer should be parsed."""
-        text = "1. Is AI intelligent?\nAnswer: true"
-        result = generator._fallback_bool_parse(text, 5)
-        assert len(result) >= 1
+    def test_qa_bool_pattern(self, generator):
+        """Question with true/false answer should be parsed."""
+        text = "1. Is AI intelligent?\nAnswer: true"
+        result = generator._fallback_bool_parse(text, 5)
+        assert len(result) == 1
+        assert result[0]["answer"] is True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/test_llm_generator.py` around lines 409 - 425, The tests for
_fallback_bool_parse use weak length assertions; change them to assert the exact
expected count and validate parsed content: in test_qa_bool_pattern,
test_inline_answer, and test_max_questions_limit replace len(result) >= 1 / <= 1
with len(result) == 1 (or == expected count for max limit) and add assertions on
result[0]['question'] and result[0]['answer'] (e.g. question text matches "Is AI
intelligent?" or "Q1?" and answer is True/False as expected) to mirror
TestFallbackParse.test_qa_pattern's strict checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/server.py`:
- Around line 368-372: The inline one-line if-return statements cause Ruff E701;
update the checks after calling _validate_string_list and _validate_options_list
so each if is a multi-line block: assign input_questions, err_q =
_validate_string_list(...) then use if err_q: on its own line followed by a new
indented return err_q line; do the same for input_options, err_o and apply
identical changes to the corresponding err_q/err_o checks in the
get_shortq_answer and get_boolean_answer call sites so no single-line if
statements remain.
- Around line 499-503: Currently only qa_pairs and question_type types are
checked; validate each item in qa_pairs by iterating and ensuring each qapair is
a dict and contains a non-empty string "question", and when the downstream logic
requires it also contains a string "answer" (check based on question_type or the
code paths around qapair["answer"]). On failure return a 400 with a clear
message. Update the validation near qa_pairs/question_type checks and reuse it
before any access to qapair["question"] or qapair["answer"] (references:
qa_pairs, qapair["question"], qapair["answer"], question_type) so malformed
payloads produce structured 400 errors instead of 500s.
- Around line 792-796: The current flow sanitizes uploaded_file.filename with
secure_filename but still uses that filename directly, causing race/overwrite
when two users upload the same name; generate a per-upload unique sanitized
filename (e.g., uuid4 or timestamp-based token) while preserving the original
extension and assign it to uploaded_file.filename before calling
file_processor.process_file so each upload writes to a unique path; optionally
store the original filename in upload metadata if you need to show it to users.

In `@eduaid_web/src/pages/Text_Input.jsx`:
- Around line 26-36: handleClear currently calls
localStorage.removeItem("selectedQuestionType") which leaves
selectedQuestionType null and later causes getEndpoint() to return "/null";
instead, restore the app's default question type when clearing. Replace the
removeItem call with setting the default value into localStorage (the same
default used on component mount) and also call the state setter (e.g.,
setSelectedQuestionType(<DEFAULT_QUESTION_TYPE>)) inside handleClear so both
persisted and in-memory state are valid; update the value stored under
"selectedQuestionType" rather than removing it.

---

Outside diff comments:
In `@backend/server.py`:
- Around line 507-519: The current request-path code is running interactive
OAuth bootstrapping (file.Storage / store.get / client.flow_from_clientsecrets /
tools.run_flow) and building the Forms client (discovery.build) inside a
request, which can block or fail; move all OAuth/token loading and
discovery.build into application startup or a dedicated init function (e.g.,
init_forms_service) and ensure tools.run_flow is never called during request
handling; instead initialize creds ahead of time (or fail fast) and set
form_service as a global/attached app resource, and update the request
handler(s) to check that form_service is available and return HTTP 503 if it's
not initialized or credentials are invalid.

---

Nitpick comments:
In `@backend/test_llm_generator.py`:
- Around line 193-197: The test test_non_json_input is too permissive: change
the assertion so it verifies generator._parse_response(raw, 5) returns the
deterministic fallback value instead of any list; specifically assert result ==
[] to confirm the method falls back to _fallback_parse for non-JSON, non-Q:/A:
input and returns an empty list.
- Around line 409-425: The tests for _fallback_bool_parse use weak length
assertions; change them to assert the exact expected count and validate parsed
content: in test_qa_bool_pattern, test_inline_answer, and
test_max_questions_limit replace len(result) >= 1 / <= 1 with len(result) == 1
(or == expected count for max limit) and add assertions on result[0]['question']
and result[0]['answer'] (e.g. question text matches "Is AI intelligent?" or
"Q1?" and answer is True/False as expected) to mirror
TestFallbackParse.test_qa_pattern's strict checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4de76f2e-722d-436c-8d3d-9450a6fb2bab

📥 Commits

Reviewing files that changed from the base of the PR and between 113c30e and b53dc28.

📒 Files selected for processing (3)
  • backend/server.py
  • backend/test_llm_generator.py
  • eduaid_web/src/pages/Text_Input.jsx

Comment thread backend/server.py Outdated
Comment thread backend/server.py
Comment thread backend/server.py Outdated
Comment on lines +26 to +36
const handleClear = () => {
setText("");
setDifficulty("Easy Difficulty");
setNumQuestions(10);
setDocUrl("");
setIsToggleOn(0);
localStorage.removeItem("selectedQuestionType");
if (fileInputRef.current) {
fileInputRef.current.value = "";
}
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reset to a valid default question type instead of removing it.

Line 32 leaves the user on this page with no selectedQuestionType. If they enter new text and click Next, line 86 reads null, getEndpoint() returns null, and the request goes to /null. Since the reset should keep users on the page, write the app’s default question type instead of removing it.

🐛 Proposed fix
+const DEFAULT_QUESTION_TYPE = "get_shortq"; // keep in sync with the question-type default
+
 const Text_Input = () => {
   const navigate = useNavigate();
   const handleClear = () => {
     setText("");
     setDifficulty("Easy Difficulty");
     setNumQuestions(10);
     setDocUrl("");
     setIsToggleOn(0);
-    localStorage.removeItem("selectedQuestionType");
+    localStorage.setItem("selectedQuestionType", DEFAULT_QUESTION_TYPE);
     if (fileInputRef.current) {
       fileInputRef.current.value = "";
     }
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@eduaid_web/src/pages/Text_Input.jsx` around lines 26 - 36, handleClear
currently calls localStorage.removeItem("selectedQuestionType") which leaves
selectedQuestionType null and later causes getEndpoint() to return "/null";
instead, restore the app's default question type when clearing. Replace the
removeItem call with setting the default value into localStorage (the same
default used on component mount) and also call the state setter (e.g.,
setSelectedQuestionType(<DEFAULT_QUESTION_TYPE>)) inside handleClear so both
persisted and in-memory state are valid; update the value stored under
"selectedQuestionType" rather than removing it.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/server.py (1)

529-541: ⚠️ Potential issue | 🟠 Major

Wrap Google Forms credential/bootstrap in the same guarded error path.

Credential loading and discovery.build(...) (Line 529–Line 541) execute before the try that starts at Line 678, so setup failures can bypass your structured JSON error handling and return inconsistent 500 behavior. Please move these lines inside the guarded block (or add a dedicated guard) so failures consistently return JSON.

🛠️ Proposed fix
-    store = file.Storage("token.json")
-    creds = store.get()
-    if not creds or creds.invalid:
-        flow = client.flow_from_clientsecrets("credentials.json", SCOPES)
-        creds = tools.run_flow(flow, store)
-
-    form_service = discovery.build(
-        "forms",
-        "v1",
-        http=creds.authorize(Http()),
-        discoveryServiceUrl=DISCOVERY_DOC,
-        static_discovery=False,
-    )
+    try:
+        store = file.Storage("token.json")
+        creds = store.get()
+        if not creds or creds.invalid:
+            flow = client.flow_from_clientsecrets("credentials.json", SCOPES)
+            creds = tools.run_flow(flow, store)
+
+        form_service = discovery.build(
+            "forms",
+            "v1",
+            http=creds.authorize(Http()),
+            discoveryServiceUrl=DISCOVERY_DOC,
+            static_discovery=False,
+        )
+
+        result = form_service.forms().create(body=NEW_FORM).execute()
+        form_service.forms().batchUpdate(
+            formId=result["formId"], body=NEW_QUESTION
+        ).execute()
 
-    try:
-        result = form_service.forms().create(body=NEW_FORM).execute()
-        form_service.forms().batchUpdate(
-            formId=result["formId"], body=NEW_QUESTION
-        ).execute()
         edit_url = "https://docs.google.com/forms/d/" + result["formId"] + "/edit"
         responder_url = result["responderUri"]
         logger.info("Created Google Form: %s", result["formId"])
         return jsonify({"responder_url": responder_url, "edit_url": edit_url})
     except Exception:

Also applies to: 678-690

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 529 - 541, The Google Forms bootstrap
(creating file.Storage("token.json"), retrieving/refreshing creds via
store.get(), client.flow_from_clientsecrets and tools.run_flow, and building
form_service via discovery.build with DISCOVERY_DOC and SCOPES) must be executed
inside the existing guarded error path that produces structured JSON errors;
move these lines into the same try/except that begins at the later block (or add
a dedicated try/except around these statements) so any failures in
store/creds/flow/authorize/discovery.build are caught and returned via the same
JSON error handler instead of bypassing it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@backend/server.py`:
- Around line 529-541: The Google Forms bootstrap (creating
file.Storage("token.json"), retrieving/refreshing creds via store.get(),
client.flow_from_clientsecrets and tools.run_flow, and building form_service via
discovery.build with DISCOVERY_DOC and SCOPES) must be executed inside the
existing guarded error path that produces structured JSON errors; move these
lines into the same try/except that begins at the later block (or add a
dedicated try/except around these statements) so any failures in
store/creds/flow/authorize/discovery.build are caught and returned via the same
JSON error handler instead of bypassing it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4a3336eb-0da3-4839-91a8-194f19c69300

📥 Commits

Reviewing files that changed from the base of the PR and between b53dc28 and 2e30d14.

📒 Files selected for processing (2)
  • backend/server.py
  • eduaid_web/src/pages/Text_Input.jsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[GOOD FIRST ISSUE]: Add a "Clear Input" button to reset the text input form

2 participants