ReportX

ReportX is an educational health explanations app. MVP stores no data and logs metadata only (no PHI or request bodies). This monorepo contains a Next.js frontend and a FastAPI backend with Docker Compose for local prod-like runs.

Quickstart

Copy envs and update if needed:
- Edit .env and set OPENAI_API_KEY=sk-... (do not source .env.example)
Build and run:
- docker compose up --build
Visit:
- Frontend: http://localhost:3000
- Health page: http://localhost:3000/health (calls backend)
- Parser: http://localhost:3000/parse (PDF/Image upload or paste text)

Features (MVP)

PDF/Text/Image parsing (in-memory) → structured rows with heuristics for ranges/units and flagging.
LLM interpretation to JSON with strict schema, one repair attempt, and robust fallback.
Frontend flow: upload/paste → Parse → edit table → Explain → see summary, per_test, flags, next_steps, disclaimer → Translate.

Environment

FRONTEND_URL: http://localhost:3000 (CORS origin)
NEXT_PUBLIC_BACKEND_URL: http://localhost:8000
OPENAI_API_KEY: Optional. If unset or network blocked, backend uses deterministic fallback JSON. Set it only in .env.
Upload limits: up to 5 files per request, 500MB per file (subject to infra limits).
ALLOWED_HOSTS: Comma-separated allowed hosts for backend (default: localhost,127.0.0.1; tests allow testserver).

Env gotcha: avoiding empty keys

Do not run with --env-file .env.example and do not source it. .env.example intentionally does not set OPENAI_API_KEY to avoid wiping your environment.
After editing .env, recreate the backend so the container picks up changes: docker compose up -d --force-recreate backend.

Test/Run Instructions

Run services: docker compose up --build

Limitations

OCR: Scanned PDFs and images (PNG/JPEG) are supported via Tesseract OCR when available. Docker and CI include Tesseract. OCR accuracy depends on image quality.
Network restrictions: if the backend cannot reach the LLM, it falls back to a safe, deterministic JSON interpretation.
Stateless: no DB; all parsing is ephemeral; do not upload PHI to shared environments.

Observability

Request ID: every response includes X-Request-ID (propagates incoming value or generates one). The same ID is logged alongside method, path, status, and duration for easier correlation across services and clients.

Testing

Backend: fast, deterministic unit tests for parser and interpretation, plus an OCR smoke test that runs when Tesseract is available.
Frontend: basic unit tests and an integration test that exercises the parse → interpret flow with mocked fetch.

Notes

No persistence: backend writes nothing to disk; no volumes for uploads.
Logging: backend logs method, path, status, and duration only (no bodies/files).
Env: never commit secrets. .env is ignored; see .env.example for required variables.
OCR: set ENABLE_OCR=1 (default) and optionally TESSERACT_CONFIG and language packs; backend tries text layer first, then falls back to OCR.

Local tooling (optional)

Frontend: npm run lint, npm run typecheck, npm test (inside frontend/).
Backend: make run, make test, ruff, black (inside backend/).

Software Requirements Specification

1. Introduction

1.1 Purpose

This document specifies the functional and non-functional requirements of ReportX, a web-based tool that allows users to upload lab or pathology reports and receive plain-language explanations of the results using AI.

1.2 Problem Statement

Many patients receive lab results that are difficult to understand due to medical jargon and language barriers.

1.3 Goal

Enable patients and caregivers to understand lab/pathology results and have better conversations with their clinicians. The system converts reports into plain-language, localized explanations, highlights out-of-range values, and suggests safe, non-diagnostic questions to ask a doctor.

1.4 Scope

ReportX provides the following key capabilities:
Accepts PDF or text input of medical lab reports
Parses and extracts structured test data from the uploaded PDF
Uses a large language model (LLM) to explain the test report to user
Maintains user safety through strong disclaimers

1.5 Out of Scope (for now)

Persistence and history, chat follow‑ups, clinician validation portal, device‑level medical classification, and regulatory approvals.

No diagnosis, triage, or personalized medical advice.

No user accounts or data retention in MVP.

1.6 References

MedlinePlus.org
LabTestsOnline.org
OpenAI API Docs

2. Overall Description

2.1 Product Perspective

ReportX is a standalone web-based application that processes medical lab reports using AI and natural language explanations. It integrates LLM APIs, parsing logic, and a simple UI to support patients and caregivers.

2.2 User Classes and Characteristics

User Type

Patient: Wants to understand their report while waiting for a doctor
Caregiver: Assists others in interpreting health data
Clinician (Optional)

2.3 Core flow - process description

Ingestion: A lab report is provided as a PDF, image (photo/scan), or plain text.
Parsing: The system extracts structured fields into a table containing: test name, measured value, unit, and reference range.
Review & correction: The parsed table is displayed for verification; fields may be edited to correct OCR or parsing errors.
Explanation generation: The system produces a plain-language explanation comprising a summary, per-test notes, out-of-range flags, suggested next steps, and a safety disclaimer.
Localization: The explanation is rendered in a selected target language (e.g., Arabic, Vietnamese, Mandarin), preserving numerals and units and supporting right-to-left scripts where applicable.
Follow-up Q&A (optional): A grounded Q&A interface is provided to ask brief follow-up questions about the results. Answers are derived strictly from the parsed table and the generated explanation, include a safety reminder, and exclude diagnosis or treatment advice.
Output & export: Results are presented with safety banners and highlighted flags. Content can be copied or exported client-side for record-keeping or sharing.

2.4 Constraints

No personal health data will be stored
Not intended for emergency interpretation or diagnosis

2.5 Assumptions and Dependencies

OpenAI or Claude API access will be available
Sample reports will be semi-structured and readable
Users will have basic internet access and digital literacy

3. Specific Requirements

3.1 Functional Requirements

FR1: Report Intake: The system shall allow users to upload a lab report PDF or paste report text.

Acceptance Criteria

AC1.1 Given a valid PDF ≤ 25 MB, when uploaded, then the system accepts it and moves to parsing.

AC1.2 Given pasted text 1–100,000 chars, when submitted, then the system accepts it and moves to parsing.

AC1.3 Given an unsupported file type or encrypted PDF, when submitted, then the system rejects it with a clear error and no data is retained.

FR2: Parsing: The system shall extract the test names, values, units, and reference ranges.

Acceptance Criteria

AC2.1 Output schema (ParsedRow): { test_name, value, unit, ref_range, flag }

AC2.2 Given a known sample report, when parsed, then each detected test appears as a ParsedRow with fields typed and schema-validated.

AC2.3 Given values outside reference ranges, when parsed, then flags reflect low or high.

AC2.4 Given missing ranges/units in a test row, when parsed, then the system sets ref_range or unit to null and flag to unknown (no fabricated values).

AC2.5 Given that there are some text included in the report without any parse output { test_name, value, unit, ref_range, flag }, Output them as is under Unparsed line section

FR3: Interpretation: The system shall send parsed results to an LLM and receive interpretation.

Acceptance Criteria

AC3.1 The AI reply always includes: a short summary, per-test notes, any flags, suggested next steps, and a disclaimer.

AC3.2 If the AI is unavailable, a simple “fallback” explanation is shown so users still get something helpful.

AC3.3 If the AI answer is incomplete, the system fixes small gaps or falls back to the simple version.

FR4: Translation: The system shall all the user to access the same explanation in different languages.

Acceptance Criteria

AC4.1 A language switch on the results screen offers English + at least two other languages.

AC4.2 Switching language updates the whole explanation (summary, per-test notes, flags, next steps, disclaimer) without re-uploading the report.

AC4.3 Numbers and units do not change when switching language.

AC4.4 Right-to-left languages display correctly, and the disclaimer is translated.

FR5: Suggestions: The system shall present follow-up suggestions and educational context.

Acceptance Criteria

AC5.1 At least three plain-language next steps are shown.

AC5.2 At least three “questions to ask your doctor” are shown.

FR6: Disclaimer: The system shall include clear disclaimers that outputs are not diagnostic.

FR7: Follow-Up Q&A: The system should allow users to ask follow-up questions about the result in context.

Acceptance Criteria

AC7.1 Answers stick to the current report and explanation (no guessing, no outside claims).

AC7.2 If a question asks for a diagnosis, prescription, or urgent triage, the app declines and points to safer next steps.

AC7.3 Answers use the current language setting and keep numbers and units as they are.

FR8: User Accounts and Role-Based Access

The system shall support secure user authentication and role-based access for three user types: patient, caregiver, and clinician.

Acceptance Criteria

AC8.1 Given a new visitor, when they register, then they must select a role (patient, caregiver, or clinician) and the system assigns the corresponding permission set upon account creation.

AC8.2 Given a patient account, when logged in, then the user can upload reports, view their own report history, and manage sharing preferences.

AC8.3 Given a caregiver account, when granted permission by a patient, then the caregiver can view and act on that patient's reports; without permission, no patient data is accessible.

AC8.4 Given a clinician account, when logged in, then the clinician can only view reports that have been explicitly shared with them by a patient; no other patient data is visible.

AC8.5 Given any authenticated user, when they attempt to access a resource outside their role's permission scope, then the system denies access and returns a clear error message.

AC8.6 Given an unauthenticated visitor, when they attempt to access any report or account data, then the system redirects them to the login page.

AC8.7 Given a user session, when the session token expires or the user logs out, then all in-memory data is cleared and re-authentication is required.

FR9: Consent-Driven Data Sharing

The system shall provide patients with explicit, granular control over who can access their reports, for how long, and what they can see.

Acceptance Criteria

AC9.1 Given a patient viewing one of their uploaded reports, when they initiate sharing, then they must specify: the recipient clinician (by verified identity), the scope of access (full report or summary only), and an expiry date before the share is confirmed.

AC9.2 Given an active share, when the patient revokes access, then the clinician's access is removed immediately and a revocation event is written to the audit log.

AC9.3 Given any share, view, or revocation event, when the event occurs, then a timestamped record is written to the patient's audit log within 5 seconds.

AC9.4 Given a patient reviewing their audit log, when they open it, then all share, view, and revocation events for their reports are displayed in chronological order.

AC9.5 Given a share with an expiry date, when the expiry date is reached, then the clinician's access is automatically revoked and an event is recorded in the audit log without requiring patient action.

AC9.6 Given a clinician attempting to access a report after expiry or revocation, then the system denies access and displays a message indicating the access is no longer valid.

FR10: Conversation Threads Tied to a Specific Report

The system shall provide a threaded messaging channel scoped to each uploaded report, allowing patients and clinicians to communicate asynchronously with context anchored to specific findings.

Acceptance Criteria

AC10.1 Given a patient viewing their report results, when they tap or select a specific lab row or flagged finding, then a new conversation thread can be opened anchored to that row.

AC10.2 Given an open thread, when either the patient or the linked clinician sends a message, then the message appears in the thread within the report view for both parties, with a timestamp and sender role label.

AC10.3 Given a clinician viewing a shared report, when they open a thread, then the same contextual anchor (the specific lab row or finding) is visible alongside the message history.

AC10.4 Given a report with active threads, when a new message is posted, then the other party receives a notification (in-app at minimum).

AC10.5 Given a clinician account that has not been granted access to a report, when they attempt to view or post to any thread on that report, then the system denies access.

AC10.6 Given a thread, when a patient or clinician views it, then all prior messages in that thread are displayed in chronological order with no messages omitted.

FR11: Structured "Questions for My Clinician" and Clinician Response Templates

The system shall use AI-generated findings to produce structured, pre-filled question prompts for patients and guide clinicians to respond using a structured template.

Acceptance Criteria

AC11.1 Given a completed AI explanation with flagged findings, when the patient navigates to the questions section, then the system presents at least three structured, pre-filled question prompts derived from those specific findings, covering symptoms, concern level, and timeline.

AC11.2 Given the pre-filled prompts, when a patient reviews them, then they can edit any prompt before sending, and the edited version is what is delivered to the clinician.

AC11.3 Given a patient sending questions through the conversation thread, when the clinician receives them, then a structured response template is presented to the clinician covering: what the result means, an urgency rating (routine, soon, or urgent), and a recommended follow-up action.

AC11.4 Given a clinician submitting a structured response, when the patient receives it, then the response is rendered as a readable card (not raw clinical text), using plain language consistent with the patient's selected language setting.

AC11.5 Given a flagged finding where no question prompt was auto-generated, when the patient views the questions section, then a free-text option is available to submit a custom question through the thread.

FR12: Trend and Longitudinal Health Analysis

The system shall allow users to upload multiple reports over time and receive AI-generated trend analysis comparing the same biomarkers across reports.

Acceptance Criteria

AC12.1 Given an authenticated patient, when they upload more than one report, then each report is stored as a dated record linked to their account, and all reports are accessible from a report history view.

AC12.2 Given two or more reports containing the same biomarker (matched by test name), when the patient views the trend section, then the system generates a plain-language trend note for that biomarker indicating whether the value is improving, stable, or worsening.

AC12.3 Given a biomarker with trend data, when the patient views that test's section, then a sparkline chart is displayed alongside the trend note showing the value over time.

AC12.4 Given a biomarker that appears in only one uploaded report, when the patient views the trend section, then no trend note or chart is shown for that biomarker, and the system does not fabricate a trend.

AC12.5 Given trend notes, when displayed, then they adhere to the same plain-language standard, disclaimer requirements, and language settings as single-report explanations (per FR3 and FR4).

AC12.6 Given a clinician viewing a shared report, when the patient has granted access to the full report, then trend notes and charts for that patient are visible to the clinician in the shared view.

FR13: Doctor-Ready Summary Report

The system shall generate a structured, one-page PDF summary designed for clinical use, drawing on the AI explanation, flagged values, trend notes, and patient-submitted questions.

Acceptance Criteria

AC13.1 Given a completed AI explanation for a report, when the patient selects "Export Doctor Summary," then the system generates a one-page PDF within 10 seconds.

AC13.2 Given the generated PDF, when reviewed, then it includes: flagged values, AI explanation summary, trend notes (if available), and any questions the patient has submitted through the conversation thread.

AC13.3 Given the generated PDF, when reviewed, then the language used in the clinical sections targets a clinician audience (uses clinical terminology rather than the plain-language version shown to patients).

AC13.4 Given the consent workflow (FR9), when a patient shares a report with a clinician, then the option to include the doctor-ready summary in the shared view is presented to the patient.

AC13.5 Given a generated PDF, when exported, then no data is stored server-side as part of the export action; the file is generated and delivered client-side only (consistent with NFR2, unless user accounts and data retention have been explicitly enabled per FR8).

3.2 Non-Functional Requirements

NFR1: The UI shall be optimized for accessibility and older users.
NFR2: The system shall not store any user-submitted data.
NFR3: Interpretation shall complete within 5 seconds per input.
NFR4: The system shall maintain 99% uptime during project phase.
NFR5: The app architecture shall allow scaling for multiple users.

Technical Requirements

5.1 Technology Stack

Layer

Technology/Tool

Frontend

Next.js

Backend

Python (FastAPI or Flask)

AI Model API

OpenAI GPT-5

Parsing/OCR

PyMuPDF

5.2 Programming Languages

Language

Purpose

Python

Parsing, API calls, AI logic, app backend

JavaScript/TypeScript (optional)

For advanced UI (Next.js)

5.3 AI Model Usage

Model

Description

GPT-5

Generates safe, contextual interpretations of lab values and summaries

5.4 AI Code Assistant Tools

Tool

Purpose

Bolt

Assist in coding, testing, and debugging backend logic

CodeX

Assist in Debugging

Lovable

Suggest accessible and clean UI components and layouts?? - Yet to explore

Roadmap

Phase

Key outcomes / deliverables

Status

Foundations

Finalise scope, disclaimers; approve SRS.

Completed

MVP: Understand My Results

Upload/paste; parse key fields; plain-language English explanation; flags; next steps; clear disclaimers; simple UI with copy/export.

Completed

Multi-language explanations

Language switch (English + at least 5 other languages);

RTL support (Pending)

Completed

Follow-up Q&A (grounded chat)

Short questions about results; answers grounded in current parse + explanation; safety guardrails

Not Started

Patient-friendly polish

Accessibility upgrades (contrast, larger text, keyboard support); plain language/layout tweaks; 1-page “doctor summary.”

Not Started

Quality & readiness

Performance and resilience testing (incl. AI fallback); privacy review (no PHI storage); basic usage/error analytics (no content).

Not Started

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
output/doc		output/doc
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
CODEBASE_CONTEXT.md		CODEBASE_CONTEXT.md
LICENSE		LICENSE
README.md		README.md
dev		dev
docker-compose.fullqa.yml		docker-compose.fullqa.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReportX

Quickstart

Features (MVP)

Environment

Env gotcha: avoiding empty keys

Test/Run Instructions

Limitations

Observability

Testing

Notes

Local tooling (optional)

Software Requirements Specification

1. Introduction

1.1 Purpose

1.2 Problem Statement

1.3 Goal

1.4 Scope

1.5 Out of Scope (for now)

1.6 References

2. Overall Description

2.1 Product Perspective

2.2 User Classes and Characteristics

2.3 Core flow - process description

2.4 Constraints

2.5 Assumptions and Dependencies

3. Specific Requirements

3.1 Functional Requirements

FR1: Report Intake: The system shall allow users to upload a lab report PDF or paste report text.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReportX

Quickstart

Features (MVP)

Environment

Env gotcha: avoiding empty keys

Test/Run Instructions

Limitations

Observability

Testing

Notes

Local tooling (optional)

Software Requirements Specification

1. Introduction

1.1 Purpose

1.2 Problem Statement

1.3 Goal

1.4 Scope

1.5 Out of Scope (for now)

1.6 References

2. Overall Description

2.1 Product Perspective

2.2 User Classes and Characteristics

2.3 Core flow - process description

2.4 Constraints

2.5 Assumptions and Dependencies

3. Specific Requirements

3.1 Functional Requirements

FR1: Report Intake: The system shall allow users to upload a lab report PDF or paste report text.

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages