Skip to content

csv-xmlConduct full UX architecture review#7

Open
daler91 wants to merge 196 commits into
rdale-dev:masterfrom
daler91:claude/ux-architecture-review-0IYjT
Open

csv-xmlConduct full UX architecture review#7
daler91 wants to merge 196 commits into
rdale-dev:masterfrom
daler91:claude/ux-architecture-review-0IYjT

Conversation

@daler91
Copy link
Copy Markdown

@daler91 daler91 commented Apr 11, 2026

No description provided.

google-labs-jules Bot and others added 30 commits March 19, 2026 16:04
Added `pytest` to `requirements.txt` to include test dependencies and allow running `pytest tests/` successfully.

Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
🎯 What: Replaced lxml.etree.parse with defusedxml.lxml.parse in xml-validator.py to prevent XML External Entity (XXE) vulnerabilities. Added defusedxml and lxml to requirements.txt.
⚠️ Risk: If left unfixed, the application could be vulnerable to XXE attacks when parsing malicious XML or XSD files, potentially leading to unauthorized data disclosure or denial of service.
🛡️ Solution: defusedxml acts as a drop-in replacement that strictly disables external entity resolution by default, successfully mitigating the XXE attack vector while maintaining compatibility with lxml.etree.XMLSchema and validation.

Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
…2842002

🔒 Fix XXE vulnerability in xml-validator
…matting-2048886441696389176

🧪 Add Error Path Tests for Date Formatting in src/data_cleaning.py
…99421084779733224

🧪 Add unit tests for Data Validation module
…2181123944

🧪 Add tests for BaseConverter to verify ABC behavior
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
…13268053583850

🧪 Add test for untestable exception block in xml-validator.py
…41166032

chore: Add pytest to requirements.txt
🎯 What:
Fixed an XML External Entity (XXE) vulnerability in `src/xml-validator.py` caused by using `lxml.etree.parse` without disabling external entity resolution.

⚠️ Risk:
When parsing user-provided XML files, `lxml` default configuration resolves external entities. This allows attackers to define malicious external entities (e.g., local files via `file://`) that get included in the parsed XML, leading to arbitrary file disclosure (Local File Inclusion), Server-Side Request Forgery (SSRF), or Denial of Service (Billion Laughs attack).

🛡️ Solution:
Created a secure `etree.XMLParser` with `resolve_entities=False` and passed it to the `etree.parse` calls for both the XML document and the XSD schema document. This prevents the parser from resolving external entities, neutralizing the XXE threat.

Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
…ers-1942838300665811679

Refactor Address and Phone logic in CounselingConverter
…cs-4508032247991148993

🧪 Add test for _calculate_demographics in TrainingConverter
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
…s-12243694426949791705

🧪 Add comprehensive tests for clean_percentage
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
…-4221162114748354920

🧪 Add tests for clean_numeric in data_cleaning.py
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
Co-authored-by: rdale-dev <203160809+rdale-dev@users.noreply.github.com>
🔒 Fix XXE vulnerability in xml-validator.py
…930982119168753

🧹 [code health improvement] Remove unused sys import in xml-validator.py
daler91 and others added 27 commits April 7, 2026 12:15
…qkkS

Fix XSD validation falsely rejecting temp file paths during conversion
- Hispanic counter: "Prefer not to say" was incorrectly counted as non-Hispanic
  because the logic counted any non-empty value not containing "hispanic/latino".
  Added explicit exclusion for "prefer not" values.
- Disability counter: Used word-boundary regex (\b) to prevent "y" from matching
  inside words like "say" in "Prefer not to say".
- Updated test to cover "Prefer not to say" for both disability and ethnicity.

https://claude.ai/code/session_01Db8ZhdRsiM8CAFnm8GNodB
…rs-fN3F6

Fix 888 demographic counters to exclude "Prefer not to say" responses
…t badges

- Restore saved columnMapping from job record on page load so selections
  persist when navigating away and back (was only loading suggestions)
- Show both auto-matched and missing fields in the mapping table (was only
  showing missing fields)
- Add field requirement badges (Required/Conditional/Optional) with color coding
- Expand dropdown options to all CSV headers instead of just unmatched columns
- Add "Auto-matched" indicator for fields that matched CSV columns directly

https://claude.ai/code/session_01HYibRRktm21EJ54YhSLT6a
…aMQV

Fix mapping page: persist selections, show all fields, add requiremen…
Comprehensive severity-ranked audit of the web app's user-facing
surfaces (apps/web), covering landing, auth, dashboard, audit trail,
and the full 5-step conversion flow.

Structured like TECHNICAL_DEBT.md so items can be marked [RESOLVED]
as they are fixed. Includes findings by theme (IA, onboarding, flow,
errors, feedback, a11y, responsive, visual, content, trust), an
appendix of findings by page, a prioritized top-10 punch list, and
open questions for the product owner.
Sequences the 45+ findings in UX_REVIEW.md into six executable
phases, ordered by impact and dependency:

  1. Stop the Bleed — P0 bug, silent data loss, worst a11y gaps
  2. Mobile Usable — responsive grids, tables, nav, focus rings
  3. Recoverable Errors & Live Feedback — toasts, cancel, retry
  4. Onboarding & Clarity — landing, converter types, vocabulary
  5. Mapping Page Overhaul — field descriptions, diff drilldown
  6. Design System & Shell Polish — component library, breadcrumbs

Each phase lists findings addressed (referenced by UX_REVIEW.md
section number), specific file changes, dependencies, verification
steps, and explicit out-of-scope items. Includes a cross-phase
dependency graph, tracking convention, and open questions that
block specific phases.
Ships the P0 fixes from UX_IMPLEMENTATION_PLAN.md Phase 1:

- §3.2 (P0 bug): Fix re-upload page hardcoded label that rendered
  "training-client" as "Training (Form 888)". New shared
  converter-types.ts module is the single source of truth; later
  phases will extend it with descriptions and sample links.
- §4.3 (P1 silent data loss): Mapping-save failure no longer swallows
  the error. Now checks res.ok, surfaces the message via a new error
  state, and prevents the redirect on failure.
- §6.1 (P0 a11y): Add role=progressbar, aria-valuenow/min/max, and
  aria-label to the conversion progress bar.
- §5.4 (P1 a11y): Wrap live counters in aria-live=polite so screen
  readers announce per-update progress.
- §6.2 (P0 a11y): Add role=alert to every error container in
  convert/login/signup/mapping/reupload/preview and to the error
  boundary fallback.
- §6.6 (P3 a11y): Add scope=col to every <th> across dashboard,
  audit, preview, mapping, results (IssueTable + CleaningDiffView).

Verified with `npx tsc --noEmit` in apps/web — no errors.
Ships the mobile/responsive fixes from UX_IMPLEMENTATION_PLAN.md
Phase 2. Every page should now survive a 375px viewport without
horizontal overflow or illegible content.

- §1.1 (P1): Collapse nav into a hamburger sheet on <md screens.
  Desktop layout unchanged; mobile gets a burger button with
  aria-expanded/aria-controls, a dropdown listing the three links,
  the user's email, and Sign Out. The sheet auto-closes on route
  change via pathname effect.
- §3.7/§7.1 (P1): results summary cards →
  grid-cols-2 md:grid-cols-4.
- §7.2 (P1): results comparison cards →
  grid-cols-1 sm:grid-cols-3.
- §7.3 (P1): preview column-status cards →
  grid-cols-1 sm:grid-cols-3.
- §7.4 (P1): wrap dashboard table in overflow-x-auto.
- §7.5 (P1): wrap audit table in overflow-x-auto.
- §7.6 (P2): mapping page select + suggestion pill stacks
  vertically on <sm (flex-col sm:flex-row).
- §7.7 (P2): preview table thead is now sticky top-0 so column
  headers stay visible when scrolling a wide CSV.
- §6.4 (P1): globally visible :focus-visible ring via globals.css
  @layer base — keyboard users get a blue outline, mouse users
  don't.
- §8.4 (P2): upload limit text bumped from text-gray-400 to
  text-gray-600 for WCAG AA contrast.

Also marks Phase 1 and Phase 2 findings as [RESOLVED] in
UX_REVIEW.md per the tracking convention.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Previously the progress page had no way to cancel an in-flight
conversion — users could only close the tab, and the worker kept
running regardless. This commit ships a working cancel flow through
both the worker and the web app, resolving the cancel portion of
UX_REVIEW.md §3.6.

Design: cooperative cancellation. The database is authoritative —
clicking Cancel flips job.status to "cancelled" immediately so the
UI, the dashboard, and any future reads see the cancellation even
if the worker is still crunching. The worker is told in parallel
and bails at the next checkpoint in run_conversion. A conditional
updateMany in the start route's background handler prevents any
late-arriving "complete" or "error" from clobbering the cancelled
state.

Worker side:
- New apps/worker/app/services/cancellation.py with a thread-safe
  CancellationRegistry and ConversionCancelledError. Module-level
  singleton; polled from asyncio.to_thread while the event loop
  writes to it.
- run_conversion takes an optional is_cancelled callable and polls
  it at five checkpoints (entry, after mapping, after cleaning diff,
  after convert, after XSD validation). Cancellation takes effect
  at the next boundary — bounded by the duration of a single phase.
- POST /convert/{job_id}/cancel adds the job id to the registry.
- The /convert route catches ConversionCancelledError, returns 409,
  and always clears the flag in finally so job_ids can be reused
  after re-upload.

Web side:
- New POST /api/jobs/[jobId]/cancel: ownership check, flips DB
  status to "cancelled", writes a conversion_cancelled audit entry,
  fire-and-forget pokes the worker's cancel endpoint.
- Existing /api/jobs/[jobId]/start background handler now uses
  prisma.job.updateMany with where { status: "converting" } so a
  late completion or error can't overwrite a cancellation that
  landed in the race window.
- Progress page renders a "Cancel conversion" button while
  status === "converting", with a cancelling state and inline
  error alert. The poll loop now also redirects on
  status === "cancelled" (short delay for the user to register
  the state flip), and the dead-end timeout gets two recovery
  buttons: "Check status again" and "Go to dashboard".
- Results page redirects cancelled jobs back to the dashboard
  (they have nothing to show).
- Dashboard StatusBadge gains a "cancelled" color entry.

Verified:
- npx tsc --noEmit clean in apps/web
- Full worker import + functional tests green
- pytest tests/ — 133 passed
Resolves UX_REVIEW.md §3.9 / §5.1. Every successful client action
previously ended in a silent router.push; on slow connections this
made the app feel broken even when it succeeded. Adds a minimal
toast primitive and fires success toasts from the 5 client pages
that complete user actions.

- New apps/web/src/components/toast.tsx: Context + useToast hook,
  singleton viewport. Variants success/error/info with auto-dismiss
  (4s, 5s for errors). Errors use role=alert; non-errors use
  role=status inside an aria-live=polite list so screen readers
  announce them. Keyboard dismissable via Escape.
- Mounted in providers.tsx inside SessionProvider, so it's
  available on every client page.
- Fired on: sign-in ("Signed in"), sign-up ("Account created"),
  upload ("File uploaded — loading preview"), mapping save
  ("Mapping saved"), re-upload ("Re-upload received — loading
  preview").

Progress-page conversion-complete toast deferred because results
is a server component and would need a client wrapper + query
param to read the "just-completed" signal. Not essential — users
already see the results page.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §4.5 and completes the resolved portions of
§3.6.

- Elapsed-time counter ticks every second in its own effect so it
  keeps counting during slow/failing polls. Replaces the stale
  "Processing row 0 of 0" subtitle that used to be shown while the
  worker was running (the worker does not actually update
  Job.processedRows, so the row-count display was misleading). True
  row-based ETA is deferred; marked in UX_REVIEW.md with a note.
- Poll-failure banner appears after 3 consecutive poll failures and
  offers Retry (resets the failure counter and nonces the polling
  effect) and Go to dashboard. Single transient failures are still
  silently retried so the UI doesn't flap.
- The poll loop's catch now increments a counter; successful polls
  reset it to zero.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §4.1, §4.2, §4.4.

Also fixes a drive-by bug in the upload route: the converter-type
allowlist only accepted "counseling" and "training", even though
the upload page offers "training-client" as a third option. Users
who picked it got a 400 error back with no warning from the UI.

Specific upload error messages (§4.1):
- New apps/web/src/lib/upload-errors.ts maps the HTTP status codes
  the upload route actually returns (413, 429, 401/403, 400, 5xx)
  to plain-language messages with next-action hints.
- convert/page.tsx and reupload/page.tsx both call the helper and
  fall back to "Couldn't reach the server…" on network failures.
- Upload route's allowlist now includes "training-client".

Preview + mapping error retry cards (§4.2):
- preview/page.tsx: extracted loadPreview into a useCallback so
  the Try again button can re-run it. Renamed error state to
  loadError; convert-start failures now fire a toast instead of
  nuking the whole page. Error fallback is now a proper card with
  Try again and Back to upload buttons.
- mapping/page.tsx: same pattern — load() is a useCallback with
  a dedicated loadError state and a card with Try again and Back
  to preview buttons. Previously load failures were silently
  swallowed, leaving the page blank.

Error boundary (§4.4):
- Now stores error + errorInfo in state so we can display them.
- Fallback offers both Try again and Go to dashboard.
- In development, a collapsible <details> surfaces the error
  message, stack, and component stack so contributors don't need
  to open devtools.
- TODO comment for Sentry wiring once the Sentry MCP integration
  is enabled (Phase 6 follow-up).

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §5.2 and §5.3.

Previously loading buttons relied on text changes plus a dimmed
background, and data pages showed a centered "Loading…" string
that caused layout jumps when the content arrived. This commit
adds two small primitives and wires them through every loading
surface in the app.

New components:
- apps/web/src/components/spinner.tsx — 16px SVG with animate-spin,
  no dependency.
- apps/web/src/components/skeleton.tsx — base Skeleton, SkeletonText,
  and SkeletonTable for the common shapes.

Spinners added to (with aria-busy on the parent button):
- login/page.tsx Sign In button
- signup/page.tsx Create Account button
- convert/page.tsx Upload & Preview button
- preview/page.tsx Confirm & Convert button
- mapping/page.tsx Save Mapping & Continue button
- reupload/page.tsx Upload & Compare button
- progress/page.tsx Cancel conversion button

Skeletons replace "Loading…" on:
- preview/page.tsx (heading + summary cards + 5-row table)
- mapping/page.tsx (heading + 8-row 2-column table)
- audit/page.tsx (5 skeleton rows inside the real <tbody>)
- reupload/page.tsx (heading + read-only type + file input)
- convert/page.tsx Suspense fallback

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §3.1 / §9.4. The convert page's converter
picker was three unadorned radio buttons — partners couldn't tell
whether their CSV belonged under "Counseling", "Training", or
"Training Client" without reading source code. Picking wrong meant
the upload succeeded, preview reported "Missing" for every column,
and the user had to loop back.

- Extended apps/web/src/lib/converter-types.ts from a simple label
  helper into a full metadata module:
    CONVERTER_TYPES    — readonly array of {value, label, description, sample}
    converterTypeLabel — existing, now derived from the array
    converterTypeMeta  — new lookup for the rest of the metadata
  The sample paths point at /samples/<type>-sample.csv which will be
  shipped in the onboarding commit (Phase 4 commit 4/4).
- Rebuilt convert/page.tsx radio group as three stacked cards (one
  column on mobile, three on sm+). Each card shows label, plain-
  language description, and a "See sample" link. The underlying
  <input type="radio"> is sr-only, so screen readers still hear a
  real radio group and keyboard navigation works via the label.
- Selected cards get a blue border + bg and render a filled radio
  dot from state; unselected cards use a subtle hover border.

Samples are linked but not yet present — follow-up commit adds them
under apps/web/public/samples/.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §2.4, §3.3, §6.5.

Signup password rules checklist (§2.4):
- Four rules rendered under the password input: 8+ chars, uppercase,
  digit, special char. Each mirrors the server check in
  /api/auth/signup/route.ts validatePasswordComplexity().
- Empty rules render as gray with hollow check circles; met rules
  turn green with filled circles as the user types.
- The <input> points at the rules via aria-describedby and each
  item has an sr-only "Met:/Not met:" prefix so screen readers
  announce the transitions clearly.

Dropzone refactor to a real <label> (§6.5):
- Replaced the <div role="button" tabIndex={0}> hack with a proper
  <label htmlFor="file-input">. Labels natively activate their
  target input on click and Enter/Space, so the manual keyboard
  handler is gone.
- Added aria-labelledby + aria-describedby pointing at the label
  and help text so screen readers announce both.
- File input now uses sr-only (still keyboard accessible) instead
  of className="hidden" (which would take it out of the tab order).

Client-side file validation (§3.3):
- New acceptFile() helper rejects anything that isn't a .csv
  (case-insensitive — the old ".endsWith('.csv')" check silently
  dropped .CSV files on Drop) and anything over 50MB.
- Rejection fires a toast explaining why, so the user gets
  immediate feedback instead of waiting for a server round-trip.
- accept attribute now includes "text/csv" alongside ".csv" so the
  OS file picker pre-filters properly.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §6.3, §9.1, §9.5, §3.4, §10.1, §10.2.

Shared status-icon primitive (§6.3):
- New apps/web/src/components/status-icon.tsx: 16px SVG icons for
  success / warning / error / info / neutral, all using
  currentColor so they inherit the surrounding text color. aria-
  hidden because adjacent text already carries the meaning.
- Closes a WCAG 1.4.1 gap: before this commit, green/yellow/red
  were the only signal on the dashboard, results summary cards,
  preview column-status cards, and cleaning diff. ~8% of men have
  color vision deficiency and couldn't tell these apart.

Status vocabulary + StatusBadge redesign (§9.1):
- dashboard/page.tsx: replaced the six lowercase raw status strings
  with a STATUS_META table that maps each DB status to a plain
  label ("Ready to convert", "Needs mapping", "Failed") and a
  StatusIcon kind. Badges now show icon + label, with data-status
  preserved for tests. Terminal colours bumped from 700 to 800 for
  contrast.

Results page (§6.3):
- Summary cards gained a StatusKind prop; Successful / Errors /
  Warnings each render an inline icon next to the label.
- XSD validation block uses icon+text too, with darker variants
  (green-700, red-700) for contrast.
- Cleaning diff rows prefix Original with "−" and Cleaned with
  "+" plus sr-only "Original value:" / "Cleaned value:" so the
  diff reads properly in screen readers and makes sense without
  color.

Preview page (§3.4, §6.3):
- Matched / Missing / Extra cards now use the status-icon
  primitive.
- Extra is now neutral gray (not warning yellow) with a one-line
  explanation "Extra columns are ignored during conversion. This
  is fine." so partners stop treating it as a problem.

Mapping page (§9.5):
- Renamed "Skip" to "Cancel" and added a dirty-state check:
  mappingKey() stringifies the current mapping state, and if it
  differs from the initial key we window.confirm() before
  discarding changes. Tracks initial key on load.

Data-handling disclosure on upload (§10.1):
- One-paragraph disclosure under the Upload button: "Your CSV is
  stored in your account and is visible only to you. Files are
  retained per SBA policy. Data is processed on SBA infrastructure;
  nothing is sent to third-party services." Marked in UX_REVIEW
  as needing the concrete retention number once product confirms.

Re-upload previous-kept cue (§10.2):
- Extra line under the reupload subtitle: "Your previous
  conversion is kept as a separate job. The two are compared
  side-by-side on the next screen." Resolves the mental-model
  mismatch where users thought re-upload replaced the original.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §2.1, §2.2, §2.3.

Landing page rebuild (§2.1):
- Hero section explains what the tool does in plain language.
- "What this converts" section renders three converter-type cards
  (pulled from the shared converter-types.ts metadata) with plain-
  language descriptions and "Download sample CSV" links.
- "How it works" section shows the 4-step flow (Upload → Preview
  & map → Convert & validate → Download) as numbered cards.
- Footer note reiterates the data-handling disclosure so partners
  can evaluate trust before creating an account.
- CTAs (Sign In / Create Account) bumped to text-base py-3 for
  better visual balance against the headline — resolves §8.3 too.

Dashboard empty state (§2.2):
- Replaced "No conversions yet. Upload a CSV file to get started."
  with a card containing a primary "Start a new conversion"
  button and three SampleLink cards so first-time users can
  round-trip the tool without supplying their own data.

Sample CSVs (new):
- apps/web/public/samples/{counseling,training,training-client}-
  sample.csv — minimal 3-row files demonstrating the expected
  column names and value format. No real client data.
- .gitignore updated to un-ignore this directory (the repo has a
  blanket *.csv exclude).

README web-app section (§2.3):
- Added a "Web app (recommended)" section at the top pointing at
  docker compose up, the samples directory, and the UX
  documentation (UX_REVIEW, UX_IMPLEMENTATION_PLAN,
  TECHNICAL_DEBT). The existing CLI section is preserved below.

Drive-by: §8.3 (homepage typography unbalanced) is now resolved
as a side-effect of the landing page rebuild.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §3.5. The mapping page previously exposed raw
XML field names like "BranchOfService" and "FIPS_Code" with only
Required/Conditional/Optional badges — partners with no SBA schema
background had no way to tell what each field meant, and
"Conditional" with no explanation of the rule was worse than
useless.

Worker side:
- New COUNSELING_FIELD_METADATA, TRAINING_FIELD_METADATA, and
  TRAINING_CLIENT_FIELD_METADATA dicts in preview_service.py map
  expected field names to {description, conditional_rule}. Kept
  in the worker layer instead of src/config.py so the CLI surface
  isn't touched with web-specific metadata.
- Covers all required + conditional fields:
    counseling      32 fields (all 12 required + all 9 conditional + common optionals)
    training         1 field  (the only required field)
    training-client 10 fields (both required + common optionals)
- read_csv_preview() now emits a field_descriptions dict alongside
  field_requirements. Only fields with metadata are included; the
  UI falls back to raw names for anything missing.

Web side:
- Extended PreviewResponse type in types/index.ts with optional
  field_descriptions: Record<string, FieldDescription>.
- mapping/page.tsx renders the description under the monospace
  field name in text-xs text-gray-600. Conditional badges gain a
  title attribute with the rule text, and the rule is also
  rendered inline in amber-700 ("When required: …") so the user
  sees it without needing to hover.
- Updated the page subtitle to explain Required vs Conditional
  and point at the inline rule text.

Verified:
- python3 -m pytest tests/ — 133 passed
- npx tsc --noEmit in apps/web — no errors
- Metadata completeness: every COUNSELING_CONDITIONAL field has a
  conditional_rule entry.
Resolves UX_REVIEW.md §3.8 and §9.2.

Re-upload comparison drilldown (§3.8):
- Previously, after a re-upload the results page showed three count
  cards (Resolved / New / Persistent) and nothing else. To see
  *which* issues had resolved, users had to compare the combined
  issues table against memory of the previous upload. The whole
  point of re-upload is verification, so that was a bad gap.
- Replaced the three cards with a ComparisonDrilldown tablist.
  Each tab renders as a card with icon + label + count and links
  to ?compare=resolved|new|persistent. The active tab's filtered
  issue list is rendered below the tabs using the existing
  IssueTable component.
- Default tab is "new" — those are the issues the user most likely
  cares about after a re-upload. Empty states per tab: "No issues
  were resolved by this re-upload.", "No new issues were
  introduced — nice.", "No issues carried over from the previous
  upload."
- Uses Link navigation + search params (no client state) so the
  results page stays a server component.

Audit details human-readable renderer (§9.2):
- New formatAuditMetadata(action, metadata) helper maps the known
  action types to plain-language summaries:
    upload                -> "<fileName> (<size> KB)"
    conversion_started    -> "Conversion started"
    conversion_complete   -> "<successful>/<total> successful, <errors> errors"
    conversion_failed     -> "Failed: <error>"
    conversion_cancelled  -> "Cancelled by user"
    download              -> "XML downloaded"
- New formatActionLabel() capitalizes the action column consistently.
- Details cell renders the summary plus a <details>/<summary>
  "View raw" toggle that shows pretty-printed JSON inline — no
  extra JS required.
- Row cells now use align-top so long details don't vertically
  center and crowd the neighbors.
- Added the "conversion_cancelled" option to the filter dropdown
  (it was missing from Phase 3 when cancel landed).
- Consistent m-dashes ("—") for empty cells instead of hyphens.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §1.2. The five-step conversion flow
(Upload → Preview → Map → Convert → Results) previously had no
visible progress cue. Users in the middle of a multi-step
conversion lost their place and had to lean on the URL bar or
the Nav's "Convert" link (which doesn't distinguish between
steps and lit up on every /convert/* sub-route).

- New client component apps/web/src/components/ui/step-indicator.tsx
  derives the active step from usePathname() (with regex matches for
  each sub-route) and renders a horizontal strip of 5 steps.
- Circles: current step blue, past steps green with a check
  StatusIcon, future steps gray with their number. Labels are
  color-coded to match.
- Past steps are linked (back-navigation only); the current and
  future steps are non-interactive spans. /convert/[id]/reupload
  lands back on the Upload step.
- Mobile: horizontal overflow with gap-1 at text-xs; sm+ uses
  gap-2 at text-sm.
- ARIA: nav with aria-label="Conversion steps", each li carries
  aria-current="step" when active.
- Mounted via a new apps/web/src/app/convert/layout.tsx that
  wraps both the plain /convert upload page and the
  /convert/[jobId]/* sub-pages.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §1.3. The app had no in-product help
surface — partners with questions had nowhere to go inside the
web UI, and the repo README only documents the Python CLI.

- New apps/web/src/app/help/page.tsx — self-contained static
  help content with 8 sections, each anchored for deep-linking:
    #overview         What this tool does
    #converter-types  Pick the right converter (pulls from
                      CONVERTER_TYPES metadata so it auto-updates)
    #upload           Step-by-step upload flow
    #mapping          Badge meanings and mapping walkthrough
    #results          How to read the results page
    #cancel           Cancelling a running conversion
    #errors           Common error messages with fixes
    #support          Where to file issues / contact
- Includes a local table of contents at the top and per-section
  heading anchors (scroll-mt-16 so the sticky nav doesn't cover
  the headline).
- Links to the UX review, technical debt register, and GitHub
  issues for known problems and bug reports.
- Nav gains a "Help" link after Audit Trail. Since the mobile
  hamburger sheet already iterates the links array, Help shows
  up automatically in both desktop and mobile layouts.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Resolves UX_REVIEW.md §8.1 (partial), §8.2, §8.5.

The app had ~8 hand-rolled button variants, ~6 hand-rolled alert
variants, and a dashboard-local StatusBadge that no other page
could use. Every new page drifted further from any baseline. This
commit extracts the four most duplicated primitives and refactors
the heavy consumers to use them.

New components under apps/web/src/components/ui/:
- button.tsx — Button with variant (primary/secondary/destructive),
  size (sm/md), isLoading (built-in Spinner + aria-busy), and
  fullWidth props. Forwards refs so it composes with Link-as-child
  later if needed.
- alert.tsx — Alert with variant (error/warning/info/success) and
  optional title. Errors get role=alert; others get role=status.
  StatusIcon baked in, matches Phase 4's icon+text contract.
- card.tsx — thin "white bg, gray border, rounded, padding" wrapper.
- status-badge.tsx — extracted from dashboard/page.tsx. Same
  STATUS_META map driving label + icon + color per status, with
  data-status preserved for tests. Resolves §8.2.

Refactored pages to consume the primitives:
- login, signup — Button + Alert
- convert (upload) — Button + Alert (kept Skeleton + ConverterTypes)
- preview — Button + Alert (error card now uses Alert title pattern)
- mapping — Button + Alert + dirty-state Cancel uses secondary
  variant
- reupload — Button + Alert
- progress — Button + Alert (cancel button, timeout recovery,
  poll-failure banner, cancel error)
- dashboard — imports StatusBadge from ui/, local copy deleted
- error-boundary — Button for Try again (Link kept as-is for
  back-navigation)

Net diff: -251 lines across the refactored pages, +267 in the
new primitives. Every page that previously had `bg-blue-600
text-white rounded py-2 text-sm font-medium hover:bg-blue-700
disabled:opacity-50` now has `<Button>`.

§8.1 is marked partially resolved because some utility-class
repetition still exists in less common sites (dense tables,
bespoke landing-page layouts) and a lint rule to prevent
regression is deferred.

Verified with `npx tsc --noEmit` in apps/web — no errors.
Phase 6 (3/3) extracted the component library but noted that
"§8.1 is partial — a lint rule to prevent regression is deferred."
This commit closes that gap and marks §8.1 fully resolved.

- New apps/web/scripts/check-ui-classes.mjs: zero-dependency
  Node script that walks apps/web/src, skips node_modules/.next,
  skips components/ui/ (the intentional home of the primitives),
  and matches two regressions:
    * raw primary button:  bg-blue-600 + text-white on the same
      element. Hint points at <Button>.
    * raw error alert:     bg-red-50 + border-red-* or text-red-*
      on the same line. Hint points at <Alert variant="error">.
  Exits 1 with file:line + rule + hint on any violation.
- File-level exemption for src/components/toast.tsx (the toast
  variant map is the intentional source of truth for runtime
  notification styling).
- Line-level exemptions for three false positives: the preview
  column-status card for "Missing" (not an alert), the cleaning
  diff table's red "before" cell (not an alert), and the landing
  page's numbered step circle in "How it works" (not a button).
  Each exemption has an inline comment explaining why.
- Wired into `npm run lint` (which CI runs via
  .github/workflows/ci.yml) so regressions fail the build.
- Added an `lg` size and a `buttonClasses()` helper to
  components/ui/button.tsx so <Link>-as-button sites can share
  the same variant/size source of truth without hand-rolling
  classes. Refactored the three real violations caught by the
  initial scan:
    * apps/web/src/app/page.tsx        — Sign In and Create
      Account link buttons
    * apps/web/src/app/dashboard/page.tsx — "New Conversion" and
      "Start a new conversion" CTAs

Verified:
- `npm run lint` passes end-to-end (tsc + check-ui-classes)
- Regression smoke test: dropped a bad <button className="bg-blue-600
  text-white"> in a scratch file, ran the check, got the expected
  violation report; scratch file deleted.
Closes the row-level ETA gap that was deferred during Phase 3.
Previously the progress page showed a dead 0% bar for the entire
conversion because the web layer had no way to see what the worker
was doing mid-run; the elapsed-time counter was the only live
feedback. This commit threads a progress signal end-to-end from
the converter row loops to the progress page, and computes a
rate-based ETA in the browser.

Converter layer (src/converters/):
- BaseConverter gains an optional `progress_callback: (processed,
  total) -> None`. Default is None so the CLI path is unchanged.
- `_report_progress(processed, total)` is a no-op when the callback
  is not set; it swallows callback exceptions so progress errors
  can't break a conversion.
- `_maybe_report_progress(processed, total, every=25)` debounces
  reporting to roughly once every 25 rows so we don't hammer the
  registry on tiny files, and always reports the terminal count.
- CounselingConverter.convert() emits an initial 0-tick, calls
  _maybe_report_progress at the end of every row iteration, and
  a final total-tick. TrainingClientConverter inherits for free.
- TrainingConverter.convert() reports progress in event groups
  (every=5) since training aggregates per event rather than per
  row.

Worker layer (apps/worker/):
- New apps/worker/app/services/progress.py: thread-safe
  ProgressRegistry (ProgressSnapshot dataclass with
  processed/total/updated_at). Same design pattern as
  cancellation.py — single-process singleton, caller clears on
  job completion.
- conversion_service.run_conversion accepts an on_progress
  callable, installs it as converter.progress_callback, and
  threads it through the asyncio.to_thread boundary.
- The /convert route builds a closure that writes into
  progress_registry keyed by job_id, passes it to run_conversion,
  and clears the snapshot in the existing finally block alongside
  the cancellation flag.
- New GET /convert/{job_id}/progress route returns the current
  snapshot (processed/total/updated_at) or 404 if no snapshot is
  recorded (job finished or not started).

Web layer (apps/web/):
- /api/jobs/[jobId] GET now fetches the worker progress snapshot
  for jobs in status=converting and merges it into the response as
  processedRows / totalRows / progressUpdatedAt. Worker 404s and
  network errors are swallowed — the UI falls back to its
  elapsed-time counter.
- Progress page formats rows as "X of Y records · Running for Nm
  Ns" and adds a rate-based ETA once >= 5% of the work is done
  ("about 3 minutes remaining"). The 5% floor suppresses the
  noisy early estimates. ETA format buckets into "a few seconds",
  "about 30 seconds", "about a minute", "about N minutes".

Verified:
- python3 -m pytest tests/ — 133 passed
- npm run lint (tsc --noEmit + check-ui-classes) — clean
The previous commit put buttonClasses() and the VARIANT/SIZE tokens
in components/ui/button.tsx, which is marked "use client" because
Button uses forwardRef/event handlers. Server components (the
landing page and the dashboard) imported buttonClasses from there
and called it during SSR — Next.js 16 rightly rejected that at
build time:

  Error: Attempted to call buttonClasses() from the server but
  buttonClasses is on the client. It's not possible to invoke a
  client function from the server, it can only be rendered as a
  Component or passed to props of a Client Component.

Fix: split the pure class-string helpers into a non-client module
that both halves can import from.

- New apps/web/src/components/ui/button-classes.ts (no "use
  client"): exports buttonClasses(), ButtonVariant, ButtonSize,
  and the BUTTON_VARIANT_CLASSES / BUTTON_SIZE_CLASSES constants.
  This is the source of truth for the tokens.
- button.tsx keeps its "use client" directive, imports from
  ./button-classes, and uses buttonClasses() internally so the
  Button component and Link-as-button sites share one definition.
- apps/web/src/app/page.tsx and dashboard/page.tsx (both server
  components) now import buttonClasses from
  @/components/ui/button-classes instead of @/components/ui/button.

Verified: `npm run build` finishes cleanly (13/13 static pages,
23 routes compiled). The Turbopack NFT warning about next.config.ts
in the download route trace is pre-existing — it comes from the
dynamic readFile(job.outputFilePath) in download/route.ts, not
from this change. The ioredis ECONNREFUSED noise during
prerender is also pre-existing and harmless (no Redis available
at build time).

The regression guard in scripts/check-ui-classes.mjs already
exempts components/ui/ as a directory, so the raw utility-class
strings in button-classes.ts don't trip the check. `npm run lint`
passes end-to-end.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly expands the project by adding a full-stack web application and a FastAPI-based worker service, enabling browser-based file uploads, column mapping, and conversion history. The changes include Docker configuration, comprehensive UX documentation, and enhanced Python converters. Feedback highlights critical areas for improvement, specifically addressing security risks associated with client-side status updates and potential race conditions during job state transitions. Recommendations also include adopting a more robust background task management system and refining the frontend ETA logic to handle page refreshes correctly.

}

// Whitelist fields that clients are allowed to update
const allowedFields = ["columnMapping", "status"] as const;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Allowing the client to update the status field directly via a PATCH request is a security risk. A malicious user could manually set a job's status to complete or error without the conversion actually taking place. Status transitions should be strictly controlled by the server-side logic (e.g., in the /start or /cancel routes). If the client needs to signal a state change (like mapping), it should be validated against a set of allowed transitions.

Comment on lines +45 to +48
await prisma.job.update({
where: { id: jobId },
data: { status: "cancelled", completedAt: new Date() },
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using prisma.job.update here introduces a race condition. If the conversion worker finishes and updates the status to complete between the check at line 38 and the update at line 45, the complete status will be overwritten by cancelled. Use updateMany with a status check in the where clause to ensure the update only occurs if the job is still in a cancellable state.

    await prisma.job.updateMany({
      where: {
        id: jobId,
        status: { in: ["converting", "uploaded", "previewed", "mapping"] }
      },
      data: { status: "cancelled", completedAt: new Date() },
    });

Comment on lines +28 to +30
await prisma.job.update({
where: { id: jobId },
data: { status: "converting" },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the cancellation logic, using an unconditional update to set the status to converting can lead to race conditions or unintended re-runs of already completed jobs. It is safer to use updateMany and verify that the current status is one that allows starting a conversion (e.g., uploaded, previewed, or error).

    const updated = await prisma.job.updateMany({
      where: {
        id: jobId,
        status: { in: ["uploaded", "previewed", "mapping", "error"] }
      },
      data: { status: "converting" },
    });

    if (updated.count === 0) {
      return NextResponse.json({ error: "Job cannot be started in its current state" }, { status: 400 });
    }

file_content: fileContent,
}),
})
.then(async (result) => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The background task started here via .then() is not robust. In a standalone Node.js server, this may continue to run, but if the process crashes or restarts (e.g., during a deployment), the task is lost and the job will remain stuck in the converting state indefinitely. For production reliability, consider using a persistent job queue (like BullMQ or a database-backed worker) to handle long-running conversions.

// work is done, so the first noisy ticks don't produce wild
// estimates. The rate is rows/ms; remaining is total-processed.
let etaText = "";
if (isConverting && total > 0 && processed > 0 && elapsedMs > 2000) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ETA calculation processed / elapsedMs assumes the conversion started exactly when the user loaded the page. If the user refreshes the page while a conversion is at 90%, elapsedMs resets to 0, causing the rate to be artificially high and the ETA to be wildly inaccurate. The calculation should ideally use a persistent start timestamp from the server (e.g., the job's updatedAt when it first transitioned to converting).

claude added 2 commits April 11, 2026 20:50
Two pre-existing issues made `next build` output noisy:

1. Redis was eagerly instantiated at module-load time, so every
   build-time route trace that transitively imported rate-limit.ts
   tried to open a TCP connection to redis://localhost:6379 and
   dumped `ECONNREFUSED` across the build output. The connection
   failures didn't break the build but made it hard to spot real
   errors.

2. Turbopack's NFT walker traced download/route.ts, saw two
   `realpath()` calls with dynamic arguments (job.outputFilePath
   from the DB), gave up statically resolving the paths, and
   flagged `next.config.ts` as potentially needed by the route.
   The result was a prominent "Encountered unexpected file in NFT
   list" warning on every build.

Fix (1) — lazy Redis:
- redis.ts now exposes `getRedis()` instead of a module-level
  singleton. The ioredis client is created on first call with
  `lazyConnect: true`, so merely importing the module (as Next.js
  does during prerender tracing) no longer opens a socket. Added
  a retry strategy that gives up after 3 attempts and a retry
  cap so a dead Redis can't hang a request for minutes.
- rate-limit.ts now wraps the Redis calls in try/catch and fails
  **open** if Redis is unreachable. The alternative — failing
  closed on any Redis error — would 429 every request during a
  Redis outage, which is worse than a brief rate-limit gap.
  Build-time tracing, dev without a container, and transient
  outages all fall through cleanly.

Fix (2) — turbopackIgnore on download route:
- download/route.ts gains `/* turbopackIgnore: true */` comments
  on the three dynamic filesystem calls (`realpath(outputFilePath)`,
  `realpath(DATA_DIR)`, `readFile(resolvedPath)`). The comments
  tell the NFT walker not to try to statically resolve these
  arguments — they're always runtime values scoped to DATA_DIR
  per the existing path-traversal check. No functional change.

Verified: `npm run build` now produces a clean output — no
Redis noise, no NFT warning, all 13 static pages prerendered,
all 23 routes compiled.
… terminal

Per PR review on cancel/route.ts:38-39. The previous commit let
the cancel endpoint mark uploaded/previewed/mapping jobs as
cancelled, but the /start and /preview routes wrote their own
statuses unconditionally, so a stale preview tab or a later start
call could revive a "cancelled" job and produce contradictory
status history plus duplicate audit entries.

Two-part fix: restrict cancellation to actively converting jobs
AND make the other write paths refuse to revive terminal states.

cancel/route.ts:
- Now rejects (idempotent 200 with current status) any job that
  isn't in status "converting". The Cancel button in the UI only
  exists while status === "converting" (see progress/page.tsx)
  so this matches the existing UX contract.
- The status flip uses `updateMany` with a `status: "converting"`
  where clause so a late-arriving worker completion can't lose
  to the cancel — if it does, the route reads back the current
  state and reports that instead of forcing "cancelled".

start/route.ts:
- Added a STARTABLE_STATUSES allowlist ("uploaded", "previewed",
  "mapping"). A job in any other state gets a 409 with a message
  telling the user to re-upload. This blocks stale tabs from
  starting a cancelled/complete/error/converting job.
- The initial status transition is now a conditional updateMany
  (where status IN STARTABLE_STATUSES) so the pre-flight
  read-then-write race is closed: if the user cancels between
  findFirst and the update, the update affects 0 rows and we
  return 409 instead of reviving the cancellation.

preview/route.ts:
- The status update to "previewed" is now conditional with a
  `notIn: ["cancelled", "complete", "error", "converting"]` guard.
  totalRows is still persisted unconditionally so the dashboard
  row count stays accurate on historical jobs, but the status
  transition only applies to pre-conversion states. A preview
  fetch from a stale tab of a cancelled job no longer revives it.

route.ts PATCH:
- New TERMINAL_STATUSES set ("cancelled", "complete", "error").
  The PATCH now refuses updates to jobs in any of those states
  with a 409 and an explanatory message.
- The update uses `updateMany` with a `status notIn TERMINAL_STATUSES`
  guard to close the read-then-write race (cancel landing between
  findFirst and the update). On race-loss, the route reads back
  the current status and reports it.
- Reads back the fresh row via findUnique so the 200 response
  still matches the previous shape (was: return value of
  prisma.job.update).

Verified:
- `npm run lint` (tsc --noEmit + check-ui-classes): clean
- `npm run build`: clean, 13/13 static pages, 23 routes compiled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants