Skip to content

Security audit: hardening, perf, dep pinning, tests, CI, and refactors#1

Open
MsShawnP wants to merge 8 commits into
mainfrom
claude/security-code-audit-6k0Fg
Open

Security audit: hardening, perf, dep pinning, tests, CI, and refactors#1
MsShawnP wants to merge 8 commits into
mainfrom
claude/security-code-audit-6k0Fg

Conversation

@MsShawnP

Copy link
Copy Markdown
Owner

Comprehensive sweep across security, dependencies, performance, code quality, tests, docs, and CI — followed by the two larger refactors flagged in the audit. 70 pytest tests pass; the app boots and renders end-to-end via streamlit.testing.v1.AppTest.

Summary by area

Project hygiene (74eb71d)

  • Add MIT LICENSE file (README claimed MIT but no file existed)
  • Pin upper bounds on all runtime deps; raise streamlit floor to 1.54.0 to close CVE-2026-33682 (SSRF/NTLM on Windows) and the 2024 path-traversal advisory; cap pandas <3.0 to avoid the breaking CoW/dtype change
  • Declare pytest in requirements-dev.txt (was undeclared)
  • Replace YOUR_USERNAME / YOUR_EMAIL placeholders in README; add Python 3.9+ note and a "Run tests" section

Security hardening (c8dba7d)

  • csv_report.py: prefix any cell starting with =, +, -, @, \t, or \r with a leading apostrophe so spreadsheet apps don't evaluate it as a formula if the CSV is shared
  • pdf_report.py: escape user-controlled strings (company_name, raw_input, issue.message, issue.recommendation, gtin_type.value) before passing them to ReportLab Paragraph — ReportLab parses inline XML markup, so unescaped < / > / & would corrupt rendering
  • app.py: 50,000-row hard cap on pasted/uploaded batches; narrow CSV-read except chain to known parser/encoding errors before falling back to a generic handler

Performance (2ca4a17)

  • validate_batch: build a {cleaned: [row_numbers]} map once instead of re-scanning the full result list per duplicate (was O(n²))
  • generate_retailer_checklists: hoist batch-level slices (invalid, dups, prefix mismatches, has_case) out of the per-retailer loop so they're computed once across all six profiles
  • check_data_completeness: vectorize the "non-empty after strip" count via str.strip().astype(bool).sum() instead of a per-row lambda
  • app.py: cache CSV/PDF in session_state and re-run PDF generation only when the company name changes — both reports were regenerating on every Streamlit rerun

Robustness & cleanup (85ca2fb)

  • validate_single_gtin accepts None / pandas NaN / numeric inputs without crashing on .strip()
  • Drop the unused gtin14_format key from all six retailer profiles (was set in 6 places but never read)
  • Update the PREFIX_MISMATCH message to flag that the prefix slice is heuristic
  • Replace stale "dark mode toggle" comments with accurate language

Bug fix + tests (c6bc7f3)

  • Prefix slice bug: a properly matched unit/case pair (614141000012 / 10614141000019) reported different prefixes and incorrectly tripped PREFIX_MISMATCH because GTIN-12 took cleaned[:7] while GTIN-14 took cleaned[1:8]. Zero-pad GTIN-12 onto the GTIN-13 frame before slicing
  • 30 new tests covering previously-untested public API: check_data_completeness, generate_retailer_checklists, calculate_readiness_score, estimate_cost_of_inaction, CSV/PDF report generators (including formula-injection neutralization and ReportLab markup escaping), GTIN-8 happy path, INDICATOR_ZERO, None/NaN input handling, and a sample-data regression guard
  • Two existing tests updated to reflect the corrected prefix slice

CI (7d0df37)

  • .github/workflows/ci.yml runs pytest on Python 3.9 / 3.10 / 3.11 / 3.12 plus a pip-audit job against requirements.txt

Refactor: pdf_report.pyPDFReportBuilder class (af16733)

  • Replace ~550-line generate_pdf_report god function with PDFReportBuilder whose _render_* methods correspond to report sections
  • Lift the 8 ParagraphStyle definitions into a module-level _build_styles() factory
  • Promote layout heuristics to named constants (PAGE_CONTENT_HEIGHT, ITEM_HEADER_PT, ISSUE_BLOCK_PT, GROUP_HEADER_PT, MULTI_GROUP_HEADER_PT, FAILING_ROW_PT, FAILING_HEADER_PT)
  • Convert the 5 nested closures into instance methods
  • Hoist CRITICAL / WARNING / INFO label dicts to module-level constants
  • Public API unchanged: generate_pdf_report(validation_data, company_name="") -> BytesIO

Refactor: app.py monolith → ui/ package (a2da009)

app.py shrinks 947 → ~190 lines. The substantive UI now lives in cohesive modules under ui/:

  • ui/styles.css + ui/styles.py: the ~190 lines of inline CSS now live in a static .css file loaded by inject_css()
  • ui/state.py: session-state keys are constants instead of magic strings; reset_session() only clears keys this UI owns rather than blindly wiping every entry; invalidate_report_caches() centralizes CSV/PDF cache eviction
  • ui/input_section.py: paste / CSV upload / sample-data picker, returns typed (gtins, dataframe) tuple
  • ui/results.py: score card, summary stats, download buttons, four validation-result tabs
  • ui/deep_analysis.py: six deep-analysis tabs

app.py keeps responsibility only for: page config, CSS injection, header, sidebar, input persistence across reruns, validate / reset buttons, the no-data explainer, and the share/security footer.

Test plan

  • pytest tests.py -v — 70 passed, 0 failed
  • python -c "import app" — succeeds with no errors
  • streamlit run app.py — boots, serves HTTP 200
  • streamlit.testing.v1.AppTest — exercises sample-data → Validate flow end-to-end with no exception; all 10 tabs (4 results + 6 deep analysis) render
  • Manual visual QA in browser — recommend before merge to confirm CSS extraction did not regress styling
  • CI workflow runs on push (added in this PR; first run will be against this PR)

Commits

74eb71d  Add LICENSE and harden project hygiene
c8dba7d  Harden CSV/PDF generation against injection from user input
2ca4a17  Eliminate O(n^2) hotspots and cache derived reports
85ca2fb  Robustness and cleanup pass
c6bc7f3  Normalize prefix slice across GTIN types and add 30 tests
7d0df37  Add CI workflow: pytest matrix + pip-audit
af16733  Refactor pdf_report into PDFReportBuilder class
a2da009  Decompose app.py monolith into ui/ package

Generated by Claude Code

claude added 8 commits May 14, 2026 04:27
- Add MIT LICENSE file (README claimed MIT but the file was missing)
- Pin upper bounds on all runtime deps; raise streamlit floor to 1.54.0
  to close CVE-2026-33682 (SSRF/NTLM on Windows) and the 2024 path-
  traversal advisory; cap pandas <3.0 to avoid the CoW/dtype break
- Declare pytest in requirements-dev.txt (previously undeclared)
- Replace YOUR_USERNAME / YOUR_EMAIL placeholders in README with real
  values; add Python 3.9+ requirement and a "Run tests" section
- csv_report: prefix any cell that begins with =, +, -, @, tab, or CR
  with a leading apostrophe so spreadsheet apps don't evaluate it as a
  formula if the resulting CSV is shared with a third party
- pdf_report: escape user-controlled strings (company_name, raw_input,
  issue.message, issue.recommendation, gtin_type.value) before passing
  them to ReportLab Paragraph. ReportLab parses inline XML markup, so
  unescaped < / > / & would corrupt rendering or inject unintended
  styling; the prior try/except in app.py masked these failures
- app.py: cap pasted/uploaded batches at 50k GTINs and surface a clear
  error so a runaway input doesn't block the Streamlit main thread
- app.py: narrow the CSV-read except chain to known parser/encoding
  errors before falling back to a generic handler
- validate_batch: build a {cleaned: [row_numbers]} map once instead of
  re-scanning the full result list per duplicate (was O(n^2) when many
  duplicates existed)
- generate_retailer_checklists: hoist the batch-level slices (invalid,
  duplicates, prefix mismatches, has_case) out of the per-retailer loop
  so they're computed once across all six profiles instead of 6+ times
- check_data_completeness: vectorize the "non-empty after strip" count
  via str accessor + astype(bool).sum() instead of a per-row lambda
- app.py: cache the generated CSV and PDF in session_state, invalidate
  on a new validate run, and re-run PDF generation only when the
  company name changes — previously both reports were regenerated on
  every Streamlit rerun (e.g. toggling a selectbox)
- validate_single_gtin: accept non-string inputs (None, pandas NaN,
  numeric types) by coercing them to "" rather than crashing on
  .strip(). The CSV upload path drops NaN today, but defensive
  handling keeps the function safe for direct callers and future
  callsites that pass raw cells.
- Drop the unused gtin14_format key from every RETAILER_PROFILES entry
  — it was set in six places but never read anywhere.
- Update the PREFIX_MISMATCH message to explicitly flag that the
  prefix slice is heuristic (first 7 digits) since actual GS1 company
  prefix lengths range 7-10 digits — previously users saw a hard
  comparison without that caveat.
- Replace stale "dark mode toggle" comments in app.py with accurate
  language describing the rerun-survival pattern.
Prefix fix: the heuristic company-prefix slice took the first 7 digits
of a GTIN-12 directly but skipped the indicator digit on a GTIN-14, so
a correctly paired unit/case pair (614141000012 / 10614141000019)
reported different prefixes and incorrectly tripped PREFIX_MISMATCH.
Zero-pad GTIN-12 onto the GTIN-13 frame before slicing so unit and case
agree. Two existing tests updated to reflect the corrected slice.

New tests (30 total) covering previously-untested public API:
- TestSingleValidationEdgeCases: GTIN-8 happy path, INDICATOR_ZERO
  branch, None / NaN / numeric input coercion, full sibling-row list
  in the DUPLICATE message
- TestReadinessScore: empty input, F-grade on all-critical, hierarchy
  bonus monotonicity, mixed-warning grade landscape
- TestCostEstimate: empty input, low- vs high-SKU growth note,
  rework_cost = rework_hours * $50, range monotonicity
- TestRetailerChecklists: profile coverage, per-retailer check-digit
  propagation, clean-data passes Walmart end to end, hierarchy check
  appears iff the profile requires it
- TestDataCompleteness: empty DataFrame, column pattern matching,
  completeness percentage math, retailer gap flagging
- TestCSVReport: header + row count, formula-injection neutralization
  via leading apostrophe, clean-input rendering
- TestPDFReport: typical batch builds a real %PDF-, missing company
  name path, markup-injection escape on company_name
- TestSampleData: regression guard so SAMPLE_DATA stays exercised
- test job runs pytest on the supported Python versions (3.9-3.12) so
  regressions surface on every PR and push to main
- audit job runs pip-audit against requirements.txt so CVEs in our
  declared dependency floors are caught at PR time (we just raised the
  streamlit floor to 1.54.0 for CVE-2026-33682; this keeps us honest)
Replaces the ~550-line generate_pdf_report god function with a small
class whose _render_* methods correspond to the report sections.
Behaviour is preserved end-to-end (existing PDF smoke + injection
tests pass unchanged); the public entrypoint generate_pdf_report still
returns a BytesIO and remains the only thing app.py imports.

Specific changes:
- Lift the eight ParagraphStyle definitions out of the per-call body
  into a module-level _build_styles() factory keyed by short names
  (title / subtitle / heading / body / small / score / grade /
  company_name) so the styles are constructed once per report.
- Promote layout heuristics to named module constants
  (PAGE_CONTENT_HEIGHT, ITEM_HEADER_PT, ISSUE_BLOCK_PT,
  GROUP_HEADER_PT, MULTI_GROUP_HEADER_PT, FAILING_ROW_PT,
  FAILING_HEADER_PT) instead of inline magic numbers — the
  KeepTogether-vs-chunk decision logic is now self-documenting.
- Convert the five nested closures (render_item_flowables,
  render_item_block, estimate_item_height, render_group_with_continuation,
  render_multi_issue_group) into instance methods so they share state
  and are individually testable.
- Hoist the CRITICAL/WARNING/INFO label dicts to module-level constants
  rather than redefining them inside the function on every call.
- Split the per-section rendering into _render_title, _render_score,
  _render_summary, _render_cost, _render_retailer_checklists (with a
  _render_failing_gtins_for_retailer extraction), _render_item_detail
  (which delegates to _render_critical_section,
  _render_warning_section, _render_info_section, _render_clean_summary),
  and _render_footer.
app.py shrinks from 947 lines to ~190 (most of which is the no-data
explainer copy). The substantive UI now lives in cohesive modules
under ui/, each one responsible for a single concern.

Verification:
- All 70 pytest tests still pass.
- python -c 'import app' succeeds with no errors.
- streamlit run app.py boots and serves HTTP 200.
- streamlit AppTest renders the page and exercises the full sample-
  data + Validate flow without raising any exception; all 10 tabs
  (4 results + 6 deep analysis) render.

New modules:
- ui/styles.css + ui/styles.py: the ~190 lines of inline CSS now live
  in a static .css file loaded by inject_css(); editing the look no
  longer requires touching Python.
- ui/state.py: session_state keys are constants instead of magic
  strings. reset_session() only clears keys this UI owns rather than
  blindly wiping every session_state entry (the prior behaviour could
  drop unrelated Streamlit-internal state). invalidate_report_caches()
  centralizes the CSV/PDF cache eviction that used to be inlined.
- ui/input_section.py: renders the paste / CSV upload / sample-data
  picker and returns a typed (gtins, dataframe) tuple. Unchanged
  behaviour: 50k batch cap, GTIN column auto-detection, narrow CSV
  parser exception handling.
- ui/results.py: renders the score card, summary stats, download
  buttons (with their CSV/PDF caching), and the four validation-result
  tabs (Issues by Severity / Full Item Detail / Check Digit
  Corrections / Packaging Hierarchy).
- ui/deep_analysis.py: renders the six deep-analysis tabs (Executive
  Summary / Fix Plan / Retailer Readiness / Cost of Inaction /
  Case GTIN Generator / Data Completeness).

app.py keeps responsibility only for: page config, CSS injection,
header, sidebar, input persistence across reruns, the validate /
reset buttons, the no-data explainer, and the share/security footer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants