Security audit: hardening, perf, dep pinning, tests, CI, and refactors#1
Open
MsShawnP wants to merge 8 commits into
Open
Security audit: hardening, perf, dep pinning, tests, CI, and refactors#1MsShawnP wants to merge 8 commits into
MsShawnP wants to merge 8 commits into
Conversation
- Add MIT LICENSE file (README claimed MIT but the file was missing) - Pin upper bounds on all runtime deps; raise streamlit floor to 1.54.0 to close CVE-2026-33682 (SSRF/NTLM on Windows) and the 2024 path- traversal advisory; cap pandas <3.0 to avoid the CoW/dtype break - Declare pytest in requirements-dev.txt (previously undeclared) - Replace YOUR_USERNAME / YOUR_EMAIL placeholders in README with real values; add Python 3.9+ requirement and a "Run tests" section
- csv_report: prefix any cell that begins with =, +, -, @, tab, or CR with a leading apostrophe so spreadsheet apps don't evaluate it as a formula if the resulting CSV is shared with a third party - pdf_report: escape user-controlled strings (company_name, raw_input, issue.message, issue.recommendation, gtin_type.value) before passing them to ReportLab Paragraph. ReportLab parses inline XML markup, so unescaped < / > / & would corrupt rendering or inject unintended styling; the prior try/except in app.py masked these failures - app.py: cap pasted/uploaded batches at 50k GTINs and surface a clear error so a runaway input doesn't block the Streamlit main thread - app.py: narrow the CSV-read except chain to known parser/encoding errors before falling back to a generic handler
- validate_batch: build a {cleaned: [row_numbers]} map once instead of
re-scanning the full result list per duplicate (was O(n^2) when many
duplicates existed)
- generate_retailer_checklists: hoist the batch-level slices (invalid,
duplicates, prefix mismatches, has_case) out of the per-retailer loop
so they're computed once across all six profiles instead of 6+ times
- check_data_completeness: vectorize the "non-empty after strip" count
via str accessor + astype(bool).sum() instead of a per-row lambda
- app.py: cache the generated CSV and PDF in session_state, invalidate
on a new validate run, and re-run PDF generation only when the
company name changes — previously both reports were regenerated on
every Streamlit rerun (e.g. toggling a selectbox)
- validate_single_gtin: accept non-string inputs (None, pandas NaN, numeric types) by coercing them to "" rather than crashing on .strip(). The CSV upload path drops NaN today, but defensive handling keeps the function safe for direct callers and future callsites that pass raw cells. - Drop the unused gtin14_format key from every RETAILER_PROFILES entry — it was set in six places but never read anywhere. - Update the PREFIX_MISMATCH message to explicitly flag that the prefix slice is heuristic (first 7 digits) since actual GS1 company prefix lengths range 7-10 digits — previously users saw a hard comparison without that caveat. - Replace stale "dark mode toggle" comments in app.py with accurate language describing the rerun-survival pattern.
Prefix fix: the heuristic company-prefix slice took the first 7 digits of a GTIN-12 directly but skipped the indicator digit on a GTIN-14, so a correctly paired unit/case pair (614141000012 / 10614141000019) reported different prefixes and incorrectly tripped PREFIX_MISMATCH. Zero-pad GTIN-12 onto the GTIN-13 frame before slicing so unit and case agree. Two existing tests updated to reflect the corrected slice. New tests (30 total) covering previously-untested public API: - TestSingleValidationEdgeCases: GTIN-8 happy path, INDICATOR_ZERO branch, None / NaN / numeric input coercion, full sibling-row list in the DUPLICATE message - TestReadinessScore: empty input, F-grade on all-critical, hierarchy bonus monotonicity, mixed-warning grade landscape - TestCostEstimate: empty input, low- vs high-SKU growth note, rework_cost = rework_hours * $50, range monotonicity - TestRetailerChecklists: profile coverage, per-retailer check-digit propagation, clean-data passes Walmart end to end, hierarchy check appears iff the profile requires it - TestDataCompleteness: empty DataFrame, column pattern matching, completeness percentage math, retailer gap flagging - TestCSVReport: header + row count, formula-injection neutralization via leading apostrophe, clean-input rendering - TestPDFReport: typical batch builds a real %PDF-, missing company name path, markup-injection escape on company_name - TestSampleData: regression guard so SAMPLE_DATA stays exercised
- test job runs pytest on the supported Python versions (3.9-3.12) so regressions surface on every PR and push to main - audit job runs pip-audit against requirements.txt so CVEs in our declared dependency floors are caught at PR time (we just raised the streamlit floor to 1.54.0 for CVE-2026-33682; this keeps us honest)
Replaces the ~550-line generate_pdf_report god function with a small class whose _render_* methods correspond to the report sections. Behaviour is preserved end-to-end (existing PDF smoke + injection tests pass unchanged); the public entrypoint generate_pdf_report still returns a BytesIO and remains the only thing app.py imports. Specific changes: - Lift the eight ParagraphStyle definitions out of the per-call body into a module-level _build_styles() factory keyed by short names (title / subtitle / heading / body / small / score / grade / company_name) so the styles are constructed once per report. - Promote layout heuristics to named module constants (PAGE_CONTENT_HEIGHT, ITEM_HEADER_PT, ISSUE_BLOCK_PT, GROUP_HEADER_PT, MULTI_GROUP_HEADER_PT, FAILING_ROW_PT, FAILING_HEADER_PT) instead of inline magic numbers — the KeepTogether-vs-chunk decision logic is now self-documenting. - Convert the five nested closures (render_item_flowables, render_item_block, estimate_item_height, render_group_with_continuation, render_multi_issue_group) into instance methods so they share state and are individually testable. - Hoist the CRITICAL/WARNING/INFO label dicts to module-level constants rather than redefining them inside the function on every call. - Split the per-section rendering into _render_title, _render_score, _render_summary, _render_cost, _render_retailer_checklists (with a _render_failing_gtins_for_retailer extraction), _render_item_detail (which delegates to _render_critical_section, _render_warning_section, _render_info_section, _render_clean_summary), and _render_footer.
app.py shrinks from 947 lines to ~190 (most of which is the no-data explainer copy). The substantive UI now lives in cohesive modules under ui/, each one responsible for a single concern. Verification: - All 70 pytest tests still pass. - python -c 'import app' succeeds with no errors. - streamlit run app.py boots and serves HTTP 200. - streamlit AppTest renders the page and exercises the full sample- data + Validate flow without raising any exception; all 10 tabs (4 results + 6 deep analysis) render. New modules: - ui/styles.css + ui/styles.py: the ~190 lines of inline CSS now live in a static .css file loaded by inject_css(); editing the look no longer requires touching Python. - ui/state.py: session_state keys are constants instead of magic strings. reset_session() only clears keys this UI owns rather than blindly wiping every session_state entry (the prior behaviour could drop unrelated Streamlit-internal state). invalidate_report_caches() centralizes the CSV/PDF cache eviction that used to be inlined. - ui/input_section.py: renders the paste / CSV upload / sample-data picker and returns a typed (gtins, dataframe) tuple. Unchanged behaviour: 50k batch cap, GTIN column auto-detection, narrow CSV parser exception handling. - ui/results.py: renders the score card, summary stats, download buttons (with their CSV/PDF caching), and the four validation-result tabs (Issues by Severity / Full Item Detail / Check Digit Corrections / Packaging Hierarchy). - ui/deep_analysis.py: renders the six deep-analysis tabs (Executive Summary / Fix Plan / Retailer Readiness / Cost of Inaction / Case GTIN Generator / Data Completeness). app.py keeps responsibility only for: page config, CSS injection, header, sidebar, input persistence across reruns, the validate / reset buttons, the no-data explainer, and the share/security footer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Comprehensive sweep across security, dependencies, performance, code quality, tests, docs, and CI — followed by the two larger refactors flagged in the audit. 70 pytest tests pass; the app boots and renders end-to-end via
streamlit.testing.v1.AppTest.Summary by area
Project hygiene (
74eb71d)LICENSEfile (README claimed MIT but no file existed)streamlitfloor to1.54.0to close CVE-2026-33682 (SSRF/NTLM on Windows) and the 2024 path-traversal advisory; cappandas <3.0to avoid the breaking CoW/dtype changepytestinrequirements-dev.txt(was undeclared)YOUR_USERNAME/YOUR_EMAILplaceholders in README; add Python 3.9+ note and a "Run tests" sectionSecurity hardening (
c8dba7d)csv_report.py: prefix any cell starting with=,+,-,@,\t, or\rwith a leading apostrophe so spreadsheet apps don't evaluate it as a formula if the CSV is sharedpdf_report.py: escape user-controlled strings (company_name,raw_input,issue.message,issue.recommendation,gtin_type.value) before passing them to ReportLabParagraph— ReportLab parses inline XML markup, so unescaped</>/&would corrupt renderingapp.py: 50,000-row hard cap on pasted/uploaded batches; narrow CSV-read except chain to known parser/encoding errors before falling back to a generic handlerPerformance (
2ca4a17)validate_batch: build a{cleaned: [row_numbers]}map once instead of re-scanning the full result list per duplicate (was O(n²))generate_retailer_checklists: hoist batch-level slices (invalid, dups, prefix mismatches,has_case) out of the per-retailer loop so they're computed once across all six profilescheck_data_completeness: vectorize the "non-empty after strip" count viastr.strip().astype(bool).sum()instead of a per-row lambdaapp.py: cache CSV/PDF insession_stateand re-run PDF generation only when the company name changes — both reports were regenerating on every Streamlit rerunRobustness & cleanup (
85ca2fb)validate_single_gtinacceptsNone/ pandasNaN/ numeric inputs without crashing on.strip()gtin14_formatkey from all six retailer profiles (was set in 6 places but never read)PREFIX_MISMATCHmessage to flag that the prefix slice is heuristicBug fix + tests (
c6bc7f3)614141000012/10614141000019) reported different prefixes and incorrectly trippedPREFIX_MISMATCHbecause GTIN-12 tookcleaned[:7]while GTIN-14 tookcleaned[1:8]. Zero-pad GTIN-12 onto the GTIN-13 frame before slicingcheck_data_completeness,generate_retailer_checklists,calculate_readiness_score,estimate_cost_of_inaction, CSV/PDF report generators (including formula-injection neutralization and ReportLab markup escaping), GTIN-8 happy path,INDICATOR_ZERO,None/NaNinput handling, and a sample-data regression guardCI (
7d0df37).github/workflows/ci.ymlruns pytest on Python 3.9 / 3.10 / 3.11 / 3.12 plus apip-auditjob againstrequirements.txtRefactor:
pdf_report.py→PDFReportBuilderclass (af16733)generate_pdf_reportgod function withPDFReportBuilderwhose_render_*methods correspond to report sectionsParagraphStyledefinitions into a module-level_build_styles()factoryPAGE_CONTENT_HEIGHT,ITEM_HEADER_PT,ISSUE_BLOCK_PT,GROUP_HEADER_PT,MULTI_GROUP_HEADER_PT,FAILING_ROW_PT,FAILING_HEADER_PT)CRITICAL/WARNING/INFOlabel dicts to module-level constantsgenerate_pdf_report(validation_data, company_name="") -> BytesIORefactor:
app.pymonolith →ui/package (a2da009)app.pyshrinks 947 → ~190 lines. The substantive UI now lives in cohesive modules underui/:ui/styles.css+ui/styles.py: the ~190 lines of inline CSS now live in a static.cssfile loaded byinject_css()ui/state.py: session-state keys are constants instead of magic strings;reset_session()only clears keys this UI owns rather than blindly wiping every entry;invalidate_report_caches()centralizes CSV/PDF cache evictionui/input_section.py: paste / CSV upload / sample-data picker, returns typed(gtins, dataframe)tupleui/results.py: score card, summary stats, download buttons, four validation-result tabsui/deep_analysis.py: six deep-analysis tabsapp.pykeeps responsibility only for: page config, CSS injection, header, sidebar, input persistence across reruns, validate / reset buttons, the no-data explainer, and the share/security footer.Test plan
pytest tests.py -v— 70 passed, 0 failedpython -c "import app"— succeeds with no errorsstreamlit run app.py— boots, serves HTTP 200streamlit.testing.v1.AppTest— exercises sample-data → Validate flow end-to-end with no exception; all 10 tabs (4 results + 6 deep analysis) renderCommits
Generated by Claude Code