Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
196 commits
Select commit Hold shift + click to select a range
0d61c12
chore: Add pytest to requirements.txt
google-labs-jules[bot] Mar 19, 2026
bcd4cf1
🧪 Add test for untestable exception block in xml-validator.py
google-labs-jules[bot] Mar 19, 2026
f56a001
🧪 Add tests for BaseConverter to verify ABC behavior
google-labs-jules[bot] Mar 19, 2026
6b4c987
🧪 Add unit tests for Data Validation module
google-labs-jules[bot] Mar 19, 2026
1e1b587
🧪 Add Error Path Tests for Date Formatting in src/data_cleaning.py
google-labs-jules[bot] Mar 19, 2026
32b2068
🔒 Fix XXE vulnerability in xml-validator
google-labs-jules[bot] Mar 19, 2026
083304d
Merge pull request #6 from daler91/fix-xxe-vulnerability-367821129734…
daler91 Mar 19, 2026
0b52d81
Merge pull request #5 from daler91/jules-testing-improvement-date-for…
daler91 Mar 19, 2026
1af67b4
Merge pull request #4 from daler91/jules-add-data-validation-tests-12…
daler91 Mar 19, 2026
383417e
Merge pull request #3 from daler91/fix-base-converter-tests-318024770…
daler91 Mar 19, 2026
71239a6
Refactor Address and Phone logic into helper methods
google-labs-jules[bot] Mar 19, 2026
325d649
Merge pull request #2 from daler91/test-xml-validator-exception-78858…
daler91 Mar 19, 2026
d7926bb
Merge branch 'master' into jules-fix-requirements-2966955879941166032
daler91 Mar 19, 2026
fffccba
Merge pull request #1 from daler91/jules-fix-requirements-29669558799…
daler91 Mar 19, 2026
05f1f00
🔒 [fix XXE vulnerability in xml-validator.py]
google-labs-jules[bot] Mar 19, 2026
3f5d45f
Add test for _calculate_demographics in TrainingConverter
google-labs-jules[bot] Mar 19, 2026
8d407f0
Merge pull request #7 from daler91/refactor/counseling-converter-help…
daler91 Mar 19, 2026
610b940
Merge pull request #9 from daler91/test-training-converter-demographi…
daler91 Mar 19, 2026
fbd820f
Add tests for clean_numeric in data_cleaning.py
google-labs-jules[bot] Mar 19, 2026
35ddd7d
🧪 Add clean_percentage tests and fix trailing % handling
google-labs-jules[bot] Mar 19, 2026
5d35d3e
Merge pull request #11 from daler91/test/clean-percentage-improvement…
daler91 Mar 19, 2026
df4dcd2
🧹 [code health] Remove unused sys import in xml-validator.py
google-labs-jules[bot] Mar 19, 2026
926dfb5
🧹 [code health improvement] Remove unused import in counseling converter
google-labs-jules[bot] Mar 19, 2026
b3963a7
Merge pull request #10 from daler91/testing-improvement-clean-numeric…
daler91 Mar 19, 2026
80fbbb9
Remove unused import `clean_percentage` in `src/data_validation.py`
google-labs-jules[bot] Mar 19, 2026
c105d93
🧹 [remove unused import truncate_counselor_notes]
google-labs-jules[bot] Mar 19, 2026
796c952
Merge branch 'master' into jules-17669605457106686770-af2673f6
daler91 Mar 19, 2026
7301378
🧹 [code health improvement] Remove unused import clean_whitespace
google-labs-jules[bot] Mar 19, 2026
46bd111
Merge pull request #8 from daler91/jules-17669605457106686770-af2673f6
daler91 Mar 19, 2026
67eec91
Merge pull request #13 from daler91/fix/remove-unused-sys-import-6964…
daler91 Mar 19, 2026
aac7793
Merge pull request #16 from daler91/fix-unused-import-training-client…
daler91 Mar 19, 2026
b769d20
Merge pull request #17 from daler91/code-health-remove-clean-whitespa…
daler91 Mar 19, 2026
f614fd0
🧹 [code health improvement] Remove unused ElementTree import in fix-s…
google-labs-jules[bot] Mar 19, 2026
ecdcf25
🧹 [code health improvement] Remove unused import clean_percentage
google-labs-jules[bot] Mar 19, 2026
4f73024
🧹 [remove unused datetime import]
google-labs-jules[bot] Mar 19, 2026
efe5a14
🧹 [code health improvement] Remove unused import standardize_state_name
google-labs-jules[bot] Mar 19, 2026
65f1218
Merge pull request #14 from daler91/fix/remove-unused-os-import-57494…
daler91 Mar 19, 2026
b0b781c
Merge pull request #20 from daler91/jules/code-health-fix-sba-xml-imp…
daler91 Mar 19, 2026
afe73a3
Merge branch 'master' into code-health-remove-unused-import-155642829…
daler91 Mar 19, 2026
844a146
Merge pull request #23 from daler91/code-health-fix-unused-imports-94…
daler91 Mar 19, 2026
d5bbb1c
Merge pull request #21 from daler91/fix/remove-unused-datetime-import…
daler91 Mar 19, 2026
d38b9ac
Merge branch 'master' into fix/remove-unused-import-clean_percentage-…
daler91 Mar 19, 2026
7d1e18a
Merge pull request #22 from daler91/code-health-remove-unused-import-…
daler91 Mar 19, 2026
48cb360
Merge pull request #15 from daler91/fix/remove-unused-import-clean_pe…
daler91 Mar 19, 2026
979356a
🧹 [code health improvement] Remove unused import validate_against_xsd
google-labs-jules[bot] Mar 19, 2026
bc59ea3
Merge pull request #26 from daler91/jules-code-health-fix-84916377361…
daler91 Mar 19, 2026
4b0f58a
Refactor main in fix-sba-xml.py
google-labs-jules[bot] Mar 19, 2026
27b0449
Merge pull request #27 from daler91/fix/refactor-main-complexity-1089…
daler91 Mar 19, 2026
be5d22d
🧹 [code health improvement] Refactor build_training_counselor_record_…
google-labs-jules[bot] Mar 19, 2026
0a40f6a
Merge pull request #28 from daler91/chore/refactor-training-counselor…
daler91 Mar 19, 2026
7d82242
SBA provided XSD and sample XMLs
daler91 Mar 19, 2026
6e9dd79
Fix element ordering and generation to strictly match SBA XML schemas
google-labs-jules[bot] Mar 19, 2026
7b54057
Merge pull request #29 from daler91/jules-fix-sba-xml-generation-1352…
daler91 Mar 19, 2026
0206d14
🧹 Refactor `generate_html_report` to reduce complexity
google-labs-jules[bot] Mar 19, 2026
c4df578
Merge pull request #30 from daler91/code-health-refactor-validation-r…
daler91 Mar 19, 2026
8dd29bf
🧹 [code health improvement] Refactor main function in xml-validator.py
google-labs-jules[bot] Mar 19, 2026
db96fbc
Merge pull request #31 from daler91/code-health/refactor-main-xml-val…
daler91 Mar 19, 2026
2158536
🧹 [Code Health] Refactor build_client_intake_section
google-labs-jules[bot] Mar 19, 2026
d6fb8d8
Merge pull request #32 from daler91/refactor-client-intake-9560018499…
daler91 Mar 19, 2026
0d3b91b
🔒 [security fix] Fix XXE vulnerabilities in XML validator
google-labs-jules[bot] Mar 19, 2026
de52845
Merge pull request #33 from daler91/jules-fix-xxe-dependency-25001621…
daler91 Mar 19, 2026
efe3bf7
🧹 [code health improvement] Standardize import logic for logging_util
google-labs-jules[bot] Mar 19, 2026
854e438
Merge pull request #34 from daler91/jules-16856079994023275020-9cc04398
daler91 Mar 19, 2026
5b98a06
Fix counter reset bug in set_current_record_id()
claude Mar 19, 2026
6971fe7
Fix HTML report rendering literal template text instead of values
claude Mar 19, 2026
5ecf8a3
Merge pull request #35 from daler91/claude/fix-counter-reset-bug-jw2O3
daler91 Mar 19, 2026
f3fb14e
Fix count_matches() using cell value instead of column name
claude Mar 19, 2026
0a164f4
Remove escape_xml() calls to fix double-escaping with ElementTree
claude Mar 19, 2026
38edebf
Fix broken bare imports in training_client_xml.py
claude Mar 19, 2026
135ed85
Return default "0" from clean_percentage() instead of raising ValueError
claude Mar 19, 2026
3a0994e
Merge pull request #36 from daler91/claude/fix-counter-reset-bug-jw2O3
daler91 Mar 19, 2026
190301a
Rename hyphenated Python files to use underscores for valid module names
claude Mar 19, 2026
3a53313
Move module docstring before imports in training_converter.py
claude Mar 19, 2026
2711047
Remove unused global ValidationTracker instance from validation_repor…
claude Mar 19, 2026
35e2927
Move validate_counseling_date() from data_cleaning to data_validation
claude Mar 19, 2026
30d8e92
Merge pull request #37 from daler91/claude/fix-python-module-names-JK0V1
daler91 Mar 19, 2026
6edb429
Add files via upload
daler91 Mar 19, 2026
c1f8cb4
Fix counseling XML output to validate against XSD schema
claude Mar 19, 2026
eee8269
Add interactive run.py for non-technical users
claude Mar 19, 2026
3bb2c35
Merge pull request #38 from daler91/claude/verify-csv-xml-conversion-…
daler91 Mar 19, 2026
0c0da2f
Add setup.bat, run.bat, and update README with Quick Start
claude Mar 19, 2026
34e1ec7
Merge pull request #39 from daler91/claude/verify-csv-xml-conversion-…
daler91 Mar 19, 2026
ea6fa07
Delete report1773953890460.csv
daler91 Mar 19, 2026
fb068ae
🧪 [testing improvement] Add tests for process_directory in xml_validator
google-labs-jules[bot] Mar 20, 2026
3e024b8
Add tests for fix_sba_xml utility.
google-labs-jules[bot] Mar 20, 2026
87c0735
🧪 [testing improvement] Add unit tests for main module
google-labs-jules[bot] Mar 20, 2026
ed31a49
Merge pull request #40 from daler91/feature/testing-improvement-proce…
daler91 Mar 20, 2026
ab7f3ea
Merge pull request #41 from daler91/fix-sba-xml-tests-114679549953397…
daler91 Mar 20, 2026
37f3299
Merge pull request #42 from daler91/test-main-coverage-improvement-11…
daler91 Mar 20, 2026
eb3f7be
🧪 [add test for validation_report.py]
google-labs-jules[bot] Mar 20, 2026
df01c4f
Merge pull request #43 from daler91/test-generate-html-report-9310266…
daler91 Mar 20, 2026
247408d
Add SBA end-system validation rules to counseling converter
claude Mar 23, 2026
0ff891c
Fix ReportableImpact validation to auto-correct VerifiedToBeInBusines…
claude Mar 23, 2026
80529b6
Merge pull request #44 from daler91/claude/fix-part3-validation-W9Ber
daler91 Mar 23, 2026
d3a3566
🧪 [testing improvement] Add tests for clean_phone_number in data_clea…
google-labs-jules[bot] Mar 25, 2026
9870044
🧪 Add testing improvement for fix_client_intake_element_order
google-labs-jules[bot] Mar 25, 2026
81d3223
Merge pull request #45 from daler91/jules-649247083745066190-2f0c23e2
daler91 Mar 25, 2026
efaed02
Merge pull request #48 from daler91/test-xml-validator-fix-element-or…
daler91 Mar 25, 2026
047f44a
Add Next.js + FastAPI web app for Railway deployment
claude Mar 30, 2026
a58b6bf
Fix CodeQL security findings: path traversal and info exposure
claude Mar 30, 2026
1162656
Fix remaining CodeQL findings: tainted paths and info exposure
claude Mar 30, 2026
0c849c0
Add realpath containment check in safe_output_path
claude Mar 30, 2026
4c51b90
Eliminate user-provided file paths from worker API
claude Mar 30, 2026
69fbf77
Add realpath containment check to get_output_path, return dict from v…
claude Mar 30, 2026
ca316d5
Inline realpath+startswith guards at every file operation site
claude Mar 30, 2026
2ec531c
Merge pull request #50 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 30, 2026
aee056e
Bump effect and prisma in /apps/web
dependabot[bot] Mar 30, 2026
ce6c538
Add railway.toml configs for multi-service deployment
claude Mar 30, 2026
4afa214
Fix web Dockerfile: use npm install and auto-run prisma db push
claude Mar 30, 2026
196b6cd
Merge pull request #52 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 30, 2026
21e6b48
Fix web Dockerfile paths for repo-root build context
claude Mar 30, 2026
3845184
Merge pull request #53 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 30, 2026
754b543
Skip postinstall during deps stage to avoid prisma generate failure
claude Mar 30, 2026
2876b5b
Merge pull request #54 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 30, 2026
f65670f
Add empty public/ directory for Next.js Docker build
claude Mar 30, 2026
f890135
Merge pull request #55 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 30, 2026
e8d8267
Fix prisma crash loop: use bundled prisma v6 CLI, not npx
claude Mar 31, 2026
8eac00b
Merge pull request #56 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 31, 2026
cfccc58
Fix prisma db push: copy full node_modules for CLI dependencies
claude Mar 31, 2026
6236918
Merge pull request #57 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 31, 2026
fe08c17
Fix DB migration and NextAuth UntrustedHost errors
claude Mar 31, 2026
3a7e0e2
Merge pull request #58 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 31, 2026
2e41256
Fix migration: run each SQL statement separately
claude Mar 31, 2026
e263912
Merge pull request #59 from daler91/claude/deploy-to-railway-JnJEz
daler91 Mar 31, 2026
7787b5b
Implement cleaning diff, side-by-side comparison, async conversion, a…
claude Mar 31, 2026
537a8a2
Merge pull request #60 from daler91/claude/implement-missing-features…
daler91 Mar 31, 2026
5a0ec39
Bump worker version to force Railway rebuild
claude Mar 31, 2026
839cb3b
Fix preview->progress redirect, add previousJobId to convert page, po…
claude Mar 31, 2026
d52ef29
Merge pull request #61 from daler91/claude/implement-missing-features…
daler91 Mar 31, 2026
06a83ba
Add diagnostic route logging and catch-all to debug Railway 404s
claude Mar 31, 2026
a9564d3
Revert "Add diagnostic route logging and catch-all to debug Railway 4…
claude Mar 31, 2026
465a0a7
Add diagnostic route logging and catch-all to debug Railway 404s
claude Mar 31, 2026
2179055
Merge pull request #62 from daler91/claude/debug-worker-404
daler91 Mar 31, 2026
41df305
Merge pull request #51 from daler91/dependabot/npm_and_yarn/apps/web/…
daler91 Mar 31, 2026
9202a71
Stream CSV content to worker via API instead of shared filesystem
claude Mar 31, 2026
1e28afe
Merge pull request #63 from daler91/claude/stream-files-to-worker
daler91 Mar 31, 2026
ceac6c0
Pin Prisma to v6.19.2 to fix Railway web app build
claude Mar 31, 2026
0391d6f
Merge pull request #64 from daler91/claude/pin-prisma-v6
daler91 Mar 31, 2026
fdc5563
Silently auto-correct VerifiedToBeInBusiness and replace Other in Ser…
claude Mar 31, 2026
36bf1dc
Merge pull request #65 from daler91/claude/counseling-converter-fixes
daler91 Mar 31, 2026
57bca3b
Fix code smells: use Number.parseInt, node: imports, remove nested te…
claude Apr 2, 2026
f9f81f5
Fix Python code smells: async file I/O, documented responses, reduced…
claude Apr 2, 2026
dbcbcd3
Potential fix for pull request finding 'CodeQL / DOM text reinterpret…
daler91 Apr 2, 2026
d9b0e14
Merge pull request #66 from daler91/claude/use-number-parseint-CTVvt
daler91 Apr 2, 2026
bd79546
Fix code smells: reduce cognitive complexity, improve accessibility, …
claude Apr 3, 2026
b4df03e
Merge pull request #67 from daler91/claude/refactor-xml-validator-RQWMe
daler91 Apr 3, 2026
0586cc9
Fix security vulnerabilities, bugs, and improve UX across the app
claude Apr 3, 2026
4e2f523
Merge pull request #68 from daler91/claude/review-app-improvements-1Ec9d
daler91 Apr 3, 2026
f6ebee4
Fix critical bugs, security issues, and remove dead code
claude Apr 3, 2026
59d7feb
Add type hints, rate limiting, CI pipeline, integration tests, and qu…
claude Apr 3, 2026
7e71f8b
Merge pull request #69 from daler91/claude/codebase-review-YokxD
daler91 Apr 3, 2026
431f719
Add technical debt register documenting 13 areas of improvement
claude Apr 4, 2026
e41df29
Expand technical debt register with security, CI, and performance fin…
claude Apr 4, 2026
ab69f93
Resolve 15 of 19 technical debt items across the codebase
claude Apr 4, 2026
af456c3
Fix minor code quality issues from static analysis
claude Apr 4, 2026
beb7701
Fix CI failures: flaky os.path.exists mock and broken next lint
claude Apr 4, 2026
cc16283
Add tsbuildinfo to gitignore
claude Apr 4, 2026
a010ef4
Merge pull request #71 from daler91/claude/identify-technical-debt-1m9QB
daler91 Apr 4, 2026
ce75aee
Fix stack-trace exposure in validate_against_xsd
claude Apr 5, 2026
1bf9833
Merge pull request #72 from daler91/claude/fix-exception-info-exposur…
daler91 Apr 5, 2026
ff995c9
Add one-click suggestion badges to column mapping page
claude Apr 6, 2026
76035df
Merge pull request #73 from daler91/claude/quick-column-mapping-fZAVR
daler91 Apr 6, 2026
9b86aad
Add training-client 641 converter for training client form data
claude Apr 7, 2026
1c692df
Merge pull request #74 from daler91/claude/training-client-form-data-…
daler91 Apr 7, 2026
d9139cc
Wire training-client converter into web UI and worker backend
claude Apr 7, 2026
cc3b284
Fix XSD validation falsely rejecting temp file paths during conversion
claude Apr 7, 2026
3f51766
Merge pull request #75 from daler91/claude/training-client-form-data-…
daler91 Apr 7, 2026
e4694bd
Merge pull request #76 from daler91/claude/fix-xsd-validation-error-D…
daler91 Apr 7, 2026
8cab9a9
Fix 888 demographic counters to exclude "Prefer not to say" responses
claude Apr 7, 2026
4ced8bf
Merge pull request #77 from daler91/claude/fix-888-demographic-counte…
daler91 Apr 7, 2026
2980202
Fix mapping page: persist selections, show all fields, add requiremen…
claude Apr 7, 2026
39cbafb
Merge pull request #78 from daler91/claude/fix-mapping-save-display-k…
daler91 Apr 7, 2026
8294961
Add UX architecture review document
claude Apr 11, 2026
c6cec59
Add UX implementation plan
claude Apr 11, 2026
9d26534
UX Phase 1: Stop the Bleed
claude Apr 11, 2026
2b7ab37
UX Phase 2: Mobile Usable
claude Apr 11, 2026
5714610
UX Phase 3 (partial): Make the cancel button work end-to-end
claude Apr 11, 2026
28a65b2
UX Phase 3 (toasts): Success notifications across client flows
claude Apr 11, 2026
e1af480
UX Phase 3 (progress page): Elapsed time + poll-failure banner
claude Apr 11, 2026
cec7e71
UX Phase 3 (errors): Actionable error messages + retry cards
claude Apr 11, 2026
d61a757
UX Phase 3 (polish): Button spinners + skeleton loaders
claude Apr 11, 2026
bc16fe4
UX Phase 4 (1/4): Converter type cards with descriptions
claude Apr 11, 2026
89d9355
UX Phase 4 (2/4): Auth + upload polish
claude Apr 11, 2026
7171cda
UX Phase 4 (3/4): Content clarity — icons, vocabulary, disclosures
claude Apr 11, 2026
98e12dc
UX Phase 4 (4/4): Onboarding — landing, empty state, samples, README
claude Apr 11, 2026
418066c
Mark UX_REVIEW.md §8.3 as resolved by Phase 4 homepage rebuild
claude Apr 11, 2026
0a8d363
UX Phase 5 (1/3): Field descriptions on the mapping page
claude Apr 11, 2026
7d8565a
UX Phase 5 (2+3/3): Re-upload diff drilldown + audit details
claude Apr 11, 2026
2693ec7
UX Phase 6 (1/3): Step indicator for the conversion flow
claude Apr 11, 2026
f8ea568
UX Phase 6 (2/3): In-app Help page and nav link
claude Apr 11, 2026
f397cac
UX Phase 6 (3/3): Extract component library from utility-class drift
claude Apr 11, 2026
7d05aad
UX §8.1 regression guard: fail lint on raw button/alert classes
claude Apr 11, 2026
d796c93
UX §3.6 row-level progress + rate-based ETA
claude Apr 11, 2026
5d90767
Fix next build: split buttonClasses out of "use client" file
claude Apr 11, 2026
03d74dc
Clean up build noise: lazy Redis + turbopackIgnore on download route
claude Apr 11, 2026
b153095
Fix: cancel should only apply to converting jobs; terminal states are…
claude Apr 11, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Database
DATABASE_URL=postgresql://user:password@db:5432/csvtoxml
POSTGRES_USER=user
POSTGRES_PASSWORD=password
POSTGRES_DB=csvtoxml

# Redis
REDIS_URL=redis://redis:6379

# Authentication
NEXTAUTH_SECRET=generate-a-random-secret-here
44 changes: 44 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: CI

on:
push:
branches: ["**"]
pull_request:
branches: [master]

jobs:
python-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install dependencies
run: pip install -r requirements.txt

- name: Run tests
run: python -m pytest tests/ -v

web-lint:
runs-on: ubuntu-latest
defaults:
run:
working-directory: apps/web
steps:
- uses: actions/checkout@v4

- uses: actions/setup-node@v4
with:
node-version: 20

- name: Install dependencies
run: npm ci

- name: Lint
run: npm run lint

- name: Build
run: npm run build
35 changes: 27 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,27 @@
venv/
__pycache__/
*.pyc
logs/
reports/
*.log
*.csv
*.xml
venv/
__pycache__/
*.pyc
logs/
reports/
*.log
*.csv
# Keep sample CSVs that the web app serves to users
!apps/web/public/samples/*.csv

# Keep XSD files in schemas/ but ignore sample XMLs
*.xml
!schemas/*.xsd

# Environment
.env

# Next.js
apps/web/.next/
apps/web/node_modules/
apps/web/.env.local
*.tsbuildinfo

# Data
/data/
uploads/
instance/
82 changes: 75 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,73 @@
# SBA Counseling and Training Data Conversion Tool

This utility is designed to process, clean, validate, and convert SBA (Small Business Administration) counseling and training data from CSV format into compliant XML files. It ensures the final XML adheres to the strict sequence and data requirements of the SBA NEXUS schemas for **Form 641 (Counseling)** and **Management Training Reports**.
Converts SBA counseling and training CSV data into XSD-compliant XML files.

The tool includes robust data cleaning, validation reporting, and a fixer utility for pre-existing XML files.
The tool ships in two forms:

- A **web application** (`apps/web`, `apps/worker`) — recommended for
most users. Handles authentication, uploads, preview/mapping,
validation reports, job history, and downloads via a browser.
- A **Python CLI** (`run.py`, `src/`) — the original interactive
launcher, useful for power users and for scripting.

-----

## Web App (recommended)

The web app is a Next.js frontend backed by a FastAPI worker, Postgres,
and Redis — all wired up in `docker-compose.yml`.

### Run it locally

```bash
cp .env.example .env
# Edit .env to set DATABASE_URL, NEXTAUTH_SECRET, etc.
docker compose up
```

Then open <http://localhost:3000>, create an account, and upload a CSV.

### Download sample CSVs

Sample CSVs for each converter type live under
`apps/web/public/samples/` and are also linked from the landing page
and the dashboard empty state inside the app:

- `counseling-sample.csv` — individual counseling sessions (Form 641)
- `training-sample.csv` — aggregated training events (Form 888)
- `training-client-sample.csv` — per-attendee rows (Form 641)

### UX documentation

- [`UX_REVIEW.md`](./UX_REVIEW.md) — severity-ranked audit of the
web app's user-facing surfaces.
- [`UX_IMPLEMENTATION_PLAN.md`](./UX_IMPLEMENTATION_PLAN.md) — the
phased roadmap that sequences the UX review findings into
executable slices.
- [`TECHNICAL_DEBT.md`](./TECHNICAL_DEBT.md) — code/security debt
register, separate from UX concerns.

-----

## Python CLI

-----

## Quick Start (3 steps)

1. **Download** — On the GitHub page, click the green **Code** button → **Download ZIP**. Unzip the folder anywhere on your computer.

2. **Setup** (one time only) — Requires [Python](https://www.python.org/downloads/) (check **"Add Python to PATH"** during install).
- **Windows:** Double-click `setup.bat`
- **Mac/Linux:** Open a terminal in the folder and run: `pip install -r requirements.txt`

3. **Run** — Put your CSV file in the folder, then:
- **Windows:** Double-click `run.bat`
- **Mac/Linux:** Open a terminal in the folder and run: `python run.py`

The tool will walk you through selecting your CSV file, conversion type, and optional XSD validation — no typing commands needed.

Your output XML and validation reports will be saved in the `output/` and `reports/` folders.

-----

Expand All @@ -24,14 +89,17 @@ The tool includes robust data cleaning, validation reporting, and a fixer utilit
* **Validation & Reporting**:
* During conversion, it generates comprehensive validation reports in both CSV and HTML formats, detailing any issues found in the source data.
* **XML Fixer Utility**:
* Includes a standalone script (`fix-sba-xml.py`) to correct element ordering issues in existing XML files that do not conform to the schema.
* Includes a standalone script (`fix_sba_xml.py`) to correct element ordering issues in existing XML files that do not conform to the schema.

-----

## Project Structure

```
.
├── run.py # Interactive launcher (start here!)
├── run.bat # Windows double-click shortcut
├── setup.bat # Windows one-time setup
├── src/
│ ├── converters/
│ │ ├── base_converter.py # Base class for all converters
Expand All @@ -44,8 +112,8 @@ The tool includes robust data cleaning, validation reporting, and a fixer utilit
│ ├── config.py # Central configuration for field mappings, defaults, and validation rules
│ ├── validation_report.py # Module for tracking and reporting validation issues
│ ├── logging_util.py # Configures application-wide logging
│ ├── fix-sba-xml.py # Utility to fix element order in existing SBA XML files
│ └── xml-validator.py # Utility to validate XML files against an XSD
│ ├── fix_sba_xml.py # Utility to fix element order in existing SBA XML files
│ └── xml_validator.py # Utility to validate XML files against an XSD
├── tests/
│ ├── test_counseling_converter.py
│ ├── test_data_cleaning.py
Expand Down Expand Up @@ -85,10 +153,10 @@ The primary entry point for the conversion is `src/main.py`.

### Fixing an Existing XML File

If you have an XML file that fails validation due to incorrect element order, use the `fix-sba-xml.py` script:
If you have an XML file that fails validation due to incorrect element order, use the `fix_sba_xml.py` script:

```bash
python -m src.fix-sba-xml --file /path/to/your/invalid.xml --output /path/to/output/fixed.xml
python -m src.fix_sba_xml --file /path/to/your/invalid.xml --output /path/to/output/fixed.xml
```

This will re-order the elements to match the schema requirements.
Expand Down
180 changes: 180 additions & 0 deletions Sample641CouselingRecord-2-14.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
<?xml version="1.0" encoding="UTF-8"?>
<CounselingInformation>


<CounselingRecord>
<Operation></Operation>
<PartnerClientNumber>234347</PartnerClientNumber>
<Location>
<LocationCode>786</LocationCode>
</Location>
<ClientRequest>
<ClientNamePart1>
<Last>Hanks</Last>
<First>Jerry</First>
<Middle></Middle>
</ClientNamePart1>
<Email>jomh@gmail.com</Email>
<PhonePart1><Primary>2365894123</Primary><Secondary></Secondary></PhonePart1>
<AddressPart1>
<Street1></Street1>
<Street2></Street2>
<City>Austin</City>
<State>Alabama</State>
<ZipCode>33189</ZipCode>
<Zip4Code>2344</Zip4Code>
<Country><Code>United States</Code></Country>
<PostalCode></PostalCode>
<StateOrProvince></StateOrProvince>
</AddressPart1>

<SurveyAgreement>Yes</SurveyAgreement>

<ClientSignature>
<Date>1987-12-12</Date>
<OnFile>Yes</OnFile>
</ClientSignature>
</ClientRequest>
<ClientIntake>
<Race><Code>White</Code><Code>Asian</Code><SelfDescribedRace></SelfDescribedRace></Race>
<Ethnicity>Non Hispanic or Latino</Ethnicity>
<Sex>Female</Sex>
<Disability>No</Disability>
<MilitaryStatus>Active Duty</MilitaryStatus>
<BranchOfService>Prefer not to say</BranchOfService>
<Media>
<Code>Magazine/Newspaper</Code>

<Code>Other</Code>
<Other>testmedia</Other>
</Media>
<Internet></Internet>
<CurrentlyInBusiness>Yes</CurrentlyInBusiness>
<CurrentlyExporting>Yes</CurrentlyExporting>
<CompanyName>ABC company</CompanyName>
<BusinessType>Real Estate and Rental and Leasing</BusinessType>
<BusinessOwnership>
<Female>100</Female>
</BusinessOwnership>
<ConductingBusinessOnline>No</ConductingBusinessOnline>

<ClientIntake_Certified8a>No</ClientIntake_Certified8a>
<Employee_Owned>Yes</Employee_Owned>
<TotalNumberOfEmployees>14</TotalNumberOfEmployees>
<NumberOfEmployeesInExportingBusiness>0</NumberOfEmployeesInExportingBusiness>
<ClientAnnualIncomePart2>
<GrossRevenues>0.00</GrossRevenues>
<ProfitLoss>0.00</ProfitLoss>
<ExportGrossRevenuesOrSales>5660.00</ExportGrossRevenuesOrSales>
</ClientAnnualIncomePart2>
<LegalEntity>
<Code>Other</Code>
<Other>Other Counseling</Other>
</LegalEntity>
<Rural_vs_Urban>Urban</Rural_vs_Urban>
<FIPS_Code>54346</FIPS_Code>

<CounselingSeeking>
<Code>Business Plan</Code>
<Other></Other>
</CounselingSeeking>
<ExportCountries><Code>Belgium</Code><Other></Other></ExportCountries>

</ClientIntake>
<CounselorRecord>
<PartnerSessionNumber>371786_T50267</PartnerSessionNumber>

<FundingSource>Resiliency and Recovery Demonstration Grant – CARESRRD</FundingSource>

<ClientNamePart3>
<Last>hanks</Last>
<First>tom</First>
<Middle></Middle>
</ClientNamePart3>
<Email>tomh@gmail.com</Email>
<PhonePart3><Primary>2365894123</Primary><Secondary></Secondary></PhonePart3>
<AddressPart3>
<Street1></Street1>
<Street2></Street2>
<City>Alpharetta</City>
<State>Alabama</State>
<ZipCode>95928</ZipCode>
<Zip4Code>2344</Zip4Code>
<Country><Code>United States</Code></Country>
<PostalCode></PostalCode>
<StateOrProvince></StateOrProvince>
</AddressPart3>
<VerifiedToBeInBusiness>Yes</VerifiedToBeInBusiness>
<ReportableImpact>No</ReportableImpact>
<DateOfReportableImpact></DateOfReportableImpact>
<CurrentlyExporting>No</CurrentlyExporting>
<BusinessStartDatePart3>2021-12-31</BusinessStartDatePart3>
<TotalNumberOfEmployees>18</TotalNumberOfEmployees>
<NumberOfEmployeesInExportingBusiness></NumberOfEmployeesInExportingBusiness>
<ClientAnnualIncomePart3>
<GrossRevenues>0.00</GrossRevenues>
<ProfitLoss>0.00</ProfitLoss>
<ExportGrossRevenuesOrSales>788.00</ExportGrossRevenuesOrSales>
<GrowthIndicator>Yes</GrowthIndicator>
</ClientAnnualIncomePart3>
<ResourcePartnerServiceContributed>
<SBALoanAmount>10</SBALoanAmount>
<NonSBALoanAmount>10
</NonSBALoanAmount>
<EquityCapitalReceived>10
</EquityCapitalReceived>
<SBALoanAmountTxnNmb>80
</SBALoanAmountTxnNmb>
<NonSBALoanAmountTxnNmb>80
</NonSBALoanAmountTxnNmb>
<EquityCapitalReceivedTxnNmb>80
</EquityCapitalReceivedTxnNmb>
<NumberOfContractsReceived>80
</NumberOfContractsReceived>

</ResourcePartnerServiceContributed>
<Certifications>
<Code>Service-Disabled Veteran-Owned Small Business</Code>
<Other></Other>
</Certifications>
<SBAFinancialAssistance>
<Code>Community Advantage</Code>
<Code>Other(SBIR, SBIC, 7(a) 504, etc)</Code>
<Other>Other SBA Disaster Loan for COVID-19</Other>
</SBAFinancialAssistance>
<CounselingProvided>
<Code>Tax Planning</Code>
<Other></Other>
</CounselingProvided>
<ReferredClient>
<Code>SBA Office of International Trade (OIT)</Code>
<Other></Other>
</ReferredClient>
<SessionType>Online</SessionType>
<Language>
<Code>English</Code>
<Other></Other>
</Language>
<DateCounseled>2024-01-01</DateCounseled>
<CounselorName>Paul Bozzo (FC)</CounselorName>
<CounselingHours>
<Contact>7</Contact>
<Prepare>6</Prepare>
<Travel>6</Travel>
</CounselingHours>
<CounselorNotes>Test</CounselorNotes>
<ExportCountries>
<Code>United States</Code>

<Other></Other>
</ExportCountries>
<TrainingSession>
<DateTrainingStarted>2024-09-22</DateTrainingStarted>
<PartnerTrainingNumber>PartnerTrainingNum</PartnerTrainingNumber>
<EmployeesTrained>2</EmployeesTrained>
<HoursTrained>3</HoursTrained>
</TrainingSession>
</CounselorRecord>
</CounselingRecord>

</CounselingInformation>
Loading