CS & Math @ NYU | Building data systems and backend infrastructure
Backend system that ingests SEC EDGAR filings and exposes structured financial data through a REST API. Repo
- Built async ingestion pipeline processing 13K+ companies from SEC EDGAR, parsing XBRL into normalized PostgreSQL tables
- Designed EAV data model and queryable REST API (FastAPI) supporting cross-company screening across 50+ metrics
- Handles 10-K/10-Q filings end-to-end: fetch, parse, store, serve
Stack: Python, FastAPI, PostgreSQL, async SQLAlchemy, Docker
Automated research pipeline and scenario modeling engine for a crypto-native PE firm's diligence process. Repo
- Built Monte Carlo simulation engine with configurable scenario parameters for estimating settlement cost savings
- Structured data extraction pipeline from SEC 10-K filings to quantify $560M/year in payments friction
- Designed falsifiability-ranked diligence framework with programmatic sensitivity analysis
Stack: Python, pandas, NumPy, SEC EDGAR API
End-to-end ML pipeline from feature engineering to production-ready scoring outputs across 1.3M+ records. Repo
- Engineered 50+ features and built classification pipeline (Logistic Regression, GBM, XGBoost) with class weighting
- Validated on out-of-time holdout (0.72 ROC-AUC), mapped predicted probabilities to policy decision tiers
- Operationalized scoring outputs for downstream consumption
Stack: Python, scikit-learn, XGBoost, pandas, NumPy
Independent research quantifying neural scaling behavior in the sub-1B parameter regime. Repo
- Identified efficiency threshold at 350M parameters using power-law fitting (R² = 0.99)
- Documented statistical limitations transparently (n=4 sample size)
Stack: PyTorch, HuggingFace, MobileLLM
| Category | Tools |
|---|---|
| Languages | Python, SQL, Java, Go, R |
| Backend & APIs | FastAPI, REST API design, async programming, Linux, Docker, Git |
| Data Infrastructure | PostgreSQL, dbt, Prefect, AWS (S3, Redshift), data pipeline design |
| ML & Analytics | scikit-learn, XGBoost, PyTorch, pandas, NumPy, hypothesis testing |
| Cloud & DevOps | AWS, Docker, CI/CD, security fundamentals |
I build backend systems, data pipelines, and infrastructure that turns messy data into something useful.



