Raja Hussain Hussain0327

CS & Math @ NYU | Building data systems and backend infrastructure

Projects

Backend system that ingests SEC EDGAR filings and exposes structured financial data through a REST API. Repo

Built async ingestion pipeline processing 13K+ companies from SEC EDGAR, parsing XBRL into normalized PostgreSQL tables
Designed EAV data model and queryable REST API (FastAPI) supporting cross-company screening across 50+ metrics
Handles 10-K/10-Q filings end-to-end: fetch, parse, store, serve

Stack: Python, FastAPI, PostgreSQL, async SQLAlchemy, Docker

Automated research pipeline and scenario modeling engine for a crypto-native PE firm's diligence process. Repo

Built Monte Carlo simulation engine with configurable scenario parameters for estimating settlement cost savings
Structured data extraction pipeline from SEC 10-K filings to quantify $560M/year in payments friction
Designed falsifiability-ranked diligence framework with programmatic sensitivity analysis

Stack: Python, pandas, NumPy, SEC EDGAR API

End-to-end ML pipeline from feature engineering to production-ready scoring outputs across 1.3M+ records. Repo

Engineered 50+ features and built classification pipeline (Logistic Regression, GBM, XGBoost) with class weighting
Validated on out-of-time holdout (0.72 ROC-AUC), mapped predicted probabilities to policy decision tiers
Operationalized scoring outputs for downstream consumption

Stack: Python, scikit-learn, XGBoost, pandas, NumPy

Independent research quantifying neural scaling behavior in the sub-1B parameter regime. Repo

Identified efficiency threshold at 350M parameters using power-law fitting (R² = 0.99)
Documented statistical limitations transparently (n=4 sample size)

Stack: PyTorch, HuggingFace, MobileLLM

Category	Tools
Languages	Python, SQL, Java, Go, R
Backend & APIs	FastAPI, REST API design, async programming, Linux, Docker, Git
Data Infrastructure	PostgreSQL, dbt, Prefect, AWS (S3, Redshift), data pipeline design
ML & Analytics	scikit-learn, XGBoost, PyTorch, pandas, NumPy, hypothesis testing
Cloud & DevOps	AWS, Docker, CI/CD, security fundamentals

I build backend systems, data pipelines, and infrastructure that turns messy data into something useful.