Skip to content
View Hussain0327's full-sized avatar
:shipit:
Hello!
:shipit:
Hello!

Block or report Hussain0327

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Hussain0327/README.md

CS & Math @ NYU | Building data systems and backend infrastructure

Email LinkedIn


Projects

Atlas Intel — Financial Data Pipeline & API

Backend system that ingests SEC EDGAR filings and exposes structured financial data through a REST API. Repo

  • Built async ingestion pipeline processing 13K+ companies from SEC EDGAR, parsing XBRL into normalized PostgreSQL tables
  • Designed EAV data model and queryable REST API (FastAPI) supporting cross-company screening across 50+ metrics
  • Handles 10-K/10-Q filings end-to-end: fetch, parse, store, serve

Stack: Python, FastAPI, PostgreSQL, async SQLAlchemy, Docker


Freight Settlement Infrastructure — Investment Research System

Automated research pipeline and scenario modeling engine for a crypto-native PE firm's diligence process. Repo

  • Built Monte Carlo simulation engine with configurable scenario parameters for estimating settlement cost savings
  • Structured data extraction pipeline from SEC 10-K filings to quantify $560M/year in payments friction
  • Designed falsifiability-ranked diligence framework with programmatic sensitivity analysis

Stack: Python, pandas, NumPy, SEC EDGAR API


Credit Risk Scoring Platform

End-to-end ML pipeline from feature engineering to production-ready scoring outputs across 1.3M+ records. Repo

  • Engineered 50+ features and built classification pipeline (Logistic Regression, GBM, XGBoost) with class weighting
  • Validated on out-of-time holdout (0.72 ROC-AUC), mapped predicted probabilities to policy decision tiers
  • Operationalized scoring outputs for downstream consumption

Stack: Python, scikit-learn, XGBoost, pandas, NumPy


Scaling Laws for Small Language Models

Independent research quantifying neural scaling behavior in the sub-1B parameter regime. Repo

  • Identified efficiency threshold at 350M parameters using power-law fitting (R² = 0.99)
  • Documented statistical limitations transparently (n=4 sample size)

Stack: PyTorch, HuggingFace, MobileLLM


Technical Skills

Category Tools
Languages Python, SQL, Java, Go, R
Backend & APIs FastAPI, REST API design, async programming, Linux, Docker, Git
Data Infrastructure PostgreSQL, dbt, Prefect, AWS (S3, Redshift), data pipeline design
ML & Analytics scikit-learn, XGBoost, PyTorch, pandas, NumPy, hypothesis testing
Cloud & DevOps AWS, Docker, CI/CD, security fundamentals

I build backend systems, data pipelines, and infrastructure that turns messy data into something useful.

Pinned Loading

  1. Langgraph-bi-agent-orchestrator Langgraph-bi-agent-orchestrator Public

    Multi-agent business intelligence orchestration system powered by GPT-5/Deepseek v3.2exp, LangGraph, and research-augmented generation.

    Python 1

  2. risk_modeling risk_modeling Public

    A machine learning pipeline for predicting loan defaults using LendingClub data. This project explores how financial institutions assess credit risk and which borrower characteristics drive default…

    Jupyter Notebook

  3. ValtricAI/EvidentAI ValtricAI/EvidentAI Public

    TypeScript

  4. atlas-intel atlas-intel Public

    Python

  5. quant-backtesting-validation quant-backtesting-validation Public

    A research-grade backtesting engine for statistically validating trading strategies against historical market data.

    TypeScript

  6. ValtricAI/Scaling-Laws-for-Small-Language-Models ValtricAI/Scaling-Laws-for-Small-Language-Models Public

    Python