Physician · Clinical Data Scientist · Senior Statistical Programmer
Bridging healthcare and code through R, CDISC, OMOP CDM, and AI-driven pipelines for Real-World Evidence.
I'm an MD who programs. I work at the seam between clinical trials, regulatory data standards, and applied AI, turning messy clinical source documents into submission-grade, reproducible R pipelines.
- 🧬 Clinical R / pharmaverse: moving teams from SAS to
admiral,cards,Tplyr,metacore,xportr. - 📐 CDISC standards: SDTM, ADaM, and the Analysis Results Standard (ARS), from annotated TLF shells → ARS JSON → ARD.
- 🏥 OHDSI / OMOP CDM: phenotype curation, MIMIC-IV ETL, LLM-assisted concept-set benchmarking.
- 🤖 Clinical AI: LLM pipelines for spec generation, metadata enrichment, and phenotype discovery.
- 💻 SAS / Python: production-grade SAS for clinical trial programming and regulatory submissions; Python for automation, data wrangling, and LLM pipeline development.
| Repo | What it is |
|---|---|
| pharmaverse-tutorials | 46 interactive learnr tutorials (712 live exercises) for SAS→pharmaverse transition, on real CDISCPILOT01 data. |
| arsbridge | R package: parse/validate/execute CDISC ARS specs into tidy ARD via {cards}, with multi-LLM metadata enrichment. |
| ars-learnr-tutorial | 7-chapter hands-on course: annotated TLF shells → ARM-TS JSON on pharmaverse datasets. |
| cards-tutorial | 10-chapter {cards}/{cardx} course covering ARD, model tidying, and ARS JSON mapping. |
| omop-phenotype-pipeline | Benchmarking LLM-assisted OMOP phenotype curation with concept- and patient-level F1 on MIMIC-IV. |
| precise-X 🔒 (private) | Lead statistical programmer: built and validated the Cox PH + LASSO survival model predicting first severe COPD exacerbation within 5 years from UK primary-care records. Published in Thorax (2025). 📄 Read the paper |
R · pharmaverse (admiral, cards, Tplyr, metacore, xportr, teal) · SAS · SQL / PostgreSQL · Python · OMOP CDM · MIMIC-IV · CDISC SDTM / ADaM / ARS · Docker · LLM pipelines (Claude, Gemini)
Open to clinical-R, RWE, and health-AI collaboration. Reach out via any repo discussion or my site.

