Jump to: About β’ Zindi β’ Focus β’ Competencies β’ Tech β’ Competitions β’ Projects β’ GitHub β’ Connect
- π Building and sharing ML, Data Science, and Data Engineering work at africdsa.com
- π Zindi competitor β consistently placing in top rankings across African AI challenges
- π± Currently deepening expertise in NLP, Speech Recognition, and LLM fine-tuning
- π Building full-stack data products: FastAPI + Next.js + Supabase pipelines
- π¬ Ask me about Python, ML workflows, data pipelines, geospatial ML, or real estate analytics
- π Proudly coding from Nairobi, Kenya β solving African problems with AI
- βοΈ Chess: daily tactics, endgames, and openings
- Building: Nairobi Property Pricing β full-stack real estate intelligence platform (Next.js + FastAPI + Supabase)
- Competing: Active Zindi challenges β NLP, computer vision, time-series forecasting
- Exploring: LLM fine-tuning and speech recognition for low-resource African languages
- Sharing: ML tutorials and open-source African data science at africdsa.com
I build practical, production-minded ML and data systems β from ingestion and transformation to modeling, evaluation, and deployment.
|
Machine Learning |
Data Science |
Data Engineering |
| Competition | Platform | Position | Solution |
|---|---|---|---|
| Barbados Lands & Surveys Plot Automation | Zindi | π Top Finish | Repo |
Full-stack real estate intelligence platform for Nairobi β automated data pipeline from scraping to interactive affordability dashboard.
What makes it strong
- Automated scraping pipeline: daily GitHub Actions workflow scrapes live listings and pushes to Supabase
- Intelligent parsing: extracts bedroom counts from messy titles/URLs (
"2br","two bed", etc.) - Affordability analytics: price-per-bedroom metrics, tier segmentation, location summaries
- Interactive frontend: Next.js + TypeScript dashboard with geographic affordability map
π Backend: josephgitau/nairobi_property_pricing π Frontend: josephgitau/nairobi-property-pricing-frontend π Live: josephgitau.me
End-to-end geospatial digitization pipeline for cadastral survey maps: parcel boundary detection + polygon extraction + OCR to produce structured, searchable outputs ready for indexing and downstream GIS use.
What makes it strong
- Boundary segmentation: UNet++ with EfficientNet-B7 encoder for precise parcel detection from raster maps
- Production-grade post-processing: polygon cleaning (hole removal, simplification, smoothing) for valid GIS geometries
- Robust OCR strategy: Qwen3-VL-30B (vision-language) for noisy map text β zero/few-shot generalization instead of risky fine-tuning
- Reproducible deliverables: training + inference notebooks, checkpoints, and final merged outputs (polygons + OCR text)
Results: Public 0.965006861 β’ Private 0.970242006
π Repo: josephgitau/Barbados-Lands-and-Surveys-Plot-Automation-Challenge π Data prep: Open in Colab
NLP-powered bot that generates contextual stories based on detected sentiment.
π Repo: josephgitau/Sentiment_Story_Generation_Bot
- βοΈ Chess is both relaxation and mental training β tactics, endgames, and openings every day.
- π Passionate about applying AI to solve African problems β housing, agriculture, language.
- π§© Love clean, elegant solutions to messy real-world data.
- π Always building something new β the best model is the next one.
"Build from data. Compete with purpose. Ship it."



