Skip to content
View soneeee22000's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report soneeee22000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
soneeee22000/README.md

Pyae Sone Kyaw · Seon

AI Specialist · Data Scientist · Full-Stack AI Engineer & Architect

Founder & AI Engineer @ Ekkhara — an AI Ventures Studio · Station F, Paris 🇫🇷

I take products from an empty repo to live in production — owning architecture, backend, AI, and front-end end-to-end.

Portfolio Ekkhara LinkedIn Kaggle Email


👋 Who I Am

I'm a Full-Stack AI Engineer who ships products to real users, not demos. Five-plus years across AI, data, and software engineering — from NLP research labs in Bangkok and Paris to zero-to-one startups at Station F, to founding my own studio.

Today I'm building Ekkhara, a self-funded AI ventures studio, while engineering production systems at the intersection of health tech, regulatory compliance, real-world evidence, and telecom data infrastructure.

  • 🏗️ I architect first, then build. Clean / Hexagonal architecture, API-first design, real tests, CI that stays green.
  • 🤖 My specialty: RAG systems, production AI agents with observability & failure detection baked in, cloud data pipelines, and LLM fine-tuning.
  • 🎓 Dual Master's in Data Science — Télécom SudParis (Institut Polytechnique de Paris) 🇫🇷 & Asian Institute of Technology 🇹🇭.
  • 🌏 Yangon → Bangkok → Paris. Social scientist turned engineer — communication and cross-cultural instincts are part of the toolkit.

🚀 What I'm Building Now — Ekkhara Ventures

Real products, real users, real moats. Each one shipped end-to-end.

🗣️ SpeakProof — TOEFL Speaking Coach, live inside Telegram

The one that's genuinely shipping to real Myanmar learners. A TOEFL speaking & English-practice bot that runs entirely inside Telegram — so learners can train for the computer-based TOEFL despite the country's internet restrictions, no VPN needed. I built the full stack: Python/FastAPI services, LLM-driven speaking feedback & calibrated scoring, and the conversational UX.

Python · FastAPI · LLM · Telegram Bot API▶ Live → @SpeakProofTOEFLBot

🩺 VitaLens — AI Blood-Test Interpretation

Built & live on GCP Cloud Run. Mistral OCR over French lab reports + deterministic LOINC biomarker classification + FHIR R5 audit trail for personalized supplement guidance. The moat is the data + validation pipeline, not the LLM.

Python · Next.js · FastAPI · PostgreSQL · Mistral OCR · FHIR R573 tests · 86% coverage · Haleon @ VivaTech 2026

🌱 VitalAge — Smart-Aging Daily Vitality Companion

Built & live on GCP Cloud Run. A 60-second daily check-in habit loop with Mistral-Vision meal analysis and a longitudinal Vitality Score that compounds over 30 days. Retention is the moat.

Python · Mistral Vision · GCP Cloud Run64 tests · Nestlé @ VivaTech 2026


🏗️ Featured Engineering

VaxEvidence — Real-World Evidence Platform

Production-grade platform for vaccine researchers: PICO protocol builder, PRISMA screening pipeline, RoB 2 / ROBINS-I assessment, meta-analysis forest plots, real-time CRDT collaboration, and FDA / EMA / CDISC regulatory exports.

Next.js 16 · React 19 · TypeScript · Supabase76 API routes · 27 DB tables · 1,400+ tests · ▶ Live

GridFlex — Real-Time European Grid Lakehouse

Probabilistic forecasting and stochastic optimisation for battery-flexibility decisions on a real-time AWS lakehouse. Streaming ingestion → feature store → ML serving, fully orchestrated.

AWS · Apache Iceberg · Kafka · dbt · Airflow · MLflow

CDR Pipeline — Telecom Billing Backbone

Event-driven Call Detail Record ingestion, rating, and reconciliation pipeline — the kind of system that bills real mobile traffic. Idempotent, replayable, observable.

Java 21 · Spring Boot 3.5 · Kafka · MySQL · MongoDB · Docker

AgentProbe — AI Agent Failure Taxonomy & Eval Harness

A ReAct agent built observability-first — a failure taxonomy and evaluation harness that catches where agents break, with live SSE streaming of reasoning traces. This is how I think production agents should be built.

Python · FastAPI · Next.js 16 · PostgreSQL · ReAct · Groq · SSE

CSRD Lake — ESG / CSRD Data Pipeline

End-to-end CSRD / ESRS sustainability-reporting reference implementation — Snowflake in the cloud, DuckDB locally, dbt transformations, Airflow orchestration, LLM-assisted disclosure mapping.

Snowflake · DuckDB · dbt · Airflow · Claude · Mistral

wikiHow-MT-MY — English↔Myanmar MT Research

Human post-edited English→Myanmar instructional MT corpus, an NLLB-200 fine-tune benchmark, and a novel Instruction Faithfulness Score for evaluating low-resource translation.

Python · NLLB-200 · HuggingFace · PyTorch

More on pseonkyaw.dev — Diameter Credit-Control (Gy/RFC 4006), SMPP Gateway, Mobility Pulse (TimescaleDB + PostGIS + H3), BCBS 239 Lakehouse, and more.


🛠️ Languages & Tools

Languages

Python Java TypeScript JavaScript SQL

AI & ML

LangChain LangGraph OpenAI Anthropic Mistral AI HuggingFace PyTorch scikit-learn

Backend & Frameworks

FastAPI Spring Boot Node.js Django Next.js React Tailwind CSS

Data Engineering

Kafka Spark Airflow dbt Snowflake Databricks

Cloud & DevOps

AWS Azure GCP Docker Kubernetes Terraform GitHub Actions

Databases

PostgreSQL MongoDB Redis Supabase Neo4j


📊 GitHub Stats

GitHub Stats Top Languages GitHub Streak

Building at the frontier of AI, data, and product — from Station F to the rest of the world.

Open to mid-to-senior roles & collaboration — AI Engineer · ML Engineer · Data Scientist · Data Engineer.

📫 Always happy to talk AI, data, or building something ambitious.

pseonkyaw.dev · ekkhara.com · LinkedIn · Kaggle

Pinned Loading

  1. VaxEvidence-Dev VaxEvidence-Dev Public

    Production-grade Real-World Evidence platform for vaccine researchers. Next.js 16 · React 19 · Supabase · TypeScript. Features PICO protocol builder, PRISMA screening pipeline, RoB 2/ROBINS-I asses…

    TypeScript 1

  2. SafeGen.dev SafeGen.dev Public

    Responsible AI compliance middleware for LLMs — serverless pipeline with PII detection, bias checking, safety filtering, RAG-powered policy rules, real-time dashboard, and full audit trail. Built o…

    Python

  3. csrd-lake csrd-lake Public

    End-to-end CSRD/ESRS data pipeline reference implementation — Snowflake (validated) + DuckDB local + dbt + Airflow + Claude/Mistral GenAI extraction with page-level audit lineage.

    Python

  4. entre-deux entre-deux Public

    FHIR-native AI companion for chronic condition patients — consent-first, audit-logged, built with Mistral AI

    Python

  5. storybridge storybridge Public

    AI-powered bilingual family storytelling companion — bridges heritage languages through interactive bedtime stories with watercolor illustrations and bilingual narration. Built with Google ADK + Ge…

    TypeScript

  6. Seon.dev Seon.dev Public

    Production-grade personal portfolio built with Next.js 15, React 19, TypeScript & Tailwind CSS 4. Features particle canvas, 3D tilt cards, glitch text, SVG orbitals, and scroll reveal animations. S…

    TypeScript