Skip to content
View galafis's full-sized avatar
  • Brazil
  • 05:52 (UTC -03:00)

Block or report galafis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
galafis/README.md
Typing SVG

Gabriel Demetrios Lafis

Data Scientist | MLOps | Generative AI (LLMs)

LinkedIn GitHub Credly Gmail


About Me

Data Scientist with hands-on experience across the full ML lifecycle: from ETL/ELT and exploratory analysis to predictive modeling, deployment, and production monitoring (MLOps). I specialize in building end-to-end AI solutions that bridge technical depth with business strategy.

My work spans real-time fraud detection, LLM refinement with RLHF, clinical NLP pipelines, and high-frequency trading analytics — always focused on delivering measurable impact.

  • Currently pursuing a Postgraduate in AI & Data Science in Healthcare at Hospital Sirio-Libanes and an MBA in Data Science, AI & Analytics at USP/Esalq
  • Certified by Google, IBM, Johns Hopkins, and Wharton/UPenn
  • Open to opportunities in Data Science, MLOps, and Generative AI

Impact Highlights

Metric Result
False positives reduced in fraud detection system -28% (30K+ transactions/day)
MLOps pipeline uptime monitoring 4 models in production 99.2%
LLM alignment score improvement via RLHF (5K+ evaluation pairs) +18%
Analysis time reduction through ML/AI solutions -35%
SQL query performance optimization +40%

Tech Stack

Languages

Python SQL R JavaScript TypeScript Java Scala Go

ML & AI

TensorFlow PyTorch scikit-learn XGBoost LightGBM HuggingFace LangChain

Data & MLOps

Apache Spark Kafka Airflow MLflow Docker Kubernetes GitHub Actions

Cloud & Databases

GCP AWS Azure BigQuery PostgreSQL MongoDB Redis

BI & Visualization

Power BI Looker Studio Plotly Tableau


Featured Projects

AI Financial Fraud Detection

Real-time fraud detection with ensemble ML and end-to-end MLOps

Python TensorFlow XGBoost MLflow Kafka Spark Streaming

  • Ensemble of 4 models (Random Forest, XGBoost, Neural Networks, Autoencoders)
  • AUC 0.94 with latency under 200ms
  • Full MLOps pipeline with drift detection and model explainability

Repo

High-Frequency Trading Analytics

Real-time analytics platform for HFT data processing

Python PySpark Kafka PostgreSQL

  • Processes 10K+ events/second in real-time
  • Market microstructure signal detection and monitoring
  • Ultra-low latency data ingestion pipeline

Repo

Clinical NLP Pipeline (PT-BR)

Medical entity extraction from Brazilian clinical texts

Python Hugging Face spaCy BERTimbau FastAPI

  • NER for ICD-10 diagnoses, medications, symptoms, and procedures
  • Fine-tuned BERTimbau with optimized F1-score
  • REST API for real-time inference in healthcare settings

Repo

Genomic Data Analysis Pipeline

Automated NGS data processing and multi-omics analysis

Python R Nextflow Docker TensorFlow

  • Supports DNA-seq, RNA-seq, single-cell, and ChIP-seq workflows
  • Multi-omics analysis with ML-based genomic insights
  • Scalable for HPC and cloud environments (AWS, GCP, Azure)

Repo


Certifications

Provider Certification Credential
Google Advanced Data Analytics, Data Analytics View
IBM AI Engineering, Generative AI Engineering, Deep Learning, Data Science, Data Engineering, Machine Learning View
Johns Hopkins Data Science Specialization View
Wharton Business Analytics View

All credentials: credly.com/users/gabriel-lafis


Professional Experience

Role Company Period Highlights
Data Scientist / Data Analyst trade2go 2025–2026 ML lifecycle, predictive models, BI dashboards
Data Science Researcher Manus AI 2025–present R&D in AI/ML, architecture experimentation, feature engineering
Data Scientist / GenAI Consultant Mindrift 2024–2025 LLM refinement with RLHF, code validation, benchmarking
Cybersecurity Analyst Sicredi 2023–2025 Real-time fraud detection, MLOps pipelines, anomaly detection
Fullstack Dev Intern EBANX 2022–2023 Scalable web apps, query optimization, agile team

Education

  • Postgraduate in AI & Data Science in Healthcare — Hospital Sirio-Libanes (2026–2027, in progress)
  • MBA in Data Science, AI & Analytics — USP/Esalq (2026–2027, in progress)
  • B.Tech in Systems Analysis and Development — UniDomBosco (2022–2025)
  • Data Science Professional Certificate — EBAC (2024–2025)

GitHub Analytics

GitHub Stats Top Languages

Activity Graph


Let's Connect

I'm always open to discussing Data Science, MLOps, Generative AI, or collaboration opportunities.

LinkedIn Gmail Credly


Profile Views

Pinned Loading

  1. Advanced-ML-Pipeline Advanced-ML-Pipeline Public

    Pipeline de ML para classificacao com EDA automatizada, comparacao de 4 modelos (RF, GB, LR, SVM), GridSearchCV e persistencia. Projeto educacional.

    Python 1

  2. ai-financial-fraud-detection ai-financial-fraud-detection Public

    AI-powered fraud detection system for financial transactions. Uses ensemble models, anomaly detection, and real-time scoring to identify fraudulent patterns.

    Python 1

  3. genomic-data-analysis-pipeline genomic-data-analysis-pipeline Public

    End-to-end genomic data analysis pipeline: DNA-seq, RNA-seq, single-cell & ChIP-seq workflows with ML-based insights on HPC and cloud (AWS, GCP, Azure)

    Python 1

  4. high-frequency-trading-analytics high-frequency-trading-analytics Public

    Real-time analytics platform for high-frequency trading data. Processes tick-level data with ultra-low latency for market microstructure insights and trading performance analysis.

    Python 1

  5. clinical-nlp-pipeline-ptbr clinical-nlp-pipeline-ptbr Public

    Pipeline de NLP clinico para portugues brasileiro - Extracao de entidades medicas (NER) de textos clinicos usando Transformers (BERTimbau/BioBERTpt)

    Python