Skip to content
View VincentG1234's full-sized avatar

Highlights

  • Pro

Block or report VincentG1234

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
VincentG1234/README.md

Vincent Gimenes

Machine Learning Engineer β€” LLM Systems, Inference & Applied AI

I design, deploy, and optimize production AI systems with a focus on LLM inference, performance engineering, and real-world deployment.

My work sits at the intersection of:

  • Large Language Models & agents
  • inference performance & GPU efficiency
  • ML systems & cloud infrastructure
  • applied AI for document intelligence & decision workflows

🧠 Core Expertise

LLM Systems & Inference

  • vLLM deployment & optimization
  • speculative decoding & batching strategies
  • KV-cache monitoring & memory pressure mitigation
  • latency & throughput optimization
  • GPU scheduling & orchestration

Applied AI & Retrieval Systems

  • RAG agents & document intelligence
  • structured information extraction
  • evaluation pipelines & schema design
  • prompt & system design

ML Engineering & Infrastructure

  • Kubernetes, Docker, ArgoCD
  • GPU workloads & CUDA environments
  • monitoring & performance profiling
  • scalable API deployment

πŸ›  Technical Stack

Languages Python β€’ C++ β€’ Bash β€’ JS

AI & ML PyTorch β€’ Transformers β€’ vLLM β€’ Triton β€’ scikit-learn

Infrastructure & MLOps Docker β€’ Kubernetes β€’ ArgoCD β€’ Linux β€’ Git

✍️ Writing Blog posts for Quickscale AI

  • Fine-tuning small Vision-Language models for structured extraction
  • Context window scaling & memory implications
  • Deploying GPT-OSS-20B with vLLM
  • Training & workshops on LLM agents and deployment

πŸŽ“ Education

ENSAE Paris β€” Institut Polytechnique de Paris Engineering Program

Focus areas: Advanced & Bayesian Statistics β€’ Machine Learning β€’ Optimization β€’ Econometrics β€’ Parallel Computing β€’ Deep Learning β€’ NLP

⭐ Indi Projects

πŸ–₯ FloatPilot β€” Desktop LLM Client

Lightweight always-on-top AI assistant

  • Instant screenshot capture injected into LLM context
  • Global shortcut & frictionless workflow
  • Public distribution with landing page

➑️ https://floatpilot.app

πŸ† Hackathon Project

Hackathon Name Description Technologies Used Link
Hackathon Banque de France (WINNER) Design a solution that automatically identifies legal topics of interest currently handled by the business, based on documentation, and generates legal monitoring content on these topics (such as articles, news, codes of conduct, and European legislation) to be distributed via a newsletter. Python, Azure, React, RSS flux, GPT API, TF-IDF forbidden to share the solution
H-Gen AI 2025 (WINNER) document analysis tool developed for Gide, a leading international law firm. The application streamlines the audit process by automatically analyzing PDF documents using Large Language Models (LLM) and generating structured audit reports in Word format based on predefined templates. Python, AWS, RAG GitHub Repo
H!Paris model trained to predict water levels in water tables over time. Python, XGboost, LSTM None

πŸ“‚ Major Projects

Here are some of the key projects I've worked on:

Project Name Description Technologies Used Link
Investment opportunities identification - Ardian Implementation of a search engine leveraging BERT and additional data to identify firms with high acquisition potential. Python, BERT, Azure GitHub Repo
Document Chat Application (RAG) An intelligent web application that enables users to upload documents and engage in conversations about their content using advanced Large Language Model technology. Built with FastAPI, Firebase authentication, and OpenAI's GPT models. OpenIA, FastAPI, Docker, K8s Github Repo
Analysis of LLM at small scale - INRIA Implementation and training of small-scale language models (<100M parameters) using the Transformers library on AWS cloud with GPU, trained on the full English Wikipedia Python, Transformers, Pytorch, wandb.ai GitHub Repo

πŸ› οΈ Other Projects

Project Name Description Technologies Used Link
Double Descent The project explores the double descent phenomenon, where test error improves after overparameterization, using linear regression, RFF, and neural networks. Experiments confirm that implicit biases enable overparameterized models to generalize effectively, challenging traditional overfitting views. Python, Pytorch, git GitHub Repo
Bayesian Statistics: Optimal Bayesian Estimation of t-Student Mixtures with a Growing Number of Components The project extends Bayesian estimation for Gaussian mixture models to t-Student mixtures, leveraging their suitability for heavy-tailed data. While theoretical challenges arise due to the t-Student's heavy tails, empirical simulations show Bayesian methods perform robustly, particularly in scenarios with complex or heavy-tailed distributions, making them valuable for real-world applications. Python, git GitHub Repo
Time Series Analysis of the French industrial Production Index for Electricity Production Data cleaning, transformation to stationnarity model selection and validation using ARMA and ARIMA models R GitHub Repo
Sentiment Analysis Web scraping to extract data, followed by sentiment analysis of the top 100 box office films Python, Selenium, NLTK, SpaCy, Scikit-learn, Pandas GitHub Repo

🌍 Languages

French (native) English (B2–C1) Russian (B1)

πŸš€ Get In Touch

Feel free to explore the repository and reach out if you’d like to collaborate or discuss exciting ideas!


Application is the alchemy that transforms your acquired knowledge into gold

Popular repositories Loading

  1. ARDIAN_CAPSTONE ARDIAN_CAPSTONE Public

    Python 1

  2. MLops_ENSAE MLops_ENSAE Public

    Python 1

  3. VincentG1234 VincentG1234 Public

  4. gide gide Public

    Forked from H-GenIAL/gide

    Python

  5. application application Public

    Forked from ensae-reproductibilite/application

    Jupyter Notebook

  6. MLops_ENSAE_gitops MLops_ENSAE_gitops Public

    dΓ©pΓ΄t pour la partie CD de l'application du dΓ©pΓ΄t MLops_ENSAE