Rehan Malik rehan243

about me

i'm an AI/ML engineer based in the US. right now i'm building production AI systems at Reallytics.ai and Verticiti, mostly getting large language models to do useful things in the real world. not demos, actual systems with real users and real traffic.

before this i was at Afiniti and Cloud Kinetics for a few years. fraud detection, voice analytics, enterprise search. the kind of stuff that pages you at 3am when something breaks.

honestly what keeps me going is when an agent you built solves something you never explicitly told it to do. that feeling never gets old.

what i'm working on right now:

multi-agent systems that don't fall apart when you chain them
RAG pipelines that actually return relevant results
writing about what i learn every day, check it out here

featured projects

Agentic AI Workflows 8 specialized AI agents with LangChain + OpenAI function calling. multi-agent orchestration with planning loops and guardrails. the project i'm most excited about right now.	RAG Enterprise Search production retrieval pipeline over 2TB+ data. hybrid dense+sparse search with FAISS and BM25, cross-encoder re-ranking. deployed on AWS SageMaker.
Voice AI Platform real-time voice infrastructure handling 500+ concurrent calls. WebSockets, Kafka, VAD, streaming STT. built the sentiment analysis piece from scratch.	LLM Fine-Tuning LoRA fine-tuning LLaMA and Mistral with LoRA/QLoRA/PEFT. 40% cheaper than hosted APIs. includes the full training loop, data pipeline, merge + quantize scripts.
RLHF LLM Optimization full RLHF pipeline: reward model with Bradley-Terry loss, PPO trainer with KL scheduling, DPO as an alternative. 68% win rate on eval, 96% safety compliance.	Sentinel Fraud Detection ensemble XGBoost + neural net with 650+ engineered features. Redis-backed real-time velocity scoring, SHAP explainability, Kafka alert routing.

tech stack

not going to pretend i use everything equally. here's what i actually reach for:

the full picture (click to expand)


daily drivers	Python, PyTorch, FastAPI, Docker, Git, VS Code
LLM and GenAI	LangChain, LlamaIndex, HuggingFace Transformers, vLLM, PEFT/LoRA/QLoRA
data and vector	FAISS, ChromaDB, Pinecone, PostgreSQL, MongoDB, Redis, Kafka, Elasticsearch
cloud and MLOps	AWS (SageMaker, Bedrock, Lambda, ECS), GCP Vertex AI, Azure OpenAI
ML frameworks	TensorFlow, scikit-learn, XGBoost, LightGBM, ONNX
infrastructure	Kubernetes, Terraform, GitHub Actions, MLflow, Weights & Biases

github stats

trophies

contribution graph

my github contributions eating themselves

recent writeups

i write about what i'm building and learning. nothing polished, more like notes to my future self that happen to be public.

Retrieval Augmented Generation Rag In Production _2026-06-04	Ai Safety And Alignment Engineering _2026-05-31
Real Time Model Serving With Gpus _2026-05-30	Multi Agent Ai Orchestration Patterns _2026-05-29

📚 View all articles →

recent activity

💬 Commented on issue: 0.9.5 -> 0.9.6 Model-attached skills are injected int in open-webui/open-webui _(2026-06-04)

💬 Commented on YOLOE Visual Prompt based Classification in ultralytics/ultralytics _(2026-06-04)

💬 Commented on Chinese characters display garbled in sweepai/sweep _(2026-06-04)

💬 Commented on Feature Request: Implement Adaptive PFlash (Self-Tuning Pref in ggml-org/llama.cpp _(2026-06-04)

💬 Commented on Transformer Engine plugin fails to check weight exists for L in Lightning-AI/pytorch-lightning _(2026-06-04)

💬 Commented on Performance/caching issue: tokenizer fails to reset has_spec in explosion/spaCy _(2026-06-04)

💬 Commented on PXI: dumps raw resource IDs instead of actionable links in r in Arize-ai/phoenix _(2026-06-04)

💬 Commented on Your project is now listed on CodeGuilds in modelcontextprotocol/servers _(2026-06-04)

what i'm reading lately

stuff i've been digging into recently. mostly papers, blog posts, and rabbit holes that kept me up too late.

🔬 Fine-Tuning and Customization of Open-Source LLMs for Domain-Specific Tasks

🔬 Retrieval-Augmented Generation (RAG) in Production LLM Systems

🔬 Graph RAG and Knowledge Graphs for LLMs

🔬 LLM Fine-Tuning at Scale with LoRA

🔬 AI Safety and Alignment Engineering

🔬 Edge AI and TinyML

code snippets

📌 Prompt Template Engine with Variable Injection — Production Pattern (Python) _(2026-06-04)

📌 Agent Tool Registry with Dynamic Discovery — Production Pattern (Python) _(2026-06-04)

📌 Agent Tool Registry with Dynamic Discovery — Production Pattern (Python) _(2026-06-02)

_{🤖 Profile auto-updated on 2026-06-04 20:13 UTC}

_{if you made it this far, you should probably just say hi}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly