I build intelligent systems that actually ship to production.
I'm an AI Engineer focused on Generative AI and LLM systems, with hands-on experience designing RAG pipelines, optimizing inference, and building scalable backend systems. I enjoy turning messy real-world problems into clean, efficient solutions โ especially when LLMs are involved.
Outside of work, you'll probably find me solving LeetCode problems, experimenting with new model architectures, or squeezing more performance out of GPUs than they probably expected.
- ๐ง Production-grade RAG systems
- ๐ Hybrid search (Dense + BM25)
- โก Scalable LLM inference optimization
- ๐ Document intelligence & OCR pipelines
- ๐ Backend systems with Python & Django
- ๐ Vector databases (FAISS, Milvus)
- โ๏ธ Cloud deployments on Azure
I care deeply about latency, cost, and real-world usability โ not just demos.
- Long-context LLM optimization
- Advanced retrieval strategies
- Semantic caching patterns
- LLM system design at scale
|
|
500+ problems solved and counting. I treat DSA as mental gym.
AI / ML
- LLMs, Transformers, RAG
- PyTorch, HuggingFace
- Prompt Engineering & Fine-tuning
Backend
- Python, Django, DRF
- Celery, Redis
- REST API design
Search & Data
- FAISS, Milvus
- Hybrid Retrieval
- Embeddings evaluation
Infra
- Docker & Docker Compose
- Azure VMs
- Azure AI Foundry
- ๐ผ LinkedIn: https://www.linkedin.com/in/balnarendrasapa/
- ๐ง HuggingFace: https://huggingface.co/bnsapa
- ๐ Portfolio: https://balnarendrasapa.github.io/portfolio/
- ๐ง Email: bnsapa2000@gmail.com