Skip to content
View hkv-31's full-sized avatar

Highlights

  • Pro

Block or report hkv-31

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hkv-31/README.md

Portfolio LinkedIn Email Research


About me

I'm a 3rd year B.Tech (CS, AI & ML) student at Atlas SkillTech University, Mumbai with a 9.5 CGPA and a co-authored research paper published at ICICC 2025.

I build things that sit at the intersection of real-world problems and AI - RAG pipelines on AWS, LLM agents, OCR-driven healthcare systems, and ML models trained on scientific datasets. I care less about following tutorials and more about shipping systems that actually work end-to-end.

Currently focused on: LLM-powered automation, AI agents, and accessible AI products for India.

"Build for the community."


Tech stack

AI / LLM

Python OpenAI Claude Gemini Groq LangChain

ML / Data

Scikit-learn XGBoost Pandas NumPy Jupyter

Backend / Infrastructure

FastAPI AWS Docker MongoDB Redis

Other Languages

Java SQL JavaScript


Featured projects

🩺 AI Digital Health Twin

OCR + Gemini + Random Forest - WHO-aligned health risk estimation for India

Built a system that reads uploaded blood reports via OCR, extracts structured medical values using Gemini 1.5 Flash, runs them through a trained Random Forest classifier, and returns explainable risk scores for diabetes, hypertension, and cardiovascular disease - all aligned to WHO South Asia thresholds.

Stack: Python · FastAPI · Gemini API · Tesseract OCR · Scikit-learn · Vite
Live: digital-twin-health.vercel.app | Repo: AI-Digital-Health-Twin


⚖️ Know Your Rights AI

RAG-based legal AI assistant - fully deployed on AWS

Semantic retrieval pipeline over legal documents using AWS Lambda, DynamoDB, S3, and Amplify. Includes structured LLM outputs, caching layers, and fallback handling for edge cases. Built for users who can't access or afford legal counsel.

Stack: AWS Lambda · DynamoDB · S3 · Amplify · Claude API · RAG
Repo: Know-Your-Rights-AI


🎯 ICP Signal Scorer

B2B lead qualification agent - LLM scoring + Notion CRM integration

AI agent that evaluates LinkedIn leads for ICP fit using contextual inference, signal extraction, and a weighted scoring system. Automates lead classification and personalized outreach generation for RevOps teams at SaaS companies.

Stack: OpenAI API · Prompt Engineering · Notion CRM · AI Automation
Repo: ICP-Signal-Scorer


🔭 Exodetect - Exoplanet Classification

Multi-model ML on combined NASA Kepler + TESS datasets

Trained and benchmarked Random Forest, XGBoost, and LightGBM on combined NASA Kepler + TESS transit photometry data for multi-class exoplanet signal classification. Includes feature importance analysis for interpretability.

Stack: Python · Scikit-learn · XGBoost · LightGBM · Jupyter
Repo: exodetect-project


⚛️ Particle Track Reconstruction

Graph-based ML on the TrackML high-energy physics dataset

Reconstructed charged particle trajectories from detector hit data using the TrackML benchmark dataset - a problem from CERN-adjacent research. Demonstrates ML applied to scientific computing at a level rare for undergraduates.

Repo: Particle-Track-Reconstruction


Research

📄 Harnessing Machine Learning in Fraud Detection: Techniques, Challenges, and Opportunities
Co-Author & Presenter · ICICC 2025
Topics: ensemble methods, anomaly detection, class imbalance, real-world deployment challenges


GitHub stats


Let's connect

If you're building something at the intersection of AI and real-world access - Let's discuss more over coffee.

LinkedIn Portfolio Email

Pinned Loading

  1. AI-Digital-Health-Twin AI-Digital-Health-Twin Public

    DigitalTwinAI is a WHO-aligned AI Health Digital Twin that helps users interpret blood reports and understand long-term health risks in a clear, explainable, and non-diagnostic manner.

    Python

  2. Particle-Track-Reconstruction Particle-Track-Reconstruction Public

    Particle track reconstruction with the TrackML dataset

    Jupyter Notebook

  3. exodetect-project exodetect-project Public

    A machine learning system for detecting and classifying exoplanets using combined data from NASA's Kepler and TESS space missions.

    TypeScript

  4. tanishkashukla/PYTHon-AI-for-Bharat tanishkashukla/PYTHon-AI-for-Bharat Public

    AI-powered legal assistant that retrieves relevant Indian legal sections and explains them in simple language using a serverless AWS architecture.

    TypeScript 1

  5. ai-mock-interview-coach ai-mock-interview-coach Public

    HTML

  6. Code-Explainer-Git-Agent Code-Explainer-Git-Agent Public

    AI agent that explains code & generates tests on @gitagent PR comments

    JavaScript