Skip to content
View Ndhakeph's full-sized avatar

Block or report Ndhakeph

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Ndhakeph/README.md

Nishad Dhakephalkar

AI/GenAI engineer building production-oriented AI systems - RAG, agents, evaluation, and safety. Computer Engineering @ I²IT Pune · Grad 2027 · Pune, India

I care more about the parts of an AI system that decide whether it actually works - retrieval quality, evaluation, safety, and orchestration - than about model hype. Most of what I build is an attempt to make LLM behaviour reliable and measurable, not just demo-able.

Python TypeScript FastAPI Next.js LangChain PostgreSQL Supabase Docker Hugging Face


Selected work

AI Evaluation Platform · live demo ↗ An LLM-as-judge that scores outputs against a rubric with per-criterion written reasoning, plus pairwise A/B comparison that runs both orderings (A-B and B-A) to neutralise the judge's own position bias and flags when the two disagree. Next.js · TypeScript · FastRouter · Supabase

ServiceBench · live on 🤗 Spaces ↗ An OpenEnv-compatible environment for training LLM agents to orchestrate calls across three interconnected backend services. The agent has to traverse user → order → inventory foreign keys in the right order, not just call one tool in isolation, with dense reward shaping over milestones and completion. Built for the Meta × Hugging Face × PyTorch hackathon. Python · FastAPI · Docker · HF Spaces

RAG Knowledge Assistant Local-first document Q&A that runs fully offline - Ollama/Gemma for inference, Supabase pgvector (HNSW) for retrieval, and a hybrid reranker that blends vector similarity with keyword, position, and recency signals. No API keys, nothing leaves the machine. Next.js · LangChain.js · Ollama · pgvector

AI Safety Harness A red-team platform that runs adversarial prompts through a five-layer guardrail pipeline - jailbreaks, prompt injection, harmful content, role manipulation, encoding tricks - and scores which layer caught or missed each attack, with incident logging. Python · FastAPI · Docker · PostgreSQL

Multi-Agent Content Pipeline Four coordinated agents - researcher → writer → fact-checker → polisher - with quality gates and a revision loop where the fact-checker can send a draft back to the writer under a bounded retry budget. Next.js · LangChain · Gemini · Tavily

GST Shield · live demo ↗ Scans receipts with Claude Vision to extract GSTINs, then validates them deterministically - format, state code, and mod-36 checksum - instead of trusting the model's raw output. Built at a hackathon. Next.js · Claude Vision · Supabase


Currently

Deepening the evaluation and agent-eval work - single-output scoring, pairwise judging, position-bias mitigation - and digging into retrieval-quality metrics for RAG.

Reach me

ndhakeph@gmail.com · LinkedIn · 🤗 Hugging Face

Pinned Loading

  1. ai-eval-platform ai-eval-platform Public

    LLM evaluation platform with rubric scoring, A/B judging, and position-bias auditing

    TypeScript

  2. khatabook-GSTshield khatabook-GSTshield Public

    GST validation and compliance shield for small-business ledgers

    TypeScript 1 1

  3. ai-rag-knowledge-assistant ai-rag-knowledge-assistant Public

    Production-style RAG assistant: pgvector HNSW retrieval, reranking, grounded answers

    TypeScript

  4. ai-safety-harness ai-safety-harness Public

    Adversarial safety-testing harness for LLMs: layered jailbreak detection and scoring

    TypeScript

  5. ai-content-pipeline ai-content-pipeline Public

    Multi-agent content pipeline orchestrating LLM stages with review and revision

    TypeScript 1

  6. Agentic-Honeypot Agentic-Honeypot Public

    Agentic Honeypot is an intelligent, automated counter-scam system designed to intercept, engage, and extract intelligence from cybercriminals.

    Python