You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is my personal home rig for serious LLM experimentation. I built it to test models head-to-head, create custom evaluation rubrics, automatically improve prompts based on the previous run’s results, and generate high-quality synthetic training data. Everything runs locally first (Ollama by default), with optional cloud support. logged locally.
Analyze Claude Code session logs and generate efficiency reports, cost diagnostics, and actionable recommendations. This project reads local JSONL session logs, computes deterministic efficiency signals, and can optionally add local LLM recommendations using Ollama.
Research repository for the Visual Nudges study, examining how lightweight interface interventions structure analytic judgment in visualization-based peer review.
A multi-turn agentic dialogue system built on hybrid dense-sparse retrieval, hierarchical agent coordination, and rubric-based evaluation. Designed for real-world deployment with FastAPI serving, streaming responses, and full observability