You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Community-driven behavioral reliability benchmark for LLMs. 231 probes across 19 modules, deterministic scoring, perplexity correlation, layer sensitivity mapping, quant method capture, hardware-stratified community rankings. Every test contributes to the community dataset.
Universal installer, hardware benchmarker, and Claude model recommender for Claude Code — auto-detects your system and sets up Claude Code fully configured
A cross-platform, terminal-based hardware benchmark tool written in Python. Measures CPU, Memory, Disk, and GPU performance — with real-time system monitoring and structured JSON export.