LLM-as-a-Judge framework for profiling local model capabilities across domains — toward automated model team composition
benchmarking benchmarking-suite slm model-evaluation evaluation-framework benchmarking-framework llm local-llm llm-evaluation deepeval slm-testing local-llm-agent
-
Updated
Apr 28, 2026 - Python