Measurement-first audit repo for hidden-state verifiers in structured reasoning: outcome readout vs process verification via counterfactual local validity.
auditing evaluation pytorch reasoning hidden-states interpretability probing mechanistic-interpretability structured-reasoning process-verification
-
Updated
Mar 31, 2026 - Python