Automated detection, visualization and suppression of hallucination-associated neurons in open-source LLMs — LLM mechanistic interpretability research tool
ai-safety pacmap model-editing mechanistic-interpretability transformerlens llm-hallucination llm-alignment h-neurons sparse-probing interpretability-research
-
Updated
Mar 19, 2026