Incidental Polysemanticity: A New Obstacle for Mechanistic Interpretability Victor Lecomte, Kushal Thaman, Rylan Schaeffer, Naomi Bashkansky, Trevor Chow, Sanmi Koyejo arXiv preprint Under Review December 2024 Mechanistic Interpretability Polysemanticity Neural Networks AI Safety Summary Incidental polysemanticity poses challenges for mechanistic interpretability.