Rylan Schaeffer

Logo
Resume
Publications
Learning
Blog
Teaching
Jokes
Kernel Papers


Incidental Polysemanticity: A New Obstacle for Mechanistic Interpretability

Victor Lecomte, Kushal Thaman, Rylan Schaeffer, Naomi Bashkansky, Trevor Chow, Sanmi Koyejo

arXiv preprint Under Review

December 2024

Summary

Incidental polysemanticity poses challenges for mechanistic interpretability.