14 July 2025
ICML 2025 Day 1 Morning Orals
by Rylan Schaeffer
Oral 1E Theory and Phenomenology
An analytic theory of creativity in convolutional diffusion models
Layer by Layer: Uncovering Hidden Representations in Language Models
- Claims:
- Myth 1: Final layers always give the best embeddings
- Myth 2: Middle layers are useless for downstream tasks
- MTEB Benchmark
- Used 32 diverse tasks spanning 5 different domains
- Probed every model layer
- Goal: Final which layers create the best embeddings
- Middle Layers Win (by large margin)

tags: machine-learning - icml - icml-2025