Rylan Schaeffer

Logo
Resume
Publications
Learning
Blog
Teaching
Jokes
Kernel Papers


Emergence of Sparse Representations from Noise

Trenton Bricken, Rylan Schaeffer, Bruno Olshausen, Gabriel Kreiman

International Conference on Machine Learning Accepted

July 2023

Abstract

Neural activations in the brain are estimated to be much sparser than those in artificial neural networks. We present a complementary explanation: adding noise to network inputs causes activations to become sparse. Our finding holds across many tasks, datasets, architectures, and nonlinearities.

Summary

Adding noise to network inputs causes activations to become sparse - a discovery with implications for both neuroscience and deep learning.

The Sparsity Puzzle

Neural activations in the brain are estimated to be much sparser than those in artificial neural networks (ANNs). The typical explanation is that action potentials in the brain are metabolically expensive, but this doesn’t apply to ANNs.

We present an alternative complementary answer.

Emergence of sparse representations

Key Discovery

Adding noise to network inputs causes activations to become sparse. Our finding holds across many tasks, datasets, architectures, and nonlinearities. This suggests that sparse representations are a useful inductive bias for separating signal from noise.

Sparsity across architectures

Biological Receptive Fields

When trained on images, neurons also form biological receptive fields and self-organize to vaguely resemble an inhibitory interneuron circuit.

Biological receptive fields

Theoretical Analysis

We analytically reveal that noise injection causes three implicit loss terms which, in the ReLU activation setting, cause the model to:

  1. Sparsify
  2. Maximize its margin around the 0 threshold (being either super off or on)
  3. Specialize weights to high variance regions of the input, minimizing weight overlap

Theoretical analysis

Weight specialization


See the full research page for more details.