Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

Rylan Schaeffer, Victor Lecomte, Dhruv Bhandarkar Pai, Andres Carranza, Berivan Isik, Alyssa Unell, Mikail Khona, Thomas Yerxa, Yann LeCun, SueYeon Chung, Sanmi Koyejo

arXiv preprint Under Review

June 2024

PDF arXiv

Abstract

MMCR is a new high-performing self-supervised learning method originating from the statistical mechanical characterization of linear separability of manifolds. We provide new perspectives connecting MMCR to information theory and analyze its behavior through double descent, neural scaling laws, and multimodality.

Summary

Understanding Maximum Manifold Capacity Representations from information theory, double descent, and scaling law perspectives.

Background: MMCR

MMCR is a new high-performing self-supervised learning method (NeurIPS 2023). The algorithm:

Data -> K transforms per datum -> Embed -> Average over K transforms -> Minimize negative nuclear norm

MMCR algorithm

Origins

MMCR originates from the statistical mechanical characterization of the linear separability of manifolds.

But most SSL algorithms originate in information theory - can we understand MMCR from this perspective?

See the full research page for more details.