Rylan Schaeffer

Logo
Resume
Research
Learning
Blog
Teaching
Jokes
Kernel Papers


29 August 2022

Cheatsheet - Self-Supervised Learning for Vision

by Rylan Schaeffer

Method Latent Dim Batch Size Optimizer Learning Rate Weight Decay Scheduler Epochs
SimCLR 128 4096 LARS 4.8 1e-6 Linear Warmup, Cosine Decay 100
TiCo 256 4096 LARS 3.2 1.5e-6 Linear Warmup, Cosine Decay 1000
VICReg 8192 2048 LARS 1.6 1e-6 Linear Warmup, Cosine Decay 1000

Figure from VICReg (ICLR 2022):

CPC (Arxiv 2018)

Deep InfoMax (DIM) (ICLR 2019)

AMDIM (NeurIPS 2019)

MoCo (CVPR 2020)

SimCLR (ICML 2020)

\[\ell_{i, j} = -\log \frac{\exp (sim(z_i, z_j) / \tau)}{\sum_k^{2N} \mathbb{1} \exp (sim(z_i, z_k) / \tau)}\]

SwAV (NeurIPS 2020)

BYOL (NeurIPS 2020)

SimSiam (CVPR 2021)

W-MSE

TiCo (Rejected at NeurIPS 2021)

\[\ell = -\frac{1}{N} \sum_n ||z_n^' z_n^{''}||_2^2 + \frac{\rho}{N} \sum_{n} (z_n^')^T C_t z_n^'\]

where \(C_t\) is the second moment matrix of the representations.

Equivalently:

\[\ell = -\frac{1}{N} \sum_n z_n^{'} \cdot z_n^{''} + \frac{\rho}{N^2} \sum_{ij} (z_i^{'} \cdot z_j^{''})^2\]

VICReg (ICLR 2022)

\[\ell = \frac{1}{N} \sum_n ||z_n^{'} - z_{n}^{''}||^2\] \[\ell = \frac{1}{D} \sum_d \max(0, \gamma - S(z_d, \epsilon))\]

where $S(x, \epsilon) = \sqrt{\mathbb{V}[x] + \epsilon}$

\[\ell = \frac{1}{N - 1} \sum_n (z_n - \bar{z}_n) (z_n - \bar{z}_n)^T\]

Barlow Twins

tags: machine-learning - self-supervised-learning - vision