year: 2022
paper: https://arxiv.org/pdf/2105.04906.pdf
website:
code:


Builds upon Barlow Twins - Self-Supervised Learning via Redundancy Reduction

Invariance

Agreement between positive examples should be high (cosine simmilarity could be used/ euclidian distance is actually used in VICReg)

Variance

Keep variance over a certain threshold: This term forces the embedding vectors of samples within a batch to be different.

They don’t try to maximize the standard deviation but just try to stop it from being very low with the help of this Hinge Loss:

where is the regularized standard deviation defined by:

Covariance

We don’t want dimensions in the embedding to be correlated. We want the output embeddings to hold different information → Like linear dependance (maximizing the diagonal).

Highly correlating dimension in the output embedding matrix have a high covariance term, uncorrelating dims have covariance of and negatively correlated ones have a negative covariance. Hence, we want to minimize the covariance matrix.

SimCLR → Same things same embeddings (SimCLR specifically maximizes difference between negative samples)
Simple Framework for Contrastive Lear → Different things different embeddings (Black and white)
VICReg → Explicitly try to regularize
Duality between contr and non-contr →
VICReg Code and tutorial: TODO READ!
https://imbue.com/open-source/2022-04-21-vicreg/

self-supervised learning