year: 2022
paper: https://arxiv.org/pdf/2105.04906.pdf
website:
code:
Builds upon Barlow Twins - Self-Supervised Learning via Redundancy Reduction

Invariance
Agreement between positive examples should be high (cosine simmilarity could be used/ euclidian distance is actually used in VICReg)
Variance
Keep variance over a certain threshold: This term forces the embedding vectors of samples within a batch to be different.
They don’t try to maximize the standard deviation but just try to stop it from being very low with the help of this Hinge Loss:
where is the regularized standard deviation defined by:
Covariance
We don’t want dimensions in the embedding to be correlated. We want the output embeddings to hold different information → Like linear dependance (maximizing the diagonal).
Highly correlating dimension in the output embedding matrix have a high covariance term, uncorrelating dims have covariance of and negatively correlated ones have a negative covariance. Hence, we want to minimize the covariance matrix.
SimCLR → Same things same embeddings (SimCLR specifically maximizes difference between negative samples)
Simple Framework for Contrastive Lear → Different things different embeddings (Black and white)
VICReg → Explicitly try to regularize
Duality between contr and non-contr →
VICReg Code and tutorial: TODO READ!
https://imbue.com/open-source/2022-04-21-vicreg/