year: 2021
paper: arxiv
website:
code:
connections: representation, synthetic data, transfer learning, Zipf’s law


TLDR

Vision models trained on synthetic data from simple algorithmic processes (fractals, dead leaves, wavelet noise) learn representations that transfer well to real images. What matters isn’t natural data but naturalistic data: data that reproduces key statistical properties of the natural world, like approximate scale-invariance and Zipfian frequency distributions.