The Lottery Ticket Hypothesis

… states that the reason large ANNs are trainable is that a large network contains a combinatorial number of subnetworks, one of which is likely to be easily trainable for the task at hand.

The hypothesis is based on the finding that after having trained a large network, it is usually possible to prune a large portion of the parameters without suffering a significant loss in performance.

An even stronger take on the Lottery Ticket Hypothesis states that due to the sheer number of subnetworks that are present within a large network, it is possible to learn a binary mask on top of the weight matrices of a randomly initialized neural network, and in this manner get a network that can solve the task at hand.

This has even been shown to be possible at the level of neurons; with a large enough network initialization, a network can be optimized simply by masking out a portion of the neurons in the network

Learning to Act through Evolution of Neural Diversity in Random Neural Networks