- Instead of learning single point estimates for weights, BNNs learn a probability distribution over each weight.
- To make a prediction, you sample multiple sets of weights from these distributions, run the input through the network for each sample, and get a distribution of outputs.
- The variance of this output distribution reflects epistemic uncertainty (and often aleatoric too, depending on how the output layer is modeled). BNNs are computationally more expensive but provide a more principled way to capture uncertainty.