• Instead of learning single point estimates for weights, BNNs learn a probability distribution over each weight.
  • To make a prediction, you sample multiple sets of weights from these distributions, run the input through the network for each sample, and get a distribution of outputs.
  • The variance of this output distribution reflects epistemic uncertainty (and often aleatoric too, depending on how the output layer is modeled). BNNs are computationally more expensive but provide a more principled way to capture uncertainty.