self-information (surprise) … information content of a single event (code length of a single event)
entropy … average surprise of a random variable (expected code length)
cross-entropy … average surprise of the true data when assuming a model (expected code length using to encode )
KL-divergence … extra average surprise regret when assuming model instead of the true distribution (inefficiency of using to encode )
mutual information … average reduction in uncertainty about one random variable given knowledge of another (shared information between them)
fisher information … average local curvature of log-likelihood, per observation (sqaured sensitivity of the model to parameter changes)