Link to originalMarginal density
The marginal density of a variable is obtained by integrating the joint density over the other variable:
Here we’re integrating out (averaging over all possible values of) , to get the marginal density of , i.e. the density of regardless of .
This also work with discrete distributions:
Marginalization and conditioning
Gaussian distributions are closed under marginalization and conditioning, i.e. they return a modified gaussian distribution.
E.g. for the following bivariate distribution:Marginalization
Marginalization lets us extract partial information from multivariate probability distributions. Given a normal probability distribution over vectors of random variables and , we can determine their marginalized probability distributions like this:
This means that each partition and only depends on its corresponding entries in and .
Another way to express this, mathematically, is that we view every possible value of under the consideration of all possible values of , e.g.: , everaging out Y’s contribution:
Conditioning
Conditioning is used to determine the probability distribution of one variable depending on another variable, i.e. how one variable behaves when another one is known.
The mean gets shifted by how much the known variable differs from its expected value , which is normalized by (think ), and scaled by the covariance between the two variables . This product can be thought of as translating the normalized deviation in to corresponding changes in ’s scale.
→ represents the amount of variance in that can be explained by .Conditioning is like taking a slice of the distribution at the known/given value of a variable.
Visualization of marginalization and conditioning
Note: There is an interactive version at distill.
The blue curve shows the entire underlying distribution of the random variable , and the red curve shows a slice of the joint distribution of and at a specific value of .
iLink to originalFor normal distributions, uncorrelated variables are independent if and only if they are jointly normally distributed.
A pair is jointly normal exactly when every linear combination is normally distributed, i.e. the resulting distribution is a multivariate normal distribution.
Visual intuition and where is with equal probability:
Consider
Individually and each look normal, but together their joint distribution is unusual: When is positive, is either or with 50/50 chance, when is negative, same thing.
→ and cannot possibly be independent, even though they’re uncorrelated!

