Chain rule of probability

By simply rearranging the formula for conditional probability, we get:

… for and to happen, has to happen, and then has to happen given that has happened.
This always holds.

For vars:

It looks a bit nicer with random variables:

Independent events

Two events and are independent if knowing gives no extra information about , and vice versa:

Equivalently, using the formula of conditional probability, for independent events, we can say:

So the chain rule of probability simplifies to multiplication for independent events.

Link to original

This works for multivariate distribution too:

Conditional probability

The conditional probability of given is given by:

This is almost identical to the standard conditional probability (probability of AND divided by the probability of ), but more general, as it’s a conditional density function of all values of and .
This is the same formula but using density notation - denotes the conditional density function of given :

And the chain rule of probability follows directly from this:

Link to original