Steve for a 🐐ed intro.


Probability

Laplace’s Rule”:

… number of events where occurs / number of ways can happen.
… number of possible events.

So probability is just combinatorics (counting combinations).

Roll two dice. What is the probability that at least one die is a 5?

By tedious counting:



… or we take the problem apart by looking at the complement of : Either the first die is a 5 or it’s not and the second die is a 5.

Visually, it’s easy to see the relation to binomial coefficients (→ probability of getting a 5 on at least one die is the same as the probability of getting a 5 on exactly one die) and how probabilities stack up.

Probabilistic models are not just useful for truly random things, but also for things that are too complex to model exactly.

Properties of probability

The probability function is a map from subsets of the sample space (the set of all possible outcomes of a random experiment) to the real numbers:

… the probability of something happening is 1.
… the probability of nothing happening is 0.
… probability of an event is always non-negative.
… a subset of events has a smaller probability than the set it’s a subset of.
… the probability of not happening. Also sometimes denoted as (surprise, but without the log).
If are disjoint (can’t occur at the same time), the probability of either happening is the sum of the probability of each happening:

For non-disjoint events, we need to subtract the intersection, so we don’t count it twice (inclusion-exclusion principle).

The “counting” from earlier is just asking about the relative sizes of sets; proportions:
and happened”: or happened”:
Now it’s also clearer what happens when we multiply or add probabilities:
With the caveat of overcounting for non-disjoint events and the union, and for the intersection: If they are not independent, we need to use the chain rule of probability (see below).

conditional probability

Conditional probability

are two events (outcomes) of a random experiment. The probability of an event given that another event has occurred is:

If we know that has occured, becomes the new :

In this case, becomes a lot more likely, as it occupies a larger fraction of the sample space than it did before.

Link to original

independent

Independent events

Two events and are independent if knowing gives no extra information about , and vice versa:

Equivalently, using the formula of conditional probability, we can say:

EXAMPLE

= deck of 52 cards
= card is spade →
= card is queen →

The probability of getting a spade or queen doesn’t change if we restrict ourselves to the set or :
center

Link to original

chain rule of probability

Chain rule of probability

By simply rearranging the formula for conditional probability, we get:

… for and to happen, has to happen, and then has to happen given that has happened.

Link to original

law of total probability

Law of total probability

Given a partition of the sample space into disjoint events (), the probability of an event is the sum of the probabilities of given each of the , weighted by the probability of each :

Link to original

Introduction (old and bad)

Let denote the nth digit of .
Consider the quanity (number) of .
Is it even or odd? → even.
What about ?
There are two approaches to probability:

  1. “d is even with probability
    1. Probability: Representing uncertainty about certain values.
    2. Baysian approach. Common in Machine Learning
  2. “d is even with probablility 0 or 1 but I don’t know which”
    4. Probability: Mathematically defineable thing about frequencies
    5. More common, esp. in rigorous mathematical theory.
    Important points:
    The total probability is alwys 1 We care about indepence: We care about expectation: The probabiliy times some other function (e.g. darts player points and dart positions) Wandb YT

References

mathematics