Bayes Theorem

When to use Baye’s rule

You have a hypothesis and observed some evidence.
You want $P (H ∣ E)$ … $P (Hypothesis given the evidence)$ .
You limit the space of total possibilities.

Baye’s Theorem: What’s $P (H ∣ E)$ ?

$posterior updated beliefs P (H ∣ E) = \frac{P ( E ∣ H ) update/likelihood how well new observations fit previous beliefs P ( H ) prior previous beliefs}{evidence normalization term P ( E )}$
What are the chances of me having cancer? If I don’t know anything else, the chances are the same as anyone else in my demographic → That’s the prior $P (H)$ , the best guess without any futther info.
If I take a test or have a symptom, I get extra information about $H$ and can update that information.
$P (E ∣ H)$ is something we usually can calculate pretty easily as opposed to updating $P (H ∣ E)$ (would need to collect more data, …).
So the basic idea of bayesian statistics is sequentially updating your beliefs given new information.

If we don’t know $P (E)$ , we can compute it witht the help of the law of total probability.

$P (H ∣ E) = \frac{P ( E ∣ H ) P ( H )}{P ( E ∣ H ) P ( H ) + P ( E ∣ P ^{c} ) P ( H ^{c} )}$
Or generally, for $n$ disjoint sets $H_{i}$ , i.e mutually exclusive hypotheses for explaining $E$ :
$P (H_{i} ∣ E) = \frac{P ( E ∣ H _{i} ) P ( H _{i} )}{\sum _{j = 1}^{n} P ( E ∣ H _{j} ) P ( H _{j} )}$

Sensitivity and specificity

Assume a drug test which has 90% sensitivity, i.e. the probability of a positive result given it’s a drug user $P (+ ∣ user) = 0.9$ , and 80% specificity, i.e. $P (- ∣ non-user) = 0.8$ . This test is quite bad, 20% of non-users get a false positive. Let’s also say 10% of the population use the drug $P (H) = 0.1$ .
The probability of someone being a drug-user after testing positive is:
$P (user ∣ +) = \frac{P ( + ∣ user ) P ( user )}{P ( + ∣ user ) P ( user ) + P ( + ∣ non-user ) P ( non-user )} = \frac{0.9 \cdot 0.1}{0.9 \cdot 0.1 + 0.2 \cdot 0.9} = \frac{1}{3}$
So the probability that you did actually use the drug is only 33%. Even if the sensitivity is 1, we only reach ~35%.
This error gets better for more common events.

This is also relevant for machine learning: We can get close to 100% accuracy just by ignoring very rare samples (high sensitivity, low specificity).

Among cases where the evidence is true, how often is the hypothesis true?
750|center
752|center

$P (H) \cdot P (E ∣ H)$ … probability of both the hypothesis and evidence occurying together. Forms an Area.
center
Thinking of probability geometrically, as proportions is incredibly helpful.

How to make accurate predicitons

Have an estimate of the situation before you consider the evidence. This is the prior → $P (H)$ … how likely is it, not knowing/considering the extra evidence.
This is your initial belief. And the evidence only updates this belief. A common fallacy is that you disregard the prior…

The belief about the hypothesis after seeing the evidence ( $P (H ∣ E)$ ) is called posterior.

The formula

\displaylines P (H ∣ E) = \frac{P ( H ) P ( E ∣ H )}{P ( E )} = \frac{P ( H ) P ( E ∣ H )}{P ( H ) P ( E ∣ H ) + P ( \neg H ) P ( E ∣\neg H )}

$H$ … hypothesis
$E$ … evidence

rare disease example

Someone takes a test for a rare disease after returning from an exotic country. The test indicates a true positive rate of $0.99$ (sensitivity). The disease itself is extremely rare, with a probability of $0.0001$ of travelers from that country contracting it. Assuming a false positive rate (specificity error) of $0.01$ , we can calculate the probability of actually having the disease given a positive test result.

So we know:

$P (P os i t i v e T es t ∣ D i se a se)$ : The probability of a positive test result when the disease is present, which is $0.99$ .
$P (D i se a se)$ : The baseline probability of contracting the disease, which is $0.0001$ .
$P (P os i t i v e T es t ∣ N oD i se a se)$ : The probability of a positive test result when the disease is not present, assumed to be $0.01$ .

TODO finish example

References

probability
bayesian statistics

Max Wolf's Second Brain

Explorer

Bayes Theorem

rare disease example

References

Graph View

Backlinks