The essence of statistical inference:
Given observed data , we want to learn about the unknown parameters that generated it.
Two fundamental quantities:
- Likelihood - probability of observing data given parameters
- Posterior - probability of parameters given observed data
The posterior is what we’re after - it tells us what parameter values are plausible given our observations. We compute it using Bayes Theorem: posterior likelihood prior.
Statistics is all about building models and testing them, separating signal from noise.
Descriptive/deductive: summarize data you have.
Inferential/inductive: generalize from a sample to the population.
Scale levels
- Nominal: Identity/Categorical
- Ordinal: Rank (1-5 stars - the diff between 1/2 stars is not necessarily the same as 4/5)
- Metric: Distance (meaningful distance - counts, …)
- Interval: No true zero (10°C is not twice as hot as 5°C)
- Ratio: True Zero, meaningful ratios (10K is twice as hot as 5K)
Humans are face are detectors with very high sensitivity and a bias towards false positives.