A Bernoulli experiment/event is a random experiment that satisfies the following conditions:

Two Possible Outcomes: Each trial has exactly two possible outcomes: “success” or “failure”.
Independence: The trials are independent; the outcome of one trial does not affect the outcome of another.
Constant Probability: The probability of success, , remains constant across all trials.

Examples of Bernoulli experiments include flipping a coin (heads or tails), checking if a light bulb works (works or doesn’t work), or rolling a die to see if it shows a 6 (shows 6 or doesn’t). A sequence of Bernoulli experiments is often called a “Bernoulli process” and forms the basis for the binomial distribution, where we are interested in the number of successes over a fixed number of trials.

Bernoulli random variable

A random variable follows a Bernoulli distribution with parameter if it takes only two possible values, typically denoted as 0 and 1, with probabilities:

The PMF can be written as:

Key properties:

  • Mean:
  • Variance:
  • Standard deviation:
  • Entropy:

When , the distribution is symmetric with maximum variance ().

This special case corresponds to a fair coin flip. The variance is maximized at and approaches 0 as approaches 0 or 1.

Relation to other distributions

  • A Bernoulli distribution is a special case of the binomial distribution where
  • The geometric distribution models the number of Bernoulli trials needed to get the first success
  • The Bernoulli distribution is a special case of the categorical distribution with only two categories.
    • In machine learning, Bernoulli distributions are often used in the output layer of binary classification models, where the model predicts the probability of the positive class.