Understanding Probability Distributions: Normal, Binomial, Poisson, and Bernoulli

In data science and statistics, probability distributions help us describe how data behaves. Different types of data follow different patterns, and that's where distributions come in.

In this post, we’ll look at four key distributions:

  1. Normal Distribution
  2. Binomial Distribution
  3. Poisson Distribution
  4. Bernoulli Distribution
Let’s explore what they mean, when to use them, and see a few examples.


1. Bernoulli Distribution

What is it?

The Bernoulli distribution models a single trial with only two possible outcomes: success (1) or failure (0).

P(X=x)={pif x=11pif x=0P(X = x) = \begin{cases} p & \text{if } x = 1 \\ 1 - p & \text{if } x = 0 \end{cases}

Where:

  • pp is the probability of success

  • 1p1 - p is the probability of failure

Example:

  • Tossing a coin (Head = 1, Tail = 0)

  • A customer buys (1) or doesn’t buy (0) a product

When to Use:

When you're modeling a single yes/no event.


2. Binomial Distribution

What is it?

The Binomial distribution models the number of successes in n independent Bernoulli trials.

P(X=k)=(nk)pk(1p)nk

Where:

  • nn = number of trials

  • kk = number of successes

  • pp = probability of success


Example:

  • Number of heads in 10 coin tosses

  • Number of customers who buy out of 100 approached


When to Use:

Use when:

  • There are n independent trials

  • Each trial has only two outcomes

  • The probability pp is constant


3. Poisson Distribution

What is it?

The Poisson distribution models the number of events happening in a fixed interval (time, area, space) when these events happen independently and at a constant average rate λ\lambda.

P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}

Where:

  • λ\lambda = average rate of occurrence

  • kk = number of events


Example:

  • Number of calls received at a call center per hour

  • Number of accidents at a junction in a day


When to Use:

Use when you're modeling rare events over time or space.


4. Normal Distribution

What is it?

The Normal distribution is the most common continuous probability distribution. It’s also called the bell curve due to its shape.

f(x)=12πσ2e(xμ)22σ2f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \, e^{-\frac{(x - \mu)^2}{2\sigma^2}}

Where:

  • μ\mu = mean (center)

  • σ\sigma = standard deviation (spread)


Example:

  • Heights of people

  • Test scores

  • Errors in measurements


When to Use:

When the data is continuous, symmetric, and naturally clusters around a central value.


My Final Thoughts

Each distribution tells us a unique story.
  1. Bernoulli is about one trial.
  2. Binomial extends that to multiple trials.
  3. Poisson counts rare events over time or space.
  4. Normal shows how most things in nature and behavior tend to group around a mean.
Understanding when to use which distribution is essential for modeling real-world problems and making informed predictions.

Comments

Popular posts from this blog

Mean, Median, Mode, Variance, and Standard Deviation

Introduction to Probability Theory: Independence, Conditional Probability, and Bayes’ Theorem