Posts

Understanding Probability Distributions: Normal, Binomial, Poisson, and Bernoulli

Image
In data science and statistics, probability distributions help us describe how data behaves . Different types of data follow different patterns, and that's where distributions come in. In this post, we’ll look at four key distributions: Normal Distribution Binomial Distribution Poisson Distribution Bernoulli Distribution Let’s explore what they mean, when to use them, and see a few examples. 1. Bernoulli Distribution What is it? The Bernoulli distribution models a single trial with only two possible outcomes: success (1) or failure (0) . P ( X = x ) = { p if  x = 1 1 − p if  x = 0 P(X = x) = \begin{cases} p & \text{if } x = 1 \\ 1 - p & \text{if } x = 0 \end{cases} ​ Where: p p  is the probability of success 1 − p 1 - p  is the probability of failure Example: Tossing a coin (Head = 1, Tail = 0) A customer buys (1) or doesn’t buy (0) a product When to Use: When you're modeling a single yes/no event . 2. Binomial Distribution What is...

Introduction to Probability Theory: Independence, Conditional Probability, and Bayes’ Theorem

Image
Probability theory forms the backbone of modern data science, artificial intelligence, and statistics. Now, we’ll explore three fundamental concepts: Independence , Conditional Probability , and Bayes’ Theorem — with simple explanations and examples. What is Probability? Probability is a measure of how likely an event is to occur. It’s a number between 0 and 1: 0 means the event cannot happen . 1 means the event will definitely happen . All other numbers in between ( like 0.3, 0.75 ), represent different levels of likelihood. 1. Independence of Events Two events are said to be independent if the occurrence of one event does not affect the probability of occurrence of the other. Definition: Events A and B are independent if: P ( A  and  B ) = P ( A ) × P ( B ) Example: Tossing a coin and rolling a dice. Getting a Head (H) and rolling a 3 are independent because the outcome of the coin toss does not impact the dice roll. Suppose: P ( Head ) = 0.5 P(\text{Hea...

Mean, Median, Mode, Variance, and Standard Deviation

 When working with data it is important to understand data, it's important to summarize and understand its overall behavior. Some of the most common statistical measures for this are Mean , Median , Mode , Variance , and Standard Deviation . Let's break them down one by one, in a simple and intuitive way. Mean (Average)  : M ean is what most of us commonly call as "average." It tells us the central value of the data set. Mean = Sum of all values Number of values \text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}} ​ Example: Suppose we have the data: 5, 7, 9, 10, 12 Then, the mean is: Mean = 5 + 7 + 9 + 10 + 12 5 = 43 5 = 8.6 \text{Mean} = \frac{5 + 7 + 9 + 10 + 12}{5} = \frac{43}{5} = 8.6 When to use: When the data is relatively evenly distributed without extreme outliers. Median (Middle Value)  : M edian is the middle number when the data is arranged in order. If there are odd numbers of values, the median is the cen...