Introduction to Discrete Random Variables
Welcome to the world of Discrete Random Variables! While the name sounds like a mouthful, the concept is something you use every day. If you’ve ever wondered about the "average" number of goals a team scores or the probability of winning a game on your third try, you’re already thinking about random variables.
In this chapter, we’ll move beyond simple probability and look at how we can model, calculate, and predict outcomes for events that have countable results. Don't worry if Statistics has felt a bit dry before—we’re going to break this down into bite-sized pieces with plenty of real-world context!
1. General Discrete Random Variables
A Discrete Random Variable (DRV) is a variable whose value depends on the outcome of a random event, and these values must be discrete (meaning you can count them, like 0, 1, 2... no decimals!). We usually use a capital letter like \( X \) to represent the variable itself, and a lowercase \( x \) for the specific values it can take.
Probability Distributions
A probability distribution is just a way of showing every possible value of \( X \) and the probability of that value happening. We often show this in a table.
Example: Let \( X \) be the number of heads when flipping a coin twice.
\( X = 0, 1, 2 \)
\( P(X=0) = 0.25 \)
\( P(X=1) = 0.50 \)
\( P(X=2) = 0.25 \)
Important Rule: The sum of all probabilities in a distribution must always equal 1. Symbolically: \( \sum P(X=x) = 1 \). If your table doesn't add up to 1, something is wrong!
Expectation and Variance
These tell us the "center" and the "spread" of our data.
- Expectation \( E(X) \): This is the long-term average value. We often call it the mean, \( \mu \).
Formula: \( E(X) = \mu = \sum x P(X=x) \) - Variance \( Var(X) \): This measures how much the values "drift" away from the mean.
Formula: \( Var(X) = \sigma^2 = \sum x^2 P(X=x) - \mu^2 \)
Quick Review Box:
1. Expectation = Sum of (Value \(\times\) its Probability).
2. Variance = Sum of (Value² \(\times\) Probability) minus (Mean)².
Common Mistake: When calculating Variance, students often forget to subtract the \( \mu^2 \) at the end. Always double-check your final step!
Key Takeaway: A probability distribution tells you the "what" and the "how likely." \( E(X) \) tells you the average, and \( Var(X) \) tells you the consistency.
2. Linear Coding
Sometimes we want to change our data—maybe by doubling every score or adding 5 points to everyone. This is called linear coding, usually written as \( Y = aX + b \).
How the Mean and Variance Change:
- The Mean: It is affected by everything. If you add 5 to every score, the average goes up by 5. If you double every score, the average doubles.
\( E(aX + b) = aE(X) + b \) - The Variance: It is only affected by multiplication. Adding a constant doesn't change the "spread" (the distance between points stays the same). However, because variance is squared, the multiplier \( a \) becomes \( a^2 \).
\( Var(aX + b) = a^2 Var(X) \)
Analogy: Imagine a group of friends standing in a line. If they all take two steps forward (adding \( b \)), the average position moves, but the distance between them (variance) stays the same. If they all double their distance from the start (multiplying by \( a \)), they spread out much further!
Key Takeaway: Adding/subtracting moves the mean but ignores the variance. Multiplying/dividing affects the mean by \( a \) and the variance by \( a^2 \).
3. The Discrete Uniform Distribution
In a discrete uniform distribution, every outcome is equally likely. Think of a fair 6-sided die: every number has a \( 1/6 \) chance.
We use the notation \( X \sim U(n) \) for the interval \( [1, n] \).
- Probability: \( P(X=x) = \frac{1}{n} \)
- Mean: \( E(X) = \frac{n+1}{2} \)
- Variance: \( Var(X) = \frac{n^2 - 1}{12} \)
Key Takeaway: Use this model when everything has the same chance of happening over a fixed range of integers.
4. The Binomial Distribution
You’ve likely seen this in A Level Maths! We use \( X \sim B(n, p) \) where \( n \) is the number of trials and \( p \) is the probability of success.
New Formulas for Further Maths:
Instead of just using tables, you need to know these quick formulas for the mean and variance:
\( \mu = np \)
\( \sigma^2 = np(1 - p) \)
Example: If you flip a coin 100 times, the expected number of heads is \( 100 \times 0.5 = 50 \).
5. The Geometric Distribution
This is a new one! The Geometric Distribution models the number of trials until the first success happens. Imagine you are trying to throw a ball into a hoop. You keep throwing until you get one in, then you stop.
We use the notation \( X \sim Geo(p) \).
Formulas to Know:
- Probability of success on the \( x \)-th try: \( P(X=x) = (1-p)^{x-1}p \)
(This means you had \( x-1 \) failures followed by 1 success). - Probability it takes more than \( x \) tries: \( P(X > x) = (1-p)^x \)
(This is a handy shortcut! It just means you failed the first \( x \) times). - Mean: \( E(X) = \frac{1}{p} \)
- Variance: \( Var(X) = \frac{1-p}{p^2} \)
Did you know? The "mode" (the most likely outcome) of any geometric distribution is always 1. It is always most likely that you succeed on the very first try, even if the probability is low!
Key Takeaway: Use the Geometric distribution when you are "waiting for the first success."
6. The Poisson Distribution
The Poisson Distribution models the number of times an event occurs in a fixed interval of time or space. Examples include the number of emails you get in an hour or the number of chocolate chips in a cookie.
We use the notation \( X \sim Po(\lambda) \), where \( \lambda \) (lambda) is the average rate of occurrence.
Conditions for a Poisson Model:
For a Poisson model to work, events must occur:
- Independently: One event doesn't affect the next.
- Singly: Two events can't happen at exactly the same time.
- At a constant average rate.
The Formula:
\( P(X=x) = \frac{e^{-\lambda} \lambda^x}{x!} \)
Special Properties:
- The Mean and Variance: In a Poisson distribution, the mean and variance are equal!
\( E(X) = Var(X) = \lambda \) - Adding Poisson Variables: If you have two independent Poisson variables, \( X \sim Po(\lambda) \) and \( Y \sim Po(\mu) \), then their sum is also Poisson:
\( X + Y \sim Po(\lambda + \mu) \)
Don't worry if the formula looks scary. Most of the time, you will use your calculator's built-in Poisson functions to find these probabilities!
Key Takeaway: Look for "average rate" or "occurrences per interval" to spot a Poisson problem. Remember that Mean = Variance.
Final Summary Review
Discrete Uniform: Everything is equally likely.
Binomial: Fixed number of tries, looking for number of successes.
Geometric: Repeating tries until the first success happens.
Poisson: Counting how many events happen in a certain time/space.
Congratulations! You've just covered the core of Discrete Random Variables. Keep practicing those formulas, and always check if your probabilities sum to 1!