Welcome to Geometric and Negative Binomial Distributions!

In your previous studies, you’ve likely encountered the Binomial Distribution, which counts the number of successes in a fixed number of trials. In this chapter, we are turning that on its head! Instead of fixing the number of trials, we are waiting for successes to happen. This is why these are often called "waiting time" distributions.

Don't worry if this seems a bit abstract at first. Whether you are waiting for a "6" on a fair die or waiting for a specific number of defective items to roll off an assembly line, the logic remains the same. Let's dive in!

1. The Geometric Distribution

The Geometric Distribution is used when we are interested in the number of trials required to get the very first success.

When do we use it?

We use this model when:
• Each trial has only two outcomes: Success or Failure.
• The trials are independent (one result doesn't affect the next).
• The probability of success, \( p \), is constant for every trial.
• We stop as soon as we achieve the first success.

The Probability Mass Function (PMF)

If \( X \) is the number of trials until the first success, we write \( X \sim \text{Geo}(p) \). The formula for the probability of the first success occurring on trial \( x \) is:
\( P(X = x) = p(1 - p)^{x-1} \) for \( x = 1, 2, 3, ... \)

Breaking down the formula:
Imagine you succeed on the 5th attempt. This means you must have failed 4 times and then succeeded once.
• \( (1-p)^{x-1} \) represents the \( x-1 \) failures.
• \( p \) represents that final success.

Quick Review Box:
• \( p \): Probability of success.
• \( (1-p) \): Probability of failure (often written as \( q \)).
• \( x \): The trial number where you get your first win.

Real-World Example

Example: Suppose the probability of hitting a target with a bow and arrow is 0.2. What is the probability that your first hit is on your 3rd attempt?
Here, \( p = 0.2 \) and \( x = 3 \).
Using the formula: \( P(X = 3) = 0.2 \times (1 - 0.2)^{3-1} = 0.2 \times 0.8^2 = 0.128 \).

A Very Useful Trick: Cumulative Probabilities

Sometimes you need to find the probability that it takes more than \( k \) trials to get a success.
\( P(X > k) = (1 - p)^k \)
Analogy: If you haven't succeeded by trial \( k \), it simply means you have failed \( k \) times in a row! This is much faster than summing up individual probabilities.

Key Takeaway: The Geometric distribution is all about "How many tries until I win once?"

2. Mean and Variance of Geometric Distribution

How many trials do we expect to need? And how much do we expect the results to vary?

The Formulae

For \( X \sim \text{Geo}(p) \):
Mean (Expected Value): \( E(X) = \mu = \frac{1}{p} \)
Variance: \( \text{Var}(X) = \sigma^2 = \frac{1 - p}{p^2} \)

Did you know?
If you have a 1 in 6 chance of rolling a "6" on a die (\( p = 1/6 \)), the mean is \( 1 \div (1/6) = 6 \). This makes perfect sense—you'd expect to roll the die 6 times to see one "6"!

Common Mistake to Avoid:
Students often swap the numerator and denominator. Just remember: if the probability of success is tiny, the expected number of trials should be huge!

3. The Negative Binomial Distribution

Think of the Negative Binomial Distribution as the "big brother" of the Geometric distribution. Instead of waiting for the first success, we are waiting for the \( r \)-th success.

The Probability Mass Function (PMF)

If \( X \) is the number of trials until the \( r \)-th success, we write \( X \sim \text{NB}(r, p) \). The formula is:
\( P(X = x) = \binom{x-1}{r-1} p^r (1 - p)^{x-r} \) for \( x = r, r+1, r+2, ... \)

Step-by-Step Explanation:
This formula looks scary, but let's break it down using an example where we want the 3rd success (\( r=3 \)) on the 10th trial (\( x=10 \)):
1. The End: The 10th trial must be a success. That’s one \( p \).
2. The Before: In the first 9 trials (\( x-1 \)), you must have had exactly 2 successes (\( r-1 \)). This is why we use the combination \( \binom{9}{2} \).
3. The Rest: You have \( r \) total successes (so \( p^r \)) and \( x-r \) total failures (so \( (1-p)^{x-r} \)).

Real-World Example

Example: A basketball player has a 0.7 chance of making a free throw. What is the probability that they score their 5th basket on their 8th attempt?
Here, \( p = 0.7 \), \( r = 5 \), and \( x = 8 \).
\( P(X = 8) = \binom{8-1}{5-1} (0.7)^5 (1 - 0.7)^{8-5} = \binom{7}{4} (0.7)^5 (0.3)^3 \).

Key Takeaway: The Negative Binomial distribution is for when you need more than one success and want to know how long it will take.

4. Mean and Variance of Negative Binomial Distribution

Since the Negative Binomial is essentially like doing the Geometric distribution \( r \) times, the formulas are very similar!

The Formulae

For \( X \sim \text{NB}(r, p) \):
Mean: \( E(X) = \mu = \frac{r}{p} \)
Variance: \( \text{Var}(X) = \sigma^2 = \frac{r(1 - p)}{p^2} \)

Memory Aid:
Just take the Geometric formulas and multiply them by \( r \). It's that simple!

5. Hypothesis Testing with the Geometric Distribution

In Further Statistics 1, you need to be able to test if a claimed probability \( p \) is accurate based on how long it took for a success to occur.

The Process

1. State Hypotheses:
• \( H_0: p = \text{claimed value} \)
• \( H_1: p < \text{value} \) or \( p > \text{value} \) or \( p \neq \text{value} \).
2. Calculate the Probability: Use the trial result \( x \) from the sample.
• If testing if \( p \) is lower than claimed, you are looking for a "surprisingly long" wait: Calculate \( P(X \ge x) \).
• If testing if \( p \) is higher than claimed, you are looking for a "surprisingly short" wait: Calculate \( P(X \le x) \).
3. Compare: Compare your p-value to the significance level \( \alpha \).
4. Conclude: If your p-value \( < \alpha \), reject \( H_0 \).

Encouraging Phrase: Hypothesis testing here follows the same logic as Binomial testing you did in A-Level Maths—only the distribution formula has changed!

6. Choosing the Right Distribution

If you're stuck on which one to use, ask yourself these questions:

Is the number of trials fixed?

Yes: Use Binomial.
No: (You are waiting for a success) Use Geometric or Negative Binomial.

How many successes are you waiting for?

Exactly One: Use Geometric.
More than One (\( r \)): Use Negative Binomial.

Summary of Chapter 3:
Geometric: Waiting for the first success. \( E(X) = 1/p \).
Negative Binomial: Waiting for the \( r \)-th success. \( E(X) = r/p \).
Independence: Always check that trials don't depend on each other before using these models!
Hypothesis Testing: Always state your hypotheses in terms of the parameter \( p \).