Welcome to the World of Random Events!
In this chapter, we are going to explore how to count things that happen randomly. Whether it is the number of typos on a page, the number of emails you get in an hour, or how many chocolate chips end up in a cookie, statistics helps us predict the "unpredictable." We will be looking at the Poisson Distribution and how it links back to the Binomial Distribution you have seen before. Don't worry if it feels like a lot of symbols at first—we will break it down step-by-step!
1. The Poisson Distribution: The Basics
The Poisson Distribution is used to model the number of times an event occurs within a fixed interval of time or space.
When do we use it?
For a situation to be modeled by a Poisson distribution, four conditions must be met. You can remember these with the acronym RISH:
• Randomly: Events occur randomly.
• Independently: One event happening doesn't change the chance of another happening.
• Singly: Events cannot happen at exactly the same time.
• Highly uniform: Events occur at a constant average rate (\(\lambda\)).
Example: If you are counting cars passing a gate, they should pass at a constant average rate (e.g., 2 cars per minute), but the exact timing of each car is random and independent of the others.
The Formula:
If a random variable \(X\) follows a Poisson distribution with a mean rate \(\lambda\), we write it as:
\(X \sim \text{Po}(\lambda)\)
The probability of seeing exactly \(x\) events is:
\(P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!}\)
Memory Trick: In your calculator, you don't usually need to type this whole formula! Look for the Poisson PD (for an exact number) or Poisson CD (for a range, like "less than 3") functions.
Key Takeaway: The Poisson distribution is all about the rate (\(\lambda\)) at which things happen in a specific "window" of time or space.
2. The Additive Property: Scaling and Combining
One of the coolest things about the Poisson distribution is how flexible the rate \(\lambda\) is. Since it is a rate, you can scale it up or down easily.
Scaling the Rate
If \(X\) is the number of events per minute and \(X \sim \text{Po}(\lambda)\), then for a period of \(t\) minutes, the new distribution is:
\(X_{new} \sim \text{Po}(\lambda t)\)
Everyday Analogy: If you usually receive 2 text messages per hour (\(\lambda = 2\)), then in a 5-hour period, you would expect to receive 10 messages (\(2 \times 5 = 10\)). Your new distribution for the 5-hour block is \(\text{Po}(10)\).
Adding Independent Variables
If you have two independent Poisson variables, \(X\) and \(Y\), you can add them together:
If \(X \sim \text{Po}(\lambda)\) and \(Y \sim \text{Po}(\mu)\), then:
\(X + Y \sim \text{Po}(\lambda + \mu)\)
Example: If a shop has two entrances, and the number of people entering Door A is \(\text{Po}(3)\) per hour and Door B is \(\text{Po}(4)\) per hour, the total number of people entering the shop is \(\text{Po}(3+4) = \text{Po}(7)\) per hour.
Key Takeaway: You can add Poisson rates as long as the events are independent and the time/space intervals match!
3. Mean and Variance: The "Magic" of Poisson
In statistics, we often look at the Expectation (Mean) and the Variance (how spread out the data is). Here is how they compare for our two favorite distributions:
For the Binomial Distribution \(B(n, p)\):
• Mean: \(E(X) = np\)
• Variance: \(Var(X) = np(1-p)\)
For the Poisson Distribution \(Po(\lambda)\):
• Mean: \(E(X) = \lambda\)
• Variance: \(Var(X) = \lambda\)
Did you know?
The Poisson distribution is unique because its mean and variance are exactly the same! This is a very common exam question. If a question gives you data where the mean and variance are similar, it’s a strong hint that a Poisson model is appropriate.
Quick Review:
If \(Var(X) \approx E(X)\), then Poisson is a good model.
If \(Var(X) < E(X)\), a Binomial model might be better.
4. The Poisson Approximation to the Binomial
Sometimes, calculating probabilities for the Binomial distribution \(B(n, p)\) is a nightmare—especially if \(n\) is huge (like 1,000) and \(p\) is tiny (like 0.001). In these cases, we can use the Poisson distribution as a shortcut!
When can we use this shortcut?
You can approximate \(B(n, p)\) with \(Po(\lambda)\) when:
1. \(n\) is large (usually \(n > 50\))
2. \(p\) is small (usually \(p < 0.1\))
The Process:
Step 1: Check if \(n\) is large and \(p\) is small.
Step 2: Calculate the rate \(\lambda\) using the formula \(\lambda = np\).
Step 3: Use the Poisson distribution \(\text{Po}(np)\) to find your probabilities.
Example: A factory produces 1,000 lightbulbs, and the probability of a bulb being faulty is 0.005. Instead of using \(B(1000, 0.005)\), we use \(\text{Po}(1000 \times 0.005) = \text{Po}(5)\). It’s much faster to calculate!
Key Takeaway: The Poisson distribution is the "limit" of the Binomial distribution as the number of trials increases and the probability of success decreases.
5. Common Pitfalls to Avoid
Don't worry if this seems tricky at first, but keep these common mistakes in mind:
• Forgetting to scale \(\lambda\): Always check if the time interval in the question matches the time interval of your \(\lambda\). If \(\lambda\) is "per day" but the question asks about "per week," you must multiply by 7 first.
• Independence: You can only add Poisson variables if they are independent. If one event triggers another (like a contagious disease), Poisson is usually not a good model.
• Binomial vs. Poisson: Remember that Binomial has a fixed upper limit (\(n\)), while Poisson theoretically has no upper limit (you could technically have an infinite number of events, even if the probability is near zero).
• Calculator Modes: Double-check if you need PD (exactly \(x\)) or CD (up to \(x\)). If the question says "more than 5," you need to calculate \(1 - P(X \leq 5)\) using the CD mode.
Summary Checklist
• Can I list the conditions for a Poisson distribution? (RISH)
• Do I know that for Poisson, Mean = Variance = \(\lambda\)?
• Can I scale \(\lambda\) for different time intervals?
• Do I know the conditions to approximate Binomial with Poisson (\(n\) large, \(p\) small)?
• Can I use my calculator to find \(P(X = x)\) and \(P(X \leq x)\)?