Welcome to Hypothesis Testing for the Mean!
Ever wondered if the "average battery life" promised on a phone box is actually true? Or if a new diet really changes the average weight of a group? In this chapter, we learn how to use the Normal Distribution to decide whether a claim about a population mean is likely to be true or if it’s time to call it out! Don't worry if it feels like a lot of steps at first—we'll break it down into bite-sized pieces.
1. The Logic: Why Sample Means?
If we want to test a claim about a whole population (like "the average height of adults is 170cm"), we usually can't measure everyone. Instead, we take a sample.
The key thing to remember is that the sample mean (\( \bar{x} \)) is more reliable than a single data point. Think of it like this: if you measure one person, they might be very tall by coincidence. But if you measure 100 people, their average height is much more likely to be close to the true population average.
The Distribution of the Sample Mean
According to your syllabus (Ref: MH7), if we take a sample of size \( n \) from a population \( X \sim N(\mu, \sigma^2) \), the sample mean \( \bar{X} \) follows its own Normal distribution:
\( \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \)
Notice that the variance is divided by \( n \). This means as your sample size gets bigger, the "spread" of the sample means gets smaller. Your estimate becomes more precise!
Quick Review Box:
Population Mean = \( \mu \)
Population Variance = \( \sigma^2 \)
Sample Size = \( n \)
Standard Error (the standard deviation of the mean) = \( \frac{\sigma}{\sqrt{n}} \)
Key Takeaway: When testing a mean, always use the variance \( \frac{\sigma^2}{n} \), not just \( \sigma^2 \)!
2. Setting Up Your Hypotheses
Every test starts with two competing statements (Ref: H4):
1. The Null Hypothesis (\( H_0 \)): This is the "status quo." We assume nothing has changed. It always looks like: \( H_0 : \mu = \text{some value} \).
2. The Alternative Hypothesis (\( H_1 \)): This is what you are trying to prove. It can be:
- 1-tailed (Greater than): \( H_1 : \mu > \text{value} \)
- 1-tailed (Less than): \( H_1 : \mu < \text{value} \)
- 2-tailed (Different from): \( H_1 : \mu \neq \text{value} \)
Example: A factory claims their cereal boxes weigh 500g. You think they are under-filling them.
\( H_0 : \mu = 500 \)
\( H_1 : \mu < 500 \) (This is a 1-tailed test).
Key Takeaway: \( H_0 \) is always "equals." \( H_1 \) depends on the wording of the question ("increased," "decreased," or "changed").
3. Conducting the Test: The Two Methods
Your syllabus (Ref: H8) requires you to be able to use either the p-value or critical regions.
Method A: The p-value approach
The p-value is the probability of getting a result as extreme as yours if \( H_0 \) is actually true.
- If the p-value < Significance Level: Reject \( H_0 \). (The result is too unlikely to be a fluke).
- If the p-value > Significance Level: Do not reject \( H_0 \).
Method B: The Critical Region approach
The Critical Region (or Rejection Region) is the range of values that are so unlikely that we reject \( H_0 \) if our test statistic falls inside it. The boundary of this region is the Critical Value.
Did you know?
The Significance Level (usually 5% or 1%) is actually the probability of "finding a result" when nothing is actually happening. It's the risk we take of being wrong when we reject the null hypothesis!
Key Takeaway: Small p-values mean strong evidence against the Null Hypothesis. Think: "If the p is low, the Null must go!"
4. Step-by-Step Calculation Guide
Follow these steps to solve any exam problem on this topic:
Step 1: Write the Hypotheses. State \( H_0 \) and \( H_1 \) clearly using the symbol \( \mu \).
Step 2: State the Distribution. Write down \( \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \).
Step 3: Calculate the Test Statistic. Find the \( z \)-score (the number of standard errors your sample mean is from the claimed mean):
\( z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}} \)
Step 4: Find the p-value or Critical Value. Use your calculator's Normal distribution functions.
Step 5: Compare and Decide. See if your value falls in the rejection zone.
Step 6: Conclusion in Context. (Ref: MH10) Never just say "Reject \( H_0 \)." You must say: "There is sufficient evidence at the 5% level to suggest that the mean weight of cereal boxes has decreased."
Common Mistake to Avoid: Many students forget to divide by \( \sqrt{n} \). If you use just \( \sigma \) instead of \( \frac{\sigma}{\sqrt{n}} \), your \( z \)-score will be much smaller than it should be, and you'll likely miss a significant result!
5. When the Variance is Unknown
In the real world, we rarely know the population variance (\( \sigma^2 \)). Your syllabus (Ref: H8b) says that if the sample size \( n \) is large, we can use the sample variance (\( s^2 \)) as a substitute for \( \sigma^2 \).
Don't worry if this seems tricky! Just look at the question: if it gives you the variance of the sample and says the sample is large (usually \( n > 30 \)), just plug that value in where \( \sigma^2 \) would normally go.
Key Takeaway: Large samples allow us to be flexible with the variance.
Final Summary Checklist
- Did I use the sample mean distribution \( \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \)?
- Is my \( H_1 \) 1-tailed or 2-tailed?
- Is my conclusion written in plain English, referring back to the original story in the question?
- If it's a 2-tailed test, did I remember to halve the significance level when finding the critical value?
Success Tip: Always draw a quick sketch of the Normal curve. Shade the "tail" or "tails" representing the significance level. It makes it much harder to get the direction of your inequalities wrong!