Introduction: The Detective Work of Maths
Welcome to the world of Statistical Hypothesis Testing! Don't worry if that sounds a bit intimidating—at its heart, hypothesis testing is just like being a detective or a juror in a court case. We start with an assumption (the "status quo"), look at some evidence (the data), and decide if there is enough proof to change our minds.
In this chapter, we are going to learn the language used in these tests. Understanding these terms is the "secret key" to scoring high marks in the Statistics section of your OCR AS Level Maths exam.
1. The Starting Point: Null and Alternative Hypotheses
Before we look at any data, we have to state what we are testing. We always have two competing ideas:
The Null Hypothesis (\( H_0 \)): This is the "boring" version. It assumes that nothing has changed, or a claim is true. In your syllabus, we usually write this as \( H_0: p = \text{something} \), where \( p \) is a probability or proportion.
Analogy: In a court of law, the Null Hypothesis is "The defendant is innocent."
The Alternative Hypothesis (\( H_1 \)): This is the "exciting" version. It’s what we suspect might actually be happening. We write this as \( H_1: p < \dots \), \( H_1: p > \dots \), or \( H_1: p \neq \dots \).
Analogy: The Alternative Hypothesis is "The defendant is guilty."
1-Tail vs. 2-Tail Tests
How do we know which symbol to use for \( H_1 \)? It depends on what the question asks:
- 1-Tail Test: Used when we are looking for a change in a specific direction (e.g., "Has the probability decreased?" use \( < \); "Has it increased?" use \( > \)).
- 2-Tail Test: Used when we just want to know if the probability has changed at all, regardless of direction. We use the "not equal to" symbol (\( \neq \)).
Quick Review:
Always state your hypotheses using the parameter \( p \).
Correct: \( H_0: p = 0.5 \)
Incorrect: \( H_0 = 0.5 \) (You must include the \( p \)!)
Key Takeaway: \( H_0 \) is the "no change" rule. \( H_1 \) is the "something is different" suspicion.
2. The Evidence: Test Statistics and Significance Levels
Once we have our hypotheses, we need a way to measure the evidence.
The Test Statistic: This is the actual result we get from our sample. For example, if we flip a coin 20 times to see if it's biased and it lands on heads 15 times, the Test Statistic is 15.
Memory Aid: The Test Statistic is just "The number we counted."
The Significance Level (\( \alpha \)): This is the "bar" we set for how much evidence we need. It’s usually 5% (\( 0.05 \)) or 1% (\( 0.01 \)). It represents the probability of accidentally rejecting \( H_0 \) when it was actually true.
Analogy: Think of this as the "Strictness Level." A 1% level is much stricter than a 5% level because it requires stronger evidence to change our minds.
Did you know? A significance level of 5% means we are willing to accept a 5% chance that we are wrong when we say things have changed!
Key Takeaway: The Test Statistic is our data; the Significance Level is our "threshold" for proof.
3. The Decision Zones: Critical Regions and Values
Imagine a number line of all possible outcomes. We divide this line into two areas:
The Critical Region (Rejection Region): If our test statistic falls in this area, the result is so unlikely to happen by "just luck" that we reject the Null Hypothesis. We conclude there is enough evidence to support \( H_1 \).
The Acceptance Region: If our test statistic falls here, it means our result is fairly normal and could have happened by chance. We do not reject the Null Hypothesis.
The Critical Value: This is the "border guard." It is the first value that falls into the critical region.
Step-by-Step Process:
1. Find the Critical Region based on the significance level.
2. See where your Test Statistic (your data) lands.
3. If it’s in the Critical Region, you’ve found "significant evidence" of a change!
Key Takeaway: If the result is in the Critical Region, it’s "weird" enough to make us reject the status quo.
4. The P-Value Method
Sometimes, instead of using regions, we use a p-value. This is a very common way to report results in real-world science.
The p-value is the probability of getting a result as extreme as, or more extreme than, the one we observed, assuming \( H_0 \) is true.
How to decide:
- If p-value \( \leq \) Significance Level: The result is significant. Reject \( H_0 \).
- If p-value \( > \) Significance Level: The result is not significant. Do not reject \( H_0 \).
Memory Trick: "If the P is low, the Null must go! If the P is high, the Null can fly (stay)."
5. Writing the Conclusion (Don't Lose Marks Here!)
The OCR examiners are very strict about how you write your final answer. You must reflect the fact that statistics is about evidence, not 100% absolute certainty.
Common Mistake to Avoid: Never say "I accept \( H_0 \)" or "This proves \( H_0 \) is true." We only ever "fail to reject" it because we haven't found enough evidence to move away from it.
The "Golden Formula" for Conclusions:
"There is [sufficient / insufficient] evidence at the [X]% significance level to suggest that [the context of the question—e.g., the coin is biased]."
Examples from the Syllabus:
- Correct: "There is evidence at the 5% level to reject \( H_0 \). It is likely that the mean mass is less than 500g."
- Correct: "There is no evidence at the 2% level to reject \( H_0 \). There is no reason to suppose the journey time has changed."
- Incorrect: "\( H_0 \) is rejected. Waiting times have increased." (This sounds too certain! Use "evidence to suggest" instead).
Key Takeaway: Always be polite and slightly uncertain in your conclusion. Use words like "evidence," "suggest," and "likely."
Chapter Summary
1. Hypotheses: \( H_0 \) (no change) vs \( H_1 \) (change). Use \( p \).
2. Significance Level: The "threshold" for evidence (e.g., 5%).
3. Test Statistic: The number observed in the sample.
4. Critical Region: The range of values that leads to rejecting \( H_0 \).
5. Conclusion: Always relate it back to the real-world context and use cautious language.
Don't worry if this seems tricky at first! Hypothesis testing is a logical process. Once you get the "hang" of the vocabulary, the actual math (which we will do in the next chapter using the Binomial Distribution) becomes much easier to follow.