Statistical hypothesis testing - Mathematics 7356 - AQA AS Level

Welcome to Statistical Hypothesis Testing!

Ever made a claim like "I bet I can flip this coin and get heads more than half the time," or "I think this new medicine works better than the old one"? In Statistics, we don't just guess; we use Hypothesis Testing to see if the evidence actually supports our claims. Don't worry if this seems a bit abstract at first—think of it as being a Maths Detective. You are looking at the evidence (data) to see if a crime (a change) has actually happened!

1. The Language of Hypothesis Testing

Before we start calculating, we need to learn the "lingo." Statistical testing has its own set of terms that you must use correctly to get full marks.

The Two Hypotheses

Every test starts with two competing statements:

1. The Null Hypothesis (\(H_0\)): This is the "status quo." It assumes that nothing has changed and everything is normal. For the Binomial distribution, we write this as \(H_0: p = \text{something}\).
2. The Alternative Hypothesis (\(H_1\)): This is the claim we are investigating. It's the "Wait, I think something IS different!" statement. We write this as \(H_1: p < \dots\), \(H_1: p > \dots\), or \(H_1: p \neq \dots\).

Analogy: The Courtroom
Think of a trial. The Null Hypothesis is that the defendant is "Innocent." We assume they are innocent unless we find "beyond a reasonable doubt" evidence to prove the Alternative Hypothesis (they are "Guilty").

Important Terms to Know

Test Statistic: This is the actual result you get from your experiment (e.g., "I flipped the coin 10 times and got 8 heads").
Significance Level (\(\alpha\)): This is the "hurdle" the evidence must clear. Common levels are 5% (0.05) or 1% (0.01). If the probability of our result is smaller than this hurdle, we reject \(H_0\).
p-value: The actual probability of getting your result (or something more extreme) if the Null Hypothesis is true.
Critical Value: The first value that falls inside the Critical Region.
Critical Region (Rejection Region): The range of values where the evidence is so strong that we decide to reject the Null Hypothesis.

Quick Review:
- \(H_0\) is always "=".
- \(H_1\) is "<", ">", or "\(\neq\)".
- The Test Statistic is just the number of successes you observed.

2. 1-Tail vs. 2-Tail Tests

How do we know which way to look? It depends on the wording of the question!

1-Tail Test

We use a 1-tail test when we are looking for a change in one specific direction.
Example: "Test whether the proportion of broken eggs has decreased."
\(H_1: p < 0.05\)

2-Tail Test

We use a 2-tail test when we are looking for any change, regardless of direction.
Example: "Test whether the proportion of people voting for Party A has changed."
\(H_1: p \neq 0.4\)

Memory Trick:
If the question says "increased" or "decreased," use 1 tail. If it says "changed" or "different," use 2 tails!

3. Conducting the Test: Step-by-Step

Let's look at how to actually perform a test using the Binomial Distribution \(X \sim B(n, p)\).

Step 1: State your Hypotheses
Always define \(p\) first (e.g., "Let \(p\) be the probability of a seed germinating"). Then write your \(H_0\) and \(H_1\).

Step 2: State the Significance Level
Usually given in the question (e.g., 5%).

Step 3: State the Distribution
Under the Null Hypothesis, what is the distribution? E.g., \(X \sim B(20, 0.4)\).

Step 4: Calculate the p-value
Use your calculator to find the probability of your result or more extreme.
- If \(H_1: p > \dots\), calculate \(P(X \geq \text{observed value})\).
- If \(H_1: p < \dots\), calculate \(P(X \leq \text{observed value})\).

Step 5: Compare and Conclude
If your p-value is less than the significance level, you reject \(H_0\). There is "sufficient evidence."
If your p-value is more than the significance level, you fail to reject \(H_0\). There is "insufficient evidence."

Important Note: In a 2-tail test, you must compare your p-value to half of the significance level (e.g., if the level is 5%, you compare your p-value to 2.5% at each end).

Key Takeaway:
"If the p is low, the null must go!" (If p-value < Significance level, reject \(H_0\)).

4. Critical Regions

Sometimes, instead of a p-value, an examiner might ask for the Critical Region. This is the set of all possible outcomes that would lead you to reject the Null Hypothesis.

Example: If you flip a coin 10 times, the critical region might be \(X=0, 1\) or \(X=9, 10\). If your actual result falls in these "zones," you reject \(H_0\).

Did you know?
The Actual Significance Level is the true probability of falling into the critical region. Because the Binomial distribution is discrete (you can't have 4.5 successes), the actual significance level is often slightly less than the level requested (e.g., 4.2% instead of 5%).

5. Common Mistakes to Avoid

Don't fall into these traps!
1. Mixing up \(H_0\) and \(H_1\): Remember, \(H_0\) is always the "equals" one.
2. Forgetting context: At the end of your test, you must write a sentence relating back to the story. Instead of just saying "Reject \(H_0\)," say "There is sufficient evidence to suggest that the proportion of broken eggs has decreased."
3. Using the wrong inequality: If you are testing for an increase, you need to calculate \(P(X \geq x)\). Students often accidentally calculate \(P(X \leq x)\).
4. Mistaking the Significance Level: Understand that the significance level is the probability of incorrectly rejecting the Null Hypothesis. In other words, it's the chance we say "Something changed!" when actually, it was just luck.

Summary Checklist

- Define your parameter \(p\) in words.
- Write \(H_0\) and \(H_1\).
- Identify the Test Statistic from the data.
- Use your calculator to find the p-value or Critical Region.
- Compare to the significance level.
- Write a final conclusion in the context of the question.

Keep practicing! Hypothesis testing is a logical process. Once you get the steps down, you'll find it's one of the most predictable parts of Paper 2!

* The content provided by thinka is generated by AI and may not always be accurate or up-to-date. Please use it as a supplementary resource and verify with official materials.