Hypothesis testing, significance testing, confidence intervals and power

Q: What is the formula for the Standard Error used in confidence intervals for the mean?

The Standard Error (SE) is calculated as \(\frac{s}{\sqrt{n}}\), where \(s\) is the sample standard deviation and \(n\) is the sample size.

Q: How do the confidence level and sample size affect the width of a confidence interval?

Increasing the confidence level makes the interval wider, while increasing the sample size makes the interval narrower.

Q: What is a Type I error?

A Type I error occurs when we reject the null hypothesis (\(H_0\)) when it is actually true (a "false positive").

Q: What is a Type II error?

A Type II error occurs when we fail to reject the null hypothesis (\(H_0\)) even though it is actually false (a "false negative").

Q: What is the mathematical definition of the Power of a test?

\(\text{Power} = 1 - P(\text{Type II error})\). It is the probability of correctly rejecting a false null hypothesis.

Q: How can a researcher increase the power of a statistical test?

Power can be increased by increasing the sample size (\(n\)), increasing the significance level (\(\alpha\)), or by investigating a larger effect size.

Statistics (9ST0) · Pearson Edexcel A Level · 5 min read

Welcome to Statistical Inference!

In this chapter, we move from simply describing data to making big decisions based on it. Think of this as the "detective work" of statistics. We use samples to make a "best guess" about the whole population, and we calculate exactly how much we should trust those guesses. Whether you're testing a new medicine or checking if a machine is filling cereal boxes correctly, these tools are your best friend. Don't worry if it seems a bit heavy on the theory at first—we'll break it down step-by-step!

1. Confidence Intervals: The "Safety Net"

A Confidence Interval (CI) is a range of values that we are fairly sure contains the true population mean. Instead of giving just one number (a point estimate), we give a range.

The Analogy: Imagine you are trying to catch a fish in a dark pond. Throwing a single spear is like a point estimate—you might miss. Throwing a wide net is like a Confidence Interval—you are much more likely to catch the fish inside that range!

Choosing between \(z\) and \(t\)

To build our "net," we need to choose the right distribution:

Use the \(z\)-distribution if the population standard deviation \(\sigma\) is known OR if your sample size is large (\(n \geq 30\)).
Use the \(t\)-distribution if the sample size is small (\(n < 30\)) AND the population standard deviation \(\sigma\) is unknown.

The Formula

The general formula for a confidence interval for the mean is:
\(\bar{x} \pm (z \text{ or } t) \times (\text{standard error})\)

Where the Standard Error is \(\frac{s}{\sqrt{n}}\). Remember, when calculating \(s^2\) (the sample variance), you must use the \((n-1)\) divisor on your calculator!

Quick Review: The Width of the Interval

The width of your "net" depends on two things:

Confidence Level: Higher confidence (e.g., 99% vs 95%) makes the interval wider.
Sample Size (\(n\)): A larger sample makes the interval narrower and more precise.

Common Mistake to Avoid: Students often think a 99% confidence interval is "better" because it's more certain. However, it is also wider and less precise. It's a trade-off!

Key Takeaway: Confidence intervals give us a range for the population mean. Use \(t\) for small samples where you don't know the population's true spread.

2. Type I and Type II Errors: When We Get It Wrong

Even with perfect statistics, we can make the wrong call. There are two specific ways to be wrong in hypothesis testing.

Type I Error: The "False Positive"

This happens when the Null Hypothesis (\(H_0\)) is actually true, but we accidentally reject it.

Example: A fire alarm going off when there is no fire. It "claimed" there was a change when there wasn't.

Did you know? The probability of a Type I error is exactly the same as the significance level (\(\alpha\)) of your test (usually 5% or 0.05).

Type II Error: The "False Negative"

This happens when the Null Hypothesis (\(H_0\)) is false, but we fail to reject it (we "accept" it).

Example: A fire is burning, but the fire alarm stays silent. It failed to detect the change.

Memory Aid: The "Truth" Trick

Type I: Rejected the Truth (The Null was True).
Type II: Accepted the Lie (The Null was False/a Lie).

Key Takeaway: Type I is crying wolf when there isn't one. Type II is missing the wolf when it's standing right there!

3. The Power of a Test

The Power of a hypothesis test is its ability to correctly reject a false null hypothesis. In simple terms, it's the probability that the test will detect an effect if there actually is one.

The Formula

Power = \(1 - P(\text{Type II error})\)

If the risk of a Type II error is high, the power is low. We want high power!

How to increase Power:

Increase sample size (\(n\)): This is the most common way. More data makes the test more sensitive.
Increase the significance level (\(\alpha\)): If you move from 1% to 5%, you are more likely to reject \(H_0\), which increases power (but also increases the risk of a Type I error!).
Pick a larger effect size: It’s easier to detect a giant change than a tiny one.

Quick Review: Power is like the "strength" of a microscope. A powerful test can see small details (effects) that a weak test would miss.

4. Significance Testing: Critical Regions vs. p-values

When you perform a test, you have two ways to decide whether to reject \(H_0\). Both lead to the same conclusion!

The Critical Region Method

You find a "cutoff" value (the Critical Value). If your Test Statistic falls into the "Critical Region" (the tail of the distribution), you reject \(H_0\).

The p-value Method

The p-value is the probability of getting your results (or more extreme results) if \(H_0\) is true.

If p-value \(\leq\) Significance Level (\(\alpha\)): Reject \(H_0\) (Significant result).
If p-value \(>\) Significance Level (\(\alpha\)): Do not reject \(H_0\) (Not significant).

Encouraging Note: Don't worry if \(p\)-values feel confusing! Just remember: "If the p is low, the Null must go!"

Important Note for Exams: In hypothesis tests on population correlation coefficients, you will usually use critical values from tables rather than \(p\)-values.

Key Takeaway: Whether you use a critical region or a \(p\)-value, you are just checking if your sample result is "weird" enough to prove that the Null Hypothesis is probably wrong.

5. Practical Importance and Sample Size

In the real world, you can't just look at the numbers; you have to look at the context.

Sample Size Matters: If you have a massive sample size, even a tiny, meaningless difference might show up as "statistically significant."
Strength of Evidence: Always evaluate how strong your conclusion is. If your \(p\)-value is 0.049 and your cutoff is 0.05, it's significant, but only just!
Changing \(n\): If a test is inconclusive, a statistician might increase the sample size to elicit better evidence and improve the power of the test.

Common Mistake: Thinking "Statistically Significant" means "Important." If a new drug lowers blood pressure by only 0.1%, it might be statistically significant (not due to chance), but it’s not practically useful for a doctor!

Key Takeaway: Always interpret your results in the context of the problem. A large sample makes it easier to find evidence, but make sure that evidence actually matters in real life.

Quick check

Can you answer these now?

Open each question to check the key ideas from this chapter.

What is a Confidence Interval (CI) in the context of statistical inference?

A Confidence Interval is a range of values, calculated from sample data, that is likely to contain the true population parameter (such as the mean) with a specific level of probability.

When should you use the \(t\)-distribution instead of the \(z\)-distribution for a confidence interval?

You must use the \(t\)-distribution if the sample size is small (\(n < 30\)) AND the population standard deviation (\(\sigma\)) is unknown.

What is the formula for the Standard Error used in confidence intervals for the mean?

The Standard Error (SE) is calculated as \(\frac{s}{\sqrt{n}}\), where \(s\) is the sample standard deviation and \(n\) is the sample size.

How do the confidence level and sample size affect the width of a confidence interval?

Increasing the confidence level makes the interval wider, while increasing the sample size makes the interval narrower.

What is a Type I error?

A Type I error occurs when we reject the null hypothesis (\(H_0\)) when it is actually true (a "false positive").

What is a Type II error?

A Type II error occurs when we fail to reject the null hypothesis (\(H_0\)) even though it is actually false (a "false negative").

What is the mathematical definition of the Power of a test?

\(\text{Power} = 1 - P(\text{Type II error})\). It is the probability of correctly rejecting a false null hypothesis.

How can a researcher increase the power of a statistical test?

Power can be increased by increasing the sample size (\(n\)), increasing the significance level (\(\alpha\)), or by investigating a larger effect size.

Ready to test yourself?

Turn these notes into exam-style practice. Get unlimited AI questions on this topic with instant marking and explanations.

Practice This Topic

More Statistics (9ST0) chapters

* The content provided by thinka is generated by AI and may not always be accurate or up-to-date. Please use it as a supplementary resource and verify with official materials.

Put These Notes into Practice

Reading the notes is a great start. Now practise with unlimited AI-generated questions and get instant feedback. 100,000+ students are already improving their grades.

Start Practising Now View Pricing

Done reading? Test yourself with AI practice questions

Practice This Topic Now