Welcome to Confidence Intervals!
In your previous statistics work, you have probably spent a lot of time calculating a single value to describe a population—like finding the average height of a group of students. In the world of Further Mathematics, we take this a step further. Instead of just giving one "best guess" (which we call a point estimate), we provide a range of values where we are fairly sure the true answer lies. This range is called a Confidence Interval.
Think of it like this: If you are trying to catch a fish, are you more likely to catch it with a single spear (a point estimate) or a wide net (a confidence interval)? The net is much more reliable! In this chapter, you will learn how to build that "net" mathematically.
1. The Core Concept: What are we Estimating?
Before we dive into the formulas, let’s get our terminology straight. Don't worry if these symbols look a bit intimidating at first; you've likely seen them before!
• Population Mean \(\mu\): The "true" average of the entire group we are studying (e.g., every lightbulb in a factory). This is usually unknown.
• Sample Mean \(\bar{x}\): The average we calculate from a small group we actually measured. This is our starting point.
• Confidence Level: How sure we want to be. Usually, we use 95%, but we might use 90% or 99%.
Did you know? A 95% confidence interval doesn't mean there is a 95% chance the mean is in *this specific* interval. It actually means that if we took 100 different samples and made 100 intervals, we would expect about 95 of them to contain the true population mean.
2. Constructing Intervals with Known Variance (SH1)
The first scenario in your syllabus is when we know the variance (\(\sigma^2\)) or standard deviation (\(\sigma\)) of the whole population. This is common in manufacturing where machines have a known "wobble" or spread.
To build a symmetric confidence interval for the mean, we use this general structure:
Sample Mean \(\pm\) (Critical Value \(\times\) Standard Error)
The formula looks like this:
\(\bar{x} \pm z \frac{\sigma}{\sqrt{n}}\)
Where:
• \(\bar{x}\) is the sample mean.
• \(z\) is the critical value (how many standard deviations you need to go out to cover your chosen percentage).
• \(\sigma\) is the population standard deviation.
• \(n\) is the sample size.
How to find the Critical Value (\(z\))
You can find these in your formula booklet or using your calculator. Here are the "famous" ones you'll see most often:
• For a 90% interval: \(z = 1.645\)
• For a 95% interval: \(z = 1.960\)
• For a 99% interval: \(z = 2.576\)
Quick Review: To make the interval narrower (more precise), you can either increase the sample size (\(n\)) or decrease your confidence level. It's a trade-off between being precise and being sure!
3. Large Samples with Unknown Variance (SH2)
What happens if we don't know the population standard deviation (\(\sigma\))? In the real world, this is actually the most common situation!
If your sample size is large (usually \(n > 30\)), we can use a clever trick. According to the Central Limit Theorem, we can use the sample standard deviation (\(s\)) as a "stand-in" for the population standard deviation (\(\sigma\)).
The formula stays almost exactly the same:
\(\bar{x} \pm z \frac{s}{\sqrt{n}}\)
Step-by-Step Process:
1. Calculate the mean (\(\bar{x}\)) of your sample.
2. Calculate the sample standard deviation (\(s\)).
3. Check that \(n\) is large (usually \(n > 30\)).
4. Pick your \(z\) value based on the confidence level requested.
5. Plug the values into the formula to get your lower and upper bounds.
Common Mistake to Avoid: A very common error is forgetting to square root the \(n\). Remember, as the sample size gets bigger, the "Standard Error" (\(\frac{s}{\sqrt{n}}\)) gets smaller. This makes sense: more data means more certainty!
4. Making Inferences (SH3)
Constructing the interval is only half the battle. The real power of statistics is making inferences (drawing conclusions). Usually, a question will give you a claim and ask if your interval supports it.
Example: A company claims their cereal boxes contain 500g of cereal. You take a sample and find a 95% confidence interval for the mean weight is \([492g, 498g]\).
What can we infer?
Since the claimed value (500g) is not inside our interval, we have evidence to suggest the company's claim is incorrect. The mean weight is likely lower than they say.
Key Takeaway for Inferences:
• If the value is inside the interval: The claim is plausible/consistent with the data.
• If the value is outside the interval: The claim is unlikely to be true based on this sample.
5. Summary and Tips for Success
Memory Aid: The "Plus-Minus" Rule
Always remember that a confidence interval is just the mean plus or minus a "margin of error."
Lower Bound = \(\bar{x} - z \frac{\sigma}{\sqrt{n}}\)
Upper Bound = \(\bar{x} + z \frac{\sigma}{\sqrt{n}}\)
Quick Summary:
• Symmetric: The interval is always centered exactly on the sample mean (\(\bar{x}\)).
• Known Variance: Use \(\sigma\) directly.
• Unknown Variance (Large \(n\)): Use \(s\) as an estimate for \(\sigma\).
• Narrower Intervals: Use a larger \(n\) or lower the confidence level (e.g., from 95% to 90%).
• Wider Intervals: Use a smaller \(n\) or higher the confidence level (e.g., from 95% to 99%).
Don't worry if this seems tricky at first! Just remember that you are building a range to capture an unknown value. Practice finding the \(z\) values on your calculator, and the rest is just plugging numbers into the formula!