Introduction to Statistical Hypothesis Testing
Welcome! If you’ve ever wondered how scientists decide if a new medicine works, or how a manufacturer proves their lightbulbs last as long as they claim, you’re looking at Statistical Hypothesis Testing. In this chapter, we learn the formal "rules of the game" for making decisions based on data. Don’t worry if this seems a bit abstract at first; once you master the vocabulary, it’s just like following a recipe!
1. The Language of Hypothesis Testing
Before we do any calculations, we need to speak the language. Think of a hypothesis test like a courtroom trial.
Key Terms
- Null Hypothesis \(H_0\): This is the "innocent until proven guilty" stance. We assume everything is normal or as previously claimed. For example, "The coin is fair" (\(p = 0.5\)).
- Alternative Hypothesis \(H_1\): This is what you are trying to prove. For example, "The coin is biased" (\(p \neq 0.5\)).
- Test Statistic: The actual piece of evidence we collect (e.g., how many heads we got in 20 flips).
- Significance Level (\(\alpha\)): The "threshold" for proof. Usually 5% (0.05) or 1% (0.01). It is the probability of incorrectly rejecting the null hypothesis when it was actually true.
- Critical Region: The range of values for the test statistic that would make us "guilty" (reject \(H_0\)).
- Critical Value: The "borderline" number that starts the critical region.
- p-value: The probability of getting your result (or something more extreme) if \(H_0\) is actually true.
1-tail vs 2-tail Tests
How do you know which one to use? Look at the wording in the question!
- 1-tail test: Used when the claim is about a specific direction.
Example: "Is the new fertilizer better?" (\(H_1: p > 0.5\)) or "Is the machine underfilling?" (\(H_1: p < 0.5\)). - 2-tail test: Used when the claim is that something has simply changed or is different.
Example: "Is the coin biased?" (\(H_1: p \neq 0.5\)).
Quick Review: In a 2-tail test at the 5% level, you split the significance level into 2.5% at the bottom end and 2.5% at the top end.
Takeaway: Always start by clearly defining your hypotheses in terms of the population parameter (like \(p\) or \(\mu\)).
2. Testing the Proportion (Binomial Distribution)
We use this when we are counting "successes" out of a fixed number of trials.
Step-by-Step Process
- State your hypotheses: Use \(p\) for the population proportion. \(H_0: p = \dots\) and \(H_1: p \dots\)
- Identify the distribution: Assume \(H_0\) is true, so \(X \sim B(n, p)\).
- Calculate the p-value: Use your calculator to find the probability of your result or more extreme.
- If testing greater than: \(P(X \geq \text{observed})\)
- If testing less than: \(P(X \leq \text{observed})\)
- Compare: If your p-value is less than the significance level, reject \(H_0\).
- Conclude in context: Always write a sentence like "There is sufficient evidence at the 5% level to suggest that..."
Example: A baker claims 90% of his cakes rise. You buy 20 and 15 rise. Is he exaggerating?
\(H_0: p = 0.9\)
\(H_1: p < 0.9\)
We test \(P(X \leq 15)\) using \(B(20, 0.9)\). If that probability is very small (less than our significance level), we call him out!
Common Mistake: Students often forget to include the "equal to" part of the probability (e.g., calculating \(P(X < 15)\) instead of \(P(X \leq 15)\)). Always include the observed value!
3. Testing Correlation (PMCC)
This tests if there is a linear relationship between two variables in a population.
The Parameters
- \(r\): The sample correlation (what you calculate from your data).
- \(\rho\) (rho): The population correlation (the "true" relationship we are guessing at).
Hypotheses for Correlation
- \(H_0: \rho = 0\) (There is no linear correlation).
- \(H_1: \rho > 0\) (Positive correlation), \(\rho < 0\) (Negative correlation), or \(\rho \neq 0\) (Some correlation).
Did you know? You don't need to calculate \(\rho\). You calculate \(r\) using your calculator, and then compare it to a Table of Critical Values (provided in the formula booklet) based on your sample size (\(n\)) and significance level.
Takeaway: If your calculated \(|r|\) is greater than the critical value, it is "strong enough" to reject the idea that there is no correlation.
4. Testing the Mean (Normal Distribution)
We use this when we know the variance of a population and want to test if the mean (\(\mu\)) has changed based on a sample.
Prerequisite: The Sample Mean Distribution
If the population is \(X \sim N(\mu, \sigma^2)\), then the mean of a sample of size \(n\) follows: \( \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \)
Memory Aid: As the sample size (\(n\)) gets bigger, the "spread" (standard error) gets smaller. This makes sense—large samples are more reliable!
The Test Statistic (Standardizing)
To find how many standard deviations our sample mean is from the claimed mean, we use: \( Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \)
This \(Z\) value is then compared to the standard Normal distribution \(N(0, 1^2)\).
Step-by-Step for Normal Testing
- State \(H_0: \mu = \dots\) and \(H_1: \mu \dots\)
- Write down the distribution of the sample mean: \(\bar{X} \sim N(\mu, \frac{\sigma^2}{n})\).
- Find the p-value: \(P(\bar{X} > \text{observed mean})\) or \(P(\bar{X} < \text{observed mean})\).
- Compare to the significance level and conclude.
Quick Review Box:
For a 1-tail test at 5%, the critical Z-value is 1.6449.
For a 2-tail test at 5%, the critical Z-values are \(\pm 1.96\).
Final Encouragement
Hypothesis testing can feel like a lot of steps, but it is very logical. Just remember: Hypothesize \(\rightarrow\) Test \(\rightarrow\) Compare \(\rightarrow\) Conclude. Keep your notation tidy, and always bring your final answer back to the real-world story in the question. You've got this!