Welcome to Effect Size!
In your journey through Statistical Inference, you have spent a lot of time looking at p-values and deciding whether to reject a null hypothesis. But have you ever stopped to ask: "Even if this result is statistically significant, does it actually matter in the real world?"
That is exactly what Effect Size tells us! While significance testing (p-values) tells us if a result is likely due to chance, effect size tells us how large and meaningful that result actually is. Don’t worry if this sounds a bit abstract right now—we are going to break it down step-by-step.
1. What is Effect Size?
Effect size is a way of quantifying the size of the difference between two groups. It is a complementary methodology to standard significance testing. This means you should use both together to get the full story of your data.
An Everyday Analogy:
Imagine you find a "miracle" plant food. A hypothesis test might prove that it definitely makes sunflowers grow taller (statistical significance). However, the effect size tells you how much taller. If the plants only grow 1 millimeter taller, the effect size is tiny, and the plant food probably isn't worth the money, even if the result was "significant"!
Key Differences to Remember:
- Significance Testing (p-values): Asks "Is there an effect at all?"
- Effect Size: Asks "How big is the effect?"
2. The Relationship Between p-values, Sample Size, and Effect Size
This is a crucial point for your exam. A common mistake is thinking that a very small p-value (like 0.0001) means a huge real-world effect. That isn't always true!
The p-value in a hypothesis test depends on two main things:
1. The Effect Size (how big the real difference is).
2. The Sample Size (how much data you collected).
The "Sample Size Trap":
If you have a huge sample size (thousands of people), even a tiny, unimportant difference can become "statistically significant." Conversely, if your sample is very small, you might miss a large, important effect because the test didn't have enough power to "see" it.
Quick Review: Effect size is independent of sample size. It tells you the "true" strength of the relationship regardless of how many people you surveyed.
3. Measuring Effect Size: Cohen’s \( d \)
In the Pearson Edexcel syllabus, the main measure you need to know is Cohen’s \( d \). This is used when we are comparing the means (averages) of two groups.
Essentially, Cohen's \( d \) measures how many standard deviations apart the two means are. The formula for a simple situation is:
\( d = \frac{\bar{x}_1 - \bar{x}_2}{s} \)
Where:
- \( \bar{x}_1 \) and \( \bar{x}_2 \) are the means of the two groups.
- \( s \) is the standard deviation (usually a "pooled" standard deviation of the groups).
Interpretation Boundaries
Jacob Cohen, the statistician who created this, suggested some "rules of thumb" to help us understand what the number means. You should memorize these boundaries for your exam:
- \( 0.2 \le d < 0.5 \): Small effect size (The difference is there, but hard to see with the naked eye).
- \( 0.5 \le d < 0.8 \): Medium effect size (The difference is large enough to be visible to an observer).
- \( 0.8 \le d \): Large effect size (The difference is very clear and substantial).
Did you know? A "large" effect size of 0.8 means that the average person in the experimental group is better off than 79% of the people in the control group!
4. Context is Everything!
While the boundaries above are helpful, the syllabus reminds us that the interpretation of effect size depends on the context.
Example:
If a new heart medication has an effect size of \( d = 0.1 \), it is "small" by Cohen's standards. However, if that medication saves 1,000 lives a year, that "small" effect is extremely important in a medical context! Always look at what is being measured before dismissing a small number.
5. Common Mistakes to Avoid
- Mistake: Thinking a significant p-value means a large effect.
Correction: Always check the effect size; significance only tells you the result is unlikely to be a fluke. - Mistake: Using Cohen's \( d \) boundaries as "hard laws."
Correction: Use them as guidelines, but always mention the context of the problem (e.g., medicine, education, or sports). - Mistake: Forgetting that \( d \) can be calculated even if the result isn't significant.
Correction: Effect size is a description of the data you have, regardless of the outcome of a hypothesis test.
Summary Checklist
- Can you explain why effect size is "complementary" to hypothesis testing? (It adds the "how much" to the "is there an effect?").
- Do you know the boundaries for Cohen’s \( d \)? (0.2 = Small, 0.5 = Medium, 0.8 = Large).
- Do you understand that p-values are affected by sample size, but effect size is not?
- Can you interpret an effect size in a real-world context?
Don't worry if this seems tricky at first! Just remember: Significance testing is the "Yes/No" switch, and Effect Size is the "Dimmer" switch that tells you exactly how bright the light is.