Welcome to the World of Sampling!
Ever wondered how news channels predict election results before all the votes are counted? Or how scientists know the average length of cod in the North Sea without catching every single fish? The secret lies in Sampling. In this chapter, we’ll learn how to take a small "snapshot" of a group to understand the "big picture." Don't worry if statistics feels a bit abstract at first—we'll use plenty of real-world examples to make it click!
1. Populations and Samples: The Big Picture vs. The Snapshot
Before we can collect data, we need to define who or what we are talking about.
Key Terms
Population: The entire group of individuals or items that we are interested in for a particular investigation. Example: Every student currently enrolled in your sixth-form college.
Sample: A smaller group of items specifically selected from the population to be studied. Example: 20 students picked at random from your college.
The Soup Analogy
Imagine you are cooking a massive pot of vegetable soup. The Population is every single drop of soup in that pot. To see if it needs more salt, you don’t drink the whole pot (that would be a census!). Instead, you stir the pot and take one spoonful. That spoonful is your Sample. If the spoonful tastes salty, you infer that the whole pot is salty.
Did you know? Populations aren't always a fixed size. A population can be infinite, like all the possible outcomes of tossing a coin forever!
Quick Review: Population vs. Sample
• Population: The whole group (can be finite or infinite).
• Sample: The part of the group we actually look at.
• Inference: Using the sample to make a "best guess" about the whole population.
2. Making Informal Inferences
We use samples to estimate things about the population. For example, we might calculate the sample mean \( ( \bar{x} ) \) to estimate the true population mean \( ( \mu ) \).
Important Point: Different samples can lead to different conclusions! If you take three different samples of 10 students and measure their heights, you will probably get three slightly different averages. This is called sampling variability. It’s perfectly normal, but it’s why we have to be careful about how we choose our samples.
3. Random Sampling: The Fair Way
To make sure our sample is representative (fair), we often want it to be random.
Simple Random Sampling (SRS)
In a Simple Random Sample, every possible sample of the required size has the same probability of being selected. It's like putting everyone's name in a giant hat and pulling them out while blindfolded.
How to do it:
1. Assign a unique number to every member of the population (this list is called a sampling frame).
2. Use a random number generator (on your calculator or computer) to pick the numbers.
3. The people/items corresponding to those numbers are your sample.
Key Takeaway: Random sampling helps avoid bias—which is when certain groups are accidentally favored over others.
4. Other Sampling Techniques
Sometimes a simple random sample isn't practical. Here are the other methods you need to know for the MEI syllabus:
Systematic Sampling
Items are chosen at regular intervals from a list.
Example: You have a list of 100 people and you want a sample of 10. You pick a random starting point between 1 and 10, then pick every \( 10^{th} \) person on the list.
Stratified Sampling
The population is divided into groups (called strata) based on a characteristic (like age or gender). You then take a random sample from each group, making sure the sample sizes are proportional to the group sizes in the population.
Example: If a school is 60% girls and 40% boys, a stratified sample of 100 students would randomly pick 60 girls and 40 boys.
Quota Sampling
Similar to stratified sampling, but not random. An interviewer is told to find a certain number of people from specific groups (e.g., "Find 20 men over the age of 50"). Once the "quota" is full, they stop.
Memory Aid: Quota = "Quantity." You just need a specific quantity from each group.
Opportunity Sampling (Convenience)
You simply pick people who are available at the time and fit your criteria.
Example: Standing outside a supermarket and asking the first 50 people who walk past.
Cluster Sampling
The population is divided into "clusters" (usually based on location). You then pick a few clusters at random and sample everyone inside those clusters.
Example: If you want to sample doctors in the UK, you might randomly pick 5 specific hospitals and interview every doctor in them.
Self-Selected (Volunteer) Sampling
People choose to take part in the study themselves.
Example: An online poll on a news website.
5. Evaluating Sampling Methods and Bias
In your exam, you might be asked to critique a sampling method. Here is what to look for:
The Problem of Bias
Bias happens when a sample doesn't accurately reflect the population. Common causes include:
• Non-response: Some people refuse to answer, and those people might have different opinions than those who do.
• Sampling Frame errors: The list you are using might be out of date or missing people.
• Method Bias: Opportunity sampling is often biased because you only talk to people in one place at one specific time.
Common Mistakes to Avoid
• Mixing up Stratified and Quota: Remember, Stratified uses random selection within groups; Quota is non-random (like an interviewer picking people).
• Forgetting the Sampling Frame: You can't do a Simple Random Sample if you don't have a full list of the population!
• Underestimating Opportunity Sampling: While it's easy, it's almost always the most biased method.
Quick Review Box:
• Simple Random: Equal chance for all.
• Systematic: Every \( n^{th} \) item.
• Stratified: Proportional random groups.
• Quota: Non-random groups.
• Opportunity: Easiest but most biased.
• Self-selected: Volunteers only.
Chapter Summary
Understanding the difference between a population and a sample is the foundation of all statistics. To make a sample useful, we try to make it random to avoid bias. While Simple Random Sampling is the "gold standard," other methods like Stratified or Systematic sampling are often more practical in the real world. Always keep an eye out for potential bias—it's the most common reason why statistical predictions go wrong!