Welcome to Statistics: Sampling!
Ever wondered how news channels predict election results before all the votes are counted? Or how scientists know a new medicine works without testing it on every single person on Earth? The secret is Sampling. In this chapter, we’ll learn how to take a small "snapshot" of a group and use it to understand the big picture. Don’t worry if statistics feels a bit different from pure algebra—it’s all about logic and making smart guesses!
1. Population vs. Sample: The Big Picture
Before we start calculating, we need to know exactly who or what we are talking about.
The Population
In statistics, the population is the entire group of items or people that you are interested in. It doesn't have to be people! It could be every cod in the North Sea, every lightbulb made in a factory, or even every possible toss of a coin.
• Finite Population: A group you can count, like all students in your college.
• Infinite Population: A group that never ends, like "all possible outcomes of rolling a dice."
The Sample
A sample is a smaller group selected from the population. We study the sample to save time and money. Think of it like this: You don’t need to eat a whole pot of soup to know if it’s too salty; you just need one sample spoonful!
Sampling With or Without Replacement
When we pick a sample, we usually do it without replacement. This means once we pick a person to interview, we don't put them back in the "pot" to be picked again. This avoids getting data from the same individual twice. However, if a population is infinite (like coin tosses), it doesn't matter if we replace them or not!
Quick Review:
• Population: The whole group.
• Sample: A part of the group.
• Census: When you actually measure the whole population (rare and expensive!).
Key Takeaway: We use a sample to make inferences (smart guesses) about a population.
2. Making Inferences (Best Guesses)
Once we have our sample, we calculate things like the sample mean \( (\bar{x}) \) or the sample variance \( (s^2) \). We use these values as estimates for the whole population.
Example: If you find the average height of 50 students in your year is \( 165 \text{ cm} \), you might infer that the average height of all students in the country is roughly \( 165 \text{ cm} \).
The "Different Samples" Problem:
If your friend picks a different group of 50 students, they might get an average of \( 168 \text{ cm} \). This is normal! Different samples will lead to different conclusions. This is why choosing the right sampling method is so important.
Key Takeaway: Sample data provides an estimate, but it’s rarely 100% perfect. Different samples produce different results.
3. Sampling Techniques: How to Pick Your Group
How you pick your sample determines if your results are "fair" or "biased." The syllabus requires you to know these specific methods:
A. Simple Random Sampling
Every single member of the population has an equal chance of being chosen.
How to do it: Give every item a number and use a random number generator to pick your sample. It’s like pulling names out of a hat!
B. Systematic Sampling
You pick a starting point at random and then take every \( n^{th} \) item.
Example: You have a list of 1000 people and want a sample of 50. You pick a random start between 1 and 20, then pick every \( 20^{th} \) person on the list.
C. Stratified Sampling
The population is divided into groups (called strata) based on a characteristic (like age or gender). You then take a random sample from each group proportional to the size of the group.
The Formula: \( \text{Number to sample from group} = \frac{\text{Group Size}}{\text{Population Size}} \times \text{Total Sample Size} \)
D. Quota Sampling
Similar to stratified sampling, but not random. You are told to find a certain number of people from specific groups.
Example: A researcher stands in a shopping center and is told to interview 20 men and 20 women. They just stop the first people they see until the "quota" is full.
E. Opportunity (Convenience) Sampling
You just pick the people who are available at the time.
Example: Interviewing the first 10 people who walk into the library on a Tuesday morning. It's easy, but often biased!
F. Cluster Sampling
The population is divided into "clusters" (like different towns). you pick a few clusters at random and then sample everyone inside those clusters.
G. Self-Selected Sampling
People volunteer to be part of the sample.
Example: An online survey or a radio call-in.
Did you know? Self-selected samples are often biased because only people with very strong opinions bother to join in!
Key Takeaway: Random methods (Simple, Systematic, Stratified) are usually fairer. Non-random methods (Quota, Opportunity, Self-selected) are easier but riskier.
4. Bias and Practicality
Even with the best intentions, things can go wrong. You need to be able to critique a sampling method in your exam.
What is Bias?
Bias happens when a sample doesn't represent the population fairly. A biased sample will over-estimate or under-estimate the truth.
Sources of Bias to Watch Out For:
• Sampling Frame Bias: If your "list" of the population is missing people (e.g., using a phone book misses people without landlines).
• Non-response Bias: When people chosen for the sample refuse to answer.
• Location/Time Bias: Sampling outside a gym at 6 AM only gets you a specific type of person!
Practical Issues:
Sometimes you can't be perfectly random because it’s too expensive, takes too long, or is physically impossible (you can't give every fish in the ocean a number!). In the exam, you might be asked to suggest a better way to sample while keeping these practical limits in mind.
Common Mistake to Avoid: Don't assume a sample is "bad" just because it isn't random. Sometimes Quota sampling is the only practical way to ensure you get a mix of different ages or backgrounds quickly.
Key Takeaway: Always ask: "Does this sample truly represent the whole population, or is it tilted in one direction?"
Summary Checklist
Before you move on, make sure you can:
• Define Population and Sample.
• Explain why different samples give different mean/variance estimates.
• Describe how to carry out Random, Systematic, and Stratified sampling.
• Identify Bias in a given scenario.
• Discuss why one method might be more practical than another.
Don't worry if this seems tricky at first! The more examples you look at, the easier it becomes to spot the patterns. Keep practicing!