Introduction to Sampling

Welcome to the world of Statistics! In this chapter, we are going to look at how we gather data. Imagine you want to know the favorite pizza topping of everyone in the UK. You couldn't possibly ask all 67 million people—it would take forever and cost a fortune! Instead, you ask a smaller group and use their answers to make an educated guess about everyone else. This is the heart of sampling.

We will learn the specific techniques used to pick that smaller group and how to make sure our results are fair and accurate. Don't worry if it seems like a lot of terms at first; we will break them down one by one!

1. Population and Samples

Before we can pick a technique, we need to understand the two main "groups" in statistics:

The Population

The population is the entire group of items or people that you are interested in.
Example: All the cod in the North Sea, or every student currently enrolled at your college.

The Sample

A sample is a sub-set or a smaller part of the population that you actually study.
Example: Catching 100 cod to measure them, or interviewing 50 students in the cafeteria.

Making Inferences

When we use data from a sample to talk about the whole population, we call this inference.
Important Note: Different samples can lead to different conclusions! If you sample only the elite athletes in a school, you might "infer" that the whole school is incredibly fit. If you sample only the students in the gaming club, you might get a different result. This is why how we choose the sample matters so much.

Quick Review:
Population: The "Big Picture" (everyone/everything).
Sample: The "Small Snapshot" (the ones we actually check).
Inference: Using the snapshot to guess what the big picture looks like.

2. The "Gold Standard": Random Sampling

To be fair, every member of a population should have a chance of being picked. This leads us to Simple Random Sampling (SRS).

Simple Random Sampling

In a simple random sample, every possible sample of the required size has an equal probability of being selected.
Analogy: It’s like putting everyone’s name into a giant hat, shaking it well, and pulling out 10 names.

How to do it:
1. Assign a unique number to every member of the population (this list is called a sampling frame).
2. Use a random number generator (on your calculator or a computer) to pick the numbers.
3. The people/items matching those numbers become your sample.

Key Takeaway: Random sampling helps avoid bias (favoritism), but you need a full list of the population to do it, which isn't always possible.

3. Other Sampling Techniques

Sometimes a simple random sample isn't practical. Here are the other methods you need to know for the MEI syllabus:

Systematic Sampling

This is where you choose individuals at regular intervals from a list.
Example: Choosing every 10th person who walks through a door.
Step-by-step:
1. Calculate the interval \( k = \frac{\text{Population Size}}{\text{Sample Size}} \).
2. Pick a random starting point between 1 and \( k \).
3. Keep adding \( k \) to that starting number to find your next participants.

Stratified Sampling

Use this when the population has distinct groups (called strata) that might behave differently (e.g., different age groups or genders). The sample reflects the proportions of the population.
Formula: \( \text{Number to sample from group} = \frac{\text{Number in group}}{\text{Total population}} \times \text{Total sample size} \)

Quota Sampling

This is like stratified sampling, but not random. An interviewer is told to find a certain number of people in specific categories.
Example: "Go find 20 men and 20 women over the age of 50." Once the 'quota' is full, they stop.

Cluster Sampling

The population is divided into groups that are similar to each other (clusters). You randomly pick a few whole clusters and sample everyone inside them.
Example: If you want to sample UK Year 12s, you might randomly pick 5 schools and interview every Year 12 student in those specific schools.

Opportunity (or Convenience) Sampling

Choosing people who are available at the time and fit the criteria.
Example: Standing outside a supermarket on a Tuesday morning and asking the first 10 people you see.
Warning: This is often very biased because it only includes people who happen to be there at that specific time.

Self-Selected Sampling

Participants volunteer to be part of the sample.
Example: An online poll or a "mail-in" survey.
Did you know? Self-selected samples are often biased because people with very strong opinions are more likely to volunteer than people who don't care much.

4. Evaluating Sampling Techniques

In your exam, you might be asked to pick the best method or explain why a method is bad. You should consider bias and practicality.

Common Pitfalls (Avoid these mistakes!)

Confusing Stratified and Quota: Remember, Stratified uses random selection within the groups; Quota uses opportunity selection (the first people the interviewer finds).
Confusing Cluster and Stratified: In Stratified, you take a few people from every group. In Cluster, you take everyone from a few groups.
Ignoring Bias: Always check if the sampling method excludes a certain type of person. If you sample "working habits" by calling landlines at 2 PM, you will miss everyone who is currently at work!

Summary Table for Quick Review

Method: Simple Random
Pros: Completely fair/unbiased.
Cons: Need a full list (sampling frame) of everyone.

Method: Stratified
Pros: Ensures all groups are represented fairly.
Cons: Complex to organize; need detailed info on the population.

Method: Opportunity
Pros: Easy and cheap.
Cons: High chance of bias.

Method: Systematic
Pros: Spreads the sample evenly across the list.
Cons: Can be biased if the list has a hidden pattern.

Final Key Takeaways

• A Population is everyone; a Sample is a small part of them.
Random sampling gives every possible sample an equal chance.
Bias occurs when certain members of the population are more (or less) likely to be included than others.
• Choosing a technique is a balance between being statistically perfect and being practical in the real world.

Don't worry if this seems tricky at first—once you practice identifying these in exam questions, the differences between "Quota" and "Stratified" or "Cluster" will become much clearer!