Welcome to Scatter Graphs and Correlation!

In this chapter, we are going to explore how two different things (we call these variables) might be related to each other. For example, does it usually get hotter as the sun stays out longer? Do taller people tend to have bigger feet?

By the end of these notes, you will be able to spot patterns in data and even make predictions about the future. Don't worry if you find graphs a bit tricky; we will take it one step at a time!

What is a Scatter Graph?

A scatter graph is a way of displaying data for two variables on a grid. We plot these as points, just like you would with coordinates \( (x, y) \).

Prerequisite Check: Remember, to plot a point, you go "along the corridor" (the x-axis) and then "up or down the stairs" (the y-axis).

Imagine you measured the height of your friends and their shoe sizes. On a scatter graph, each person would be represented by a single dot. The position of that dot tells you both their height and their shoe size at the same time!

Quick Review:

• A scatter graph uses dots to show the relationship between two sets of data.
• Each dot represents one "pair" of data (like one person's height and weight).

Understanding Correlation

When we look at all the dots on a scatter graph together, we look for a pattern. This pattern is called correlation. Correlation tells us if there is a relationship between the two variables.

1. Positive Correlation

This happens when both variables increase together. As the value on the x-axis goes up, the value on the y-axis also goes up.

Example: The more hours you spend revising, the higher your test score is likely to be.
Visual Trick: If the dots seem to be heading "uphill" from left to right (like an airplane taking off), it is positive correlation.

2. Negative Correlation

This happens when one variable increases while the other decreases.

Example: The more time you spend watching TV, the less time you have for sleep.
Visual Trick: If the dots seem to be heading "downhill" from left to right (like a slide), it is negative correlation.

3. No Correlation

This happens when there is no clear pattern at all. The dots are just scattered everywhere like spilled cereal! This means the two things are not related.

Example: Your score in a Maths test and the length of your hair. Having longer hair won't make you better (or worse) at Maths!

Key Takeaway:

Positive = Both go up together (Uphill).
Negative = One goes up, one goes down (Downhill).
No Correlation = No relationship (Messy dots).

Strength of Correlation

Not all relationships are perfect. We use the terms strong and weak to describe how clear the pattern is:

Strong Correlation: The dots are very close together and form a narrow "path." It's very easy to see the direction.
Weak Correlation: The dots are more spread out, but you can still see a general "uphill" or "downhill" trend.

Did you know? Just because two things have a correlation, it doesn't always mean one causes the other. For example, ice cream sales and shark attacks both go up in the summer because the weather is hot, but eating ice cream doesn't cause sharks to bite!

The Line of Best Fit

If there is a correlation, we can draw a Line of Best Fit. This is a straight line that goes through the middle of the data points.

How to draw a perfect Line of Best Fit:
1. Use a ruler and a sharp pencil (this is a must!).
2. Try to follow the general direction of the dots.
3. Try to have roughly the same number of dots above the line as below the line.
4. Do not just connect the first dot to the last dot.
5. Do not force the line to go through \( (0,0) \) unless it naturally fits there.

Key Takeaway:

The Line of Best Fit is like a "summary" of the data. It helps us ignore the small wobbles and see the main trend.

Making Predictions

The main reason we draw a Line of Best Fit is to predict values we don't know yet.

Example: If you have a graph showing temperature and ice cream sales, and you want to know how many ice creams you might sell at \( 25^{\circ}C \):
1. Find \( 25^{\circ}C \) on the bottom (x-axis).
2. Move your finger up to the Line of Best Fit.
3. Move your finger across to the side (y-axis) to read the number of sales.

Caution: This is most accurate for values inside the range of data you already have (this is called interpolation). If you try to predict something far outside your data range (like the temperature of the sun!), your prediction might be very wrong (this is called extrapolation).

Common Mistakes to Avoid

Connecting the dots: Never draw a "dot-to-dot" line. The Line of Best Fit must be a single, straight line.
Ignoring Outliers: An outlier is a dot that is far away from all the others (a bit of a "loner"). When drawing your line, ignore the outliers so they don't pull your line away from the main group.
No ruler: A wobbly Line of Best Fit will lead to wrong predictions! Always use a ruler.

Summary Quick-Check

1. What does positive correlation look like? It goes "uphill" from left to right.
2. When should you draw a Line of Best Fit? Only when there is a clear correlation (positive or negative).
3. What is an outlier? A piece of data that doesn't fit the pattern of the rest.
4. How do you find a missing value? Go from the axis to the Line of Best Fit, then across to the other axis.