Welcome to Two-Variable Data!

Ever wondered if there is a pattern between how much time you spend on social media and your test scores? Or if taller people actually have bigger shoe sizes? In this chapter, we explore how two different "variables" (things we can measure) relate to each other. By the end of these notes, you’ll be able to read scatterplots like a pro and use mathematical models to predict the future! Don't worry if math usually feels like a different language—we're going to break it down piece by piece.

1. What is a Scatterplot?

A scatterplot is just a graph that uses dots to show the relationship between two sets of data. One variable is on the x-axis (horizontal) and the other is on the y-axis (vertical).

Analogy: Think of a scatterplot as a "map" of data points. Each dot represents one person or one event. For example, if we are looking at height and weight, one dot represents one person's specific height and their specific weight.

Types of Associations

When we look at the "cloud" of dots, we are looking for a pattern, also called an association or correlation:

Positive Association: The dots generally move up from left to right. This means as one thing increases, the other also increases. (Example: The more you study, the higher your grade.)

Negative Association: The dots generally move down from left to right. This means as one thing increases, the other decreases. (Example: The more miles you drive a car, the less gas is in the tank.)

No Association: The dots are scattered everywhere like spilled glitter. There is no clear pattern. (Example: Your shoe size and your favorite color.)

Quick Review:
Positive: \(x \uparrow, y \uparrow\)
Negative: \(x \uparrow, y \downarrow\)
No association: No pattern at all!

Key Takeaway: Scatterplots help us see if two things are related at a single glance. If the dots form a "line-like" shape, there is a relationship!

2. The "Model" (Line of Best Fit)

Since data in the real world is messy, the dots usually don't form a perfect line. To make sense of the mess, we draw a line of best fit (or a trend line). This is a straight line that goes through the middle of the "cloud" of dots.

Interpreting the Equation

On the SAT, you will often see this line written as a linear equation: \( y = mx + b \).

The Slope (\(m\)): This tells you how much the \(y\) value is expected to change for every one-unit increase in \(x\).
Example: If the slope is \(5\), it means for every 1 extra hour you study, your score is predicted to go up by 5 points.

The Y-intercept (\(b\)): This is the predicted value of \(y\) when \(x\) is zero.
Example: If the y-intercept is \(40\), it means if you study for 0 hours, you are predicted to score 40 points.

Did you know? The line of best fit doesn't have to touch any of the actual dots! It's just a summary of the general trend.

Common Mistake to Avoid: Don't assume the slope is always the "total." The slope is the rate of change. Always look for words like "per," "each," or "every" to identify the slope in a word problem.

Key Takeaway: The line of best fit is a "simplified version" of the data that helps us make predictions.

3. Making Predictions: Predicted vs. Actual

One of the most common SAT questions will ask you to compare a predicted value to an actual value.

Actual Value: This is the real-life data point (the dot on the graph).
Predicted Value: This is the value on the line of best fit for a specific \(x\).

Step-by-Step: How to find the difference

1. Find the \(x\) value the question is asking about on the horizontal axis.
2. Move your finger up to the dot to see the Actual value.
3. Move your finger to the line at that same \(x\) to see the Predicted value.
4. The vertical "gap" between the dot and the line is the error (often called a residual).

If the dot is above the line, the model underestimated the actual value. If the dot is below the line, the model overestimated it.

Key Takeaway: The "Line" is the math's best guess; the "Dots" are what really happened.

4. Linear vs. Exponential Models

The SAT wants you to know the difference between a relationship that grows at a constant rate and one that grows faster and faster.

Linear Growth

Shape: A straight line.
Rule: You add the same amount every time.
Example: You save \$10 every week. (\(10, 20, 30, 40...\))

Exponential Growth

Shape: A curve that gets steeper and steeper.
Rule: You multiply by the same percentage or factor every time.
Example: A population of bacteria doubles every hour. (\(2, 4, 8, 16...\))

Memory Trick:
Linear = Line (Straight)
Exponential = Explosion (Gets big very fast!)

Key Takeaway: If a problem mentions a "constant rate" or "fixed amount," think Linear. If it mentions "percent increase," "doubling," or "tripling," think Exponential.

5. Outliers: The Rebels of Data

Sometimes you’ll see a dot that is far away from all the others. This is called an outlier.

Analogy: If you are measuring the height of students in a 5th-grade class and an NBA player walks in, that NBA player's height is an outlier. It doesn't fit the pattern of the rest of the group.

Why it matters: Outliers can "pull" the line of best fit toward them, making the model less accurate for the rest of the data. When identifying a trend, we often look at the overall "cloud" and ignore the single weird points.

Quick Summary for the Test:
1. Look at the direction: Up = Positive, Down = Negative.
2. Interpret the slope: It's the "for every one" change.
3. Check the y-intercept: It's the starting value (where \(x = 0\)).
4. Spot the difference: Dots = Actual, Line = Predicted.
5. Linear vs. Exponential: Adding vs. Multiplying.

Don't worry if this seems tricky at first! Scatterplots are all about visual patterns. Once you start "seeing" the lines inside the clouds of dots, you'll be an expert!