Welcome to Statistical Distributions!

In this chapter, we are going to explore how we can use mathematical models to predict the likelihood of real-world events. Whether it’s the number of heads you get when flipping a coin or the average height of people in a room, statistical distributions help us make sense of the "randomness" around us.

By the end of these notes, you’ll be able to decide which model to use and how to calculate probabilities like a pro. Don't worry if it seems like a lot of new symbols at first—we'll break everything down step-by-step!

1. The Basics: Discrete vs. Continuous

Before we dive into specific models, we need to know what kind of data we are dealing with. In Statistics, data usually falls into two camps:

1. Discrete Data: Things you can count. For example, the number of siblings you have or the number of goals scored in a match. You can't have 2.5 siblings!
2. Continuous Data: Things you measure. For example, height, weight, or time. These can take any value within a range (like 172.53 cm).

Quick Review:
- Counting? Use a Discrete distribution (like the Binomial Distribution).
- Measuring? Use a Continuous distribution (like the Normal Distribution).

2. The Binomial Distribution (Discrete)

The Binomial distribution is used when you have a fixed number of trials and you are looking for a "Success" or a "Failure."

When can we use it? The "BINS" Mnemonic

To use the Binomial model, the situation must meet these four criteria:

B – Binary: There are only two possible outcomes (Success or Failure).
I – Independent: Each trial doesn't affect the next one (like flipping a coin).
N – Number: There is a fixed number of trials (\(n\)).
S – Success: The probability of success (\(p\)) stays the same for every trial.

The Notation

We write this as: \(X \sim B(n, p)\)

Example: If you flip a fair coin 10 times, \(n=10\) and \(p=0.5\). We write \(X \sim B(10, 0.5)\).

Calculating Probabilities

To find the probability of getting exactly \(r\) successes, we use the formula:
\(P(X = r) = \binom{n}{r} \times p^r \times (1-p)^{n-r}\)

Step-by-Step Example:
If a dice is rolled 5 times, what is the probability of getting exactly two 6s?
1. Identify \(n\): There are 5 rolls, so \(n = 5\).
2. Identify \(p\): Probability of a 6 is \(1/6\).
3. Identify \(r\): We want exactly 2 successes, so \(r = 2\).
4. Plug it in: \(P(X=2) = \binom{5}{2} \times (1/6)^2 \times (5/6)^3\).

Common Mistake: Forgetting that the probabilities must add up to 1. If the probability of success is \(p\), the probability of failure is always \(1 - p\) (often called \(q\)).

Key Takeaway:

The Binomial distribution is for counting successes in a set number of independent tries.

3. The Normal Distribution (Continuous)

The Normal distribution is the famous "Bell Curve." It is used for data that clusters around a central average, like shoe sizes or birth weights.

Key Features

1. Symmetrical: The left side is a mirror image of the right side.
2. The Mean (\(\mu\)): This is the peak of the curve. The mean, median, and mode are all the same!
3. The Standard Deviation (\(\sigma\)): This tells us how "spread out" the bell is. A big \(\sigma\) means a wide, flat bell. A small \(\sigma\) means a tall, thin bell.
4. Points of Inflection: These occur at exactly one standard deviation away from the mean (\(\mu \pm \sigma\)). This is where the curve changes from "curving down" to "curving out."

The Notation

We write this as: \(X \sim N(\mu, \sigma^2)\)

Did you know? Because it's a continuous curve, the area under the curve represents the probability. The total area under the whole curve is always exactly 1.

Using Your Calculator

For AQA Paper 3, you are expected to use your calculator's statistical functions rather than formulas for the Normal distribution.
- Use NormalCD to find the probability between two values (e.g., "What is the probability a student is between 160cm and 170cm tall?").
- Use Inverse Normal if you know the probability but want to find the value (e.g., "How tall must a student be to be in the tallest 5%?").

Analogy: Think of the Normal distribution like a hill. Most people are standing on the top (the mean). As you move away from the center, fewer and fewer people are found there.

Key Takeaway:

The Normal distribution is defined by its mean and variance. It is perfectly symmetrical.

4. Choosing the Right Model (N3)

In the exam, you'll often have to justify why you chose a specific distribution.

Choose Binomial if:
- You are counting the number of "wins."
- You have a set number of trials.
- Outcomes are independent.

Choose Normal if:
- The data is continuous (measurements).
- The data is symmetrical about the mean.
- The data is unlikely to be far away from the mean (no extreme outliers).

5. Linking Distributions

The syllabus asks you to understand the link between these two models.
Under certain conditions, the Binomial distribution starts to look very much like the Normal distribution. Specifically, if you have a large number of trials (\(n\)) and the probability (\(p\)) is close to 0.5, the Binomial histogram will form a perfect bell shape!

Quick Review Box:
- Binomial: \(P(X=r)\) finds probability of an exact number.
- Normal: \(P(X=x)\) is always 0! We only find the probability of a range (e.g., \(P(X < 10)\)).
- Standard Deviation: Measures spread. Small = consistent, Large = varied.

Summary Checklist

- Can I explain the BINS criteria for Binomial?
- Can I use my calculator to find Normal probabilities?
- Do I know that the area under the Normal curve equals 1?
- Can I identify the mean and standard deviation from \(N(10, 4)\)? (Careful: the second number is \(\sigma^2\), so \(\sigma = 2\)).

Don't worry if this seems tricky at first! The best way to master distributions is to practice using your calculator. Once you get the hang of the menus, the math becomes much easier!