Welcome to Statistical Distributions!

In this chapter, we move from just looking at data we've already collected to predicting what might happen in the future. Statistical distributions are like mathematical "blueprints" that tell us how likely different outcomes are in a random event. Whether you are predicting how many people will click an ad or how many seeds will sprout in a garden, these tools are your best friend!

Don't worry if the formulas look a bit intimidating at first. We will break them down step-by-step, and you'll soon see that most of the hard work is actually done by your calculator!


1. Discrete Probability Distributions

Before we dive into specific models, we need to understand what a Discrete Random Variable is.

  • Variable: A quantity that can change (usually called \( X \)).
  • Random: The outcome depends on chance.
  • Discrete: It can only take specific, separate values (like 0, 1, 2...). You can't have 1.5 siblings or flip a coin 2.3 times!

Representing the Distribution

We can show a distribution in two main ways: a table or a formula (probability mass function).

The Golden Rule: For any discrete probability distribution, the sum of all individual probabilities must equal 1. \( \sum P(X=x) = 1 \). If it doesn't add up to 1, it's not a valid distribution!

Example (Table): Imagine a three-sided spinner.
\( x \): 1, 2, 3
\( P(X=x) \): 0.2, 0.5, 0.3
Here, \( 0.2 + 0.5 + 0.3 = 1.0 \). It works!

Example (Formula): Sometimes the probability is given as a function, like \( P(X=x) = kx \) for \( x = 1, 2, 3 \). To find \( k \), you would add \( 1k + 2k + 3k \) and set it equal to 1.

Quick Review:

Key Takeaway: Discrete distributions deal with "countable" outcomes, and all probabilities in the "map" must total exactly 1.


2. The Binomial Distribution

The Binomial Distribution is a special type of distribution used when we have a fixed number of trials and only two possible outcomes: Success or Failure.

When can we use it? (The BINS Mnemonic)

To use the Binomial model, four conditions must be met. Remember BINS:

  • B - Binary: There are only two outcomes (Success or Failure).
  • I - Independent: The result of one trial doesn't affect the next.
  • N - Number: There is a fixed number of trials (\( n \)).
  • S - Success: The probability of success (\( p \)) is the same for every trial.

Notation: We write this as \( X \sim B(n, p) \).
Where \( n \) is the number of trials and \( p \) is the probability of success.

Did you know? The word "Binomial" comes from "Bi" (two) and "nom" (name/term)—referring to those two outcomes, Success and Failure!


3. Calculating Binomial Probabilities

There are two ways to find the probability of getting exactly \( x \) successes: using a formula or using your calculator.

The Formula

The probability of getting exactly \( x \) successes in \( n \) trials is:
\( P(X=x) = \binom{n}{x} p^x (1-p)^{n-x} \)

Let's break that down:

  • \( \binom{n}{x} \): This is the number of ways to arrange the successes (use the \( nCr \) button on your calculator).
  • \( p^x \): The probability of success raised to the number of successes you want.
  • \( (1-p)^{n-x} \): The probability of failure raised to the number of failures.

Using Your Calculator (The Pro Way)

In your OCR exam, you are expected to use your calculator's statistical functions.

  • Binomial PD (Probability Density): Use this to find the probability of an exact value, e.g., \( P(X = 3) \).
  • Binomial CD (Cumulative Distribution): Use this to find the probability of a range of values, specifically "up to and including," e.g., \( P(X \le 3) \).

Common Mistake to Avoid: If a question asks for "more than 3" (\( P(X > 3) \)), your calculator can't do that directly. You must calculate \( 1 - P(X \le 3) \). Always remember the total is 1!

Quick Review:

Key Takeaway: Use \( X \sim B(n, p) \) for Success/Failure scenarios. For "exactly," use PD; for "up to," use CD. Always check if you need to do \( 1 - \text{something} \).


4. Modeling and Assumptions

In exam questions, you'll often be asked to criticise a model or state assumptions.

  • Condition: A requirement for the math to work (e.g., "The trials must be independent").
  • Assumption: When you assume a condition is met in a real-life story (e.g., "We assume one student catching a cold doesn't affect the probability of another student catching it").

Example: If you are sampling items from a very large population, we often assume independence even if we don't replace the items, because the population is so big that the probability change is tiny. The syllabus notes that you can assume the population is large enough to sample without replacement unless told otherwise.

Encouraging Phrase: Context is king! When explaining assumptions, always refer back to the story in the question (e.g., talk about "seeds," "cars," or "votes," not just "\( n \)" and "\( p \)").


Summary Checklist

1. Do all my discrete probabilities add up to 1?
2. Does the situation fit BINS?
3. Have I identified \( n \) (trials) and \( p \) (probability)?
4. Am I using PD for "exactly" or CD for "less than or equal to"?
5. If the question says "at least," am I doing \( 1 - P(X \le \dots) \)?

You've reached the end of the Statistical Distributions notes! Keep practicing with your calculator—it's the best way to get comfortable with these concepts.