Welcome to the World of Continuous Random Variables!

In your A Level Mathematics journey, you’ve already met Discrete Random Variables—things you can count, like the number of heads in a coin toss. Now, we are stepping into the "smooth" side of statistics: Continuous Random Variables (CRVs).

Think of the difference between walking up a flight of stairs (discrete steps) and sliding down a ramp (a continuous path). CRVs are used for things we measure rather than count, such as the exact time you wait for a bus, the height of a tree, or the weight of an apple. Don't worry if the calculus involved feels a bit heavy at first; we will break it down step-by-step!

1. The Probability Density Function (PDF)

A Probability Density Function, written as \(f(x)\), is a formula that describes the shape of a continuous distribution. Unlike discrete variables, the probability of a CRV being exactly one value (like exactly 1.50000... cm) is technically zero. Instead, we look at the probability of a value falling within a range.

Key Properties of a PDF:

  • The function is never negative: \(f(x) \ge 0\) for all \(x\).
  • The Golden Rule: The total area under the curve must equal 1. Mathematically: \(\int_{-\infty}^{\infty} f(x) dx = 1\).
  • Probability is Area: The probability that \(X\) is between \(a\) and \(b\) is the area under the curve between those two points: \(P(a \le X \le b) = \int_{a}^{b} f(x) dx\).

Analogy: Imagine a PDF is like a pile of sand spread out along a line. The total amount of sand is "1 unit." To find the probability of a range, you are just measuring how much sand sits on that specific part of the line.

Common Mistake to Avoid: Thinking that \(f(x)\) is the probability. It isn't! The area is the probability. If you calculate an area and get a number bigger than 1 or a negative number, double-check your integration!

Quick Review:
1. \(f(x) \ge 0\)
2. Total Area = 1
3. \(P(a < X < b) = P(a \le X \le b)\) (Boundaries don't add extra probability in CRVs!)

2. The Cumulative Distribution Function (CDF)

The Cumulative Distribution Function, written as \(F(x)\), tells us the probability that the variable is less than or equal to a certain value.

\(F(x) = P(X \le x) = \int_{-\infty}^{x} f(t) dt\)

The Relationship between PDF and CDF:

This is a "Two-Way Street":

  • To get from PDF to CDF: Integrate.
  • To get from CDF to PDF: Differentiate. \(f(x) = \frac{d}{dx}F(x)\).

Did you know? The CDF always starts at 0 (on the far left) and finishes at 1 (on the far right) because it's accumulating all the probability as it goes along.

Key Takeaway: The CDF is your best friend for finding probabilities quickly without re-integrating every time.

3. Expectation and Variance

Just like with discrete variables, we want to know the "average" (Mean) and the "spread" (Variance) of our data. Because we are dealing with continuous curves, we use integration instead of summation.

The Formulas:

  • Mean (Expectation): \(E(X) = \mu = \int_{-\infty}^{\infty} x f(x) dx\)
  • Variance: \(Var(X) = \sigma^2 = E(X^2) - [E(X)]^2\), where \(E(X^2) = \int_{-\infty}^{\infty} x^2 f(x) dx\)
  • Expectation of a function \(g(X)\): \(E(g(X)) = \int_{-\infty}^{\infty} g(x) f(x) dx\)

Memory Aid: For the mean, you "multiply by \(x\) then integrate." For \(E(X^2)\), you "multiply by \(x^2\) then integrate." Always remember to subtract the mean squared at the end of your variance calculation!

4. Median, Quartiles, and Percentiles

Sometimes we want to find the "middle" value or the "top 10%."

  • The Median (\(m\)) is the value where half the area is to the left and half is to the right. Solve: \(F(m) = 0.5\).
  • The Lower Quartile (\(Q_1\)) is where 25% of the data lies. Solve: \(F(Q_1) = 0.25\).
  • The Upper Quartile (\(Q_3\)) is where 75% of the data lies. Solve: \(F(Q_3) = 0.75\).

Step-by-Step Process:
1. Find the CDF, \(F(x)\), by integrating your PDF.
2. Set \(F(x) = \text{target probability}\) (e.g., 0.5 for median).
3. Solve for \(x\). Ensure your answer falls within the valid range for that function!

5. Special Distributions

The OCR syllabus highlights two specific distributions you need to master besides the Normal distribution.

A. Continuous Uniform Distribution

This is the "fair" distribution where every value in a range \([a, b]\) is equally likely. The PDF looks like a rectangle.

  • PDF: \(f(x) = \frac{1}{b-a}\) for \(a \le x \le b\).
  • Mean: \(E(X) = \frac{a+b}{2}\) (Right in the middle!).
  • Variance: \(Var(X) = \frac{(b-a)^2}{12}\).

B. Exponential Distribution

This is often used to model the time between events (like the time between radioactive decays or customers arriving in a shop).

  • PDF: \(f(x) = \lambda e^{-\lambda x}\) for \(x \ge 0\).
  • Mean: \(E(X) = \frac{1}{\lambda}\).
  • Variance: \(Var(X) = \frac{1}{\lambda^2}\).

Interesting Connection: The Exponential distribution is closely linked to the Poisson distribution. If events occur according to a Poisson process with rate \(\lambda\), then the time between those events follows an Exponential distribution with the same \(\lambda\).

6. Functions of a Random Variable

Sometimes you know the distribution of \(X\), but you want to find the distribution of a related variable, like \(Y = X^3\) or \(Y = 2X + 5\).

How to solve these:

  1. Start with the CDF of \(Y\): \(F_Y(y) = P(Y \le y)\).
  2. Substitute the relationship: \(P(g(X) \le y)\).
  3. Rearrange to get \(X\) on its own: \(P(X \le g^{-1}(y))\).
  4. This is now the CDF of \(X\) evaluated at a certain point!
  5. Once you have the new CDF (\(F_Y(y)\)), differentiate it to find the new PDF (\(f_Y(y)\)).

Example: If \(Y = X^3\), then \(P(Y \le y) = P(X^3 \le y) = P(X \le y^{1/3}) = F_X(y^{1/3})\).

Key Takeaway: Always start with the CDF when transforming variables. It makes the logic much safer than trying to jump straight to the PDF.

Final Quick Review Box

The CRV Essentials:
- Integrate PDF to get Probabilities or CDF.
- Differentiate CDF to get PDF.
- Total Area must be 1.
- Mean is the "Average" (\(\int x f(x) dx\)).
- Median is the 0.5 point on the CDF.

You've got this! Continuous variables might seem abstract, but they are just the mathematical way of describing the infinite variety of the real world. Practice your integration, and the statistics will follow!