Introduction to Continuous Random Variables (CRVs)

Welcome! In your previous maths studies, you’ve looked at discrete random variables—things you can count, like the number of heads on a coin flip or the number of students in a class. In this chapter, we move into the world of the continuous.

Think of a Continuous Random Variable (CRV) as something that can take any value within a range. While a digital clock shows discrete minutes, a traditional analogue clock hand moves continuously through every possible fraction of a second. We use CRVs to model things like time, height, or weight. Understanding CRVs is essential because the real world rarely happens in neat, whole-number steps!

1. The Probability Density Function (pdf)

For a discrete variable, we use a table to show probabilities. For a CRV, we use a function called the Probability Density Function, written as \( f(x) \).

Key Concept: Area = Probability
In a CRV, the probability of the variable falling between two values is the area under the curve \( f(x) \) between those two points. Because of this, two very important rules apply:
1. The total area under the entire curve must always equal 1.
\( \int_{-\infty}^{\infty} f(x) \, dx = 1 \)
2. The function \( f(x) \) can never be negative (you can't have negative probability!).

Did you know?
For a CRV, the probability of the variable being exactly one specific value is always zero. \( P(X = 2) = 0 \). This is because a single point has no width, and therefore no area. We only talk about probabilities over an interval, like \( P(1 < X < 3) \).

Quick Review:
- Discrete: Sum of probabilities \( \sum P(X=x) = 1 \)
- Continuous: Integral of pdf \( \int f(x) \, dx = 1 \)

2. Finding Probabilities

To find the probability that \( X \) lies between \( a \) and \( b \), you simply integrate the pdf between those limits:

\( P(a \le X \le b) = \int_{a}^{b} f(x) \, dx \)

Don't worry if this seems tricky at first! Just remember that "finding the probability" is just a fancy way of saying "find the area under the graph." If the function is a simple shape like a rectangle or triangle, you can even use basic geometry instead of integration!

3. Cumulative Distribution Function (cdf)

The Cumulative Distribution Function, written as \( F(x) \), represents the "running total" of the probability. it tells you the probability that the variable is less than or equal to a certain value \( x \).

The Relationship:
- To get \( F(x) \) from \( f(x) \): Integrate.
\( F(x) = P(X \le x) = \int_{-\infty}^{x} f(t) \, dt \)
- To get \( f(x) \) from \( F(x) \): Differentiate.
\( f(x) = \frac{d}{dx} F(x) \)

Common Mistake to Avoid:
When integrating to find \( F(x) \), don't forget the constant of integration (\( + C \)), or use a dummy variable like \( t \) and integrate from the lower bound to \( x \). \( F(x) \) must always equal 0 at the very start of the range and 1 at the very end.

4. Mean, Variance, and Standard Deviation

Just like with discrete variables, we want to know the average (mean) and the spread (variance) of our CRV.

The Expected Value (Mean):
\( E(X) = \mu = \int x f(x) \, dx \)
Analogy: Think of this as the "balance point" of the area under the curve.

The Variance:
To find the variance, first find \( E(X^2) \):
\( E(X^2) = \int x^2 f(x) \, dx \)
Then use the familiar formula:
\( Var(X) = E(X^2) - [E(X)]^2 \)

Standard Deviation:
This is simply the square root of the variance: \( \sigma = \sqrt{Var(X)} \).

Key Takeaway: Always find \( E(X) \) first, then \( E(X^2) \), then the variance. It's a three-step process!

5. Median and Quartiles

The median (\( m \)) is the value where exactly half the area lies to the left and half to the right. To find it, solve for \( m \) in this equation:
\( F(m) = 0.5 \) or \( \int_{-\infty}^{m} f(x) \, dx = 0.5 \)

Similarly, for quartiles:
- Lower Quartile (\( Q_1 \)): Solve \( F(Q_1) = 0.25 \)
- Upper Quartile (\( Q_3 \)): Solve \( F(Q_3) = 0.75 \)

Step-by-Step for Median:
1. Find the expression for the cdf, \( F(x) \).
2. Set that expression equal to 0.5.
3. Solve for \( x \). That's your median!

6. Linear Functions and Expectation of \( g(X) \)

Sometimes we don't just want the mean of \( X \), but the mean of a function of \( X \), like \( 5X^3 \) or \( 6X^{-1} \).

Expectation of a function \( g(X) \):
\( E(g(X)) = \int g(x) f(x) \, dx \)

Linear Transformations:
If you have a linear transformation \( aX + b \):
- Mean: \( E(aX + b) = aE(X) + b \)
- Variance: \( Var(aX + b) = a^2 Var(X) \)
Memory Trick: Adding a constant (\( b \)) shifts the whole graph but doesn't change how spread out it is, so \( b \) disappears in the variance formula!

7. The Rectangular (Uniform) Distribution

The simplest CRV is the Rectangular Distribution. This is used when every value in a range \( [a, b] \) is equally likely to occur.

The Formulas:
- pdf: \( f(x) = \frac{1}{b-a} \) for \( a \le x \le b \) (and 0 otherwise).
- Mean: \( E(X) = \frac{a+b}{2} \) (The exact middle).
- Variance: \( Var(X) = \frac{(b-a)^2}{12} \)

Quick Review:
If a bus arrives anywhere between 0 and 10 minutes, \( a=0 \) and \( b=10 \). The height of the pdf is \( \frac{1}{10-0} = 0.1 \). The average wait time is 5 minutes.

8. Combining Independent Variables

If you have two independent random variables \( X \) and \( Y \) (meaning one does not affect the other), the following rules apply whether they are discrete or continuous:

Sum of Expectations:
\( E(X + Y) = E(X) + E(Y) \)

Sum of Variances:
\( Var(X + Y) = Var(X) + Var(Y) \)

Important Note: This rule for variance only works if the variables are independent. If they are linked, the formula becomes much more complicated (but you don't need to worry about that for this section!).

Key Takeaway Summary:
- pdf \( f(x) \): The curve. Area under it is probability. Total area = 1.
- cdf \( F(x) \): The running total. \( F(x) = P(X \le x) \).
- Mean \( E(X) \): Integrate \( x f(x) \).
- Median: Where \( F(x) = 0.5 \).
- Rectangular: The "fair" distribution where the pdf is a flat line.