Welcome to Continuous Random Variables!

In your previous studies, you’ve likely worked with Discrete Random Variables—things you can count, like the number of heads in a coin toss or the result of a dice roll. In this chapter, we step into the world of Continuous Random Variables (CRVs). These deal with data that can take any value within a range, like the time it takes for a kettle to boil, the height of a tree, or the exact weight of an apple.

Don't worry if this seems a bit abstract at first! Think of it as moving from a staircase (where you can only stand on specific steps) to a smooth ramp (where you can stand at any height). We will use some calculus to find probabilities, but we'll break it down step-by-step.

1. The Probability Density Function (pdf)

For a CRV, we use a function called the Probability Density Function, written as \(f(x)\). This function describes the shape of the distribution.

The Golden Rule: For \(f(x)\) to be a valid pdf, two things must be true:
1. The function can never be negative: \(f(x) \geq 0\) for all \(x\).
2. The total area under the curve must equal 1. Mathematically, this means: \(\int_{-\infty}^{\infty} f(x) dx = 1\).

Analogy: The Rain Gauge

Imagine a rain gauge with a strange shape. The total amount of water it can hold is exactly 1 liter. The height of the gauge at any point is like the \(f(x)\). To find how much "probability water" is in a specific section, you measure the area of that section.

Quick Review:
- Discrete: Probabilities are values at specific points.
- Continuous: Probabilities are areas under a curve.
- Key Fact: The probability of a CRV being exactly one value is zero! \(P(X = 5) = 0\). We only measure intervals, like \(P(4.9 < X < 5.1)\).

2. Finding Probabilities in an Interval

To find the probability that an observation lies between two values \(a\) and \(b\), we calculate the area under the curve between those points using integration.

\(P(a < X < b) = \int_{a}^{b} f(x) dx\)

Step-by-Step Process:

1. Identify the range \([a, b]\) you are interested in.
2. Set up the integral of the function \(f(x)\) between these limits.
3. Integrate and solve!

Common Mistake: Forgetting the limits. Always check the range where the function is defined. If a function is defined for \(0 < x < 4\) and you are asked for \(P(x > 3)\), your integral should go from 3 to 4, not 3 to infinity!

3. Measures of Average: Mean and Median

Just like with discrete data, we want to find the "center" of our CRV.

The Mean (Expected Value)

The Mean, or \(E(X)\), represents the "balance point" of the probability distribution. The formula is:
\(E(X) = \int x f(x) dx\)

(Hint: You are just multiplying your function by \(x\) before you integrate!)

The Median and Quartiles

The Median (\(m\)) is the value where 50% of the area lies to the left and 50% to the right. To find it, solve for \(m\) in this equation:
\(\int_{-\infty}^{m} f(x) dx = 0.5\)

Similarly:
- For the Lower Quartile (\(Q_1\)), the area is 0.25.
- For the Upper Quartile (\(Q_3\)), the area is 0.75.

Did you know? The median is the "fair cut" point. If the pdf was a piece of cake, the median is where you would cut it so two people get exactly the same amount of cake!

4. Measures of Spread: Variance and Standard Deviation

To see how "spread out" our data is, we calculate the Variance and Standard Deviation.

The Variance Formula

\(Var(X) = E(X^2) - [E(X)]^2\)

To find \(E(X^2)\), we use a similar integral to the mean, but with \(x^2\):
\(E(X^2) = \int x^2 f(x) dx\)

Memory Aid: "The Mean of the Squares minus the Square of the Mean." (MS - SM)

Standard Deviation

The Standard Deviation is simply the square root of the variance: \(\sigma = \sqrt{Var(X)}\).

Key Takeaway:
1. Find \(E(X)\) using \(\int x f(x) dx\).
2. Find \(E(X^2)\) using \(\int x^2 f(x) dx\).
3. Plug them into the Variance formula.

5. Expectation of Functions: \(E(g(X))\)

Sometimes, we aren't just interested in \(X\), but a function of \(X\), such as \(X^3\) or \(1/X\). The syllabus requires you to handle functions like \(5X^3\), \(18X^{-3}\), or \(6X^{-1}\).

The rule is simple: replace the \(x\) in the mean formula with your new function \(g(x)\):
\(E(g(X)) = \int g(x) f(x) dx\)

Example: To find \(E(18X^{-3})\), you would calculate \(\int (18x^{-3}) f(x) dx\).

6. Linear Transformations

What if we change the scale of our data? For example, if \(X\) is a temperature in Celsius and we want to convert it to a new scale \(aX + b\). We use these handy "short-cut" rules:

For Expectation: \(E(aX + b) = aE(X) + b\)
(The mean shifts and scales exactly as you'd expect.)

For Variance: \(Var(aX + b) = a^2 Var(X)\)
(Adding \(b\) doesn't change the spread, and scaling by \(a\) increases the variance by \(a^2\) because variance is "squared" units.)

Quick Review Box:
If \(E(X) = 10\) and \(Var(X) = 4\):
- \(E(2X + 5) = 2(10) + 5 = 25\)
- \(Var(2X + 5) = 2^2 \times 4 = 16\)

7. Independent Random Variables

If you have two random variables, \(X\) and \(Y\), that are independent (meaning the outcome of one doesn't affect the other), you can combine their means and variances easily.

Sum of Expectations: \(E(X + Y) = E(X) + E(Y)\)
Sum of Variances: \(Var(X + Y) = Var(X) + Var(Y)\)

Important Note: This variance rule only works if the variables are independent! Also, even if you are subtracting (finding \(Var(X - Y)\)), you still add the variances because the uncertainty (spread) always increases when you combine two variables!

Summary Table: Key Formulas

Total Probability: \(\int f(x) dx = 1\)
Mean \(E(X)\): \(\int x f(x) dx\)
\(E(X^2)\): \(\int x^2 f(x) dx\)
Variance \(Var(X)\): \(E(X^2) - [E(X)]^2\)
Median \(m\): Solve \(\int_{-\infty}^{m} f(x) dx = 0.5\)

Final Tip: Always sketch the function if you can! It helps you visualize the area and check if your calculated mean or median looks "sensible" on the graph.