Welcome to Continuous Probability Distributions!
In your earlier studies, you worked with discrete random variables—things you can count, like the number of heads on a coin flip or the score on a die. In this chapter, we step into the world of continuous random variables. These are things we measure, like time, height, or temperature. Because measurements can take any value (like 1.5 seconds, 1.51 seconds, or 1.5123... seconds), we need a slightly different toolkit to handle them. Don't worry if this seems tricky at first; it's mostly just applying the integration and differentiation skills you already have to the world of statistics!
1. The Probability Density Function (PDF)
For continuous variables, we use a function called the Probability Density Function, usually written as \(f(x)\).
Think of it like this: Imagine a histogram where the bars get thinner and thinner until they form a smooth curve. That curve is your PDF. The most important thing to remember is that for a continuous variable, the area under the curve represents the probability.
Key Properties of \(f(x)\):
1. The function can never be negative: \(f(x) \geq 0\) for all \(x\).
2. The total area under the whole curve must equal 1: \(\int_{-\infty}^{\infty} f(x) dx = 1\).
3. The probability that \(X\) falls between two values \(a\) and \(b\) is the area between them: \(P(a < X \leq b) = \int_{a}^{b} f(x) dx\).
Common Mistake to Avoid: In continuous distributions, the probability of the variable being exactly one value is zero! For example, \(P(X = 5) = 0\). This is because the "area" of a single point is zero. Therefore, \(P(X < 5)\) is exactly the same as \(P(X \leq 5)\).
Quick Review: To find a probability, just integrate the PDF between your two limits!
2. The Cumulative Distribution Function (CDF)
The Cumulative Distribution Function, written as \(F(x)\), tells us the probability that the variable is less than or equal to a certain value.
Analogy: Imagine filling a bucket with water. The PDF tells us how fast the water is flowing at any moment, but the CDF tells us the total amount of water in the bucket at time \(x\).
How to calculate the CDF:
\(F(x_0) = P(X \leq x_0) = \int_{-\infty}^{x_0} f(x) dx\)
When you find \(F(x)\), you will usually get a piecewise function. It will start at 0, grow as it "accumulates" probability, and must end at 1 once you've covered all possible values of \(x\).
3. Switching Between PDF and CDF
The relationship between the PDF (\(f(x)\)) and the CDF (\(F(x)\)) is one of the most useful tools in this chapter. It's just simple calculus!
1. To get from PDF to CDF: Integrate! \(F(x) = \int f(x) dx\).
2. To get from CDF to PDF: Differentiate! \(f(x) = \frac{dF(x)}{dx}\).
Memory Aid:
Differentiate to go Down (from the total amount \(F(x)\) to the density \(f(x)\)).
Integrate to Increase (from the density \(f(x)\) to the accumulated total \(F(x)\)).
Key Takeaway: If a question gives you the CDF and asks for the PDF, just find the derivative of the function for each part of its range.
4. Mean, Variance, and Expectation
Just like with discrete variables, we want to find the "average" (Mean) and the "spread" (Variance) of our data.
The Formulas:
Mean (Expected Value): \(E(X) = \mu = \int x f(x) dx\)
Expected Value of a Function: \(E(g(X)) = \int g(x) f(x) dx\)
Variance: \(Var(X) = \sigma^2 = E(X^2) - [E(X)]^2\), where \(E(X^2) = \int x^2 f(x) dx\).
Step-by-Step for Variance:
1. Find \(E(X)\) by integrating \(x \times f(x)\).
2. Find \(E(X^2)\) by integrating \(x^2 \times f(x)\).
3. Subtract the square of your first answer from your second answer. Don't forget to square the mean! This is the most common place where marks are lost.
5. Mode, Median, and Quartiles
These are different ways to describe the "center" or "position" of the distribution.
The Mode: This is the value of \(x\) where the PDF, \(f(x)\), is at its maximum.
How to find it: Look at the function. If it’s a simple curve, use differentiation to find the stationary point (\(f'(x) = 0\)). If it’s a straight line, it will be at one of the boundaries.
The Median (\(m\)): This is the value where half the area is to the left and half is to the right.
How to find it: Solve \(F(m) = 0.5\).
Quartiles and Percentiles: These work just like the median. For the lower quartile (\(Q_1\)), solve \(F(Q_1) = 0.25\). For the 90th percentile, solve \(F(x) = 0.90\).
Key Takeaway: Always use the CDF (\(F(x)\)) to find the median and quartiles. It's much easier than integrating from scratch every time!
6. Skewness
Skewness describes the "lean" of the distribution. You can often tell the skewness by looking at the shape of the graph, but you may need to justify it using the values you've calculated.
Positive Skew: The "tail" is on the right. Usually, \(Mode < Median < Mean\).
Negative Skew: The "tail" is on the left. Usually, \(Mean < Median < Mode\).
Zero Skew: The distribution is perfectly symmetrical. \(Mean = Median = Mode\).
Did you know? Many real-world measurements, like household income, have a positive skew because a few people earn a very large amount, pulling the "mean" to the right!
7. The Continuous Uniform Distribution
This is a special, simple case where the probability is constant over a specific range \([a, b]\). It is also called the Rectangular Distribution because its PDF looks like a rectangle.
Key Facts for \(X \sim U(a, b)\):
PDF: \(f(x) = \frac{1}{b - a}\) for \(a \leq x \leq b\). (Because the height \(\times\) width must equal 1).
Mean: \(E(X) = \frac{a + b}{2}\) (Exactly in the middle!).
Variance: \(Var(X) = \frac{(b - a)^2}{12}\).
CDF: \(F(x) = \frac{x - a}{b - a}\) for \(a \leq x \leq b\).
Quick Tip: You are often asked to derive the mean and variance for the uniform distribution. To do this, simply use the standard \(E(X)\) and \(Var(X)\) integration formulas using the PDF \(f(x) = \frac{1}{b-a}\).
Chapter Summary Checklist
Can you...
- Integrate a PDF to find probabilities or the value of a constant \(k\)?
- Switch between PDF and CDF using differentiation and integration?
- Calculate the Mean and Variance using the integral formulas?
- Find the Median and Mode of a given distribution?
- Recognize and use the shortcuts for the Continuous Uniform Distribution?
- Describe the skewness of a distribution with a clear justification?
Don't be afraid of the integration! Most exam questions use simple powers of \(x\), so as long as you can handle \(\int kx^n dx = \frac{kx^{n+1}}{n+1}\), you are well on your way to mastering this chapter!