Welcome to Probability Generating Functions!

In this chapter, we are going to learn a very clever "shortcut" in statistics. A Probability Generating Function (PGF) is a way of "packaging" an entire probability distribution into a single algebraic expression. Imagine having a giant, messy list of probabilities and being able to fold it up and put it into one simple polynomial—that is exactly what a PGF does!

By the end of these notes, you’ll be able to create these functions, use them to find the mean and variance, and even see what happens when we add different random variables together. Don't worry if it seems abstract at first; once you see the patterns, it’s just like working with algebra.


1. What is a PGF?

For a discrete random variable \(X\) that takes non-negative integer values (\(0, 1, 2, ...\)), the Probability Generating Function, denoted by \(G_X(t)\), is defined as:

\(G_X(t) = E(t^X) = \sum P(X=x)t^x\)

Breaking it down:

Think of the variable \(t\) as a "placeholder." The power of \(t\) tells you the value of \(X\), and the coefficient (the number in front of \(t\)) tells you the probability of that value occurring.

Example: If a random variable \(X\) has probabilities \(P(X=0)=0.2\), \(P(X=1)=0.5\), and \(P(X=2)=0.3\), the PGF is:
\(G_X(t) = 0.2t^0 + 0.5t^1 + 0.3t^2\)
\(G_X(t) = 0.2 + 0.5t + 0.3t^2\)

Did you know? The "100% Rule": Since the sum of all probabilities must equal 1, if you let \(t = 1\), the PGF will always equal 1. So, \(G_X(1) = 1\). This is a great way to check your work!

Key Takeaway: A PGF is just a polynomial where the coefficients are the probabilities of the distribution.


2. PGFs for Standard Distributions

You don't always have to build a PGF from scratch. The Edexcel syllabus requires you to know (and be able to derive) the PGFs for the most common distributions. Here is your "Cheat Sheet":

The Binomial Distribution: \(X \sim B(n, p)\)

\(G_X(t) = (q + pt)^n\)
(where \(q = 1 - p\))

The Poisson Distribution: \(X \sim Po(\lambda)\)

\(G_X(t) = e^{\lambda(t-1)}\)

The Geometric Distribution: \(X \sim Geo(p)\)

\(G_X(t) = \frac{pt}{1 - qt}\)
Note: This is for the version of the distribution where \(X\) starts at 1.

The Negative Binomial Distribution: \(X \sim Negative B(r, p)\)

\(G_X(t) = (\frac{pt}{1 - qt})^r\)

Memory Trick: Notice that the Negative Binomial PGF is just the Geometric PGF raised to the power of \(r\). This makes sense because a Negative Binomial variable is just the sum of \(r\) independent Geometric variables!

Key Takeaway: Memorize these four forms! They are the foundation for almost every exam question in this chapter.


3. Finding Mean and Variance using PGFs

This is where PGFs become really useful. Instead of using the long \(\sum xP(x)\) formulas, we can use calculus.

Finding the Mean (\(E(X)\))

To find the expected value, we find the first derivative of the PGF and then plug in \(t = 1\).

1. Find \(G'_X(t)\)
2. Let \(t = 1\)
3. \(E(X) = G'_X(1)\)

Finding the Variance (\(Var(X)\))

Finding the variance is a two-step derivative process. We first find the second derivative at \(t=1\), but that gives us something called the "factorial moment." To get the actual variance, use this specific formula:

1. Find \(G''_X(t)\) and let \(t = 1\)
2. \(Var(X) = G''_X(1) + G'_X(1) - [G'_X(1)]^2\)

Don't worry if this seems tricky at first! Many students forget to add the \(G'_X(1)\) or forget to subtract the mean squared. Just remember this rhythm: "Second derivative plus mean minus mean squared."

Quick Review Box:
- \(E(X) = G'(1)\)
- \(Var(X) = G''(1) + E(X) - [E(X)]^2\)


4. Sums of Independent Random Variables

What if you have two separate, independent events and you want to know the probability distribution of their total? PGFs make this incredibly easy.

If \(X\) and \(Y\) are independent random variables, and you define a new variable \(Z = X + Y\), then the PGF of \(Z\) is simply the product of the individual PGFs:

\(G_{X+Y}(t) = G_X(t) \times G_Y(t)\)

Analogy:

Imagine \(G_X(t)\) is the blueprint for a LEGO car and \(G_Y(t)\) is the blueprint for a LEGO trailer. If you want the blueprint for the car and trailer combined (\(X+Y\)), you just put the two blueprints together!

Example: If \(X \sim Po(\lambda)\) and \(Y \sim Po(\mu)\) are independent:
\(G_X(t) = e^{\lambda(t-1)}\)
\(G_Y(t) = e^{\mu(t-1)}\)
\(G_{X+Y}(t) = e^{\lambda(t-1)} \times e^{\mu(t-1)} = e^{(\lambda+\mu)(t-1)}\)
This shows that the sum of two Poisson variables is also a Poisson variable with a new mean of \(\lambda + \mu\)!

Key Takeaway: Summing independent variables in the "real world" equals multiplying their PGFs in the "math world."


5. Common Mistakes to Avoid

Even the best students can slip up on these. Keep an eye out for:

  • Forgetting to set \(t=1\): After differentiating, your answer should be a number. If your mean or variance still has a \(t\) in it, you've missed the final step!
  • Chain Rule Errors: When differentiating things like \((q + pt)^n\), remember to multiply by the derivative of the inside (which is \(p\)).
  • Mixing up p and q: Always remember that \(p\) is the probability of success and \(q\) is failure (\(1-p\)). Check which one the question is asking for!
  • Geometric Starting Point: Be careful if a question defines the Geometric distribution differently (e.g., starting from \(X=0\) instead of \(X=1\)). The standard PGF provided above is for \(X \in \{1, 2, 3, ...\}\).

Summary Checklist

Before moving on to practice questions, make sure you can:

  • Define a PGF from a probability table.
  • State the PGF for Binomial, Poisson, Geometric, and Negative Binomial distributions.
  • Differentiate a PGF to find the Mean (\(G'(1)\)).
  • Apply the variance formula (\(G''(1) + G'(1) - [G'(1)]^2\)).
  • Multiply PGFs to find the distribution of the sum of independent variables.

You've got this! PGFs are a powerful tool that turn difficult probability problems into simple algebra. Keep practicing the derivatives, and the rest will fall into place.