Welcome to Dynamic Programming!

Welcome to one of the most powerful tools in Decision Mathematics! If you’ve ever had to plan a long journey with multiple stops, or tried to figure out the best way to spend a limited budget over several months, you’ve already used the logic of Dynamic Programming (DP).

In this chapter, we explore a "divide and conquer" approach. Instead of trying to solve one giant, scary problem all at once, we break it down into smaller, bite-sized "stages" and work our way through them. Don't worry if it looks like a lot of data at first—once you learn the rhythm of the table, it becomes very repetitive (in a good way!).

1. The Core Idea: Bellman’s Principle of Optimality

Before we look at tables and numbers, we need to understand the golden rule of DP. It’s called Bellman’s Principle of Optimality.

The Principle: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.

Wait, in plain English please?
Imagine the shortest route from London to Edinburgh goes through York. Bellman is simply saying that the "York to Edinburgh" part of that journey must be the shortest possible way to get from York to Edinburgh. If it wasn't, you could just take a better path from York to Edinburgh and your whole London-Edinburgh trip would be even shorter!

Quick Tip: In the exam, you are almost always required to work backwards. We start at the finish line and work back to the start. This ensures that every sub-path we find is already the "best" it can be.

Key Takeaway: Any part of an optimal path is itself an optimal path.

2. The Language of Dynamic Programming

To build our tables, we need to speak the language. There are two main variables you need to know:

  • Stage: This usually represents "how many steps are left until the end." If you are 3 stops away from the finish, you are at Stage 3.
  • State: This describes your current condition or location at the beginning of a stage. (e.g., "I am currently at Vertex C").

Analogy: A Video Game
Think of a game with 5 levels.
The Stage is which level you are on (counting down to the boss).
The State is how much health or ammo you have at the start of that level.

3. Solving Problems via Tabulation

The Edexcel syllabus requires you to use a specific table format. It might look intimidating, but it follows a logical flow. We usually work from Stage \(n\) (the last step) back to Stage 1 (the first step).

The Standard Table Columns:

  1. Stage: The number of steps remaining.
  2. State: Where you are starting from in this stage.
  3. Action: Where you decide to go next.
  4. Destination: The state you land in for the next stage.
  5. Value: The cost/profit of the current action + the best value from the previous stage.
  6. Optimal Value: The "best" result for that specific state.

Did you know?
Working backwards is called a Bottom-up approach. It feels counter-intuitive to plan a journey from the destination back to the start, but it prevents you from having to re-calculate the same things over and over again!

4. Different Types of Objectives

Depending on the question, you might be asked to find different types of "optimal" paths. Pay close attention to these words:

A. Minimising (e.g., Shortest Path or Lowest Cost)

You want the smallest total sum.
Formula: \(Value = (\text{Current Cost}) + (\text{Optimal value from previous stage})\)

B. Maximising (e.g., Most Profit)

You want the largest total sum.
Formula: \(Value = (\text{Current Profit}) + (\text{Optimal value from previous stage})\)

C. Minimax (The "Safety First" route)

You want to find a route where the maximum penalty you face is as small as possible. This is common in problems involving risk or "maximum weight" on a bridge.
Formula: \(Value = \max(\text{Current Value, Optimal value from previous stage})\)
Then, you choose the minimum of those maximums for your optimal column.

D. Maximin (The "Best of the Worst" route)

You want to ensure that your minimum gain is as large as possible.
Formula: \(Value = \min(\text{Current Value, Optimal value from previous stage})\)
Then, you choose the maximum of those minimums for your optimal column.

Quick Review Box:
- Minimise: Smallest total.
- Maximise: Largest total.
- Minimax: Smallest of the maximums.
- Maximin: Largest of the minimums.

5. Step-by-Step: Building the Table

Don't worry if this seems tricky at first! Follow these steps for a standard shortest path problem:

  1. Identify the Stages: Draw vertical lines through the network. The final destination is Stage 0 (nothing left to do). The nodes leading directly to it are in Stage 1.
  2. Start the Table: Create your columns. Start with Stage 1.
  3. Fill Stage 1: List the states that can reach the destination. The "Optimal Value" is simply the weight of the arc to the destination.
  4. Move to Stage 2: List the states that can reach the Stage 1 states. For each action, add the arc weight to the "Optimal Value" you just found in Stage 1.
  5. Pick the Winner: For each state, look at all possible actions and pick the one that gives the best result (e.g., the minimum). Put this in the "Optimal Value" column.
  6. Repeat: Continue until you reach Stage \(n\) (your starting point).
  7. Trace Back: Once the table is done, start at your initial node and follow the "Action" column to find your path.

6. Common Mistakes to Avoid

  • Working Forwards: Unless the question explicitly asks for it (which is rare in DM2), always work backwards from the destination.
  • Arithmetic Slips: Because DP is a chain, one tiny addition error in Stage 1 will ruin every single stage after it. Double-check your sums!
  • Mixing up Maximin/Minimax: Remember: Minimax is minimizing a maximum. Maximin is maximizing a minimum. Write the formula at the top of your page to stay on track.
  • Missing Actions: Ensure every possible path from a state is considered in your table rows.

Summary & Key Takeaways

Congratulations! You've covered the fundamentals of Dynamic Programming. Here are the bits to remember for your revision:

  • Bellman’s Principle: An optimal path is made of optimal sub-paths.
  • Backward Planning: We build the table from the end to the start.
  • Tabulation: Accurate tables are the key to full marks. Keep your columns neat!
  • State vs Stage: A Stage is a "step" in time/distance; a State is "where you are" at that step.

Keep practicing the table layout—it's the most common way this is tested. You've got this!