Welcome to Data Structures!
In this chapter, we are going to explore how computers organize data. Think of it like this: if you have a messy bedroom, it’s hard to find your favorite socks. But if you have a chest of drawers, a shoe rack, and a laundry basket, finding things becomes easy! Data structures are just different "furniture" we use to store data in a computer's memory so we can use it efficiently.
Don't worry if some of these names sound intimidating at first. By the end of these notes, you’ll see that they are mostly based on simple, real-world ideas.
1. Static vs. Dynamic Data Structures
Before we look at specific structures, we need to understand how they grow (or don't grow!).
Static Data Structures: These are fixed in size. You decide how much space you need at the start, and it cannot change while the program is running. Analogy: A wooden egg carton. It always has 12 slots, whether you have 1 egg or 12.
Dynamic Data Structures: These can grow and shrink in size. They are great for when you don't know how much data you will have. Analogy: A guest list for a party. You can keep adding names as more people RSVP.
Quick Review: Static is fixed and memory-efficient but risky if you run out of space. Dynamic is flexible but uses more processing power to manage the memory.
2. The Basics: Arrays, Records, and Tuples
Arrays
An Array is a collection of data items of the same type stored together. Each item has an index (a number starting at 0) so we can find it.
- 1D Array: A simple list. Example: [ "Apple", "Banana", "Cherry" ]
- 2D Array: Like a grid or a spreadsheet. You need two coordinates to find an item: [row, column].
- 3D Array: Like a block of flats or a Rubik's cube. You need three coordinates: [floor, row, column].
Common Mistake: Forgetting that most programming languages start counting at 0, not 1!
Records and Tuples
Tuples: An ordered set of values that can be different data types (e.g., an Integer and a String). Once created, you usually cannot change them (they are immutable). Example: (12, "High Street", "London").
Records: These are similar to Tuples but use named fields. Example: A "Student" record might have fields like ID, FirstName, and Surname. It’s like a single row in a database table.
Key Takeaway: Arrays are for lists of the same thing; Records/Tuples are for grouping different details about one thing.
3. Stacks and Queues (ADTs)
These are called Abstract Data Types (ADTs). This just means we care more about how they behave than how they are built.
The Stack (LIFO)
A Stack is a LIFO structure: Last-In, First-Out.
Analogy: A stack of dinner plates. You put the last one on top, and it’s the first one you take off.
Operations:
- Push: Add an item to the top.
- Pop: Remove the item from the top.
- Peek: Look at the top item without removing it.
- IsEmpty: Check if the stack is empty.
The Queue (FIFO)
A Queue is a FIFO structure: First-In, First-Out.
Analogy: A line of people waiting for a bus. The person who got there first gets on the bus first.
Operations:
- Enqueue: Add an item to the back.
- Dequeue: Remove an item from the front.
Did you know? Computers use stacks to remember where they were when a function is called (the "Call Stack") and use queues to manage print jobs!
4. Linked Lists
A Linked List is a dynamic structure where each item (called a Node) "points" to the next one. Imagine a treasure hunt where each clue tells you where to find the next clue.
A Node contains two things:
- Data: The actual value (e.g., "Hello").
- Pointer: The memory address of the next node.
The very last node points to Null, which tells the computer "This is the end!"
How to add/remove: You don't move the data; you just change where the pointers point. It’s like cutting a chain and hooking the links back together in a different order.
5. Trees and Binary Search Trees (BST)
A Tree is a hierarchical structure. It starts with a Root node at the top and "branches" down to Child nodes. Nodes with no children are called Leaves.
Binary Search Trees (BST)
A BST is a special tree where every node has at most two children. It follows a strict rule:
- Values less than the parent go to the Left.
- Values greater than the parent go to the Right.
Traversing a Tree: There are two main ways to visit every node:
- Breadth-First: You visit all nodes on one level before moving down to the next level. (Think of it like scanning a book line by line).
- Depth-First (Post-Order): You go as deep as possible down a branch before "visiting" the nodes. In Post-Order, you visit the children before the parent.
Key Takeaway: BSTs make searching for data incredibly fast! Every time you move left or right, you ignore half of the remaining data.
6. Graphs
A Graph is a collection of nodes (called Vertices) connected by lines (called Edges).
Analogy: A map of the London Underground. Each station is a vertex, and the tracks are edges.
Types of Graphs:
- Directed: Edges have arrows (one-way streets).
- Undirected: Edges work in both directions (two-way streets).
- Weighted: Edges have "costs" or "weights" (e.g., the distance in miles between two cities).
Common Mistake: Students often confuse Trees and Graphs. Remember: A Tree is just a specific type of Graph that has no "cycles" (loops) and is all connected to one root.
7. Hash Tables
A Hash Table is designed for instant searching. Instead of looking through a list, we use a Hashing Algorithm to calculate exactly where the data should be stored.
How it works:
- Take a piece of data (the Key).
- Run it through a Hash Function (a mathematical formula).
- The result is the Index (the "address") where the data is stored.
Complexity: Ideally, finding something in a hash table takes \(O(1)\) time—it’s nearly instant!
Collisions: Sometimes two different keys result in the same index. This is a collision. Computers handle this by using "overflow" areas or "chaining" (creating a small linked list at that index).
Summary Checklist
To succeed in your exam, make sure you can:
- Explain the difference between Static and Dynamic.
- Draw a Stack (Push/Pop) and a Queue (Enqueue/Dequeue).
- Describe how a Linked List uses pointers to stay connected.
- Apply the Left < Parent < Right rule to build a Binary Search Tree.
- Identify Directed vs Undirected edges in a Graph.
- Explain why Hash Tables are so fast for searching.
Don't worry if this seems like a lot to memorize. Draw these structures out on paper—it’s much easier to understand when you can see the "furniture" you're building!