Welcome to the World of Hash Tables!

Have you ever wondered how a computer can find one specific user out of millions in a split second? It doesn't look through a list one by one—that would take forever! Instead, it uses a clever data structure called a Hash Table.

In this chapter, we are going to learn how these "speed-demons" of the data world work. Don't worry if it sounds a bit mathematical at first; once you see the patterns, it’s as simple as finding your coat on a numbered peg!

1. What is a Hash Table?

A Hash Table is a data structure that creates a mapping between keys (like a username or an ID number) and values (like a user's profile data).

Think of it like a massive wall of lockers in a gym. Each locker has a number (the index or address). Instead of searching every locker to find your bag, you have a "magic formula" that tells you exactly which locker number belongs to you based on your name.

Why do we use them?

The main reason we use hash tables is speed. They allow us to find, add, or delete data almost instantly, regardless of how much data we have stored.

Did you know? Hash tables are the secret sauce behind database indexing, caches, and even how some programming languages store variables!

Key Takeaway:

A Hash Table maps a key to a specific location in memory so we can find it instantly.

2. The Magic Formula: Hashing Algorithms

How do we decide exactly where a piece of data should go? We use a Hashing Algorithm (or Hash Function).

A hash function takes a key as an input and turns it into a number. This number is used as the address (the index in an array) where the data will be stored.

A Simple Example: The Modulo Method

A very common "simple" hashing algorithm is the Modulo method.

Let’s say we have a hash table with 10 slots (indexed 0 to 9). Our formula is:
\( \text{address} = \text{key} \pmod{\text{table\_size}} \)

If our key is 124:
\( 124 \pmod{10} = 4 \)
So, the data for key 124 is stored at Index 4.

Step-by-Step: Storing a Key

1. Take the Key (e.g., 57).
2. Run it through the Hash Function (e.g., \( 57 \pmod{10} = 7 \)).
3. Store the data at that Index (Address 7).

Don't worry if the math seems dry—just remember that the function is just a set of rules to pick a seat for the data!

Quick Review:

Hash Function: The calculation that turns a key into an index address.

3. When Things Get Crowded: Collisions

Imagine you are at a party, and the host says, "Everyone sit in the chair that matches the last digit of your phone number." If your number ends in 5, you go to chair 5. But what if someone else’s number also ends in 5?

In Computer Science, this is called a Collision.

Definition: A Collision occurs when two different keys produce the same hash value (address).

Example: Using our \( \text{key} \pmod{10} \) rule:
Key 25 hashes to Index 5.
Key 85 also hashes to Index 5.
UH OH! We can't put two pieces of data in the same slot.

Key Takeaway:

Collisions are unavoidable in the real world because we usually have more possible keys than we have slots in our table.

4. Fixing the Problem: Handling Collisions

When a collision happens, we need a "Plan B" to find a new home for the data. This is often called Rehashing.

Linear Probing (The "Next Door" Method)

The simplest way to handle a collision is to look for the next available empty slot in the table.

How it works:
1. Calculate the address (e.g., Index 5).
2. Is Index 5 full? Yes.
3. Look at Index 6. Is it empty? If yes, put the data there!
4. If Index 6 is also full, keep moving to 7, 8, and so on.

Searching for Data with Collisions

If we want to find key 85 later, the computer:
1. Hashes 85 to get Index 5.
2. Checks Index 5. It finds key 25 there instead! (It knows this isn't the right one).
3. It then checks the next slots (using the same collision rule) until it finds 85 or an empty space.

Common Mistake to Avoid: Students often think hashing is "random." It isn't! A hash function must be deterministic—meaning if you put the same key in, you must get the same address out every single time.

Memory Aid: The "BUS" Check

When thinking about collisions, think of a BUS:
B - Is the slot Busy?
U - Use the next slot.
S - Search until you find it!

Quick Review Box:

Collision: Two keys, one address.
Rehashing: The process of finding a new location when a collision occurs (e.g., moving to the next empty slot).

Chapter Summary

- Hash Tables are high-speed data structures that map Keys to Values.
- A Hash Function calculates the Address (Index) where data is stored.
- A Collision happens when two keys hash to the same address.
- Rehashing (like Linear Probing) is used to find an alternative slot when a collision occurs.