Welcome to Database Design!
Ever tried to find a specific sock in a messy drawer? It’s frustrating, right? Designing a database is exactly like organizing that drawer. If you just throw everything in at once, you’ll never find what you need, and you might end up with three left socks and no right ones. In this chapter, we are going to learn how to design databases that are neat, efficient, and error-free using a process called Normalisation. Don’t worry if this seems a bit "maths-heavy" at first—it’s actually just about using logic to keep things tidy!
Step 1: The Blueprint (Entity Relationship Modelling)
Before we build a database, we need a plan. This is called a Data Model. Think of this as the architectural drawing of a house before the bricks are laid.
Entities and Attributes
An Entity is simply a "thing" we want to store data about (like a Student, a Book, or a Car). Each entity has Attributes, which are the specific details (like StudentName or DateOfBirth).
Entity Relationship (ER) Diagrams
We use diagrams to show how these things connect. For example, one Customer can place many Orders. In your exam, you might be asked to draw these using simple boxes and lines.
Memory Tip: Always ask yourself "Can one of 'A' have many of 'B'?" to find the relationship.
Entity Descriptions
To write down our design clearly, we use a standard format. We write the Entity name, then list the attributes in brackets. We underline the Primary Key (the unique ID).
Example: Student(StudentID, FirstName, LastName, CourseID)
Key Takeaway: Good design starts with a clear plan showing what data we are collecting and how the different "things" in our system relate to each other.
Step 2: Relational Database Essentials
A Relational Database is a collection of tables that are linked together. To make these links work, we use special types of "Keys":
1. Primary Key: A unique identifier for every record in a table. Think of it like your National Insurance number or a school ID—no two people have the same one.
2. Composite Primary Key: Sometimes, one column isn't enough to be unique. A composite key is when we use two or more columns together to create a unique ID. For example, in a "Lesson" table, RoomNumber and PeriodNumber together might be the unique key.
3. Foreign Key: This is the "bridge." It is a column in one table that is a Primary Key in another table. It’s how we link a Student to their Teacher.
Quick Review Box:
- Primary Key: Unique ID for the current table.
- Foreign Key: Links to a Primary Key in a different table.
- Attribute: A column/header in your table.
Step 3: What is Normalisation?
Normalisation is the process of tidying up our database. We do this for two main reasons:
- To eliminate Data Redundancy (storing the same piece of info in multiple places).
- To maintain Data Integrity (making sure the data is accurate and easy to update).
Real-World Example: If a student changes their phone number and you have recorded it in five different tables, you have to update it five times. If you miss one, your data is wrong! Normalisation ensures you only store that number once.
Did you know? Normalisation was invented by E.F. Codd in the 1970s. He created a set of rules called "Normal Forms" to ensure databases were logically perfect.
Step 4: The Three Levels of Normalisation
To get a database into Third Normal Form (3NF), we have to pass through three stages. Students often find this tricky, but just follow these simple rules:
First Normal Form (1NF)
A table is in 1NF if there are no repeating groups of attributes. This means every "cell" in your table must contain only one single value (it must be atomic).
Common Mistake: Having a column called "Subject" and putting "Maths, Physics, CS" all in one box. In 1NF, those must be separate records.
Second Normal Form (2NF)
To be in 2NF, the table must first be in 1NF. Then, you must remove Partial Dependencies.
This only applies if you have a Composite Primary Key. Every non-key attribute must depend on the entire composite key, not just part of it.
Example: If your key is (StudentID, CourseID), and you have a column for StudentAddress, that address only depends on the StudentID. It doesn't care about the CourseID. That's a partial dependency! You must move the address to a separate Student table.
Third Normal Form (3NF)
To be in 3NF, the table must be in 2NF. Then, you must remove Transitive Dependencies.
This means "No non-key attribute should depend on another non-key attribute." All attributes should only depend on the Primary Key.
Analogy: Imagine a table FootballTeam(PlayerID, Name, ClubName, ClubManager). The ClubManager depends on the ClubName, not directly on the PlayerID. This is a transitive dependency. You should move the Club information to its own table.
The Golden Mnemonic:
To remember the rules of 3NF, database experts say: "The data must depend on The Key (1NF), The Whole Key (2NF), and Nothing But The Key (3NF), so help me Codd!"
Key Takeaway: By the time you reach 3NF, every piece of information in a table is strictly about the Primary Key of that table and nothing else. This makes your database lean, fast, and reliable.
Quick Summary Checklist
- 1NF: No lists in cells. Data is atomic.
- 2NF: No partial dependencies (look out for composite keys!).
- 3NF: No transitive dependencies (non-keys shouldn't rely on other non-keys).
- Goal: Minimise duplication and keep data accurate.
Final Encouragement: Normalisation takes practice! If you’re struggling, try drawing the tables out on paper and circling which bits of information "belong" to which key. It will start to click soon!