Welcome to Relational Databases!

In this chapter, we are moving away from simple flat files and diving into relational databases. Think of this as the difference between a messy pile of papers on a desk and a perfectly organized filing cabinet where everything is linked. Databases are the backbone of almost every app you use, from Instagram to your school’s grading system. By the end of these notes, you’ll understand how to organize data so it’s efficient, accurate, and easy to find!

1. What is a Relational Database?

At its simplest, a relational database is a way of storing data in several tables that are linked together. Instead of having one giant, clunky table (a flat-file), we split the data into smaller, logical tables.

Analogy: Imagine a music app. Instead of one table listing the song, artist, artist's age, and album for every single track, you have one table for Artists, one for Albums, and one for Songs. They "relate" to each other using special ID numbers.

Key Terms You Need to Know:

Attribute: This is just a fancy word for a column or field in a table. It’s a specific category of information. Example: "DateOfBirth" or "Surname".

Primary Key: This is a unique identifier for every record (row) in a table. No two rows can have the same primary key. Example: Your "Student ID" at school.

Composite Primary Key: Sometimes, one single attribute isn't enough to be unique. A composite key uses two or more attributes combined to create a unique identifier. Example: In a "Classroom Booking" table, combining "RoomNumber" and "Period" could act as a unique key.

Foreign Key: This is a Primary Key from one table that appears in another table to link them together. It acts as a "bridge" between tables.

Quick Review:
- Primary Key: Unique ID in its own table.
- Foreign Key: A Primary Key from a different table used as a link.

Key Takeaway: Relational databases use multiple tables linked by keys to reduce data duplication and keep things organized.

2. Database Design and Normalisation

Don't worry if normalisation sounds intimidating! It’s just a step-by-step process of "tidying up" your database to make it as efficient as possible. We do this to avoid data redundancy (storing the same thing twice) and data inconsistency (having two different versions of the "truth").

The Three Levels of Normalisation

To reach Third Normal Form (3NF), we follow these steps:

First Normal Form (1NF)

A table is in 1NF if there are no repeating groups of attributes. Every "cell" in your table must contain only one piece of data.

Second Normal Form (2NF)

A table is in 2NF if it is already in 1NF and all non-key attributes are fully dependent on the primary key. This usually only matters if you have a Composite Key. You shouldn't have data in a table that only relates to part of the key.

Third Normal Form (3NF)

A table is in 3NF if it is in 2NF and has no transitive dependencies. This is a complex way of saying: "All attributes must depend only on the key." If an attribute depends on another attribute that isn't the key, it needs to move to its own table.

Memory Aid for 3NF: A famous way to remember this is that all data must depend on:
"The Key, the Whole Key, and Nothing But the Key" (so help me Codd!).

Why bother normalising?

1. Save Space: You aren't typing the same address or name over and over.
2. Data Integrity: If a customer changes their phone number, you only have to update it in one place, not fifty.
3. Easier Searching: Smaller, cleaner tables are faster for the computer to process.

Key Takeaway: Normalisation (1NF -> 2NF -> 3NF) removes messy duplicates and ensures every piece of data is stored in the most logical place.

3. Structured Query Language (SQL)

SQL is the language we use to "talk" to the database. You need to know how to perform four main actions (CRUD: Create, Read, Update, Delete) plus how to create a table.

Defining a Table

To create a new table, we use CREATE TABLE.
Example:
CREATE TABLE Students (
  StudentID INT PRIMARY KEY,
  FirstName VARCHAR(20),
  Surname VARCHAR(20)
);

Managing Data

SELECT: Used to find and show data.
SELECT FirstName FROM Students WHERE Surname = 'Smith';

INSERT: Used to add a new record.
INSERT INTO Students (StudentID, FirstName, Surname) VALUES (101, 'Ali', 'Khan');

UPDATE: Used to change existing data.
UPDATE Students SET Surname = 'Jones' WHERE StudentID = 101;

DELETE: Used to remove a record.
DELETE FROM Students WHERE StudentID = 101;

Common Mistake to Avoid: When using UPDATE or DELETE, always remember your WHERE clause! If you forget it, you might accidentally delete everyone in the database or change everyone's name to 'Jones'.

Key Takeaway: SQL is a standard language for defining tables and manipulating the data inside them.

4. Client-Server Databases

In the real world, many people (clients) try to access the same database (server) at the same time. This can cause "traffic jams" known as concurrency issues.

The Problem: Lost Updates

Imagine two people trying to book the very last seat on a flight at the exact same millisecond. If the system isn't careful, it might let both of them book it, or one person's booking might "overwrite" the other's. This is a lost update.

The Solutions:

To keep data safe (preserve integrity), we use these methods:

Record Locks: While one person is editing a record, the database "locks" it so no one else can change it until the first person is finished. It's like taking a book out of a library; no one else can read it until you bring it back.

Serialisation: This ensures that transactions are processed one after another, rather than at the same time, so they don't interfere with each other.

Timestamp Ordering: Every transaction gets a "time sticker." If two people try to change the same data, the one with the earlier timestamp gets processed first.

Commitment Ordering: Transactions are ordered based on their dependencies and the time they are "committed" (finalized) to ensure they don't clash.

Did you know? Banks use these techniques every second to make sure your balance doesn't accidentally double (or disappear!) when you and a friend both send money at the same time.

Key Takeaway: Concurrent access can break data. We use locks and ordering techniques to make sure every update is handled correctly without losing data.