Welcome to the World of Databases!

In this chapter, we are exploring how computers store and organize massive amounts of information. Think of a database as the "brain" behind your favorite apps. Whether it's Spotify keeping track of millions of songs or your school keeping records of students, they all use databases to keep things tidy and searchable. By the end of these notes, you’ll understand how to design, build, and manage these digital filing cabinets like a pro!

1. The Basics: Flat Files vs. Relational Databases

Before we get into the clever stuff, we need to know the two main ways data is stored.

Flat File Databases

Imagine a single giant spreadsheet where everything is crammed into one table. This is a Flat File database. While it's simple for small lists (like a shopping list), it becomes a nightmare for big projects because you end up typing the same information over and over again. This is called data redundancy.

Relational Databases

This is the gold standard for modern computing. Instead of one big table, a Relational Database splits data into several smaller tables that are linked together. Example: Instead of writing a customer's address every time they buy a DVD, you have one table for 'Customers' and one for 'Orders', and you link them using a unique ID.

2. Keys: The "ID Cards" of Databases

To keep everything organized and linked, we use different types of Keys:

  • Primary Key: A unique identifier for every record in a table. Think of it like your Passport Number or Fingerprint—no two people have the same one!
  • Foreign Key: This is a Primary Key from one table that appears in another table to create a link (a relationship) between them.
  • Secondary Key: Also called an 'index'. This is a field used to search for data quickly. Example: You might use 'Surname' as a secondary key because, while not unique, it’s a common way people want to search.

Quick Review: Relational databases use Primary Keys and Foreign Keys to link tables together and avoid messy, repeated data.

3. Entity Relationship (ER) Modelling

Before building a database, you need a map. This is called an Entity Relationship Diagram (ERD). We look at the "Entities" (the things we store data about) and how they talk to each other.

There are three main types of relationships:

  1. One-to-One: One husband has one wife (usually!).
  2. One-to-Many: One mother can have many children. (This is the most common type in databases).
  3. Many-to-Many: Many students study many subjects. (Note: In a database, we usually have to break these down into two one-to-many relationships using a 'link table').

4. Normalisation: The "Tidying Up" Process

Normalisation is just a fancy word for making a database efficient. We do this to eliminate redundancy (repeated data) and ensure data integrity (making sure the data stays accurate).

Don’t worry if this seems tricky! Just remember this famous mnemonic for the three stages (Normal Forms):
"The Key, The Whole Key, and Nothing but the Key."

First Normal Form (1NF)

A table is in 1NF if there are no repeating groups of data and every piece of data is "atomic" (broken down into its smallest part). Example: You wouldn't put "John Smith" in one box; you'd have a 'First Name' column and a 'Surname' column.

Second Normal Form (2NF)

The table must already be in 1NF. To get to 2NF, you must remove partial dependencies. This means every column must depend on the whole Primary Key. This usually applies when you have a "Composite Key" (a primary key made of two columns).

Third Normal Form (3NF)

The table must be in 2NF. To reach 3NF, you remove transitive dependencies. This means no column should depend on anything other than the Primary Key. "Nothing but the Key!"

Key Takeaway: Normalisation makes databases smaller, faster, and less likely to have errors when you update information.

5. Structured Query Language (SQL)

SQL is the language we use to talk to databases. You need to be able to read and write basic commands:

  • SELECT: Which columns do you want?
  • FROM: Which table is the data in?
  • WHERE: What are the search criteria? (e.g., Price > 10)
  • ORDER BY: Do you want it in alphabetical or numerical order?
  • JOIN: Linking two tables together using their keys.

Example Query:
SELECT Title, Author
FROM Books
WHERE Genre = 'Sci-Fi'
ORDER BY Title ASC

6. Managing Data: Capture, Select, and Exchange

How does data actually get into and out of the system?

  • Capturing: Getting data into the database. This could be via online forms, barcode scanners, or OCR (Optical Character Recognition).
  • Selecting: Choosing the specific data you need (usually using SQL).
  • Managing: Keeping data updated, secure, and backed up.
  • Exchanging: Moving data between systems. Common formats for this are JSON and XML because they are easy for different types of computers to read.

7. Transactions and ACID

A Transaction is a single operation on a database. Think of transferring money from your bank account to a friend's. You need to make sure the money leaves your account and arrives in theirs. If only half happens, it's a disaster!

To ensure transactions are safe, we use the ACID rules:

  • Atomicity: The transaction must be "all or nothing." If any part fails, the whole thing is cancelled.
  • Consistency: The database must follow all its rules (e.g., you can't have a negative bank balance if the rules forbid it).
  • Isolation: If two people try to update the same record at once, the transactions must be processed one after the other, not at the same time.
  • Durability: Once a change is made, it's permanent, even if the power goes out a second later.

Record Locking

To help with Isolation, databases use Record Locking. If you are editing a record, the database "locks" it so nobody else can change it until you are finished. This prevents data from being overwritten or corrupted.

Did you know? Without Record Locking, two people could book the same seat on a plane at the exact same time!

8. Referential Integrity

This is a simple but vital concept. Referential Integrity ensures that you can't have a "dangling" link. If a 'Student' table has a Foreign Key pointing to a 'Class' ID, that Class must actually exist in the 'Classes' table. You can't delete a class if there are still students assigned to it!

Quick Review Box

- Flat File: One table.
- Relational: Multiple linked tables.
- Primary Key: Unique ID.
- SQL: The language used to query data.
- ACID: Rules to keep transactions safe.
- Normalisation: The process of tidying tables to 3NF.

Don't worry if Normalisation feels like a puzzle at first—practice with a few tables and you'll soon start seeing the patterns! You've got this!