Welcome to the World of Databases!

Ever wondered how Spotify keeps track of millions of songs, or how your school manages thousands of student records without losing anything? The secret is Databases! In this chapter, we will learn how to organize information so it is easy to find, update, and keep safe. Don't worry if it seems like a lot of terms at first; think of a database as just a very smart, digital filing cabinet.


1. The Building Blocks: Tables, Records, and Fields

Before we build a complex system, we need to understand the basic structure of a Relational Database.

What makes up a Database?

Imagine a spreadsheet for a class list:

  • Table: The entire collection of data about one specific subject (e.g., a "Students" table).
  • Record (or Row): A single horizontal entry. It represents one "thing" (e.g., one specific student).
  • Field (or Column): A single vertical category. It represents one "attribute" or characteristic (e.g., "Phone Number" or "Date of Birth").
  • Attribute: This is just another name for a Field. It describes a property of the entity.

Quick Review: If you have a table of "Books," the Author is a field, and "Harry Potter and the Philosopher's Stone" is part of a record.


2. The "Keys" to the Database

In a database, we need ways to identify data and link tables together. We use Keys for this.

Types of Keys

  • Primary Key: A unique identifier for every record. No two records can have the same Primary Key. Example: Your Admission Number or NRIC.
  • Composite Key: When one field isn't enough to be unique, we combine two or more fields to make a unique identifier. Example: Combining "Course ID" and "Student ID" to identify a specific enrollment.
  • Foreign Key: A field in one table that is the Primary Key in another table. This is how we "link" tables together.
  • Secondary Key: A field that is not the primary key but is used frequently to search for data. Example: Searching for a student by "Last Name" instead of their ID number.

Memory Aid: Think of the Primary Key as your Fingerprint (unique to you) and the Foreign Key as a Reference Letter (it points to someone else).


3. Data Redundancy and Dependency

A good database should be "lean and mean." We want to avoid two big problems:

Data Redundancy

This happens when the same piece of data is stored in multiple places. Example: Writing a student's home address in five different tables.
Why is it bad? It wastes storage space and leads to "Update Anomalies" (if the student moves, you might forget to update their address in all five places, leading to inconsistent data).

Data Dependency

This refers to the relationship between fields. We want fields to depend only on the Primary Key. If a piece of information (like a "Doctor's Phone Number") depends on something other than the Primary Key (like the "Doctor's Name"), it can cause confusion when data is deleted or changed.

Key Takeaway: We want to reduce redundancy to keep data accurate and save space!


4. Normalization: The Three Steps to a Perfect Database

Normalization is the process of organizing a database to reduce redundancy. For H2 Computing, you need to know how to get a table into Third Normal Form (3NF).

The Step-by-Step Process

  1. First Normal Form (1NF): Remove "repeating groups." Every cell must contain only one single value (atomic value). No lists allowed inside a box!
  2. Second Normal Form (2NF): Must be in 1NF first. Then, remove Partial Dependencies. This means every field must depend on the entire Primary Key (this is usually only an issue if you have a Composite Key).
  3. Third Normal Form (3NF): Must be in 2NF first. Then, remove Transitive Dependencies. This means "Non-key fields should not depend on other non-key fields."

Mnemonics: To reach 3NF, a field must depend on: "The Key (1NF), the Whole Key (2NF), and Nothing But the Key (3NF), so help me Codd!" (E.F. Codd invented this system).


5. Entity-Relationship (ER) Diagrams

An ER Diagram is a map of how tables relate to each other. It uses boxes and lines to show connections.

  • Entity: Represented by a rectangle (e.g., [STUDENT]).
  • Relationship: The connection between them.

Common Relationship Types

  • One-to-One (1:1): One person has one passport.
  • One-to-Many (1:M): One mother can have many children. (This is the most common in databases!)
  • Many-to-Many (M:N): Many students take many subjects. Note: In a real database, we usually break these down into two 1:M relationships using a "Link Table."

6. SQL: Talking to Your Database

Structured Query Language (SQL) is the language we use to "ask" the database for information or to make changes.

Essential SQL Statements

  • SELECT: Used to pick which fields you want to see.
  • FROM: Which table the data is coming from.
  • WHERE: Filters the data (like a "search" condition).
  • ORDER BY: Sorts the results (ASC for ascending, DESC for descending).
  • INSERT INTO: Adds a new record.
  • UPDATE: Changes existing data.
  • DELETE: Removes a record.
  • JOIN: Combines rows from two or more tables based on a related column.

Example Query:
SELECT Name, Score FROM Students WHERE Score > 70 ORDER BY Score DESC;
(This would show the names and scores of all students who scored above 70, starting with the highest score.)


7. SQL vs. NoSQL: The Modern Choice

Traditional databases use SQL and strict tables. But sometimes, data is too messy or moves too fast for tables. That's where NoSQL comes in.

Comparing the Two

SQL (Relational):

  • Uses structured tables with rows and columns.
  • Great for data that doesn't change its "shape" often (e.g., bank accounts).
  • Strong consistency (data is always accurate across the system).

NoSQL (Non-Relational):

  • Addresses shortcomings like Scaling (NoSQL can handle massive amounts of data across many servers easily).
  • Flexible "Schema" (You don't need a fixed table; data can be stored as documents, graphs, or key-value pairs).
  • Great for "Big Data" and real-time web apps (e.g., social media feeds).

Did you know? SQL databases are like a Library (very organized, specific spots for everything), while NoSQL is like a Laundry Basket (easy to toss things in quickly, flexible, but requires a different way to find things).


8. Working with Databases in Python

In your practical exams, you will likely use the sqlite3 module in Python to work with SQL databases.

The Python-SQL Workflow

  1. Connect: Create a connection to the database file.
  2. Cursor: Create a "cursor" object (think of this as your digital "pointing finger" that executes commands).
  3. Execute: Use cursor.execute("SQL STATEMENT HERE").
  4. Commit: Use connection.commit() to save your changes!
  5. Close: Always close the connection when finished.

Common Mistake: Forgetting to commit! If you don't commit, your INSERT or UPDATE commands won't be saved to the file, and you'll wonder why your data disappeared!


Quick Review Box

Primary Key: Unique ID.
Foreign Key: Link to another table.
Normalization: Reducing data repetition (Redundancy).
1NF: No lists in cells.
2NF: No partial dependency.
3NF: No transitive dependency.
SQL: The language of databases.
NoSQL: Flexible, handles "Big Data" better than SQL.