Welcome to the World of Data Manipulation!
Hello there! In this chapter, we are going to explore one of the most important parts of Information Technology: Data and Databases. Think of data as the raw ingredients in a kitchen. On their own, they might not seem like much, but when you manipulate them correctly, you can create something incredibly useful—like a delicious meal (or in our case, Information).
Don't worry if some of the terms sound a bit "techy" at first. We’ll break everything down step-by-step with simple examples and analogies. Let's get started!
5.1 Data and Information
Before we can manipulate data, we need to know exactly what it is and how it differs from information.
The Difference Between Data and Information
Data is raw, unprocessed facts and figures. It has no meaning on its own.
Example: 15, 08, 1995. These are just numbers.
Information is data that has been processed, organized, or structured so that it has meaning and context.
Example: "15/08/1995" identified as a "Date of Birth." Now it means something!
Analogy: The Lego Bricks
Imagine a giant pile of individual Lego bricks on the floor. That is Data. When you follow the instructions and build a castle, the finished castle is Information. The act of building the castle is Processing.
Structured vs. Unstructured Data
Not all data looks the same. We usually put it into two categories:
1. Structured Data: This is data that is highly organized, usually in a table with rows and columns. It is very easy for computers to search and analyze.
Example: A spreadsheet of student names and grades.
2. Unstructured Data: This is data that doesn't have a pre-defined format. It is much harder for computers to "read" automatically.
Example: Photos, videos, social media posts, or handwritten notes.
Quick Review Box:
- Data: Raw facts (The "ingredients").
- Information: Processed data (The "meal").
- Structured: Organized (Like a tidy filing cabinet).
- Unstructured: Unorganized (Like a messy pile of papers).
Key Takeaway: Organizations want to extract meaningful information from data because it helps them make better decisions, like a shop seeing which products are selling best so they can order more.
5.2 Structured Data and Databases
A database is an organized collection of structured data. We use Relational Databases to link different pieces of information together.
The Structure of a Relational Database
To understand a database, imagine a digital version of a school's filing system:
- Tables: A database is made of one or more tables. A table contains data about one specific thing (like "Students" or "Teachers").
- Fields: These are the columns in the table. Each field holds one specific piece of information.
Example: "FirstName" or "PhoneNumber."
- Records: These are the rows in the table. One record contains all the information about one specific item or person.
Example: All the details for one student named "Alex."
Understanding "Keys"
Keys are like the glue that holds a database together.
1. Primary Key: A unique identifier for every record in a table. No two records can have the same Primary Key.
Example: Your Student ID number. Even if two people have the same name, their ID is unique!
2. Foreign Key: This is a field in one table that links to the Primary Key in another table. It's how we create a relationship between tables.
3. Composite Key: Sometimes, one field isn't enough to be unique. A Composite Key is when you combine two or more fields together to create a unique identifier.
Memory Aid: The Key Trick
- Primary: "Number One" (The main ID).
- Foreign: "From Somewhere Else" (It points to another table).
- Composite: "Compound" (A mix of two things).
Entities and Relationships
An Entity is just a fancy word for the "thing" we are storing data about (like a Customer, a Product, or a Sale). In a database, entities "talk" to each other through Relationships:
- One-to-One (1:1): One person has one passport.
- One-to-Many (1:M): One teacher has many students. (This is the most common!).
- Many-to-Many (M:M): Many students take many different subjects.
Did you know? We use Entity Relationship Diagrams (ERDs) to draw maps of how these tables link together before we build the database. It’s like a blueprint for a house!
Key Takeaway: Databases use tables, fields, and records to stay organized, and "Keys" are used to link everything together accurately.
5.3 Structured Query Language (SQL)
Now that our data is in a database, how do we get it out? We use SQL (pronounced "S-Q-L" or "Sequel"). SQL is the language we use to "talk" to the database.
Why use SQL?
SQL allows us to manipulate data—this means we can find specific records, add new data, change existing data, or delete things we no longer need.
Common SQL Commands
Don't worry if this seems like a lot of code—it's actually very logical!
1. Finding Data (Queries):
We use SELECT to pick the fields, FROM to pick the table, and WHERE to filter the results.
Example: SELECT Name FROM Students WHERE Grade = 'A';
2. Adding and Changing Data:
- INSERT: Adds a new row (record) to a table.
- UPDATE (Amend): Changes data that is already there.
- DELETE: Removes a record forever. (Be careful with this one!)
3. Grouping and Counting:
- COUNT: Tells you how many records there are.
- ORDER BY: Sorts your results (A-Z or highest to lowest).
- GROUP BY: Groups similar items together (e.g., showing how many students are in each class).
Using Wildcards
Sometimes you don't know the exact word you are looking for. We use Wildcards as "fill-in-the-blanks":
- % (Percent): Represents any number of characters.
Example: 'S%' will find "Sam", "Sarah", and "Steven".
- _ (Underscore): Represents exactly one single character.
Example: 'H_t' will find "Hat" and "Hot", but not "Heat".
Linking Tables (JOIN and UNION)
- JOIN: This is used to combine rows from two or more tables based on a related column (usually a Primary Key and Foreign Key).
- UNION: This is used to combine the result-set of two or more SELECT statements into one big list.
Common Mistake to Avoid:
Students often forget to put semicolons (;) at the end of their SQL statements. Think of a semicolon as the "period" or "full stop" at the end of a sentence in the database world!
Key Takeaway: SQL is the tool that lets us perform actions on our data. By using commands like SELECT, INSERT, and JOIN, we can turn a massive database into useful information.
Final Quick Review
- Data is raw; Information has meaning.
- Relational Databases use Tables linked by Keys.
- Primary Keys are unique IDs; Foreign Keys link to other tables.
- SQL is the language used to select, add, update, and delete data.
- Wildcards (%) and (_) help us search when we don't have the full name.
You've reached the end of the "Manipulating Data" notes! Great job! Take a break, and then try writing a simple SQL query to see how it feels. You've got this!