Welcome to the World of Character Encoding!

Ever wondered how your computer knows that a specific bunch of 1s and 0s should look like the letter 'A' on your screen? Or how it manages to show a thumbs-up emoji?

In this chapter, we are going to explore Character Encoding. Think of it as a secret decoder ring that computers use to translate binary numbers into the letters, numbers, and symbols we use every day. Don't worry if binary numbers feel a bit "maths-heavy" – we’re going to break it down step-by-step!

1. What is a Character Set?

Before we dive into the specific codes, we need to understand the "big picture." A character set is basically a giant look-up table. It contains all the characters (letters, numbers, symbols) that a computer system can recognize, and it assigns a unique binary number (a character code) to every single one.

The Analogy: Imagine you and your best friend have a secret code. You decide that 1 = 'Hello', 2 = 'Goodbye', and 3 = 'Lunch?'. If you send the number "3", your friend knows exactly what you mean. That "list" of rules is your character set!

Quick Review: Key Terms

Character: A single symbol, like 'a', 'Z', '!', or '7'.
Character Set: The complete collection of characters that a computer can represent.
Character Code: The specific binary number assigned to one character.

Key Takeaway: Without a character set, a computer would just see a mess of binary numbers and wouldn't know how to display text at all.

2. ASCII: The "Original" Code

ASCII (which stands for American Standard Code for Information Interchange) was one of the first major character sets. Specifically, the AQA syllabus focuses on 7-bit ASCII.

How it works:
• It uses 7 bits for each character.
• This means it can represent \(2^7 = 128\) different characters.
• It includes English letters (capital and lowercase), numbers, and basic punctuation (like ! ? . ,).
• It also includes "control characters" like the 'Enter' key or 'Space'.

The "Sequence" Trick (Very important for exams!)

In ASCII, character codes run in alphabetical sequence. This is a lifesaver in exams! If you know the code for 'A', you can figure out the rest just by counting.

Example:
If the exam tells you that 'A' is 65, you can easily find 'C':
• A = 65
• B = 66
C = 67

Common Mistake to Avoid: Capital letters and lowercase letters have different codes! 'A' is not the same as 'a'. In ASCII, 'A' starts at 65, but 'a' starts at 97.

Did you know? Even though ASCII only uses 7 bits, computers usually store it in a full 8-bit byte, leaving the 8th bit as a 0.

3. Unicode: The Global Upgrade

ASCII was great, but it had a big problem: 128 characters isn't enough for the whole world! What about Chinese characters, Arabic script, or even just the € symbol? That's why Unicode was created.

Why Unicode is better:
• It uses more bits (often 16 or 32 bits), allowing for millions of possible characters.
• It can represent every language in the world.
• It includes emojis! ✊🎨🚀

Unicode and ASCII: Best Friends

Unicode was designed to be "backward compatible" with ASCII. This means that for the first 127 characters, Unicode uses the exact same codes as ASCII.

Example: The character code for 'A' is 65 in ASCII, and it is also 65 in Unicode!

Don't worry if this seems tricky: You don't need to know the specific technical versions (like UTF-8). You just need to know that Unicode is larger and more inclusive than ASCII.

Key Takeaway: ASCII is small and simple (128 characters), while Unicode is huge and global (enough for every language and emoji).

4. How to use an Encoding Table

In your exam, you might be given a small table and asked to convert between characters and codes. Let's practice!

Step-by-Step: Converting Character to Code

1. Look at the character you are given (e.g., 'B').
2. Find 'B' in the table provided.
3. Look at the number next to it (e.g., 66).
4. Convert that number into binary if the question asks for it.

Step-by-Step: Converting Code to Character

1. Take the binary number (e.g., 0100 0011).
2. Convert it to a decimal number (e.g., 67).
3. Find 67 on the encoding table.
4. See which character it represents (e.g., 'C').

Memory Aid: ASCII starts with A, which is 65. If you forget, think "Age 65 is when people used to retire... and ASCII is an old system!"

Quick Summary Review

1. Character Set: A list of characters and their binary codes.
2. 7-bit ASCII: Can represent 128 characters. Good for English, but limited.
3. Unicode: Can represent millions of characters. Used for all languages and emojis.
4. Sequence: Codes run in order (A=65, B=66, C=67).
5. Compatibility: Unicode uses the same codes as ASCII for the first 127 characters.

Final Tip: When you see a question about "why we moved from ASCII to Unicode," the answer is almost always "to represent more characters from different languages and symbols."