Introduction: Welcome to Information Coding!

Ever wondered how a computer knows that a 1 is the number one, but also knows when a 1 is just a character you typed in an email? Or how it manages to send messages across the world without them getting scrambled?

In this chapter, we are going to dive into Information Coding Systems. This is the "secret language" computers use to turn human symbols into 1s and 0s. We’ll look at how text is represented and how computers spot mistakes when data is being moved. Don't worry if it seems like a lot—we’ll break it down bit by bit!


4.5.5.1 Character Form of a Decimal Digit

It is very easy to assume that the number 5 is always just "5" to a computer. However, a computer treats numbers differently depending on how they are being used.

The Difference Between Binary and Character Codes

  • Pure Binary Representation: This is used when the computer wants to do maths. If you want to calculate \(5 + 2\), the computer sees the value of 5 as \(101_2\).
  • Character Code Representation: This is used when the computer treats the number like text. If you type the digit "5" into a Word document, the computer doesn't care about its mathematical value. It just needs a "label" or a "code" to know which shape to show on the screen.

Common Mistake: Students often think the character "5" is stored as binary \(101_2\). In reality, in the ASCII system, the character "5" is actually stored as the code 53 (binary \(00110101_2\))!

Key Takeaway: Use Pure Binary for calculations and Character Codes for displaying text on a screen.


4.5.5.2 ASCII and Unicode

To make sure all computers understand the same characters, we use standard systems. Think of these like a universal dictionary where every letter has a specific page number.

1. ASCII (American Standard Code for Information Interchange)

ASCII was one of the first major systems. It originally used 7 bits, which allowed for \(2^7 = 128\) different characters. This was enough for:

  • English uppercase and lowercase letters
  • Digits 0-9
  • Punctuation symbols
  • Control characters (like "Enter" or "Delete")

2. Unicode

As the internet grew, 128 characters weren't enough. We needed symbols for Chinese, Arabic, emojis, and mathematical symbols.

Unicode was introduced to solve this. It uses a much larger number of bits (usually 16 or 32 bits). Because it has more bits, it can represent millions of different characters, covering almost every language on Earth.

Analogy: ASCII is like a small local post office with 128 mailboxes. Unicode is like a massive global distribution center with millions of mailboxes for everyone in the world.

Quick Review: Why was Unicode introduced?
ASCII didn't have enough space for non-English characters or special symbols. Unicode provides a unique code for every character, no matter the language or platform.


4.5.5.3 Error Checking and Correction

When data is sent from one place to another (like downloading a file), interference or "noise" can cause a 1 to flip into a 0. We need ways to catch these mistakes!

1. Parity Bits

A extra bit (the parity bit) is added to a string of binary code to make the total number of 1s either Even or Odd.

  • Even Parity: The total number of 1s (including the parity bit) must be an even number.
  • Odd Parity: The total number of 1s must be an odd number.

Example (Even Parity): You want to send \(1101101\). There are five 1s. To make it even, the parity bit becomes 1. The final code is \(1101101\mathbf{1}\).

2. Majority Voting

This is a clever but "expensive" way to fix errors. The computer sends every bit three times. If one bit gets flipped, the computer looks at the other two and takes the "majority" as the truth.

Example: You want to send a 1. You send 1 1 1. If the receiver gets 1 0 1, it assumes the 0 was an error and corrects it back to 1.

3. Checksums

The computer runs a mathematical formula on the data being sent and produces a total (the checksum). This total is sent along with the data. The receiver runs the same formula. If their total doesn't match the checksum sent, an error has occurred.

4. Check Digits

You see these every day on barcodes and ISBN numbers on books. A check digit is a single digit at the end of a long identification number. It is calculated based on the other digits. If you type the number in wrong, the calculation won't match the check digit, and the system will throw an error.

Did you know? Check digits are specifically designed to catch human errors, like swapping two numbers around when typing (e.g., typing 45 instead of 54).

Summary Table for Error Checking:

  • Parity Bits: Simple, but can't fix the error (only detect it).
  • Majority Voting: Can fix errors, but uses 3x more data.
  • Checksums: Great for large blocks of data.
  • Check Digits: Used for data entry (barcodes/IDs).

Don't worry if the binary maths feels tricky at first! Just remember the main goal: we need to represent human characters as numbers, and we need to make sure those numbers don't change while they travel through the computer.

Key Takeaways for the Exam:

  • ASCII = 7 or 8 bits (English only).
  • Unicode = 16 or 32 bits (Global languages).
  • Numbers can be Pure Binary (value) or Character Codes (labels).
  • Errors happen when bits flip; we use Parity, Majority Voting, Checksums, and Check Digits to find or fix them.