Welcome to Topic 2: Data Representation!

Ever wondered how your computer knows the difference between a high-definition movie, a text message, and a video game? At its heart, a computer is just a collection of tiny switches that can be either ON or OFF. In this chapter, we will learn how computers use these simple switches (Binary) to represent everything in our digital world. Don't worry if it seems like a lot of math at first—once you learn the patterns, it’s like learning a secret code!

2.1 The Power of Binary

What is Binary?

Computers use binary because they are made of transistors that act like switches. We represent "OFF" as 0 and "ON" as 1. These individual 0s and 1s are called bits (short for Binary Digits).

Did you know? A single bit is the smallest unit of data a computer can store. It's like a single light switch in your house!

Number of States

If you have a certain number of bits, how many different things can you represent? There is a simple formula for this: \( 2^n \), where \( n \) is the number of bits.
Example: If you have 3 bits, you can represent \( 2^3 = 8 \) different states (000, 001, 010, 011, 100, 101, 110, 111).

Unsigned vs. Signed Integers

Computers need to store both positive and negative numbers:
1. Unsigned Integers: These are only positive numbers (0 and upwards).
2. Two’s Complement (Signed Integers): This is a method used to represent both positive and negative numbers. The most significant bit (the one furthest to the left) represents a negative value.

Quick Review: In an 8-bit Two's Complement number, the left-most bit represents \( -128 \) instead of \( +128 \). This allows us to store numbers from \( -128 \) to \( +127 \).

Binary Addition

Adding binary is just like normal addition, but you carry over when you hit 2!
- \( 0 + 0 = 0 \)
- \( 0 + 1 = 1 \)
- \( 1 + 1 = 0 \) (carry 1)
- \( 1 + 1 + 1 = 1 \) (carry 1)

Common Mistake: Forgetting about Overflow. If you add two 8-bit numbers and the result needs 9 bits, the computer might ignore the extra bit. This is an overflow error and can cause programs to crash or give wrong answers!

Binary Shifts

Shifting bits left or right is a fast way to multiply or divide.
- Logical Shift Left: Moves bits to the left and fills the right with 0s. A shift of 1 place multiplies the number by 2.
- Logical Shift Right: Moves bits to the right. A shift of 1 place divides the number by 2 (dropping any remainder).
- Arithmetic Shift: Used for signed numbers to keep the negative sign bit the same.

Hexadecimal (Hex)

Hexadecimal is a Base-16 system. It uses 0-9 and then A, B, C, D, E, F for values 10-15.
Why use it? It is much easier for humans to read and remember than long strings of binary. One Hex digit represents exactly 4 bits (a nibble).

Takeaway: Binary is for computers; Hex is a "shorthand" for humans to understand binary more easily.

2.2 Representing Text, Images, and Sound

Text: ASCII

To store text, computers give every character a unique number. 7-bit ASCII is a standard code that can represent 128 different characters (including capital letters, lowercase, numbers, and symbols).
Example: In ASCII, the letter 'A' is represented by the denary number 65.

Images: Bitmaps

Digital images are made of tiny dots called pixels (picture elements).
- Resolution: The number of pixels in the image (e.g., width x height).
- Colour Depth: The number of bits used for each pixel. The more bits per pixel, the more colours you can have!
Formula: \( \text{Total Colours} = 2^{\text{colour depth}} \)

Sound: Analogue to Digital

Sound is naturally an analogue wave. To store it, the computer takes "snapshots" of the wave at regular intervals. This is called sampling.
- Sample Rate: How many samples are taken per second (measured in Hertz, Hz).
- Bit Depth: The number of bits used to record the amplitude (height) of the wave at each sample.
- Sample Interval: The time gap between each sample.

Analogy: Think of a digital photo of a mountain. The "Resolution" is like how many megapixels the camera has, and the "Sample Rate" in sound is like how many frames per second a video has. More is usually better quality, but creates a bigger file!

Takeaway: Converting real-world data (images/sound) into binary always involves a trade-off between quality and file size.

2.3 Data Storage and Compression

Measuring Data

You need to know your units! In Computer Science, we use binary multiples (based on 1024):
- Bit: 1 or 0
- Nibble: 4 bits
- Byte: 8 bits
- Kibibyte (KiB): 1024 bytes
- Mebibyte (MiB): 1024 KiB
- Gibibyte (GiB): 1024 MiB
- Tebibyte (TiB): 1024 GiB

Memory Aid: Killy Might Get Tired (Kibi, Mebi, Gibi, Tebi).

Data Compression

Compression makes files smaller so they take up less storage space and are faster to send over the internet.

1. Lossy Compression

This removes data permanently to reduce file size. You lose some quality, but usually, humans can't tell the difference (like a JPEG image or an MP3 song).
Warning: You cannot get the original data back once it's gone!

2. Lossless Compression

This shrinks the file without losing any information. It looks for patterns in the data to store it more efficiently.
Example: If a file has 100 "A"s in a row, instead of writing "AAAA...", it writes "100xA". This is great for text files or program code where every character matters.

Takeaway: Use Lossy for media (photos/music) to save lots of space. Use Lossless for documents and code where you can't afford to lose a single bit!

Final Quick Review

Binary: Base-2 (0, 1).
Hexadecimal: Base-16 (0-F).
Character Set: A list of characters and their binary codes (ASCII).
Bitmap: Image made of pixels.
Sampling: Converting analogue sound to digital.
Compression: Reducing file size (Lossy = data lost, Lossless = no data lost).

Don't worry if this seems tricky at first! Practice converting small numbers between binary and denary, and the rest will start to click. You've got this!