Welcome to Data Storage!

Ever wondered how a computer can store a high-definition movie, a catchy song, or a 500-page essay using nothing but electronic switches? In this chapter, we are going to pull back the curtain and see how computers turn everything into binary (1s and 0s). Don't worry if it sounds a bit "Matrix-y" at first—we'll take it step-by-step!

1. Units of Data Storage

Computers are made of billions of tiny switches that can either be ON (1) or OFF (0). Because of this, everything must be stored in binary format to be processed.

The Hierarchy of Data

Just like we use grams and kilograms for weight, computers use specific units for data:

  • Bit: A single 0 or 1. The smallest unit.
  • Nibble: 4 bits (half a byte).
  • Byte: 8 bits. (The "building block" of data).
  • Kilobyte (KB): 1,000 bytes.
  • Megabyte (MB): 1,000 KB.
  • Gigabyte (GB): 1,000 MB.
  • Terabyte (TB): 1,000 GB.
  • Petabyte (PB): 1,000 TB.

Quick Memory Aid: To remember the order, try this: Big Nice Boys Keep Many Great Toys Packed. (Bit, Nibble, Byte, KB, MB, GB, TB, PB).

Note: For your exam, OCR uses 1,000 as the multiplier, but using 1,024 is also acceptable!

Calculating File Sizes

You might be asked to calculate how much space a file takes up. Here are the "golden rules":

  • Text Files: \( \text{bits per character} \times \text{number of characters} \)
  • Images: \( \text{colour depth} \times \text{image height (px)} \times \text{image width (px)} \)
  • Sound: \( \text{sample rate (Hz)} \times \text{duration (s)} \times \text{bit depth} \)

Key Takeaway: Computers use binary because they are made of transistors (switches). As we move up from Bits to Petabytes, we usually multiply by 1,000 at each step.


2. Storing Numbers

Computers store our regular numbers (Denary) by converting them into Binary (Base 2).

Binary to Denary (and vice versa)

To convert an 8-bit binary number, use a table like this:

128 | 64 | 32 | 16 | 8 | 4 | 2 | 1

If there is a '1' in the column, add that number. If there is a '0', ignore it.

Example: 00001011 is \( 8 + 2 + 1 = 11 \).

Hexadecimal

Hexadecimal (Base 16) is a shorthand for binary. It's much easier for humans to read! It uses 0–9 and then A–F (where A=10, B=11, C=12, D=13, E=14, F=15).

Quick Review: One Hex digit represents exactly 4 bits (a nibble). So, a 2-digit Hex number represents one full Byte.

Binary Shifts

Shifting bits Left or Right is a fast way to multiply or divide:

  • Left Shift: Multiplies the number. Shifting 1 place left doubles the value.
  • Right Shift: Divides the number. Shifting 1 place right halves the value.

Common Mistake: In a right shift, any bits that "fall off" the end are lost, which can lead to rounding errors!

Binary Addition and Overflow

When adding binary, remember: \( 0+0=0 \), \( 0+1=1 \), \( 1+1=10 \) (write 0, carry 1), and \( 1+1+1=11 \) (write 1, carry 1).

Overflow Error: This happens when the result of an addition is too big to fit into 8 bits. The computer simply doesn't have a place to put the "extra" bit!

Key Takeaway: Binary is for computers; Hex is for humans to read binary easily. Binary shifts are "math shortcuts."


3. Storing Characters

To store text, the computer uses a Character Set. This is a look-up table that links a binary number to a specific character.

  • ASCII: Uses 8 bits per character. It can only represent 256 characters (enough for English and symbols).
  • Unicode: Uses more bits (usually 16 or 32). It can represent thousands of characters, including every language on Earth and even Emojis!

Did you know? In character sets, codes are ordered. If 'A' is code 65, then 'B' will be code 66.

Key Takeaway: More bits per character = more unique characters we can represent, but the file size will be larger.


4. Storing Images

Digital images are made of tiny dots called Pixels (Picture Elements). Each pixel is assigned a binary code to represent its colour.

Important Factors:

  • Resolution: The number of pixels in the image (Width x Height). Higher resolution = sharper image but bigger file.
  • Colour Depth: The number of bits used for each pixel. The more bits you use, the more colours you can have (e.g., 2 bits = 4 colours, 8 bits = 256 colours).
  • Metadata: "Data about data." This is extra info stored inside the file, like the height, width, and date the photo was taken.

Key Takeaway: Quality vs. File Size is a trade-off. Increasing resolution or colour depth makes the image look better but uses more storage.


5. Storing Sound

Sound is naturally analogue (a continuous wave). Computers must convert this into digital signals through Sampling.

How Sampling Works:

  1. The amplitude (height) of the sound wave is measured at regular intervals.
  2. Each measurement is turned into a binary number.
  3. These numbers are stored in order to recreate the wave.

Sound Quality:

  • Sample Rate: How often you take a measurement (measured in Hertz/Hz).
  • Bit Depth: How many bits are available for each sample. More bits = more accurate "snapshots" of the wave's height.

Key Takeaway: Higher sample rate and bit depth = better playback quality but a much larger file size.


6. Compression

Why do we need compression? To make files smaller so they take up less storage and can be sent over the internet faster!

1. Lossy Compression

This permanently removes some data from the file. It looks for things the human eye or ear can't easily notice.

  • Pros: Massive reduction in file size.
  • Cons: Quality is reduced; data is gone forever.
  • Example: MP3, JPEG.

2. Lossless Compression

This makes the file smaller without losing any information. It usually works by finding patterns in the data.

  • Pros: No loss in quality; the original file can be perfectly restored.
  • Cons: File size isn't reduced as much as with lossy.
  • Example: PNG, ZIP files.

Quick Review Box: Use Lossy for streaming music or photos where "good enough" is okay. Use Lossless for text documents or software where every single bit is essential!

Key Takeaway: Lossy = Smallest size but lower quality. Lossless = Perfect quality but larger size.