upgrade
upgrade

🖼️Images as Data

Key Image Compression Algorithms

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you're working with digital images, you're constantly making tradeoffs between file size and image quality—and the AP exam expects you to understand why different compression methods exist and when to use each one. These algorithms aren't just technical trivia; they represent fundamental concepts in computer science: lossy vs. lossless compression, data redundancy, and how mathematical transformations can reduce information while preserving what matters most.

You're being tested on your ability to analyze how images are stored and transmitted as data, which connects directly to broader themes of data representation, abstraction, and algorithmic efficiency. Don't just memorize which format does what—know the underlying principle each algorithm demonstrates and be ready to explain why JPEG works well for photos but terribly for logos, or how run-length encoding exploits patterns in data.


Lossy Compression Formats

These formats achieve smaller file sizes by permanently discarding some image data. The key principle: human perception doesn't notice certain details, so we can remove them strategically.

JPEG (Joint Photographic Experts Group)

  • Lossy compression—reduces file size by discarding visual information humans are less likely to notice, particularly in high-frequency color variations
  • 24-bit color depth supports over 16 million colors, making it ideal for photographs and complex gradients
  • No transparency support—cannot store alpha channels, so it's unsuitable for logos or images requiring see-through backgrounds

HEIF (High Efficiency Image File Format)

  • Advanced lossy compression—achieves roughly 50% smaller files than JPEG at equivalent quality using modern encoding techniques
  • HDR and transparency support make it more versatile than JPEG for professional and mobile photography
  • HEVC codec compatibility—uses the same compression technology as modern video, enabling efficient image sequences and animations

WebP

  • Dual-mode compression—supports both lossy and lossless options in a single format, giving developers flexibility
  • Superior compression ratios compared to both JPEG and PNG, reducing bandwidth and improving web performance
  • Transparency and animation support—combines the best features of JPEG, PNG, and GIF into one format

Compare: JPEG vs. WebP—both handle photographic images well, but WebP achieves 25-35% smaller files at equivalent quality and supports transparency. If an FRQ asks about optimizing web performance, WebP is your modern answer.


Lossless Compression Formats

These formats reduce file size without losing any original data. The tradeoff: larger files, but perfect reconstruction.

PNG (Portable Network Graphics)

  • Lossless compression—every pixel is preserved exactly, making it ideal for images that will be edited multiple times
  • Transparency support through alpha channels allows for smooth edges and see-through regions in graphics
  • Best for sharp edges—text, logos, and icons stay crisp because no data is approximated or discarded

GIF (Graphics Interchange Format)

  • 256-color palette limit (8-bit) makes it unsuitable for photographs but efficient for simple graphics
  • Animation support—the only legacy format that handles frame-based animation natively in browsers
  • Binary transparency only—pixels are either fully transparent or fully opaque, no partial transparency

JPEG 2000

  • Dual-mode flexibility—supports both lossy and lossless compression with superior quality at low bit rates
  • Wavelet-based compression provides better detail preservation than standard JPEG, especially at high compression ratios
  • Professional use cases—common in medical imaging, archives, and digital cinema where quality is critical

Compare: PNG vs. GIF—both are lossless, but PNG supports millions of colors and smooth transparency while GIF is limited to 256 colors with binary transparency. Use GIF only when you need animation; otherwise, PNG wins.


Underlying Compression Techniques

These are the mathematical and algorithmic methods that power the formats above. Understanding these helps you explain how compression actually works.

Discrete Cosine Transform (DCT)

  • Frequency-domain transformation—converts spatial pixel data into frequency components, separating important visual information from fine details
  • Foundation of JPEG compression—enables controlled quality loss by quantizing (rounding) less important frequency values
  • Focuses on perceptual relevance—human eyes are more sensitive to low-frequency changes, so DCT prioritizes those

Run-Length Encoding (RLE)

  • Pattern-based compression—replaces sequences like "AAAAA" with "5A," dramatically shrinking repetitive data
  • Lossless and simple—preserves all original data using a straightforward algorithm
  • Limited effectiveness—works well for simple graphics with solid colors but provides minimal compression for complex photographs

Compare: DCT vs. RLE—DCT transforms data mathematically to identify what can be discarded (lossy), while RLE simply finds repeated patterns to shorten (lossless). They solve different problems and are often used together in compression pipelines.

Huffman Coding

  • Variable-length encoding—assigns shorter binary codes to frequently occurring values and longer codes to rare ones
  • Lossless optimization—reduces file size without any data loss by exploiting statistical patterns
  • Used alongside other methods—JPEG applies Huffman coding after DCT to further compress the transformed data

Lempel-Ziv-Welch (LZW) Compression

  • Dictionary-based approach—builds a table of repeated patterns during compression, replacing them with shorter references
  • Powers GIF and TIFF formats—particularly effective for images with repetitive structures like logos and diagrams
  • Lossless but variable efficiency—works best on images with redundant data, less effective on high-detail photographs

Compare: Huffman Coding vs. LZW—both are lossless, but Huffman assigns codes based on individual value frequency while LZW identifies and replaces repeated sequences. Think of Huffman as letter-level optimization and LZW as word-level optimization.


Quick Reference Table

ConceptBest Examples
Lossy compression for photosJPEG, HEIF, WebP (lossy mode)
Lossless compression for graphicsPNG, GIF, WebP (lossless mode)
Transparency supportPNG, GIF, WebP, HEIF
Animation supportGIF, WebP, HEIF
Frequency-domain transformationDCT (used in JPEG)
Pattern-based lossless encodingRLE, LZW
Statistical encodingHuffman Coding
Modern/next-gen formatsWebP, HEIF, JPEG 2000

Self-Check Questions

  1. Which two compression techniques are both lossless but use fundamentally different approaches—one based on repeated sequences and one based on value frequency?

  2. A web developer needs to display a company logo with a transparent background. Why would JPEG be a poor choice, and which format would you recommend instead?

  3. Compare and contrast JPEG and PNG: What type of compression does each use, and what types of images is each best suited for?

  4. If an FRQ asks you to explain how JPEG achieves smaller file sizes than the original image, which underlying technique (DCT, RLE, or Huffman) would be most important to discuss, and why?

  5. WebP and HEIF are both considered "next-generation" formats. What advantages do they share over older formats like JPEG and PNG, and why haven't they completely replaced those legacy formats yet?