Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Image compression sits at the heart of nearly every computer vision application you'll encounter—from how your smartphone stores photos to how autonomous vehicles process visual data in real-time. You're being tested on your understanding of the fundamental trade-off between file size and image quality, and more importantly, why different techniques make different choices along that spectrum. The concepts here connect directly to signal processing, information theory, and human visual perception.
Don't just memorize which format uses which algorithm. Instead, focus on understanding the underlying mechanisms: transform-based compression, entropy coding, spatial redundancy exploitation, and the lossy vs. lossless paradigm. When you can explain why JPEG discards high-frequency data or how wavelet transforms outperform DCT for certain images, you're thinking like a computer vision engineer—and that's exactly what exam questions will demand.
These techniques convert image data from the spatial domain into a frequency domain, where redundant information becomes easier to identify and remove. The core insight is that most image energy concentrates in low-frequency components, while high frequencies (fine details) can often be reduced without noticeable quality loss.
Compare: DCT vs. Wavelet Transform—both convert to frequency domain, but DCT operates on fixed blocks while wavelets analyze multiple scales. For images with sharp edges (like text or diagrams), wavelets in JPEG 2000 outperform DCT-based JPEG. If asked about compression artifacts, DCT's blocking artifacts vs. wavelets' ringing artifacts is a key distinction.
Entropy coding exploits statistical redundancy in data—the fact that some values appear more frequently than others. These techniques assign shorter codes to common patterns and longer codes to rare ones, achieving compression without any information loss.
Compare: Huffman Coding vs. RLE—both are lossless, but they exploit different redundancies. RLE targets spatial redundancy (repeated adjacent values), while Huffman targets statistical redundancy (frequent symbols). Many practical systems use both: RLE first to reduce repeated values, then Huffman to encode the result efficiently.
These are full file formats or codec families that combine multiple techniques—transforms, quantization, and entropy coding—into practical, standardized systems.
Compare: JPEG vs. PNG—JPEG is lossy and excels at photographs; PNG is lossless and excels at graphics with transparency. The classic exam question: which format for a logo overlay on a photo? PNG for the logo (sharp edges, transparency), JPEG for the photo background (continuous tones, smaller file).
These techniques go beyond standard transforms to exploit more complex patterns in image data, often achieving impressive compression ratios at the cost of computational complexity.
Compare: Vector Quantization vs. Fractal Compression—both identify patterns within images, but VQ uses a fixed codebook of representative blocks while fractal methods find self-similar transformations. VQ is practical and widely used; fractal compression remains largely academic despite its elegant mathematics.
Video compression builds on image compression but adds temporal redundancy exploitation—the fact that consecutive frames are usually very similar.
Understanding when to use each approach is as important as understanding how they work. The choice depends entirely on your application's requirements for quality, file size, and reconstruction fidelity.
Compare: Lossless vs. Lossy—this is the central conceptual divide in compression. Lossless (PNG, lossless JPEG 2000) preserves everything but achieves limited compression; lossy (JPEG, standard JPEG 2000) achieves dramatic compression by exploiting perceptual limitations. The exam loves asking: when would you choose each?
| Concept | Best Examples |
|---|---|
| Transform-based compression | DCT, Wavelet Transform |
| Entropy coding | Huffman Coding, RLE |
| Lossy image formats | JPEG, lossy JPEG 2000 |
| Lossless image formats | PNG, lossless JPEG 2000 |
| Exploits spatial redundancy | RLE, Vector Quantization |
| Exploits statistical redundancy | Huffman Coding |
| Exploits self-similarity | Fractal Compression |
| Video compression | MPEG (H.264, H.265) |
Both DCT and Wavelet Transform convert images to the frequency domain. What specific advantage does wavelet-based JPEG 2000 have over DCT-based JPEG for images containing text or sharp edges?
You need to compress a medical X-ray for archival storage where diagnostic accuracy is critical. Which compression approach (lossy or lossless) would you choose, and name two specific formats that support it.
Compare and contrast how Huffman Coding and Run-Length Encoding achieve compression. What type of redundancy does each exploit, and in what scenario would RLE dramatically outperform Huffman?
A web developer asks whether to use JPEG or PNG for a company logo that needs to appear over various background colors. Which format should they choose and why? What if they were compressing a photograph instead?
Explain how MPEG video compression builds upon still-image compression techniques like JPEG. What additional type of redundancy does video compression exploit that isn't present in single images?