Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Data compression sits at the heart of Information Theory—it's the practical application of Shannon's foundational insights about entropy and redundancy. When you study compression methods, you're really studying how to exploit statistical patterns in data to represent information more efficiently. Every algorithm here connects back to core concepts you'll be tested on: entropy limits, redundancy reduction, source coding theorems, and the fundamental trade-off between compression ratio and computational complexity.
Don't just memorize which algorithm does what—understand why each approach works and when it's optimal. The exam will ask you to compare methods, identify which technique suits a given data type, and explain the theoretical principles that make compression possible. Know the mechanism behind each method, and you'll be ready for any FRQ that asks you to analyze or design a compression strategy.
These algorithms directly exploit Shannon's entropy—they assign shorter codes to more probable symbols, approaching the theoretical compression limit. The key insight: if you know the probability distribution of your source, you can encode common symbols with fewer bits.
Compare: Huffman vs. Arithmetic Coding—both are entropy coders targeting the Shannon limit, but Huffman assigns whole-bit codewords while arithmetic coding handles fractional bits. For FRQs on compression efficiency, arithmetic coding is your example of getting closer to bits per symbol.
Instead of modeling symbol probabilities, these algorithms find and replace repeated sequences with shorter references. The mechanism: build a dictionary of patterns as you scan the data, then substitute future occurrences with dictionary pointers.
Compare: Entropy-based vs. Dictionary-based methods—Huffman/arithmetic exploit symbol frequency, while LZW exploits sequence repetition. If an FRQ gives you data with long repeated substrings, dictionary methods win; for data with skewed symbol distributions, entropy coding is your answer.
These techniques don't compress directly—they transform data into a representation where compression becomes more effective. The principle: convert data to a domain (frequency, wavelet) where energy concentrates in fewer coefficients.
Compare: DCT vs. Wavelet transforms—both convert to frequency domain, but DCT uses fixed block sizes while wavelets provide multi-resolution analysis. Wavelets handle edges better (fewer blocking artifacts), which is why JPEG 2000 outperforms classic JPEG at high compression ratios.
These algorithms target specific types of redundancy in data—consecutive repetitions or predictable differences. They work best when data has obvious local patterns.
Compare: RLE vs. Delta Encoding—RLE exploits spatial repetition (identical consecutive values), while delta encoding exploits temporal correlation (small changes between values). For video compression questions, delta encoding between frames is your key example; for binary image compression, think RLE.
| Concept | Best Examples |
|---|---|
| Entropy coding (Shannon limit) | Huffman, Arithmetic coding, PPM |
| Dictionary-based (sequence repetition) | LZW, ZIP/gzip variants |
| Transform coding (frequency domain) | DCT, Wavelet compression |
| Preprocessing transforms | Burrows-Wheeler Transform |
| Local redundancy exploitation | RLE, Delta encoding |
| Lossy compression | DCT (JPEG), Wavelet (JPEG 2000) |
| Lossless compression | Huffman, LZW, Arithmetic, BWT + RLE |
| Adaptive/context modeling | Arithmetic coding, PPM |
Which two methods both approach the Shannon entropy limit but differ in how they assign code lengths? Explain why one can outperform the other for certain probability distributions.
Compare and contrast dictionary-based compression (LZW) with entropy-based compression (Huffman). Under what data conditions would you choose one over the other?
If you're compressing video with high frame-to-frame similarity, which combination of methods would you use, and why does each component help?
The Burrows-Wheeler Transform doesn't compress data itself. Explain the principle behind why it improves subsequent compression, and name an algorithm that typically follows it.
FRQ-style: You're designing a compression system for medical images where some quality loss is acceptable but edge detail must be preserved. Which transform method would you recommend over DCT, and what theoretical property makes it superior for this application?