Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Data compression sits at the heart of nearly everything you do on a computer or the internet. When you stream music, send photos, or download files, compression algorithms are working behind the scenes to make that data small enough to transmit efficiently and store affordably. The AP CSP exam tests your understanding of how data is represented digitally and the trade-offs involved in different approaches to managing that data—compression is a perfect case study for both concepts.
You're being tested on more than just knowing that "JPEG is lossy." The exam wants you to understand why certain techniques work better for certain data types, what information gets sacrificed in lossy compression, and how algorithms like Huffman coding exploit patterns to achieve smaller file sizes. Don't just memorize which formats use which methods—know what concept each technique illustrates and be ready to explain the trade-offs involved.
Lossless compression reduces file size while preserving all original data, allowing perfect reconstruction. These techniques exploit patterns and redundancy in data without discarding any information.
Compare: Run-Length Encoding vs. Huffman Coding—both are lossless, but RLE exploits spatial repetition (same values in a row) while Huffman exploits frequency patterns (some symbols appear more often overall). If an FRQ asks about compressing a simple black-and-white image, RLE is your go-to example.
Lossy compression achieves much higher compression ratios by permanently removing data deemed less important. These techniques rely on human perception—we don't notice when certain details disappear.
Compare: JPEG vs. PNG—both compress images, but JPEG is lossy (smaller files, some quality loss) while PNG is lossless (larger files, perfect quality). Choose JPEG for photographs where minor quality loss is acceptable; choose PNG for graphics, text, or anything requiring exact pixel preservation.
Some compression methods target specific data characteristics rather than general patterns. These techniques work by understanding what makes certain data types predictable.
Compare: Delta Encoding vs. Dictionary-Based Compression—delta encoding exploits sequential similarity (nearby values are similar) while dictionary compression exploits repeated patterns (same sequences appear multiple times). Delta encoding is your best example when discussing video compression or sensor data.
The fundamental choice in compression comes down to what you're willing to sacrifice. Understanding this trade-off is one of the most commonly tested concepts on the AP CSP exam.
Compare: Lossless vs. Lossy Compression—lossless preserves all data (perfect for text, code, medical images) while lossy sacrifices some data for dramatically smaller files (ideal for media streaming). An FRQ might ask you to justify which approach fits a given scenario—always consider whether the data can tolerate any loss.
| Format | Compression Type | Primary Technique |
|---|---|---|
| PNG | Lossless | Dictionary-based (similar to LZW) |
| GIF | Lossless | LZW dictionary compression |
| JPEG | Lossy | DCT + quantization |
| MP3 | Lossy | Perceptual coding |
| FLAC | Lossless | Predictive coding + entropy coding |
| MP4/MPEG | Lossy | Inter-frame + intra-frame |
| ZIP | Lossless | Huffman + dictionary-based |
| Concept | Best Examples |
|---|---|
| Exploiting repetition | Run-Length Encoding, Dictionary-Based (LZW) |
| Exploiting frequency patterns | Huffman Coding |
| Exploiting sequential similarity | Delta Encoding, Video inter-frame compression |
| Exploiting human perception limits | JPEG (vision), MP3 (hearing), MPEG (both) |
| Lossless image formats | PNG, GIF, TIFF |
| Lossy media formats | JPEG, MP3, MP4/MPEG |
| Trade-off decision factors | Accuracy needs, file size constraints, editing requirements |
Which two compression techniques both exploit patterns in data but target different types of patterns—and what's the key difference between them?
A hospital needs to store X-ray images for patient records. Should they use JPEG or PNG compression, and what principle guides this decision?
Explain why delta encoding is particularly effective for video compression but would be nearly useless for compressing a collection of unrelated photographs.
Compare Huffman coding and Run-Length Encoding: under what conditions would each technique be most effective, and could they ever be used together?
An FRQ describes a music streaming service that wants to reduce bandwidth costs. Explain the trade-offs involved in increasing the compression ratio of their audio files, referencing specific techniques and their effects on user experience.