Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Image enhancement sits at the foundation of nearly every computer vision pipeline. Before any algorithm can detect objects, recognize faces, or segment scenes, the input image often needs preprocessing to correct for poor lighting, sensor noise, or low contrast. You're being tested on your understanding of how these techniques manipulate pixel values and when to apply each one.
The techniques here demonstrate core principles: intensity transformations, spatial domain operations, frequency domain analysis, and adaptive processing. Exam questions will ask you to select the appropriate technique for a given scenario, explain the mathematical basis behind a method, or compare approaches for noise reduction versus edge preservation. Don't just memorize definitions. Know what problem each technique solves and the tradeoffs involved.
These methods operate directly on pixel values using mathematical functions, transforming input intensity to output intensity without considering neighboring pixels. The key principle: a point operation applies the same transformation function to every pixel independently.
Histogram equalization redistributes pixel intensities to produce a more uniform histogram, which spreads out the most frequent intensity values and increases global contrast.
Contrast stretching linearly maps the existing intensity range to the full available range :
This is a nonlinear power-law transformation defined as , where is a scaling constant and controls the curve shape.
Compare: Histogram Equalization vs. Contrast Stretching. Both improve contrast, but histogram equalization adapts to the image's actual distribution while contrast stretching applies a fixed linear mapping. If an exam question mentions "adaptive contrast improvement," histogram equalization is your answer.
Spatial filters modify pixel values based on the values of neighboring pixels within a defined kernel (or window). The underlying mechanism: convolution of the image with a filter mask determines whether you smooth, sharpen, or detect features.
Unsharp masking sharpens by adding a scaled version of the high-frequency detail back into the original image:
The term isolates the detail (edges and texture) that was removed by blurring.
Choosing the right filter depends on the type of noise:
The matching rule: salt-and-pepper noise responds best to median filtering (because the median ignores the extreme outlier values), while Gaussian noise responds better to Gaussian or mean filters (which average out the distributed noise).
Fixed filters apply the same operation everywhere, which is a problem when noise levels or image content vary across the frame. Adaptive filters adjust their behavior based on local statistics like mean and variance.
Compare: Median Filter vs. Gaussian Filter. Both reduce noise, but median filtering is nonlinear and excels at removing salt-and-pepper noise while preserving edges. Gaussian filtering is linear and better for Gaussian-distributed noise but blurs edges. Exam questions often ask which filter to choose for a specific noise type.
Edge detection identifies locations where intensity changes rapidly, marking boundaries between regions. The mathematical basis: edges correspond to high values of the first derivative (gradient) or zero-crossings of the second derivative (Laplacian).
Gradient-based operators compute the magnitude of intensity change. The Sobel operator, for example, computes separate horizontal () and vertical () derivatives, then combines them:
The Canny edge detector is a multi-stage pipeline and remains the gold standard for accuracy and noise robustness:
Prewitt and Roberts operators are simpler alternatives with smaller kernels, but they're more sensitive to noise than Canny.
Thresholding converts a grayscale image to binary: if , else . This is a critical step in segmentation pipelines.
Compare: Global Thresholding vs. Adaptive Thresholding. Global works when lighting is uniform, but adaptive thresholding handles shadows and gradients by computing thresholds from local neighborhoods. If an exam scenario mentions "varying illumination," adaptive is the correct choice.
Frequency domain methods transform images using the Fourier Transform, allowing you to manipulate specific frequency components directly. The core insight: low frequencies carry overall structure and smooth variations; high frequencies encode edges, noise, and fine details.
The Discrete Fourier Transform (DFT) converts spatial data to a frequency representation:
Once in the frequency domain, you multiply by a filter function to keep or remove specific frequencies, then apply the inverse transform to get back to a spatial image.
Compare: Spatial Domain vs. Frequency Domain Filtering. Spatial filtering is intuitive and computationally efficient for small kernels. Frequency domain filtering excels for large kernels and gives precise control over which frequencies to modify. The key relationship: convolution in the spatial domain equals multiplication in the frequency domain. This means a large spatial convolution (expensive) can be done as a simple multiplication after a Fourier Transform (often faster).
| Concept | Best Examples |
|---|---|
| Point operations (intensity transforms) | Histogram Equalization, Contrast Stretching, Gamma Correction |
| Linear spatial filtering | Gaussian Smoothing, Mean Filter, Sharpening Kernels |
| Nonlinear spatial filtering | Median Filter, Adaptive Filtering |
| Edge detection | Sobel, Canny, Prewitt |
| Segmentation | Image Thresholding, Otsu's Method |
| Frequency domain | Fourier Transform, Low-pass/High-pass Filters |
| Detail enhancement | Unsharp Masking, High-pass Filtering |
| Noise-specific solutions | Median (salt-and-pepper), Gaussian filter (Gaussian noise) |
Which two techniques both improve contrast but differ in whether they adapt to the image's histogram distribution? Explain when you'd choose one over the other.
You're given an image corrupted by salt-and-pepper noise. Compare the effectiveness of mean filtering versus median filtering, and justify which you'd select.
Explain how unsharp masking achieves sharpening without directly computing derivatives. What parameter controls the strength of the effect?
An image has uneven lighting across the frame. Compare global thresholding with adaptive thresholding. Which would you recommend and why?
Describe the relationship between spatial domain convolution and frequency domain multiplication. Why might you choose frequency domain filtering for a very large smoothing kernel?