Definition of Gabor transform
The Gabor transform is a time-frequency analysis technique that represents a signal in both the time and frequency domains simultaneously. It's designed for non-stationary signals, where the frequency content changes over time. Dennis Gabor introduced the concept in 1946 specifically to address the limitation of the Fourier transform, which tells you what frequencies are present but not when they occur.
Relationship to STFT
The Gabor transform is a specific instance of the Short-Time Fourier Transform (STFT). Both work by sliding a window along the signal and computing the Fourier transform of each windowed segment. The distinction is that the Gabor transform specifically uses a Gaussian window function, which achieves the theoretical minimum of the Heisenberg uncertainty bound for joint time-frequency localization. Any STFT with a Gaussian window is a Gabor transform.
Differences from wavelet transform
The Gabor transform uses a fixed window size for all frequencies, producing uniform time-frequency resolution across the entire plane. The wavelet transform, by contrast, uses variable-sized windows: narrow windows for high frequencies (good time resolution) and wide windows for low frequencies (good frequency resolution).
This makes wavelets better suited for signals with sharp transients at high frequencies and slowly varying low-frequency components. The Gabor transform is preferable when you want consistent resolution at all frequencies, or when the signal's characteristics don't vary dramatically across frequency bands.
Mathematical formulation
The Gabor transform maps a one-dimensional time-domain signal into a two-dimensional time-frequency representation by projecting the signal onto a set of time-frequency shifted Gaussian functions, called Gabor atoms.
Continuous Gabor transform
The continuous Gabor transform of a signal is defined as:
where:
- is the Gaussian window function
- is the time shift parameter (centers the window)
- is the frequency parameter
- denotes complex conjugation
You can read this as: slide the Gaussian window to time , multiply the signal by the conjugated window, then extract the frequency content at via the complex exponential.
Discrete Gabor transform
In practice, you compute the Gabor transform on discrete signals with discrete time-frequency shifts:
- : discrete input signal
- : discrete Gaussian window
- , : discrete time and frequency shift indices
- : time step between successive windows (hop size)
- : total number of frequency bins (DFT length)
The product relative to the signal length determines whether the transform is oversampled or critically sampled (more on this below).
Gabor coefficients
The Gabor coefficients encode two things at each time-frequency location:
- Magnitude : the signal energy concentrated at that point in the time-frequency plane
- Phase : local phase structure, which carries information about the fine temporal alignment of frequency components
Both are important. Magnitude is used most often for visualization and feature extraction, but phase is critical for reconstruction and for applications like phase vocoding in audio.
Gabor functions
Gabor functions (atoms) are the elementary building blocks of the transform. Each atom is a Gaussian envelope modulated by a complex sinusoid at a specific frequency and centered at a specific time.
Gaussian window function
The Gaussian window is defined as:
The parameter (standard deviation) controls the window width. A larger spreads the window over more time, while a smaller concentrates it. The Gaussian is chosen because it is the only window that achieves equality in the Heisenberg uncertainty bound.
Time-frequency resolution
The time-frequency resolution is governed entirely by :
- Narrow window (small ): good time resolution, poor frequency resolution
- Wide window (large ): good frequency resolution, poor time resolution
Once you fix , the resolution is the same everywhere in the time-frequency plane. This is a fundamental constraint you cannot escape, only manage through your choice of for a given application.
Uncertainty principle
The Heisenberg-Gabor uncertainty principle states:
where and are the effective time and frequency spreads of the window. No window function can beat this bound. The Gaussian window is special because it achieves equality, meaning it provides the tightest possible joint localization. This is precisely why Gabor chose it.
Properties of Gabor transform
Linearity
The Gabor transform is linear. For signals and with constants and :
This means you can analyze composite signals component by component and superpose the results. It also means the transform commutes with linear filtering operations, which simplifies many processing pipelines.
Invertibility
The Gabor transform is invertible under appropriate conditions. The inverse continuous Gabor transform is:
The normalization constant depends on the window and must be finite (this is the admissibility condition). In the discrete case, invertibility depends on whether the Gabor atoms form a frame for the signal space. Oversampled systems are generally easier to invert stably than critically sampled ones.
Parseval's theorem
Energy is preserved under the Gabor transform:
This guarantees that the total energy you measure in the time-frequency plane equals the energy of the original signal. It's essential for any energy-based analysis or when you modify coefficients and want to predict the effect on signal power.
Computation of Gabor transform

Efficient algorithms
Direct computation of the Gabor transform requires an inner product for every point on the time-frequency grid, which gets expensive fast. The standard efficient approach exploits the structure of the transform:
- Segment the signal using the shifted Gaussian window at each time step
- Apply the FFT to each windowed segment to obtain the frequency coefficients
- Collect the results into the time-frequency matrix
This reduces the per-frame cost from to . For the inverse transform, efficient dual window computation and overlap-add methods are used.
Oversampling vs. critical sampling
The sampling density in the time-frequency plane is controlled by the product , where is the time step and is the frequency step.
- Critical sampling (): the minimum number of coefficients needed for perfect reconstruction. Maximally efficient in storage, but the dual window may be poorly localized and the system is sensitive to perturbations.
- Oversampling (): more coefficients than strictly necessary, introducing redundancy. The redundancy makes the representation more robust, the dual window better localized, and coefficient modification (e.g., for denoising) more stable.
Most practical implementations use moderate oversampling (redundancy factors of 2 to 4).
Numerical stability
Stability depends on the frame bounds and of the Gabor system. The condition number controls how noise and rounding errors propagate through reconstruction. Poorly chosen combinations of window width and sampling parameters can make very large, leading to unstable inversion.
To maintain stability:
- Use oversampling to improve the frame bound ratio
- Normalize the window appropriately
- Avoid sampling parameters near the critical density boundary where frame bounds degrade
Applications of Gabor transform
Time-frequency analysis
The primary use of the Gabor transform is visualizing and analyzing how frequency content evolves over time. The squared magnitude is called the spectrogram (when derived from a Gabor/STFT). Speech signals, musical audio, EEG, and ECG all exhibit time-varying spectral content that the Gabor transform reveals clearly.
Feature extraction
Gabor coefficients serve as localized time-frequency features for classification and pattern recognition. Because each coefficient captures energy at a specific time and frequency, they naturally encode the kind of discriminative structure needed for tasks like:
- Speech and speaker recognition
- Image texture classification (using 2D Gabor filters)
- Mechanical fault diagnosis from vibration signals
The localization properties of the Gaussian window make these features more robust than global Fourier features for non-stationary data.
Denoising and compression
Both tasks operate by modifying Gabor coefficients before inverting back to the time domain.
- Denoising: apply thresholding (hard or soft) to suppress coefficients dominated by noise while preserving signal components. Oversampled representations work better here because the redundancy allows more aggressive thresholding without introducing artifacts.
- Compression: discard or quantize small coefficients to reduce storage. The trade-off is between reconstruction fidelity and compression ratio.
Variants and extensions
Gabor frames
Gabor frame theory provides the mathematical foundation for the discrete Gabor transform. A collection of Gabor atoms forms a frame if there exist constants such that:
for all signals . When this holds, stable reconstruction is guaranteed via the dual frame. Frame theory also extends beyond Gaussian windows, allowing other window shapes as long as the frame condition is satisfied.
Multiwindow Gabor transform
Instead of a single Gaussian, the multiwindow variant uses several window functions simultaneously. Different windows can target different signal characteristics: a narrow window captures transients, while a wider window resolves closely spaced frequency components. The combined representation is richer, though at the cost of increased redundancy and computation.
Adaptive Gabor transform
The adaptive Gabor transform adjusts the window parameters (width, shape) locally based on the signal's time-frequency structure. The goal is to overcome the fixed-resolution limitation of the standard Gabor transform. For example, regions with rapid transients get a narrower window, while tonal regions get a wider one. This is conceptually appealing but computationally more demanding and requires a reliable criterion for local adaptation.
Relationship to other transforms
Fourier transform
The Fourier transform can be viewed as a limiting case of the Gabor transform where the window spans the entire signal (effectively a rectangular window of infinite length). It provides perfect frequency resolution but zero time localization. The Gabor transform trades some frequency precision for the ability to track spectral changes over time.
Wavelet transform
The wavelet transform uses a multi-resolution approach: the analysis window scales with frequency. At high frequencies, the wavelet has short duration (good time resolution); at low frequencies, it has long duration (good frequency resolution). The Gabor transform, by contrast, uses the same window at every frequency, giving uniform resolution.
Neither is universally better. Wavelets suit signals with broadband transients and slowly varying baselines. The Gabor transform suits signals where uniform resolution is acceptable or preferred, and where phase coherence across frequency is important.
Wigner-Ville distribution
The Wigner-Ville distribution (WVD) is a quadratic (bilinear) time-frequency representation defined as:
It achieves excellent time-frequency concentration for single-component signals but produces cross-terms (interference artifacts) for multi-component signals. Smoothing the WVD with a Gaussian kernel in both time and frequency yields the Gabor spectrogram, which suppresses cross-terms at the expense of some resolution. This connection highlights the Gabor transform as a practical compromise between resolution and interference suppression.