The Wigner-Ville distribution (WVD) provides a joint time-frequency representation of a signal, mapping it into a two-dimensional plane where you can see how energy is distributed across both time and frequency simultaneously. This makes it especially useful for non-stationary signals whose frequency content changes over time.

The WVD is a quadratic (bilinear) time-frequency distribution belonging to Cohen's class. Unlike linear representations such as the spectrogram, its quadratic nature is what gives it superior resolution but also introduces the cross-term artifacts discussed later.

Mathematical formulation

The Wigner-Ville distribution of a signal $x(t)$ is defined as:

$W_x(t,f) = \int_{-\infty}^{\infty} x\!\left(t+\frac{\tau}{2}\right) x^*\!\left(t-\frac{\tau}{2}\right) e^{-j2\pi f \tau}\, d\tau$

where:

$x(t)$ is the analytic signal
$x^*(t)$ is its complex conjugate
$t$ is the time variable
$f$ is the frequency variable
$\tau$ is a time-lag variable

The product $x(t+\frac{\tau}{2})\, x^*(t-\frac{\tau}{2})$ is called the instantaneous autocorrelation function. The WVD is simply the Fourier transform of this product with respect to $\tau$ . That's the core idea: at each time instant $t$ , you compute a local autocorrelation centered on $t$ , then transform it to get the frequency content at that moment.

Note that the WVD is typically applied to the analytic signal (the original real signal plus $j$ times its Hilbert transform) rather than the real signal directly. Using the analytic signal eliminates interference between positive and negative frequency components.

Relation to time-frequency analysis

The WVD maps a one-dimensional signal into a two-dimensional time-frequency representation, providing information about how signal energy is distributed across both domains simultaneously. It's particularly well-suited for analyzing:

Chirp signals (linearly frequency-modulated), where the WVD produces a perfectly concentrated line along the instantaneous frequency
Frequency-hopping signals, common in communications and radar
Transient events in biomedical or seismic data

For a linear chirp, the WVD achieves ideal concentration along the true instantaneous frequency trajectory, something no linear time-frequency method can match.

Properties of Wigner-Ville distribution

The WVD satisfies a number of mathematical properties that make it theoretically attractive. Understanding these properties clarifies both why the WVD is powerful and where its guarantees hold.

Time-frequency resolution

The WVD offers high resolution in both time and frequency domains simultaneously. Unlike the STFT-based spectrogram, the WVD does not use a fixed analysis window, so it is not subject to the classical time-frequency resolution trade-off imposed by the uncertainty principle on linear representations.

This means the WVD can resolve signal components that are closely spaced in time, in frequency, or both. For a single-component signal like a linear chirp, the WVD concentrates energy exactly along the instantaneous frequency with no smearing.

However, this resolution advantage comes with a cost: for multi-component signals, the same quadratic structure that provides high auto-term resolution also generates cross-terms between every pair of components.

Marginal properties

The WVD satisfies both marginal properties, which connect the time-frequency distribution back to familiar time-domain and frequency-domain quantities.

Time marginal: Integrating the WVD over all frequencies yields the instantaneous power: $\int_{-\infty}^{\infty} W_x(t,f)\, df = |x(t)|^2$
Frequency marginal: Integrating the WVD over all time yields the energy spectral density: $\int_{-\infty}^{\infty} W_x(t,f)\, dt = |X(f)|^2$

where $X(f)$ is the Fourier transform of $x(t)$ . These properties guarantee that the WVD is consistent with the known energy distributions in each domain individually. Not all time-frequency distributions satisfy both marginals, so this is a distinguishing feature of the WVD.

Moyal's formula

Moyal's formula relates the inner product of two signals in the time domain to the inner product of their WVDs in the time-frequency plane:

$\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} W_x(t,f)\, W_y(t,f)\, dt\, df = |\langle x, y \rangle|^2$

This result confirms that total signal energy is preserved in the WVD representation. It also has practical value: you can measure the similarity between two signals by computing the overlap of their WVDs, which is useful in signal detection and matched filtering.

Convolution and multiplication properties

The WVD interacts cleanly with convolution and multiplication:

Convolution property (time domain): If $z(t) = x(t) * y(t)$ , then: $W_z(t,f) = \int_{-\infty}^{\infty} W_x(t-\tau, f)\, W_y(\tau, f)\, d\tau$ The WVD of the convolution equals the convolution of the individual WVDs along the time axis.
Multiplication property (frequency domain): If $z(t) = x(t) \cdot y(t)$ , then: $W_z(t,f) = \int_{-\infty}^{\infty} W_x(t, f-\nu)\, W_y(t, \nu)\, d\nu$ The WVD of the product equals the convolution of the individual WVDs along the frequency axis.

These properties are useful when analyzing signals that have passed through linear time-invariant systems (convolution with an impulse response) or have been modulated (multiplication with a carrier).

Advantages of Wigner-Ville distribution

High resolution in time-frequency domain

The WVD achieves the highest possible time-frequency resolution among all members of Cohen's class. It is not constrained by a fixed window length (as in the STFT) or by the choice of basis functions (as in the wavelet transform). For single-component signals, this translates to perfectly sharp localization of energy along the instantaneous frequency.

This makes the WVD especially effective for signals with closely spaced frequency components or rapidly varying frequency content, such as chirps or frequency-hopping sequences.

Ability to analyze non-stationary signals

Traditional Fourier analysis assumes stationarity and cannot capture how frequency content evolves over time. The WVD provides a full joint time-frequency representation, making it naturally suited for non-stationary signals. You can directly read off the instantaneous frequency and group delay from the distribution, which is critical in radar target identification, sonar classification, and speech analysis.

Preservation of signal energy

Because the WVD satisfies both marginal properties and Moyal's formula, total signal energy is faithfully represented in the time-frequency plane. No energy is lost or artificially created by the transformation. This makes the WVD a reliable basis for energy-based detection, classification, and feature extraction.

Mathematical formulation, Category:Wigner distribution function - Wikimedia Commons

Disadvantages of Wigner-Ville distribution

Presence of cross-terms

The most significant drawback of the WVD is the appearance of cross-terms (also called interference terms). For a signal with $N$ components, the WVD produces $N$ auto-terms (the true signal components) plus $\binom{N}{2}$ cross-terms. So a signal with just 3 components generates 3 cross-terms; a signal with 10 components generates 45.

Cross-terms arise because the WVD is quadratic: the instantaneous autocorrelation function mixes every pair of signal components. These artifacts typically appear midway between the auto-terms in the time-frequency plane and oscillate at a rate proportional to the separation between the components they connect.

Interpretation difficulties due to cross-terms

Cross-terms can have amplitudes comparable to or even exceeding the auto-terms. They may overlap with true signal components, making it difficult to distinguish real features from artifacts. For multi-component signals, the time-frequency plane can become cluttered to the point of being unreadable without additional processing.

The WVD can also take negative values, which means it cannot be interpreted as a true energy density (a probability-like distribution). Negative regions often coincide with cross-term locations, further complicating interpretation.

Computational complexity

Computing the WVD requires evaluating the instantaneous autocorrelation at each time point and then taking its Fourier transform. The computational cost scales as $O(N^2)$ for a signal of length $N$ , compared to $O(N \log N)$ for an STFT-based spectrogram. For long signals or high sampling rates, this can be prohibitive, especially in real-time applications.

Variants of Wigner-Ville distribution

Several modified distributions have been developed to suppress cross-terms while retaining as much of the WVD's resolution as possible. Each variant introduces some form of smoothing, which reduces cross-terms at the cost of degraded auto-term resolution.

Pseudo Wigner-Ville distribution

The pseudo Wigner-Ville distribution (PWVD) applies a window function $h(\tau)$ to the instantaneous autocorrelation before computing the Fourier transform:

$PW_x(t,f) = \int_{-\infty}^{\infty} h(\tau)\, x\!\left(t+\frac{\tau}{2}\right) x^*\!\left(t-\frac{\tau}{2}\right) e^{-j2\pi f \tau}\, d\tau$

The window limits the extent of the lag variable $\tau$ , which smooths the distribution in frequency and suppresses cross-terms that oscillate rapidly along $\tau$ . Common window choices include Hamming, Hann, and Gaussian windows. A shorter window suppresses more cross-terms but reduces frequency resolution.

Smoothed pseudo Wigner-Ville distribution

The smoothed pseudo Wigner-Ville distribution (SPWVD) adds a second smoothing step: after applying the lag window $h(\tau)$ , it also convolves the result with a time-domain smoothing window $g(t)$ . This provides independent control over smoothing in both time and frequency directions.

The SPWVD offers more flexibility than the PWVD, since you can tune the time and frequency smoothing separately to match the signal's characteristics. The trade-off remains the same: more smoothing means fewer cross-terms but blurrier auto-terms.

Cone-shaped kernel

The cone-shaped kernel distribution uses a weighting function in the ambiguity domain (the $\tau$ - $\nu$ plane) shaped like a cone. This kernel emphasizes regions near the origin of the ambiguity plane, where auto-terms are concentrated, and attenuates outer regions where cross-terms tend to appear.

The cone's shape and parameters can be adjusted to optimize cross-term suppression for specific signal types while preserving auto-term sharpness.

Choi-Williams distribution

The Choi-Williams distribution (CWD) uses an exponential kernel:

$\Phi(\tau, \nu) = \exp\!\left(-\frac{(\tau \nu)^2}{\sigma}\right)$

where $\sigma$ is a parameter controlling the kernel's spread. A smaller $\sigma$ suppresses more cross-terms but also degrades resolution. The CWD is a popular choice because it provides a good balance between cross-term reduction and resolution, and the single parameter $\sigma$ makes it straightforward to tune.

Applications of Wigner-Ville distribution

Speech signal analysis

The WVD is used to study the time-varying spectral content of speech, including formant trajectories, pitch contours, and transitions between phonemes. Its high resolution helps capture the rapid spectral changes that occur during consonant-vowel transitions, which are difficult to resolve with the spectrogram alone.

Radar signal processing

Radar systems frequently use chirp waveforms and encounter frequency-hopping targets. The WVD's ability to perfectly represent linear chirps makes it valuable for pulse compression analysis, target classification based on micro-Doppler signatures, and detection of weak returns in noise and clutter. In practice, smoothed variants (SPWVD, CWD) are often preferred to handle multi-component radar scenes.

Sonar signal processing

In underwater acoustics, the WVD helps analyze marine mammal vocalizations, classify ship-radiated noise, and characterize seismic signals. These signals are typically non-stationary and broadband, making the WVD's joint time-frequency representation well-suited for source detection, localization, and environmental monitoring.

Biomedical signal analysis

The WVD is applied to physiological signals such as EEG, ECG, and EMG to identify transient events and time-varying spectral patterns. Examples include detecting epileptic seizures in EEG (which produce characteristic time-frequency signatures), analyzing heart rate variability in ECG, and studying muscle activation patterns in EMG. The WVD's ability to capture brief, non-stationary events is particularly valuable in clinical diagnostics.

Comparison with other time-frequency distributions

Spectrogram vs Wigner-Ville distribution

Feature	Spectrogram (STFT)	WVD
Resolution	Limited by window size (uncertainty principle trade-off)	High in both time and frequency
Cross-terms	None (linear representation)	Present for multi-component signals
Interpretability	Straightforward (always non-negative)	Can be negative; cross-terms complicate reading
Computational cost	$O(N \log N)$	$O(N^2)$

The spectrogram is the squared magnitude of the STFT, so it's always non-negative and free of cross-terms. But it cannot match the WVD's resolution. The spectrogram is actually a smoothed version of the WVD, where the smoothing kernel is determined by the analysis window's WVD.

Wavelet transform vs Wigner-Ville distribution

The continuous wavelet transform (CWT) provides multi-resolution analysis: better time resolution at high frequencies and better frequency resolution at low frequencies. This is well-matched to many natural signals. The CWT's scalogram (squared magnitude of the CWT) is also cross-term free.

Compared to the WVD, the wavelet transform trades peak resolution for interpretability and robustness. The WVD can achieve sharper localization for single-component signals, but the wavelet transform handles multi-component signals more gracefully without cross-term contamination.

Cohen's class of distributions

The WVD is the "prototype" member of Cohen's class. Every distribution in Cohen's class can be written as a 2D filtered version of the WVD:

$C_x(t,f) = \int\!\!\int W_x(t',f')\, \Pi(t-t', f-f')\, dt'\, df'$

where $\Pi$ is the kernel function. Different kernels yield different distributions:

WVD: $\Pi = \delta(t)\delta(f)$ (no smoothing, maximum resolution, maximum cross-terms)
Spectrogram: Kernel determined by the analysis window
Choi-Williams: Exponential kernel
Born-Jordan: Sinc-type kernel
Zhao-Atlas-Marks (cone kernel): Cone-shaped kernel

The choice of distribution depends on the signal characteristics, the acceptable level of cross-terms, and computational constraints.

Implementation of Wigner-Ville distribution

Discrete Wigner-Ville distribution

For discrete-time signals $x[n]$ , the discrete WVD is defined as:

$W_x[n,k] = 2\sum_{m=-M}^{M} x[n+m]\, x^*[n-m]\, e^{-j\frac{4\pi}{N}mk}$

where $n$ is the discrete time index, $k$ is the discrete frequency index, and $N$ is the number of frequency bins (typically chosen as a power of 2 for FFT efficiency).

Two practical issues arise in the discrete case:

Aliasing: The quadratic nature of the WVD causes frequency-domain aliasing. The standard remedy is to oversample the signal by a factor of 2 (or equivalently, use the analytic signal and compute $N \geq 2$ times the signal length in frequency bins).
Periodicity: The discrete WVD is periodic in both time and frequency, which can introduce wrap-around artifacts at the boundaries.

Fast algorithms for computation

Computing the discrete WVD directly costs $O(N^2)$ operations. Several strategies reduce this:

FFT-based computation: At each time index $n$ , form the instantaneous autocorrelation vector, then apply an FFT to get the frequency slice. This is the standard approach and brings the per-time-point cost to $O(N \log N)$ , with total cost $O(N^2 \log N)$ over all time points.
Recursive algorithms: Exploit the overlap between consecutive autocorrelation vectors to update the WVD incrementally, reducing redundant computation.
Decimation and pruning: For signals where only a portion of the time-frequency plane is of interest, compute only the relevant slices.

Software tools and libraries

Several tools support WVD computation:

MATLAB: The wvd() function in the Signal Processing Toolbox computes the WVD directly. The Time-Frequency Toolbox (TFTB) provides tfrwv() along with implementations of the PWVD, SPWVD, and other Cohen's class distributions.
Python: The tftb package (a Python port of the MATLAB TFTB) provides WVD and related functions. SciPy does not include a dedicated WVD function, but you can implement it using scipy.fft. The ssqueezepy library also offers time-frequency tools including the WVD.
Specialized packages: Libraries like libtfr (C/Python) provide optimized implementations for large-scale or real-time applications.