Fiveable

🌌Cosmology Unit 11 Review

QR code for Cosmology practice questions

11.2 Statistical measures of large-scale structure

11.2 Statistical measures of large-scale structure

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🌌Cosmology
Unit & Topic Study Guides

Large-scale structure in the universe isn't random. Galaxies cluster together, forming patterns we can measure. These patterns encode information about the universe's composition, its initial conditions, and how structure has grown over cosmic time.

To quantify these patterns, cosmologists rely on statistical tools, primarily the two-point correlation function and the matter power spectrum. These are complementary descriptions of the same underlying clustering signal, one in real space and one in Fourier space.

Statistical Measures of Large-Scale Structure

Two-point correlation function for galaxies

The two-point correlation function ξ(r)\xi(r) measures the excess probability of finding a pair of galaxies separated by distance rr, compared to what you'd expect if galaxies were scattered randomly. It's the most direct way to quantify clustering.

A common estimator is:

ξ(r)=DD(r)RR(r)1\xi(r) = \frac{DD(r)}{RR(r)} - 1

  • DD(r)DD(r) is the number of galaxy pairs with separation rr in the observed data.
  • RR(r)RR(r) is the expected number of pairs at that separation in a random catalog with the same survey geometry.

When ξ(r)>0\xi(r) > 0, galaxies are more clustered than random at that scale. When ξ(r)<0\xi(r) < 0, galaxies avoid each other at that scale (void regions). At very large separations, ξ(r)0\xi(r) \to 0 because the universe approaches homogeneity.

In practice, more sophisticated estimators (like the Landy-Szalay estimator, which also uses cross-counts DR(r)DR(r) between data and random catalogs) are preferred because they reduce bias from edge effects and survey geometry.

Estimating the correlation function involves:

  1. Constructing a random catalog that matches the survey's angular footprint and radial selection function.
  2. Counting galaxy-galaxy pairs DD(r)DD(r), random-random pairs RR(r)RR(r), and data-random pairs DR(r)DR(r) at each separation bin.
  3. Normalizing the pair counts and applying the chosen estimator.
  4. Repeating across a large enough galaxy sample to beat down shot noise. Surveys like SDSS, DES, and the upcoming Euclid and LSST/Rubin Observatory provide the millions of galaxies needed for reliable statistics.
Two-point correlation function for galaxies, The Distribution of Galaxies in Space | Astronomy

Power spectrum and correlation function

The power spectrum P(k)P(k) is the Fourier-space counterpart of the correlation function. While ξ(r)\xi(r) tells you about clustering at a given physical separation, P(k)P(k) tells you the amplitude of density fluctuations at a given spatial frequency (wavenumber kk, with units of inverse length).

The two are related by a Fourier transform:

P(k)=ξ(r)eikrd3rP(k) = \int \xi(r) \, e^{-i\vec{k} \cdot \vec{r}} \, d^3r

They contain the same information, but each has practical advantages. The power spectrum is often preferred for theoretical work because different Fourier modes evolve independently in the linear regime, making predictions cleaner. The correlation function is sometimes easier to estimate directly from survey data and is more intuitive for identifying features at specific physical scales (like the BAO peak at 100h1\sim 100 \, h^{-1} Mpc).

The shape of P(k)P(k) depends on the underlying cosmological model, the matter content of the universe, and the physics of structure formation. Its amplitude reflects the overall level of clustering.

Two-point correlation function for galaxies, Frontiers | C IV Broad Absorption Line Variability in QSO Spectra from SDSS Surveys

Shape and amplitude of the galaxy power spectrum

The power spectrum's shape carries distinct physical information at different scales:

  • Large scales (small kk): The spectrum turns over near the scale corresponding to the particle horizon at matter-radiation equality. On scales larger than this, the primordial spectrum is roughly preserved. The turnover scale depends on Ωmh2\Omega_m h^2, so measuring it constrains the total matter density.
  • Intermediate scales: The slope encodes the relative amounts of baryonic and dark matter. Baryonic acoustic oscillations (BAO) imprint a series of wiggles on P(k)P(k), corresponding to the sound horizon at recombination (150\sim 150 Mpc comoving). These wiggles act as a standard ruler for measuring cosmic distances.
  • Small scales (large kk): Non-linear gravitational collapse, galaxy mergers, and baryonic feedback processes (AGN heating, supernova-driven outflows) reshape the spectrum. Predictions here require N-body simulations or effective models rather than simple linear theory.

The amplitude of the power spectrum is typically parameterized by σ8\sigma_8, the root-mean-square density fluctuation in spheres of radius 8h18 \, h^{-1} Mpc. This depends on both the primordial fluctuation amplitude and the matter density Ωm\Omega_m.

A critical complication is galaxy bias: galaxies don't perfectly trace the underlying matter distribution. More massive halos (and the luminous galaxies they host) are more strongly clustered than the dark matter itself. The bias factor bb relates the galaxy power spectrum to the matter power spectrum: Pgal(k)b2Pmatter(k)P_{\text{gal}}(k) \approx b^2 \, P_{\text{matter}}(k). This relationship is scale-independent only on large scales; on smaller scales, bias becomes more complex.

Statistical measures of galaxy clustering

Extracting cosmological information from clustering measurements requires large galaxy surveys covering significant cosmic volume. Current and upcoming surveys span a range of approaches:

  • SDSS mapped over a million galaxy redshifts, providing the first high-precision measurements of P(k)P(k) and the BAO feature.
  • DES uses photometric redshifts over a wide area, combining clustering with weak lensing.
  • Euclid and LSST/Rubin will map billions of galaxies out to higher redshifts, dramatically improving constraints.

These clustering measurements constrain cosmological parameters because the shape and amplitude of P(k)P(k) depend sensitively on Ωm\Omega_m, Ωb\Omega_b, H0H_0, the spectral index nsn_s, σ8\sigma_8, and the dark energy equation of state ww. Observed clustering is compared to theoretical predictions using Bayesian inference, typically implemented with Markov Chain Monte Carlo (MCMC) sampling to explore the high-dimensional parameter space.

Measuring the correlation function and power spectrum at different redshifts reveals how clustering evolves over cosmic time. The growth rate of structure is particularly powerful as a cosmological probe: in general relativity, it's determined by Ωm(z)\Omega_m(z), but modified gravity theories predict different growth rates. Comparing the observed evolution of clustering with these predictions provides one of the cleanest tests of gravity on cosmological scales and helps distinguish dark energy models from modifications to general relativity.