A light field is a complete record of the radiance (light intensity and direction) for every ray traveling through a scene. Unlike a traditional photograph, which collapses all angular information onto a flat sensor, a light field preserves how light arrives at each point, not just how much light arrives. This makes post-capture refocusing, novel view synthesis, and accurate depth estimation possible from a single capture.

Light field representation

A light field describes radiance along rays as a high-dimensional function that encodes both spatial position and angular direction. In practice, it's most commonly represented as a 4D function, parameterized by the intersection of rays with two parallel planes. You can think of each ray as being indexed by where it crosses the first plane $(u, v)$ and where it crosses the second plane $(s, t)$ .

Because the light field stores intensity and direction for each ray, you can computationally reconstruct different viewpoints and focal planes from a single capture session.

Plenoptic function

The plenoptic function is the theoretical foundation for light fields. It's a seven-dimensional function:

$P(x, y, z, \theta, \phi, \lambda, t)$

Each variable captures a different aspect of light in the scene:

$(x, y, z)$ : the 3D position of the observation point
$(\theta, \phi)$ : the angular direction of the incoming ray
$\lambda$ : wavelength (color)
$t$ : time

In practice, nobody works with all seven dimensions at once. Light field photography simplifies this by fixing position to a plane and dropping wavelength (handling color with standard RGB channels), which brings you down to the 4D or 5D representations below.

4D vs 5D light fields

4D light fields $(u, v, s, t)$ parameterize rays by their intersections with two parallel planes. This works well for static scenes captured from a single region of space, and it keeps computational cost manageable.
5D light fields $(u, v, s, t, \tau)$ add time as a fifth dimension, enabling capture of dynamic scenes and light field video.

The trade-off is straightforward: higher dimensionality captures more information but dramatically increases storage and processing demands.

Light field capture devices

Several hardware approaches exist for recording light fields, each balancing spatial resolution, angular resolution, portability, and cost differently.

Plenoptic cameras

A plenoptic camera places a microlens array between the main lens and the image sensor. Each microlens captures a tiny image of the scene from a slightly different angle, encoding angular information directly onto the sensor.

Single-shot capture makes them suitable for dynamic scenes
The fundamental trade-off: spatial resolution is sacrificed for angular resolution, so the effective image resolution is significantly lower than the sensor's pixel count
Notable examples: Lytro (consumer), Raytrix (industrial)

Camera arrays

Camera arrays use multiple synchronized cameras arranged in a grid or other configuration. Each camera captures a full-resolution image from its own viewpoint.

Larger baselines between cameras improve depth estimation accuracy
Achieve high spatial and angular resolution simultaneously
Require careful calibration and synchronization across all cameras
Examples: Stanford Multi-Camera Array, Light Stage systems (used in film VFX)

Handheld light field cameras

These are compact devices designed for consumer or field use. They typically use microlens arrays or coded apertures to record angular information in a portable form factor.

Portability comes at the cost of light field quality (fewer angular samples, smaller baseline)
Examples include the Lytro Illum and Pelican Imaging's multi-aperture array camera

Light field rendering

Rendering transforms raw light field data into 2D images, novel views, or depth maps. This is where the post-capture flexibility of light fields becomes tangible.

Ray tracing techniques

To generate a novel view from a light field, you select and interpolate rays from the captured dataset that correspond to the desired virtual camera position and orientation.

Common approaches include:

Splatting: projecting light field samples onto the output image plane
Voxel-based rendering: discretizing the scene volume and accumulating ray contributions
Surface light field rendering: mapping captured radiance onto reconstructed geometry

Higher rendering quality generally requires more computation, so real-time applications often use approximations.

Depth estimation from light fields

Light fields are especially powerful for depth estimation because they contain angular parallax information. Two key techniques:

Epipolar Plane Image (EPI) analysis: Slicing the 4D light field along specific dimensions produces 2D images where depth appears as the slope of lines. Steeper slopes correspond to closer objects.
Multi-view stereo: Treating each angular sample as a separate camera view and applying stereo matching algorithms.

Light field depth estimation generally outperforms traditional two-view stereo, though occlusions and low-texture regions remain challenging.

Synthetic aperture photography

By computationally combining multiple angular views from the light field, you can simulate apertures of different sizes after capture.

Summing all angular views simulates a very large aperture, producing shallow depth of field and strong bokeh
Using fewer views simulates a smaller aperture with deeper focus
This enables post-capture refocusing and depth-of-field control that's impossible with a single conventional photograph
Artifacts can appear in low-light conditions due to noise amplification during view combination

Light field representation, ECCV2020: Spatial-Angular Interaction for Light Field Image Super-Resolution 空间-角度交互网络光场图像超分辨率 ...

Applications of light fields

Refocusing after capture

Post-capture refocusing works by computationally shifting and summing the sub-aperture images in the light field. Shifting by different amounts brings different depth planes into focus.

You can also create all-in-focus images by combining the sharpest regions from multiple focal planes
Practical uses include portrait photography (choosing the focus point later) and macro imaging (where depth of field is extremely shallow)

View synthesis

View synthesis generates viewpoints that weren't directly captured. The light field is interpolated between existing angular samples to produce new perspectives.

Enables parallax effects, 3D viewing experiences, and free-viewpoint video
Critical for virtual reality and telepresence applications
Quality degrades near occlusion boundaries where interpolation must fill in previously hidden regions

Depth estimation

Depth maps extracted from light fields support 3D reconstruction, object segmentation, and scene understanding. The angular redundancy in light field data provides more robust depth estimates than a single stereo pair, particularly in scenes with reflective surfaces or fine geometric detail.

Handling scenes with many overlapping depth layers (e.g., foliage, fences) remains an active research challenge.

Computational light field photography

Computational techniques reduce the hardware burden of light field capture and improve the quality of reconstructed data.

Compressive light field sensing

Full light field capture requires enormous amounts of data, but light fields tend to be sparse in certain transform domains. Compressive sensing exploits this sparsity:

A coded aperture or optical mask modulates the incoming light during capture
The sensor records a compressed measurement (fewer samples than a full light field)
A reconstruction algorithm recovers the full light field from these compressed measurements

This approach can enable light field capture with a conventional camera sensor, though reconstruction is computationally expensive.

Light field reconstruction

When only sparse or noisy light field samples are available, reconstruction algorithms fill in the missing data. Techniques include:

Sparse coding / dictionary learning: representing light field patches as combinations of learned basis elements
Deep learning: training neural networks to predict dense light fields from sparse inputs
Optimization-based methods: enforcing consistency constraints across angular views

These methods outperform simple interpolation by exploiting the structural regularity of light fields.

Super-resolution techniques

Light field super-resolution increases spatial resolution beyond what the capture device provides. The key insight is that multiple low-resolution angular views contain slightly shifted versions of the same scene, and these sub-pixel shifts can be combined to reconstruct higher-resolution images.

Sub-pixel shift methods: align and merge angular views
Learning-based methods: train networks on paired low/high-resolution light field data
Bayesian inference: probabilistically combine evidence across views

Preserving angular consistency while boosting spatial resolution is the central difficulty.

Challenges in light field imaging

Data storage requirements

A single light field capture can be orders of magnitude larger than a conventional photograph. A 4D light field with, say, $9 \times 9$ angular views at full sensor resolution generates 81 times the data of a single image.

Standard image and video codecs (JPEG, H.264) aren't designed for angular dimensions
Specialized light field compression algorithms must balance compression ratio against preserving the angular information needed for refocusing and view synthesis

Processing complexity

Light field algorithms operate on high-dimensional data, making them computationally demanding. Real-time rendering and analysis often require GPU acceleration or specialized hardware. There's a persistent trade-off between processing speed and output quality.

Hardware limitations

Plenoptic cameras sacrifice spatial resolution for angular resolution; you can't have both without larger sensors
Miniaturizing light field optics for mobile devices remains difficult
Dynamic range and low-light performance lag behind conventional cameras
High frame rate light field video capture multiplies already large data volumes
Specialized optics and sensors increase manufacturing costs

Light field displays

Light field displays aim to present 3D imagery without requiring the viewer to wear glasses, reproducing the natural depth cues (parallax, accommodation) that flat screens cannot.

Integral imaging displays

These displays use arrays of small lenses or pinholes to project different views in different directions, creating an autostereoscopic 3D image visible from multiple viewpoints. They provide both horizontal and vertical parallax and support natural accommodation cues. Resolution and viewing angle remain limited by the lens array pitch. The Nintendo 3DS used a simpler parallax barrier variant of this concept; Leia Inc. produces more advanced versions.

Multi-layer displays

Multi-layer displays stack several LCD or OLED panels and control the light emission from each layer to approximate a light field. By carefully computing the pattern on each layer (a factorization problem), they can produce 3D images with wide viewing angles. Challenges include precise layer alignment and managing light interactions between layers. MIT Media Lab's tensor display is a well-known research prototype.

Holographic displays

Holographic displays reconstruct actual light wavefronts, producing true 3D images with full parallax and correct accommodation. They require spatial light modulators with very high bandwidth and complex rendering pipelines. Achieving large display sizes and full-color reproduction at reasonable cost remains an open problem. Looking Glass Factory produces commercial near-holographic displays, though true holographic displays are still largely in the research stage.

Light fields in computer vision

Light fields give computer vision algorithms access to angular information that single images lack, enabling stronger performance on several core tasks.

Scene understanding

Light field data improves semantic scene analysis by providing built-in depth cues and multi-view consistency. Angular reflectance patterns help estimate material properties (distinguishing matte from specular surfaces, for example). Light fields also support occlusion reasoning and transparency detection, since you can observe how occluding edges shift across angular views.

3D reconstruction

Angular parallax in light fields enables dense depth estimation from a single capture, without requiring the camera to move. This produces detailed 3D reconstructions that improve on traditional multi-view stereo in low-texture or reflective regions. Applications include VR content creation and 3D modeling, though scaling to large scenes and achieving real-time performance are ongoing challenges.

Object recognition

Light field features combine spatial appearance with depth and angular information, making recognition more robust to partial occlusions. Multi-view analysis lets algorithms "see around" foreground objects to some extent. Fine-grained classification benefits from subtle angular variations in surface appearance. Developing efficient recognition architectures that handle the high dimensionality of light field data is an active research area.

Future directions

Machine learning for light fields

Deep learning is being applied across the entire light field pipeline: capture optimization, compression, rendering, super-resolution, and scene understanding. End-to-end learned systems can jointly optimize multiple stages. The main challenge is designing network architectures that efficiently process high-dimensional light field data without excessive memory and compute requirements.

Light field video

Extending light fields to the temporal domain enables 4D motion analysis and immersive telepresence. Light field video demands efficient compression and streaming solutions, since data rates scale with both angular and temporal resolution. Achieving high frame rates while maintaining angular sampling quality is a significant engineering challenge.

Mobile light field applications

Integrating light field capture into smartphones could bring computational refocusing and improved depth sensing to consumer devices. Dual and multi-camera phone systems already capture limited angular information. True mobile light field cameras require further miniaturization of optics and more energy-efficient processing. Augmented reality applications stand to benefit most from the improved depth sensing that mobile light fields would provide.

2,589 studying →