๐Ÿค”Cognitive Psychology

Key Attention Theories

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Attention is the gateway to everything else in cognition. Without it, there's no perception, no memory encoding, no learning. These theories aren't just historical footnotes; they represent fundamentally different answers to a question that still drives research today: where and when does the brain decide what information matters?

You're being tested on your ability to distinguish between early selection and late selection models, understand how attentional resources get allocated, and explain why we sometimes miss obvious things right in front of us.

The theories here fall into distinct camps based on the mechanisms they propose. Some focus on filtering and selection (when does irrelevant info get blocked?), others on resource allocation (how much mental fuel do we have?), and still others on spatial and feature-based processing (where and what do we attend to?). Don't just memorize names and dates. Know what problem each theory solves and where it falls short.


Filter and Selection Models

These theories tackle the classic bottleneck problem: we can't process everything, so when does selection happen? The debate between early and late selection shaped decades of attention research.

Broadbent's Filter Theory

Broadbent proposed that the brain filters information based on physical features like pitch, location, or loudness before any analysis of meaning occurs. This is the purest early selection model.

  • A single-channel bottleneck allows only one stream of information through at a time; everything else is discarded entirely
  • Dichotic listening experiments provided initial support: participants could accurately report information from one ear (the attended channel) while recalling almost nothing from the other
  • The key limitation is that this model predicts you should never notice meaningful content on the unattended channel, which turns out to be wrong

Treisman's Attenuation Theory

Treisman modified Broadbent's model to fix its biggest problem. Instead of completely blocking unattended information, her filter attenuates it, turning it down like a volume dial rather than switching it off.

  • A dictionary unit with variable activation thresholds explains why certain stimuli break through. Important words (your name, "fire") have permanently low thresholds, so even a weakened signal can activate them.
  • This elegantly accounts for the cocktail party effect: you're focused on one conversation, but you still hear your name spoken across the room. Broadbent's all-or-nothing filter couldn't explain that.
  • Unattended information is still processed less than attended information, keeping this firmly in the early selection camp.

Deutsch and Deutsch's Late Selection Theory

Deutsch and Deutsch took the opposite approach: all incoming information is fully processed to the level of meaning. The bottleneck doesn't occur at perception but at the point of deciding what to respond to.

  • Relevance determines awareness rather than physical characteristics; the filter moves much later in the processing stream
  • Response selection is where filtering occurs. You process everything semantically, but only the most relevant information reaches conscious awareness and triggers a response.
  • The main criticism is parsimony: if the brain processes everything fully, that seems like a massive waste of cognitive resources. Why build all that processing machinery just to throw most of the output away?

Compare: Broadbent vs. Treisman vs. Deutsch & Deutsch all address the bottleneck problem but place the filter at different points. Broadbent says early (physical features only). Treisman says early but leaky (attenuation with variable thresholds). Deutsch & Deutsch say late (after meaning is extracted). If a question asks about the cocktail party effect, Treisman's model is typically the strongest answer because it explains both the general filtering of irrelevant info and the occasional breakthrough of important unattended stimuli.


Resource and Capacity Models

Rather than asking where selection occurs, these theories ask how much attention we have and how it gets divided. Think of attention as fuel rather than a filter.

Kahneman's Capacity Model

Kahneman proposed that attention is a single, limited pool of mental resources that gets allocated flexibly based on task demands and arousal level.

  • A central allocation policy determines how resources are distributed. Three factors shape this policy: enduring dispositions (you automatically attend to your name), momentary intentions (you choose to focus on studying), and arousal level (more arousal means more total capacity, up to a point).
  • This model explains dual-task performance well. You can do two easy things at once if their combined demands don't exceed your total capacity, but performance drops as soon as they do.
  • The weakness is that it treats all attention as one undifferentiated pool, which can't fully explain why some task pairings interfere more than others.

Multiple Resource Theory (Wickens)

Wickens argued that instead of one general pool, we have separate resource pools for different processing dimensions: visual vs. auditory input, verbal vs. spatial coding, and manual vs. vocal responses.

  • Task compatibility predicts interference. Two visual-spatial tasks compete heavily for the same pool, but a visual task paired with an auditory task draws from different pools and produces less interference.
  • This explains why you can listen to a podcast while driving on a familiar road (different modalities) but struggle to read a text while driving (both visual-manual).
  • Practical applications are significant. Cockpit displays, car dashboards, and surgical interfaces are designed around these principles to minimize same-pool competition.

Load Theory of Selective Attention (Lavie)

Lavie's Load Theory offers an elegant resolution to the early vs. late selection debate by arguing that both can occur, depending on the demands of the task.

  • Under high perceptual load (the task uses up all your perceptual capacity), distractors are automatically excluded because there's no leftover capacity to process them. This looks like early selection.
  • Under low perceptual load (the task is easy), spare capacity involuntarily "spills over" to process irrelevant distractors. This looks like late selection.
  • Cognitive load works differently and in the opposite direction. High cognitive load depletes executive control resources, which actually increases distractor interference because you lose the ability to actively suppress irrelevant information.

Compare: Kahneman's Capacity Model vs. Multiple Resource Theory both treat attention as a resource, but Kahneman proposes one general pool while Wickens argues for multiple specialized pools. Multiple Resource Theory better explains why specific task pairings cause more interference than others. Load Theory adds a further layer by showing that the amount of load, not just the type, determines whether filtering is early or late.


Spatial Attention Models

These theories focus on where attention goes in space: how we select locations in our visual environment and shift focus between them.

Spotlight Theory of Attention

The simplest spatial model treats attention as a moveable beam that illuminates one region of the visual field while leaving surrounding areas in relative darkness.

  • In early versions, the spotlight had a fixed size but flexible location. Later modifications (the zoom lens model) allowed the beam to expand or contract, with a tradeoff: a wider beam means less processing intensity at any given point.
  • Stimuli falling within the spotlight are detected faster and more accurately than stimuli outside it.
  • The model is useful as a starting metaphor but oversimplifies things. It doesn't explain how attention shifts between locations or what happens at the neural level.

Posner's Orienting of Attention Theory

Posner broke spatial attention into three distinct operations: disengage from the current location, move to the new location, and engage at the new location. This gave researchers a way to study each component separately.

  • Endogenous cues (a central arrow telling you where to look) produce voluntary, slower orienting. Exogenous cues (a sudden flash in the periphery) produce reflexive, faster capture. These rely on partially different brain mechanisms.
  • Validity effects in the classic Posner cueing task demonstrate this cleanly: valid cues (correctly predicting target location) speed responses, while invalid cues slow them because you must disengage before reorienting.
  • Posner's work also identified three distinct brain networks for attention: alerting (maintaining readiness), orienting (selecting spatial locations), and executive control (resolving conflict). This framework became foundational for attention neuroscience.

Compare: Spotlight Theory vs. Posner's Orienting Theory both address spatial attention, but the Spotlight model is more metaphorical while Posner specifies the cognitive operations involved and maps them onto brain systems. Posner's framework is more useful for explaining clinical findings, such as why patients with parietal lobe damage have trouble disengaging attention from one side of space.


Feature and Search Models

These theories explain how we find things, whether scanning a cluttered desk or searching for a friend in a crowd. They address how features get combined and how search is guided.

Feature Integration Theory (Treisman)

Treisman (yes, the same Treisman) proposed a two-stage process for visual processing:

  1. Pre-attentive stage: Basic features like color, orientation, size, and motion are detected automatically and in parallel across the entire visual field. This is fast and effortless.
  2. Attentive stage: Focused attention binds those individual features together into coherent objects. This is slower and serial.
  • Without focused attention, features can be incorrectly combined, producing illusory conjunctions (e.g., seeing a blue square when the display actually contained a blue circle and a red square).
  • This explains the pop-out vs. serial search distinction. A red item among green items pops out instantly (single-feature search, handled pre-attentively). But finding a red circle among red squares and green circles requires slow, item-by-item search because you need to bind color and shape together.

Guided Search Theory (Wolfe)

Wolfe's Guided Search Theory builds on Feature Integration Theory but adds a critical role for top-down knowledge in directing the search process.

  • Both bottom-up salience (a bright flash grabs you) and top-down goals (you're looking for your red car) contribute to an activation map that prioritizes likely target locations. Attention visits high-activation areas first.
  • This explains why knowing what you're looking for dramatically speeds search. Your prior knowledge reshapes the activation map so you don't waste time on obviously wrong items.
  • Guided Search also handles search asymmetries: finding a tilted line among vertical lines is easier than finding a vertical line among tilted lines, because the tilted line generates stronger bottom-up activation.

Compare: Feature Integration Theory vs. Guided Search Theory both involve two stages, but Guided Search emphasizes how top-down knowledge actively steers attention during search rather than treating the pre-attentive stage as purely stimulus-driven. Feature Integration Theory better explains binding errors and illusory conjunctions; Guided Search better explains efficient real-world search behavior where you rarely check every item.


Quick Reference Table

ConceptBest Examples
Early selectionBroadbent's Filter Theory, Treisman's Attenuation Theory
Late selectionDeutsch & Deutsch's Late Selection Theory
Single resource poolKahneman's Capacity Model
Multiple resource poolsMultiple Resource Theory (Wickens)
Load-dependent selectionLoad Theory (Lavie)
Spatial attentionSpotlight Theory, Posner's Orienting Theory
Feature bindingFeature Integration Theory (Treisman)
Visual searchGuided Search Theory (Wolfe), Feature Integration Theory
Bottleneck location debateBroadbent vs. Treisman vs. Deutsch & Deutsch

Self-Check Questions

  1. Both Treisman's Attenuation Theory and Deutsch & Deutsch's Late Selection Theory can explain the cocktail party effect. How do their explanations differ mechanistically, and what kind of evidence would distinguish between them?

  2. You're designing a car dashboard. Which theory, Kahneman's Capacity Model or Multiple Resource Theory, provides more useful design guidance, and why?

  3. A participant in a visual search task is looking for a blue square among blue circles and red squares. According to Feature Integration Theory, will this produce pop-out or serial search? Explain the mechanism.

  4. How does Load Theory resolve the early vs. late selection debate that Broadbent's Filter Theory originally sparked? Be specific about the role of perceptual load vs. cognitive load.

  5. A driver fails to notice a pedestrian while adjusting the GPS. Which two theories would you combine for the strongest explanation, and what specific concepts from each would you use?