upgrade
upgrade

๐Ÿ’พIntro to Computer Architecture

CPU Pipeline Stages

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Understanding CPU pipeline stages is fundamental to grasping how modern processors achieve high performance. You're being tested on concepts like instruction-level parallelism, pipeline hazards, throughput vs. latency tradeoffs, and the fetch-decode-execute cycle. These stages don't exist in isolationโ€”they work together to allow multiple instructions to be "in flight" simultaneously, which is why a 5-stage pipeline can theoretically improve throughput by up to 5x compared to single-cycle execution.

When exam questions ask about pipeline stalls, data hazards, or control hazards, they're really testing whether you understand what each stage does and what resources it needs. Don't just memorize the stage namesโ€”know what hardware components are active at each stage, what data flows between stages, and what happens when dependencies force the pipeline to wait. This conceptual understanding will help you tackle FRQ scenarios involving hazard detection, forwarding, and branch prediction.


Instruction Preparation Stages

These first two stages focus on getting the instruction ready for executionโ€”fetching it from memory and figuring out what it actually means. Both stages interact heavily with memory and control logic before any real computation happens.

Instruction Fetch (IF)

  • Program Counter (PC) provides the memory addressโ€”the CPU reads the instruction at this address from instruction memory or cache
  • Instruction Register (IR) stores the fetched instruction, holding it stable while the next stage decodes it
  • PC increments automatically (typically by 4 bytes in a 32-bit architecture), preparing to fetch the next sequential instruction unless a branch occurs

Instruction Decode (ID)

  • Opcode field is parsed to determine the operation typeโ€”this generates control signals that configure all downstream stages
  • Register file is read to retrieve operand values specified by the source register fields (typically rsrs and rtrt in MIPS)
  • Sign extension occurs for immediate values, converting 16-bit immediates to 32-bit values for ALU operations

Compare: IF vs. IDโ€”both happen before any computation, but IF interacts with instruction memory while ID interacts with the register file. If an FRQ asks where a data hazard is detected, ID is your answer since that's where register values are read.


Computation and Data Stages

These middle stages perform the actual workโ€”calculating results and accessing data memory. This is where the ALU does its job and where load/store instructions interact with the memory hierarchy.

Execute (EX)

  • ALU performs the core operationโ€”arithmetic (++, โˆ’-), logical (AND, OR), or address calculation for memory instructions
  • ALU inputs come from either register values (R-type) or a register plus sign-extended immediate (I-type)
  • Branch target addresses are calculated here by adding the PC to the sign-extended offset, even before knowing if the branch is taken

Memory Access (MEM)

  • Data memory is accessed only for load/store instructionsโ€”all other instruction types pass through this stage without memory interaction
  • Effective address from EX stage determines the memory location; loads read data while stores write register values to memory
  • Cache misses create stalls since the pipeline must wait for data to arrive from slower memory levelsโ€”a major source of performance loss

Compare: EX vs. MEMโ€”EX uses the ALU for computation, while MEM uses data memory for storage. R-type instructions only need EX; load/store instructions need both. This distinction matters for understanding which hazards affect which instruction types.


Result Completion Stage

The final stage ensures computed results become visible to future instructions. Without this stage, no instruction would ever produce a lasting effect on processor state.

Write Back (WB)

  • Destination register receives the resultโ€”either the ALU output (R-type, immediate operations) or loaded data (load instructions)
  • Register file write port is used here, with the write register number determined by the instruction format (rdrd for R-type, rtrt for loads)
  • Store instructions skip this stage since they write to memory, not registersโ€”their work completed in MEM

Compare: MEM vs. WBโ€”both can provide the final result, but MEM provides data from memory (loads) while WB writes any result to registers. Understanding this split is essential for implementing data forwarding paths.


Quick Reference Table

ConceptBest Examples
Memory interactionIF (instruction memory), MEM (data memory)
Register file accessID (read), WB (write)
ALU usageEX stage exclusively
Control signal generationID stage
Address calculationEX (for branches and memory operations)
Pipeline register boundariesIF/ID, ID/EX, EX/MEM, MEM/WB
Stages skipped by some instructionsMEM (by R-type), WB (by stores)

Self-Check Questions

  1. Which two stages interact with memory, and what type of memory does each access?

  2. If a load instruction is followed immediately by an add instruction that uses the loaded value, at which stage is the data hazard detected, and why?

  3. Compare and contrast what happens during the EX stage for an R-type arithmetic instruction versus a load instruction.

  4. A store instruction (swsw) uses the MEM stage but not the WB stage. Explain why this makes sense given what each stage does.

  5. If you were implementing data forwarding to reduce stalls, which stages would need forwarding paths between them, and what values would be forwarded?