upgrade
upgrade

💾Intro to Computer Architecture

Computer Performance Metrics

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you're studying computer architecture, you're really learning how to answer one fundamental question: how do we make computers faster? But "faster" isn't as simple as it sounds. Performance metrics give you the vocabulary and mathematical tools to quantify speed, identify bottlenecks, and predict the impact of design changes. Every architectural decision—from pipeline depth to cache size to parallel processing—ultimately shows up in these numbers.

On exams, you're being tested on more than definitions. You need to understand how metrics relate to each other, why some metrics can be misleading in isolation, and how to apply formulas like the CPU performance equation and Amdahl's Law to real scenarios. Don't just memorize what each metric measures—know what architectural factors influence it and when to use one metric over another.


Time-Based Metrics

These metrics measure performance in terms of how long things take—the most intuitive way to evaluate speed. Time-based metrics answer the user's real question: "How fast will my program run?"

Execution Time

  • Total time to complete a task—the ultimate measure of performance from the user's perspective
  • Calculated using the CPU performance equation: Execution Time=Instruction Count×CPIClock Speed\text{Execution Time} = \frac{\text{Instruction Count} \times \text{CPI}}{\text{Clock Speed}}
  • Lower is always better—this is the metric that actually matters for comparing real-world performance

Latency

  • Delay from task initiation to completion—measures responsiveness rather than raw speed
  • Critical for real-time applications like gaming, video conferencing, and interactive systems where delays are noticeable
  • Affected by memory access times, network delays, and pipeline stalls—not just CPU speed

Compare: Execution Time vs. Latency—both measure time, but execution time focuses on total duration while latency emphasizes delay before response. FRQs often ask you to optimize for one or the other, and the strategies differ significantly.


Rate-Based Metrics

Rate metrics express performance as work completed per unit time. They're useful for comparisons but can be misleading without context about what kind of work is being measured.

Clock Speed (Frequency)

  • Measured in Hertz (Hz)—indicates how many cycles the CPU completes per second (modern processors run in GHz)
  • Not a complete performance indicator—a 3 GHz processor isn't necessarily faster than a 2 GHz one if architectures differ
  • Interacts with CPI and instruction count—higher frequency only helps if other factors remain constant

MIPS (Million Instructions Per Second)

  • Quantifies instruction throughput: MIPS=Instruction CountExecution Time×106\text{MIPS} = \frac{\text{Instruction Count}}{\text{Execution Time} \times 10^6}
  • Misleading across different architectures—a CISC processor may do more work per instruction than a RISC processor
  • Ignores instruction complexity—simple NOPs count the same as complex floating-point operations

FLOPS (Floating-Point Operations Per Second)

  • Measures floating-point calculation speed—essential for scientific computing, simulations, and graphics
  • More meaningful than MIPS for numerical workloads—directly measures the operations that matter for these applications
  • Reported in megaFLOPS, gigaFLOPS, or teraFLOPS—supercomputers are ranked by peak FLOPS

Compare: MIPS vs. FLOPS—MIPS counts all instructions while FLOPS counts only floating-point operations. Use MIPS for general-purpose comparisons; use FLOPS when evaluating scientific or graphics workloads. Neither tells the whole story alone.

Throughput

  • Work completed per unit time—measured in tasks, transactions, or jobs processed
  • Critical for servers and batch processing—where processing volume matters more than individual task speed
  • Can improve even if individual latency stays constant—through parallelism and pipelining

Efficiency Metrics

These metrics describe how well the processor uses its resources. They reveal architectural efficiency independent of clock speed.

Instructions Per Cycle (IPC)

  • Average instructions completed each clock cycle—higher IPC means better resource utilization
  • Varies by workload and architecture—branch-heavy code typically has lower IPC than straight-line computation
  • Modern superscalar processors target IPC > 1—executing multiple instructions simultaneously

Cycles Per Instruction (CPI)

  • Average cycles needed per instruction: CPI=Total CyclesInstruction Count\text{CPI} = \frac{\text{Total Cycles}}{\text{Instruction Count}}
  • Lower CPI is better—indicates more efficient instruction execution
  • Affected by pipeline stalls, cache misses, and instruction mix—complex instructions and memory delays increase CPI

Compare: IPC vs. CPI—these are mathematical inverses (IPC=1CPI\text{IPC} = \frac{1}{\text{CPI}}). Use whichever makes your calculation cleaner. IPC emphasizes "instructions completed" while CPI emphasizes "cycles consumed."


Analytical Tools

These aren't raw measurements—they're frameworks for predicting and understanding performance improvements.

Amdahl's Law

  • Predicts maximum speedup from partial optimization: Speedup=1(1P)+PS\text{Speedup} = \frac{1}{(1-P) + \frac{P}{S}} where PP is the parallelizable fraction and SS is the speedup of that portion
  • Reveals diminishing returns—if only 90% of code is parallelizable, maximum speedup is 10× no matter how many processors you add
  • Essential for optimization decisions—tells you where improvements will actually matter

Benchmarks (e.g., SPEC)

  • Standardized test suites that measure performance across representative workloads
  • Enable fair comparisons—same tests run on different systems eliminate workload variability
  • SPEC CPU benchmarks are industry standard—separate suites for integer (SPECint) and floating-point (SPECfp) performance

Compare: Amdahl's Law vs. Benchmarks—Amdahl's Law is theoretical (predicts limits), while benchmarks are empirical (measure actual performance). Use Amdahl's Law to guide design decisions; use benchmarks to validate real-world results.


Quick Reference Table

ConceptBest Examples
Time-based performanceExecution Time, Latency
Rate-based throughputClock Speed, MIPS, FLOPS, Throughput
Architectural efficiencyIPC, CPI
Theoretical analysisAmdahl's Law
Empirical measurementBenchmarks (SPEC)
CPU performance equation componentsInstruction Count, CPI, Clock Speed
Parallel speedup limitsAmdahl's Law
Workload-specific metricsFLOPS (scientific), Throughput (servers)

Self-Check Questions

  1. If Processor A has a higher clock speed than Processor B but longer execution time for the same program, what metric must be worse for Processor A?

  2. A program spends 80% of its time in parallelizable code. Using Amdahl's Law, what is the maximum possible speedup with infinite processors?

  3. Compare and contrast MIPS and FLOPS—when would each metric be most appropriate for evaluating processor performance?

  4. Given the CPU performance equation, how would doubling the clock speed while also doubling the CPI affect execution time?

  5. Why might a processor with IPC of 2.0 running at 2 GHz outperform a processor with IPC of 0.8 running at 4 GHz? Which metrics would you calculate to prove this?