💾Intro to Computer Architecture

Computer Performance Metrics

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

When you're studying computer architecture, you're really learning how to answer one fundamental question: how do we make computers faster? But "faster" isn't as simple as it sounds. Performance metrics give you the vocabulary and mathematical tools to quantify speed, identify bottlenecks, and predict the impact of design changes. Every architectural decision—from pipeline depth to cache size to parallel processing—ultimately shows up in these numbers.

On exams, you're being tested on more than definitions. You need to understand how metrics relate to each other, why some metrics can be misleading in isolation, and how to apply formulas like the CPU performance equation and Amdahl's Law to real scenarios. Don't just memorize what each metric measures—know what architectural factors influence it and when to use one metric over another.

Time-Based Metrics

These metrics measure performance in terms of how long things take—the most intuitive way to evaluate speed. Time-based metrics answer the user's real question: "How fast will my program run?"

Execution Time

Total time to complete a task—the ultimate measure of performance from the user's perspective
Calculated using the CPU performance equation: $\text{Execution Time} = \frac{\text{Instruction Count} \times \text{CPI}}{\text{Clock Speed}}$
Lower is always better—this is the metric that actually matters for comparing real-world performance

Latency

Delay from task initiation to completion—measures responsiveness rather than raw speed
Critical for real-time applications like gaming, video conferencing, and interactive systems where delays are noticeable
Affected by memory access times, network delays, and pipeline stalls—not just CPU speed

Compare: Execution Time vs. Latency—both measure time, but execution time focuses on total duration while latency emphasizes delay before response. FRQs often ask you to optimize for one or the other, and the strategies differ significantly.

Rate-Based Metrics

Rate metrics express performance as work completed per unit time. They're useful for comparisons but can be misleading without context about what kind of work is being measured.

Clock Speed (Frequency)

Measured in Hertz (Hz)—indicates how many cycles the CPU completes per second (modern processors run in GHz)
Not a complete performance indicator—a 3 GHz processor isn't necessarily faster than a 2 GHz one if architectures differ
Interacts with CPI and instruction count—higher frequency only helps if other factors remain constant

MIPS (Million Instructions Per Second)

Quantifies instruction throughput: $\text{MIPS} = \frac{\text{Instruction Count}}{\text{Execution Time} \times 10^6}$
Misleading across different architectures—a CISC processor may do more work per instruction than a RISC processor
Ignores instruction complexity—simple NOPs count the same as complex floating-point operations

FLOPS (Floating-Point Operations Per Second)

Measures floating-point calculation speed—essential for scientific computing, simulations, and graphics
More meaningful than MIPS for numerical workloads—directly measures the operations that matter for these applications
Reported in megaFLOPS, gigaFLOPS, or teraFLOPS—supercomputers are ranked by peak FLOPS

Compare: MIPS vs. FLOPS—MIPS counts all instructions while FLOPS counts only floating-point operations. Use MIPS for general-purpose comparisons; use FLOPS when evaluating scientific or graphics workloads. Neither tells the whole story alone.

Throughput

Work completed per unit time—measured in tasks, transactions, or jobs processed
Critical for servers and batch processing—where processing volume matters more than individual task speed
Can improve even if individual latency stays constant—through parallelism and pipelining

Efficiency Metrics

These metrics describe how well the processor uses its resources. They reveal architectural efficiency independent of clock speed.

Instructions Per Cycle (IPC)

Average instructions completed each clock cycle—higher IPC means better resource utilization
Varies by workload and architecture—branch-heavy code typically has lower IPC than straight-line computation
Modern superscalar processors target IPC > 1—executing multiple instructions simultaneously

Cycles Per Instruction (CPI)

Average cycles needed per instruction: $\text{CPI} = \frac{\text{Total Cycles}}{\text{Instruction Count}}$
Lower CPI is better—indicates more efficient instruction execution
Affected by pipeline stalls, cache misses, and instruction mix—complex instructions and memory delays increase CPI

Compare: IPC vs. CPI—these are mathematical inverses ( $\text{IPC} = \frac{1}{\text{CPI}}$ ). Use whichever makes your calculation cleaner. IPC emphasizes "instructions completed" while CPI emphasizes "cycles consumed."

Analytical Tools

These aren't raw measurements—they're frameworks for predicting and understanding performance improvements.

Amdahl's Law

Predicts maximum speedup from partial optimization: $\text{Speedup} = \frac{1}{(1-P) + \frac{P}{S}}$ where $P$ is the parallelizable fraction and $S$ is the speedup of that portion
Reveals diminishing returns—if only 90% of code is parallelizable, maximum speedup is 10× no matter how many processors you add
Essential for optimization decisions—tells you where improvements will actually matter

Benchmarks (e.g., SPEC)

Standardized test suites that measure performance across representative workloads
Enable fair comparisons—same tests run on different systems eliminate workload variability
SPEC CPU benchmarks are industry standard—separate suites for integer (SPECint) and floating-point (SPECfp) performance

Compare: Amdahl's Law vs. Benchmarks—Amdahl's Law is theoretical (predicts limits), while benchmarks are empirical (measure actual performance). Use Amdahl's Law to guide design decisions; use benchmarks to validate real-world results.

Quick Reference Table

Concept	Best Examples
Time-based performance	Execution Time, Latency
Rate-based throughput	Clock Speed, MIPS, FLOPS, Throughput
Architectural efficiency	IPC, CPI
Theoretical analysis	Amdahl's Law
Empirical measurement	Benchmarks (SPEC)
CPU performance equation components	Instruction Count, CPI, Clock Speed
Parallel speedup limits	Amdahl's Law
Workload-specific metrics	FLOPS (scientific), Throughput (servers)

Self-Check Questions

If Processor A has a higher clock speed than Processor B but longer execution time for the same program, what metric must be worse for Processor A?
A program spends 80% of its time in parallelizable code. Using Amdahl's Law, what is the maximum possible speedup with infinite processors?
Compare and contrast MIPS and FLOPS—when would each metric be most appropriate for evaluating processor performance?
Given the CPU performance equation, how would doubling the clock speed while also doubling the CPI affect execution time?
Why might a processor with IPC of 2.0 running at 2 GHz outperform a processor with IPC of 0.8 running at 4 GHz? Which metrics would you calculate to prove this?

💾Intro to Computer Architecture

Computer Performance Metrics

Why This Matters

Time-Based Metrics

Execution Time

Latency

Rate-Based Metrics

Clock Speed (Frequency)

MIPS (Million Instructions Per Second)

FLOPS (Floating-Point Operations Per Second)

Throughput

Efficiency Metrics

Instructions Per Cycle (IPC)

Cycles Per Instruction (CPI)

Analytical Tools

Amdahl's Law

Benchmarks (e.g., SPEC)

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes