Intro to Computer Architecture

💾Intro to Computer Architecture Unit 1 – Computer Architecture: Core Concepts

Computer architecture forms the foundation of modern computing systems, encompassing the design and organization of hardware and software components. This unit explores key elements like CPUs, memory systems, and I/O devices, as well as fundamental concepts such as data representation and instruction set architectures. The study delves into CPU design, memory hierarchies, and performance optimization techniques. It also covers advanced topics like multiprocessor systems, GPU architectures, and emerging technologies such as quantum computing and neuromorphic systems, providing a comprehensive overview of this rapidly evolving field.

Key Components of Computer Architecture

  • Encompasses the design and organization of a computer system's hardware and software components
  • Includes the CPU (Central Processing Unit) which executes instructions and performs arithmetic and logical operations
  • Incorporates memory systems for storing data and instructions (RAM, cache, hard drives)
  • Features input/output (I/O) devices for interacting with the external environment (keyboard, mouse, display)
  • Utilizes buses for communication and data transfer between components
    • Address bus carries memory addresses
    • Data bus transfers data between components
    • Control bus carries control signals and synchronizes operations
  • Defines the instruction set architecture (ISA) specifying the machine language instructions supported by the processor
  • Focuses on optimizing performance, power efficiency, and cost-effectiveness in computer system design

Data Representation and Storage

  • Computers represent and store data using the binary number system (0s and 1s)
  • Each binary digit (bit) represents the smallest unit of data
  • Bits are grouped into larger units called bytes (8 bits) or words (typically 32 or 64 bits)
  • Numeric data is represented using fixed-point or floating-point formats
    • Fixed-point represents integers and fractions with a fixed number of bits for each part
    • Floating-point represents real numbers using a mantissa and exponent (IEEE 754 standard)
  • Character data is encoded using ASCII (American Standard Code for Information Interchange) or Unicode standards
  • Images are represented using pixel grids with color values (RGB, CMYK) or compressed formats (JPEG, PNG)
  • Data is stored in memory cells or on secondary storage devices (hard drives, SSDs)
  • Memory is organized into addressable locations, each with a unique memory address

Instruction Set Architecture (ISA)

  • Defines the interface between hardware and software in a computer system
  • Specifies the set of machine language instructions supported by the processor
  • Includes instruction formats, addressing modes, data types, and registers
  • CISC (Complex Instruction Set Computing) architectures have a large number of complex instructions
    • Examples: x86 (Intel), 68000 (Motorola)
  • RISC (Reduced Instruction Set Computing) architectures have a smaller set of simpler instructions
    • Examples: ARM, MIPS, SPARC
  • Instructions are fetched from memory, decoded, executed, and results are stored back in memory or registers
  • Assembly language provides a human-readable representation of machine language instructions
  • Compilers translate high-level programming languages into machine language instructions

CPU Design and Organization

  • The CPU is the brain of the computer, responsible for executing instructions and performing computations
  • Consists of the arithmetic logic unit (ALU), control unit, registers, and cache memory
  • Fetches instructions from memory, decodes them, executes operations, and stores results
  • Pipelining improves performance by overlapping the execution of multiple instructions
    • Stages: instruction fetch, decode, execute, memory access, write back
  • Superscalar architectures execute multiple instructions simultaneously using multiple execution units
  • Out-of-order execution reorders instructions to maximize resource utilization and minimize dependencies
  • Branch prediction techniques (static, dynamic) optimize the execution of conditional branches
  • Multi-core processors integrate multiple CPU cores on a single chip for parallel processing

Memory Hierarchy and Management

  • Memory hierarchy organizes storage devices based on speed, capacity, and cost
    • Registers, cache, main memory (RAM), secondary storage (hard drives, SSDs)
  • Registers are the fastest and most expensive, located within the CPU
  • Cache memory (L1, L2, L3) stores frequently accessed data and instructions to reduce memory access latency
    • Temporal locality: recently accessed data is likely to be accessed again
    • Spatial locality: nearby memory locations are likely to be accessed together
  • Main memory (RAM) stores active programs and data, accessed by the CPU through the memory controller
  • Virtual memory allows the use of secondary storage (hard drive) as an extension of main memory
    • Paging divides memory into fixed-size pages, swapped between main memory and secondary storage
    • Segmentation divides memory into variable-size segments based on logical divisions of a program
  • Memory management unit (MMU) handles address translation and memory protection

Input/Output Systems

  • I/O systems enable communication between the computer and external devices
  • Includes input devices (keyboard, mouse, touchscreen) and output devices (display, printer, speakers)
  • I/O controllers manage the transfer of data between the CPU, memory, and I/O devices
    • Examples: USB controller, graphics card, network interface card (NIC)
  • Interrupts allow I/O devices to signal the CPU when they require attention
    • Interrupt handler routines process the interrupts and perform necessary actions
  • Direct memory access (DMA) enables I/O devices to access memory directly, bypassing the CPU
  • Buses (PCIe, USB, SATA) provide standardized interfaces for connecting I/O devices to the system
  • Device drivers are software components that facilitate communication between the operating system and I/O devices

Performance Metrics and Optimization

  • Performance metrics evaluate the efficiency and speed of a computer system
  • Clock speed (measured in Hz) determines the number of clock cycles per second
  • Instruction per cycle (IPC) measures the average number of instructions executed per clock cycle
  • Execution time is the total time taken to complete a task, influenced by clock speed and IPC
  • Throughput represents the number of tasks completed per unit of time
  • Latency is the delay between the initiation of an operation and its completion
  • Amdahl's Law states that the performance improvement of a system is limited by its sequential (non-parallelizable) parts
  • Optimization techniques include:
    • Instruction-level parallelism (ILP) exploits parallelism within a single instruction stream
    • Data-level parallelism (DLP) performs the same operation on multiple data elements simultaneously
    • Thread-level parallelism (TLP) executes multiple threads concurrently on different processor cores
    • Compiler optimizations (loop unrolling, code vectorization) improve code efficiency
    • Cache optimization (blocking, prefetching) reduces memory access latency
  • Multiprocessor systems integrate multiple processors on a single system for parallel processing
    • Shared memory multiprocessors share a common memory address space
    • Distributed memory multiprocessors have separate memory for each processor
  • GPU (Graphics Processing Unit) architectures are optimized for parallel processing of graphics and general-purpose computations
  • Heterogeneous computing combines different types of processors (CPU, GPU, FPGA) to leverage their unique strengths
  • Neuromorphic computing mimics the structure and function of biological neural networks for energy-efficient and adaptive computing
  • Quantum computing utilizes quantum bits (qubits) and quantum operations for solving certain complex problems exponentially faster than classical computers
  • Near-memory and in-memory computing architectures place computation closer to memory to reduce data movement and improve performance
  • 3D chip stacking technologies (e.g., through-silicon vias) enable vertical integration of multiple chip layers for increased density and bandwidth
  • Emerging non-volatile memory technologies (PCM, MRAM, ReRAM) offer higher density, lower power consumption, and persistent storage compared to traditional DRAM and NAND flash


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.