Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Load balancing sits at the heart of parallel and distributed computing—it's the mechanism that transforms a collection of individual servers into a unified, high-performance system. When you're tested on this topic, you're being asked to demonstrate understanding of resource allocation, fault tolerance, scalability, and performance optimization. The algorithms themselves are just implementations of deeper principles about how distributed systems manage competing demands for finite resources.
Don't just memorize which algorithm does what. Instead, focus on why each approach exists: What problem does it solve? What assumptions does it make about the environment? When would it fail? Understanding the trade-offs between simplicity and adaptability, static and dynamic allocation, and stateless and stateful routing will serve you far better on exams than rote recall. Each algorithm represents a different answer to the fundamental question: How do we distribute work fairly and efficiently?
These algorithms make routing decisions without considering current server state. They're predictable, low-overhead, and work well when server capabilities and request costs are uniform. The key trade-off: simplicity comes at the cost of responsiveness to changing conditions.
Compare: Round Robin vs. Random—both are stateless and simple, but Round Robin guarantees even distribution while Random only achieves it probabilistically. For exam questions about deterministic vs. probabilistic load balancing, this is your key distinction.
These algorithms track active connections to make smarter routing decisions. They respond to actual load rather than assumed load, making them more adaptive but requiring state maintenance.
Compare: Least Connection vs. Weighted Least Connection—both track active connections, but the weighted version accounts for server heterogeneity. If an FRQ asks about load balancing in a mixed-capacity cluster, Weighted Least Connection is your answer.
These algorithms measure actual server performance metrics to make routing decisions. They're the most responsive but require continuous monitoring infrastructure. The principle: let observed behavior, not assumptions, drive decisions.
Compare: Least Response Time vs. Least Bandwidth—both measure real-time metrics, but they optimize for different bottlenecks. Response time targets latency-sensitive applications; bandwidth targets throughput-sensitive applications. Know which metric matters for which workload type.
Some applications require that a client consistently reaches the same server—for session data, caching efficiency, or stateful protocols. These algorithms sacrifice load optimization for affinity.
Compare: IP Hash vs. Least Connection—IP Hash prioritizes consistency (same client always hits same server), while Least Connection prioritizes balance (requests go where load is lowest). This trade-off between affinity and fairness is a classic distributed systems question.
These algorithms adjust their behavior based on observed conditions, representing the most sophisticated approach to load balancing. They treat load balancing as a continuous optimization problem rather than a fixed policy.
Compare: Dynamic vs. Adaptive Load Balancing—Dynamic reacts to current conditions; Adaptive predicts future conditions based on learned patterns. Both are real-time, but Adaptive adds a predictive layer. For questions about proactive vs. reactive resource management, this distinction matters.
| Concept | Best Examples |
|---|---|
| Stateless/Simple | Round Robin, Random |
| Capacity-Aware | Weighted Round Robin, Weighted Least Connection |
| Connection-Tracking | Least Connection, Weighted Least Connection |
| Performance-Metric Based | Least Response Time, Least Bandwidth |
| Session Persistence | IP Hash |
| Real-Time Adaptation | Dynamic Load Balancing, Adaptive Load Balancing |
| Homogeneous Servers | Round Robin, Least Connection, Random |
| Heterogeneous Servers | Weighted Round Robin, Weighted Least Connection |
Which two algorithms both use server weights, and what additional factor does one consider that the other ignores?
You're designing a load balancer for a video streaming service where network bandwidth is limited but servers have similar specs. Which algorithm would you choose and why?
Compare and contrast IP Hash and Least Connection in terms of their trade-offs between session persistence and load optimization. When would you sacrifice one for the other?
A system administrator notices that Round Robin is causing some servers to become overloaded while others sit idle. What assumptions has Round Robin made that aren't holding true, and which algorithm would you recommend instead?
If an FRQ asks you to design a load balancing strategy for a cloud application with unpredictable traffic patterns and servers of varying capabilities, which algorithm category should you focus on and what monitoring infrastructure would it require?