Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Load balancing sits at the heart of cloud computing architecture—it's the traffic cop that determines whether your distributed system performs smoothly or collapses under pressure. You're being tested on your understanding of scalability, fault tolerance, availability, and resource optimization, and load balancing techniques demonstrate all of these principles in action. Every algorithm represents a different trade-off between simplicity, intelligence, and overhead.
Don't just memorize which algorithm does what—understand why you'd choose one over another. Exam questions will present scenarios and ask you to recommend the appropriate technique, or they'll probe whether you understand the underlying mechanisms: session persistence, connection tracking, health monitoring, and weighted distribution. Know the concept each technique illustrates, and you'll handle any question they throw at you.
These algorithms distribute traffic using predetermined rules without considering real-time server state. They're simple to implement but lack adaptability—the load balancer makes decisions without knowing how servers are actually performing.
Compare: Round Robin vs. Random—both are stateless and simple, but Round Robin guarantees even distribution while Random only achieves it probabilistically. If an exam asks about the simplest deterministic approach, Round Robin is your answer.
These algorithms make routing decisions based on real-time server metrics. They're more intelligent but require continuous monitoring infrastructure to track server state.
Compare: Least Connection vs. Least Response Time—both are dynamic, but Least Connection only counts connections while Least Response Time factors in how quickly servers respond. For FRQs about user experience optimization, Least Response Time is the stronger choice.
These algorithms ensure clients maintain consistent server relationships across multiple requests. They solve the session persistence problem—keeping user state intact when applications store session data locally on servers.
Compare: IP Hash vs. Least Connection—IP Hash sacrifices optimal load distribution to maintain session affinity, while Least Connection optimizes distribution but may route the same user to different servers. Choose based on whether your application requires stateful sessions or can handle stateless requests.
These algorithms inspect request content to make intelligent routing decisions. They operate at Layer 7 (application layer) rather than Layer 4 (transport layer), enabling application-specific optimization.
/api/ hits application servers while /static/ hits content serversCompare: URL-Based vs. Content-Based—URL-Based routes on the path (simple pattern matching), while Content-Based examines the actual request content (requires parsing). URL-Based is faster; Content-Based is more precise for heterogeneous workloads.
Health monitoring isn't a load balancing algorithm itself—it's the foundation that makes all other algorithms reliable. Without it, load balancers route traffic to failed servers.
Compare: Health Monitoring vs. Least Response Time—Health Monitoring is binary (healthy/unhealthy), while Least Response Time is continuous (faster/slower). Production systems use both: health checks remove dead servers, then dynamic algorithms optimize among the living.
| Concept | Best Examples |
|---|---|
| Static distribution (no monitoring) | Round Robin, Weighted Round Robin, Random |
| Dynamic load awareness | Least Connection, Least Response Time, Least Bandwidth |
| Session persistence | IP Hash |
| Application-layer routing | URL-Based, Content-Based |
| Fault tolerance | Server Health Monitoring |
| Heterogeneous server pools | Weighted Round Robin, Content-Based |
| Latency optimization | Least Response Time |
| Bandwidth-intensive workloads | Least Bandwidth |
Which two algorithms are both static (require no real-time monitoring) but differ in whether distribution is deterministic or probabilistic?
A web application stores user shopping carts in server memory without shared session storage. Which load balancing technique ensures users don't lose their carts, and what's its main drawback?
Compare and contrast Least Connection and Least Bandwidth—what metric does each optimize for, and when would you choose one over the other?
Your architecture uses microservices with separate server pools for /api/, /images/, and /video/. Which load balancing approach enables this routing, and at which OSI layer does it operate?
An FRQ describes a system where the load balancer continues sending traffic to a crashed server. What mechanism is missing, and how would adding it improve the system's fault tolerance?