Load Balancing

Distributing incoming work across multiple servers so no single one is overwhelmed. Covered formally in SE464.

Resources:

AWS: What is load balancing?
AWS Elastic Load Balancing
Mastering Chaos: A Netflix Guide to Microservices
Wikipedia: Load balancing (computing)

From ECE459 L31

Typical “server farm” shape: $n$ approximately identical servers behind a dispatcher that assigns jobs. Alternative: work-stealing, post-hoc reassignment when one queue piles up.

Task-assignment policies [HB13]

Random: exactly what it sounds like
Round-robin: job $i$ goes to server $i mod n$
Shortest-Queue: pick the server with the shortest queue
Size-Interval-Task-Assignment (SITA): short jobs to one server, medium to another, long to a third
Least-Work-Left: pick the server with the smallest sum of remaining work
Central-Queue: servers pull from a shared queue when idle (no direct assignment)

Which minimizes mean response time? Nobody knows. It depends on job-size variability and other factors; under-studied. Jeff: “PhD, anyone?“.

Queueing Theory
k
Microservices

🛠️ Steven Gong

Table of Contents

Load Balancing

From ECE459 L31

Task-assignment policies [HB13]

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Load Balancing

From ECE459 L31

Task-assignment policies [HB13]

Related

Graph View

Backlinks