M/M/k Queue
Poisson arrivals (rate ), exponential service times (rate per server), identical servers sharing one FIFO queue.
Utilization per server:
The probability of all servers busy (thus arrivals having to queue) is the Erlang C formula:
From it:
Key practical result: shared queue beats sharded queue. Given servers handling arrival rate :
- One queue, servers: average wait is much smaller.
- independent M/M/1’s, each seeing : average wait is much larger.
Why: with independent queues you get stuck behind one slow job; in a shared queue any free server takes the next customer. This is why supermarkets moved to single-line, multi-register. Also why server farms put work on a shared queue, not round-robin to per-server queues.
From ECE459 L32
Per-server utilization . Intermediate shorthand (no intrinsic meaning, just keeps formulas tidy):
Probability a new job has to wait (all servers busy):
Completion and queue length:
Printer example continued [Wil13b]
From 1: one slow printer min; one 2× printer .
Two slow printers (, ):
- .
- .
- min.
Two takeaways: (1) doubling printers made jobs ~4× faster, not 2×. (2) the single fast printer still wins at low utilization; it processes each job faster. The two-printer solution pulls ahead as utilization climbs toward 100%.