M/M/k Queue

Poisson arrivals (rate ), exponential service times (rate per server), identical servers sharing one FIFO queue.

Utilization per server:

The probability of all servers busy (thus arrivals having to queue) is the Erlang C formula:

From it:

Key practical result: shared queue beats sharded queue. Given servers handling arrival rate :

  • One queue, servers: average wait is much smaller.
  • independent M/M/1’s, each seeing : average wait is much larger.

Why: with independent queues you get stuck behind one slow job; in a shared queue any free server takes the next customer. This is why supermarkets moved to single-line, multi-register. Also why server farms put work on a shared queue, not round-robin to per-server queues.

From ECE459 L32

Per-server utilization . Intermediate shorthand (no intrinsic meaning, just keeps formulas tidy):

Probability a new job has to wait (all servers busy):

Completion and queue length:

Printer example continued [Wil13b]

From 1: one slow printer min; one 2× printer .

Two slow printers (, ):

  • .
  • .
  • min.

Two takeaways: (1) doubling printers made jobs ~4× faster, not 2×. (2) the single fast printer still wins at low utilization; it processes each job faster. The two-printer solution pulls ahead as utilization climbs toward 100%.