Lock-Free Programming
A lock-free data structure has operations that are critical sections, but performed without ownership, every operation completes in constant atomic time (Buhr §11.1). No thread holds a lock; mutation happens via a single atomic read-modify-write (typically CAS) in a retry loop.
Why call it lock-free if you're still spinning?
Because there’s no owned lock, no thread can hold one, die, and leave others stranded. But Buhr is explicit that lock-free is still locking (misnomer): you’re spinning on a conceptual lock implemented via CAS, which is still busy-waiting. The “free” means no ownership, not no waiting.
The distinguishing properties come directly from that no-ownership shape:
- no deadlock possible, hold-and-wait can’t happen without ownership
- no underutilization on preemption, an interrupted thread doesn’t block others, others keep finishing their operations
- limited set of critical sections, the whole mutation has to fit in one (possibly double-wide) atomic op. Stack, single-producer-single-consumer queue, counters: yes. Arbitrary tree rotation: no
- no progress guarantee per thread, an unlucky thread can lose every CAS race forever. If you need that guarantee, go to wait-free
- breaks rule 5 without SMR, freed nodes + dereference races eventually cause OOM or segfault, violating bounded waiting for the memory allocator
Core pattern: CAS retry loop
“If nobody changed this out from under me, commit; else reread and try again.” Canonical lock-free stack push from Buhr §11.1.2:
void Stack::push( Node &n ) {
for ( ;; ) { // busy wait
n.next = top; // link to current top
if ( CAS( top, n.next, &n ) ) break;
}
}The three CAS variants from §11.1 differ in what they do on failure:
CAS(val, comp, nval), returnsbool, retry has to re-readCASV(val, &comp, nval), on failure writes current value intocomp, saves then.next = topreloadCASVD(...), double-wide, lets you pack(pointer, counter)and defeat ABA
See Lock-Free Stack for the full push/pop walkthrough.
Lock vs lock-free (Buhr §11.2 summary)
| lock | lock-free | |
|---|---|---|
| critical section scope | arbitrary | fits in one atomic op |
| deadlock | possible | impossible |
| interrupted holder | stalls everyone | others progress |
| memory reclamation | trivial | needs SMR |
| performance | comparable in general | comparable in general |
The “no performance difference in general” is Buhr’s point: reach for lock-free because you need the progress properties, not because you think it’s faster.
Lock-free ≠ wait-free
- Lock-free: some thread always makes progress, individuals can starve
- Wait-free: every thread finishes in a bounded number of steps
The CAS retry loop above is lock-free but not wait-free.
Data structures worth knowing
- Lock-Free Stack, canonical example, push/pop via CAS on
top - Lock-free queue (Michael-Scott), SPSC queue, needs more care
- Lock-free counter, just
fetch_add(also wait-free)
ECE459 framing
ECE459 L12 frames lock-free as: “if any thread suspended mid-operation still lets others finish theirs.” Also notes restrictions are allowed, a queue permitting one appender and one remover but not two removers is still lock-free.
Lock-free is not about speed
Lock-free guarantees forward progress, not throughput. Sometimes CAS is faster than lock/unlock; sometimes the algorithm ends up slower. Reach for it when you must (signal handlers, thread-death tolerance, progress guarantees), not because it’s “fast.”
Resources: