Lock-Free Programming

A lock-free data structure has operations that are critical sections, but performed without ownership, every operation completes in constant atomic time (Buhr §11.1). No thread holds a lock; mutation happens via a single atomic read-modify-write (typically CAS) in a retry loop.

Why call it lock-free if you're still spinning?

Because there’s no owned lock, no thread can hold one, die, and leave others stranded. But Buhr is explicit that lock-free is still locking (misnomer): you’re spinning on a conceptual lock implemented via CAS, which is still busy-waiting. The “free” means no ownership, not no waiting.

The distinguishing properties come directly from that no-ownership shape:

no deadlock possible, hold-and-wait can’t happen without ownership
no underutilization on preemption, an interrupted thread doesn’t block others, others keep finishing their operations
limited set of critical sections, the whole mutation has to fit in one (possibly double-wide) atomic op. Stack, single-producer-single-consumer queue, counters: yes. Arbitrary tree rotation: no
no progress guarantee per thread, an unlucky thread can lose every CAS race forever. If you need that guarantee, go to wait-free
breaks rule 5 without SMR, freed nodes + dereference races eventually cause OOM or segfault, violating bounded waiting for the memory allocator

Core pattern: CAS retry loop

“If nobody changed this out from under me, commit; else reread and try again.” Canonical lock-free stack push from Buhr §11.1.2:

void Stack::push( Node &n ) {
    for ( ;; ) {                          // busy wait
        n.next = top;                     // link to current top
        if ( CAS( top, n.next, &n ) ) break;
    }
}

The three CAS variants from §11.1 differ in what they do on failure:

CAS(val, comp, nval), returns bool, retry has to re-read
CASV(val, &comp, nval), on failure writes current value into comp, saves the n.next = top reload
CASVD(...), double-wide, lets you pack (pointer, counter) and defeat ABA

See Lock-Free Stack for the full push/pop walkthrough.

Lock vs lock-free (Buhr §11.2 summary)

	lock	lock-free
critical section scope	arbitrary	fits in one atomic op
deadlock	possible	impossible
interrupted holder	stalls everyone	others progress
memory reclamation	trivial	needs SMR
performance	comparable in general	comparable in general

The “no performance difference in general” is Buhr’s point: reach for lock-free because you need the progress properties, not because you think it’s faster.

Lock-free ≠ wait-free

Lock-free: some thread always makes progress, individuals can starve
Wait-free: every thread finishes in a bounded number of steps

The CAS retry loop above is lock-free but not wait-free.

Data structures worth knowing

Lock-Free Stack, canonical example, push/pop via CAS on top
Lock-free queue (Michael-Scott), SPSC queue, needs more care
Lock-free counter, just fetch_add (also wait-free)

ECE459 framing

ECE459 L12 frames lock-free as: “if any thread suspended mid-operation still lets others finish theirs.” Also notes restrictions are allowed, a queue permitting one appender and one remover but not two removers is still lock-free.

Lock-free is not about speed

Lock-free guarantees forward progress, not throughput. Sometimes CAS is faster than lock/unlock; sometimes the algorithm ends up slower. Reach for it when you must (signal handlers, thread-death tolerance, progress guarantees), not because it’s “fast.”

Resources:

https://www.youtube.com/watch?v=RWCadBJ6wTk

🛠️ Steven Gong

Table of Contents

Lock-Free Programming

Core pattern: CAS retry loop

Lock vs lock-free (Buhr §11.2 summary)

Lock-free ≠ wait-free

Data structures worth knowing

ECE459 framing

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Lock-Free Programming

Core pattern: CAS retry loop

Lock vs lock-free (Buhr §11.2 summary)

Lock-free ≠ wait-free

Data structures worth knowing

ECE459 framing

Related

Graph View

Backlinks