Lock-Free Programming

Lock-Free Stack

A lock-free stack uses CAS in a retry loop to atomically update the top pointer without a mutex. push and pop are wait-free in the common case; under contention, at least one thread always makes progress. The subtle failure mode is the ABA problem, fixed with a versioned double-wide CAS.

Why does a stack work lock-free but not, say, a doubly-linked list?

Because a stack has a single mutation point, the top pointer. One CAS updates it atomically, and the rest of the structure (node contents, next pointers) is either read-only (push: new node; pop: old top) or becomes garbage. A doubly-linked list needs to atomically update both prev and next of multiple nodes, no single CAS does that. You’d need transactional memory or complex SMR protocols.

Push

class Stack {
    Node *top;
public:
    struct Node { /* data */ Node *next; };
    void push( Node &n ) {
        for ( ;; ) {                          // busy wait
            n.next = top;                     // link to current top
            if ( CAS( top, n.next, &n ) ) break;   // try to swing top
            // CAS failed ⇒ top changed since we read it ⇒ retry
        }
    }
};

CASV variant: on failure, CASV copies the changed value back into the compare arg ⇒ eliminates re-reading top at the start of each iteration (no n.next = top; reset at loop head).

Pointer walk-through. Before: top = 0x211d8 → 0x384e0. A new node n is allocated at 0x4ffb8.

             top
          0x211d8
          0x4ffb8          0x384e0
                           0x211d8            ← n.next = top (read)
                             n
          top = &n  ⇒  top = 0x4ffb8

CAS succeeds iff top == n.next, i.e. no other push/pop happened between the read and the CAS. On failure, the new top value is reloaded (or CASV writes it into n.next for free) and the loop retries.

Pop

Node *Stack::pop() {
    Node *t;
    for ( ;; ) {
        t = top;
        if ( t == nullptr ) return nullptr;   // empty
        if ( CAS( top, t, t->next ) ) return t;
    }
}

Pop walk-through with same addresses:

             top
          0x211d8
          0x211d8 ← t             (t = top)
          0x384e0 ← t->next       0x384e0
          top = t->next ⇒ top = 0x384e0

CAS succeeds iff top == t, meaning nobody else pushed or popped between the t = top read and the CAS.

⚠️ SMR required, between t = top and CAS(top, t, t->next), another thread can pop+free t. Dereferencing t->next then is undefined. See hazard pointers.

ABA vulnerability

Between t = top and CAS(top, t, t->next), other threads can pop t and push it back, leaving top == t with t->next now stale. CAS succeeds wrongly and corrupts the stack. The fix (double-wide CAS on (top, count), or LL/SC) lives in ABA Problem.