Compare-and-Swap Instruction
Compare-and-swap atomically reads a memory location, compares it to an expected value, and writes a new value if they match, all in one uninterruptible step. Also called compare-and-exchange (the cmpxchg instruction on x86). Covered in ECE459 L12.
CAS returns bool. If val == comp, write nval; else nothing:
bool CAS( int &val, int comp, int nval ) {
// atomic
if ( val == comp ) { val = nval; return true; }
return false;
}
// spinlock:
while ( ! CAS( Lock, OPEN, CLOSED ) );CASV (compare-and-set-value): on failure, copies the current value of val back into comp by reference. Saves a reload in retry loops. In Stack::push, CAS failure needs n.next = top; at the top of each iteration to refresh, whereas CASV writes the fresh top into n.next for free:
bool CASV( int &val, int &comp, int nval ) {
if ( val == comp ) { val = nval; return true; }
comp = val;
return false;
}CASVD (compare-and-set-value-double-wide): operates on a 64/128-bit atom for a 32/64-bit machine. Lets you pack (pointer, counter) into a single atomic operation, which solves ABA probabilistically for lock-free structures:
bool CASVD( uintS_t &val, uintS_t &comp, uintS_t nval ) {
if ( val == comp ) { val = nval; return true; }
comp = val;
return false;
}See Lock-Free Stack for CASVD use on (top, count).
Lock-Free Programming
Why is CAS the workhorse of lock-free code?
It packs read, compare, and write into one atomic instruction. That’s enough to build a retry-on-conflict loop without any explicit locking.
The canonical CAS loop: read the current value, compute the new value, CAS; on conflict reread and retry.
loop {
let old = atomic.load(Ordering::Acquire);
let new = f(old);
if atomic.compare_exchange(old, new, Ordering::AcqRel, Ordering::Acquire).is_ok() {
break;
}
}Only one concurrent CAS on a given location succeeds, so the system as a whole always makes progress even if an individual thread has to spin.
Where CAS shows up:
- Head and tail pointer updates in lock-free stacks and queues (see Lock-Free Stack)
- Flipping a lock flag from free to held (the spinlock example below)
- Installing a singleton pointer exactly once for lazy initialization
- Atomic counters when
fetch_addisn’t expressive enough, e.g. saturating or bounded increments - Work-stealing deques in concurrent schedulers
Rust: compare_and_swap(expected, new, Ordering) on atomic types. Also swap (no comparison, just returns old value), and the more advanced compare_exchange and compare_exchange_weak.
Spinlock via CAS
use std::sync::atomic::{AtomicBool, Ordering, spin_loop_hint};
let my_lock = AtomicBool::new(false);
while my_lock.compare_and_swap(false, true, Ordering::SeqCst) == true {
spin_loop_hint();
}
// critical section
my_lock.store(false, Ordering::SeqCst);spin_loop_hint() tells the CPU it’s fine to yield a hyperthread or enter low-power mode while spinning. Spinlocks only make sense when the expected wait is shorter than two context switches.
ABA false positives
CAS can succeed even when the value round-tripped A→B→A while you weren’t looking. See ABA Problem.