CS343

Eliding

Eliding is a sequential optimization where the compiler removes “unnecessary” operations — loads by caching a value in a register, writes that appear immediately overwritten, loops with no body, sleep(1) in a sequential context. Safe when the code is single-threaded; catastrophic when another thread mutates the variable.

Why is this a concurrency bug and not just a compiler bug?

Because the compiler is playing by sequential rules — “I read flag once, then looked at it in a loop; no one else touches it in this function, so cache it in a register.” From the sequential perspective that’s correct. The compiler doesn’t know another thread can modify the variable. The fix isn’t to disable the optimization globally — it’s to mark the variable so the compiler knows it’s shared.

The classic bug

// Task 1                            Task 2
flag = false;                        register = flag;      // load once
                                     while ( register );   // spin on register!

The register is a private copy — T1’s write to memory never affects T2’s loop variable. T2 spins forever.

Other elision examples

x = 0;              // "unnecessary — immediate change"
x = 3;              // compiler keeps only this
 
for ( int i = 0; i < 10000; i += 1 ) {}   // no body ⇒ deleted
 
sleep( 1 );         // sequential program doesn't need delays
// but in a concurrent program, the delay may be critical for signal visibility

Fixes

  • volatile qualifier (C/C++) — forces loads/stores to/from registers at every sequence point. Created for longjmp and memory-mapped I/O. Prevents elision but does not prevent reordering or cache effects.
  • Java volatile / C++11 std::atomic — stronger: prevents both elision and disjoint reordering ⇒ sequential consistency for that variable.
  • Wrap the variable in an atomic type when it’s truly shared.

On architectures with few registers

Practically all variables end up implicitly volatile anyway because they don’t fit in registers and get reloaded on every use. This hides the bug on such machines — until you port to x86-64 with 16 registers and it explodes.