Locking Granularity

Locking granularity is how much data a single lock protects. It’s a trade-off between parallelism, overhead, and bug-proneness, covered in ECE459 L11.

Why does it matter?

Critical sections should be “as large as they need to be but no larger.” Too coarse kills parallelism. Too fine means deadlocks and lock-management overhead.

Coarse-grained uses few locks (maybe one):

easier to implement
one lock means no deadlock
lowest memory and setup cost
parallel program collapses to sequential

Python’s GIL locks the whole interpreter. Only I/O-bound threads see a benefit, and non-I/O threaded Python is slower than sequential. OS kernels have had similar big kernel locks. Linux had one from SMP support until around 2011.

Fine-grained uses many small locks:

maximizes parallelization
wasted memory and setup time if the program isn’t very parallel
exposes you to deadlocks and “did I grab the right lock?” bugs

Databases lock fields, records, or tables depending on scope. Object-level locking works, but watch out for transactional needs.

Sizing the critical section in Rust

In Rust the critical section ends when the MutexGuard drops. Shrink it with manual scoping { ... } or an explicit drop(guard).

L11’s producer-consumer example: by default the guard lives until the end of the loop body, so every unrelated call runs locked. Wrap just the buf-touching statements in an inner block that returns to_consume:

// Before — guard held for the whole loop body
let mut buf = buffer.lock().unwrap();
let current_consume_space = buf.consumer_count;
let next_consume_space = (current_consume_space + 1) % buf.buffer.len();
let to_consume = *buf.buffer.get(current_consume_space).unwrap();
buf.consumer_count = next_consume_space;
spaces.add_permits(1);   // unrelated to buf
permit.forget();         // unrelated to buf
consume_item(to_consume); // the actual work

// After — guard dropped after the inner block
let to_consume = {
    let mut buf = buffer.lock().unwrap();
    let current_consume_space = buf.consumer_count;
    let next_consume_space = (current_consume_space + 1) % buf.buffer.len();
    let to_consume = *buf.buffer.get(current_consume_space).unwrap();
    buf.consumer_count = next_consume_space;
    to_consume
};
spaces.add_permits(1);
permit.forget();
consume_item(to_consume);

With thread::sleep added to consume_item to simulate real work, hyperfine reports ~2.8 s before, ~1.1 s after. The same shrink applies symmetrically on the producer side.

Three concerns when using locks

Overhead: memory, init/destroy, acquire/release time. Scales with lock count
Contention: most locking time is spent waiting. Shrink the region or split the lock
Deadlocks: more locks, more chances to cycle

🛠️ Steven Gong

Table of Contents

Locking Granularity

Sizing the critical section in Rust

Three concerns when using locks

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Locking Granularity

Sizing the critical section in Rust

Three concerns when using locks

Related

Graph View

Backlinks