Cache Consistency
Cache consistency answers when a write becomes visible to other cores — a timing question, distinct from coherence (which is about whether duplicates eventually agree).
Why?
Coherence alone isn’t enough. Even if the protocol guarantees every cache will eventually hold the latest value, there’s still a gap between “core 0 writes” and “core 3 can read it.” During that gap, simultaneous R/W or W/W could flicker or scramble. Consistency is the bidirectional sync story that fills the gap.
Eager vs. lazy
Two points on the design space (CS 343 §10.2.2):
| Writer waits for ack? | Reader sees | Cost | |
|---|---|---|---|
| Eager | yes, from all cores | always fresh | complex, expensive, stalls the writer |
| Lazy | no — writer proceeds after updating its own cache | may read stale data briefly | cheap, what real hardware ships |
Eager is the “obvious” correct thing, but it serializes every write behind an all-cores round-trip. Unusable at scale.
How is lazy actually implemented?
A small FIFO (the store buffer) sits between the core and L1. On a store, the value goes into the buffer and the core retires the instruction immediately; the buffer drains into L1 (and fires off coherence messages) in the background. Other cores only see the write once the drain completes.
A memory fence opts back into eager by draining the buffer and waiting for acks before the next instruction executes.
What lazy still guarantees
Lazy sounds broken — readers can see stale data — but it’s not a free-for-all:
- Writes eventually appear in (largely) the same order as issued.
- Inside a critical section, the write to the shared variable lands before the write releasing the lock. So a reader who acquires the lock sees the update. This is why locks “just work” even under lazy consistency.
- Outside a lock, if you need to see a specific write, your only recourse is to spin until it shows up — or issue a fence.
This is why the memory consistency model treats lazy caches as a relaxation axis (the “Lazy cache” column in the TSO/PSO/WO/RC table). Lazy consistency is the hardware behaviour; the memory model is the programmer-visible contract layered on top.
Coherence vs. consistency at a glance
| Question | Mechanism | |
|---|---|---|
| Coherence | Do all cached copies eventually agree? | invalidate / broadcast protocol |
| Consistency | When does a write become visible? | eager / lazy ack policy |
You need both. A coherent-but-inconsistent cache would eventually converge but let readers see arbitrary intermediate states. A consistent-but-incoherent cache would sync writes in a tight window but then diverge forever.