Cache Coherency

Directory-Based Cache Coherence

A cache-coherence scheme where a hardware directory records, per cache line, which cores hold a copy. On a write, the writer consults the directory and sends invalidate (or update) messages only to the cores that actually care.

The alternative is snooping.

Why?

Snooping broadcasts every transaction to every cache, which saturates the bus past ~8 cores. Directories scale to dozens or hundreds of cores by turning broadcasts into targeted point-to-point messages, at the cost of maintaining the directory itself.

A directory entry is literally a row keyed by cache-line address:

Line addressStateSharers (bitmap)
0xA340Shared[1, 0, 1, 0, 0, 0, 1, 0] → cores 0, 2, 6
0xB800Modified[0, 0, 0, 1, 0, 0, 0, 0] → core 3, dirty

On every load / evict / write, the directory has to be updated. It lives somewhere, typically sharded across memory controllers or co-located with the shared L3 slices.

Email analogy

  • Snooping = reply-all on a huge mailing list. Everyone gets every message; most ignore it
  • Directory = check a contact list first, send DMs only to cores that care

Modern many-core CPUs use directory protocols or hybrid “snoop filter” schemes (a directory-like filter that lets snoops skip cores guaranteed not to have the line).