Memcache

Designed by Facebook.

Memcache sits between the web/app server and the data store, not between the end-user client (browser/mobile) and your web server.

How do we know if cache is stale?

When the source of truth changes, the writer deletes (invalidates) the relevant cache keys.

  • Next read misses and refills with fresh data.
  • If invalidation is correct and arrives promptly, then cached data is “fresh enough” most of the time.
  • So invalidation answers: “Has something changed that should make this key wrong?”

What about on the CPU?

On a single machine, CPU caches are hardware-coherent (on normal systems):

  • If core A writes to a memory location, the cache-coherence protocol ensures other cores’ cached copies are invalidated or updated before they’re allowed to read a stale value.
  • The program sees a consistent model as long as you obey the language + hardware memory model (use atomics/locks when you have data races).
  • So you don’t usually ask “is my L1 cache stale?” — the hardware makes it “as-if” it isn’t.

But as we saw in CS343, CPU caches can look “incorrect” in practice (two big cases):

  1. Data races / missing synchronization
  • If two threads access the same variable and at least one writes, without atomics/locks:
  • The behavior can be undefined (in C/C++) or otherwise not what you expect.
  • You might see “old” values, weird reordering, etc.
  • This isn’t the cache being “wrong” — it’s the program violating the rules that make coherence + ordering meaningful.

The product tolerates “eventual correctness” within some window (often seconds). In exchange you get:

  • massive read scalability
  • low latency
  • high availability even during partial failures

For CPU: Most software is written assuming:

  • When you use mutexes/atomics, then memory updates protected by them become visible to other threads in a reliable way.
  • When you don’t, you get data races, and then all bets are off (in C/C++ it’s literally undefined behavior).

Granularity: coherence works on cache lines (e.g., 64B), not variables.
This is why you can get false sharing: two different variables in the same cache line can cause lots of invalidations.

https://chatgpt.com/share/697163dc-d230-8002-89ed-dba26d4b6cb2