Distributed Cache

Learned in SE464.

Examples: Redis, Memcached, ElastiCache, Riak.

  • Originally developed to reduce load on relational databases
  • Cache responses to frequent DB requests or other materialized application data
  • Always support timed expiration of data
  • Use the same basic key-value abstraction as NoSQL distributed DBs
  • Store data across many nodes
  • Have the same data consistency issues as NoSQL databases
  • Often optimized to do everything in-memory, but most also persist to disk

Distributed Cache vs NoSQL

Very similar in architecture, but differ in goals:

NoSQL Database

  • Items are permanent/persistent
  • All items stored on disk (some cached in RAM)
  • Scale is the primary goal

Distributed Cache

  • Items expire
  • Items stored in RAM (though maybe persisted to disk)
  • Speed is the primary goal
  • RAM capacity is limited; once reached, start evicting oldest/least-used items

Example use cases (Redis)

Real-time stock prices, real-time analytics, leaderboards, real-time communication, anywhere Memcached was previously used.

Keeping nodes in sync

Three coarse strategies (SE464 framing):

  • Expire entries after a TTL.
  • Coherent cache: push new data or invalidations on every write. The hardware cache coherency protocols (MSI/MESI) apply directly.
  • Versioned data: never mutate; new data gets a new name (filename/URL).

The L08 exam question writes pseudocode for a write-back MESI software cache across nodes backed by a DB:

  • Get: local hit returns; else ask peers (whoever has it transitions M→S, writing back to DB if dirty); else fall through to DB and cache as Exclusive.
  • Update: Shared → broadcast invalidate, go Modified; Exclusive → go Modified silently; Modified → no-op.
  • Shutdown: flush all Modified items to the DB, then send leave messages.