Distributed Cache
Learned in SE464.
Examples: Redis, Memcached, ElastiCache, Riak.
- Originally developed to reduce load on relational databases
- Cache responses to frequent DB requests or other materialized application data
- Always support timed expiration of data
- Use the same basic key-value abstraction as NoSQL distributed DBs
- Store data across many nodes
- Have the same data consistency issues as NoSQL databases
- Often optimized to do everything in-memory, but most also persist to disk
Distributed Cache vs NoSQL
Very similar in architecture, but differ in goals:
NoSQL Database
- Items are permanent/persistent
- All items stored on disk (some cached in RAM)
- Scale is the primary goal
Distributed Cache
- Items expire
- Items stored in RAM (though maybe persisted to disk)
- Speed is the primary goal
- RAM capacity is limited; once reached, start evicting oldest/least-used items
Example use cases (Redis)
Real-time stock prices, real-time analytics, leaderboards, real-time communication, anywhere Memcached was previously used.
Keeping nodes in sync
Three coarse strategies (SE464 framing):
- Expire entries after a TTL.
- Coherent cache: push new data or invalidations on every write. The hardware cache coherency protocols (MSI/MESI) apply directly.
- Versioned data: never mutate; new data gets a new name (filename/URL).
The L08 exam question writes pseudocode for a write-back MESI software cache across nodes backed by a DB:
- Get: local hit returns; else ask peers (whoever has it transitions M→S, writing back to DB if dirty); else fall through to DB and cache as Exclusive.
- Update: Shared → broadcast invalidate, go Modified; Exclusive → go Modified silently; Modified → no-op.
- Shutdown: flush all Modified items to the DB, then send leave messages.