Memory Wall

The memory wall is the widening gap between CPU speed and main memory speed: CPUs got faster much quicker than DRAM did, so on modern processors the bottleneck is no longer compute but waiting on memory.

Why?

From 1980–2005 CPU frequency roughly doubled every 2 years while DRAM speed doubled every ~6 years. The ratio compounded, so by the mid-2000s a single cache miss cost hundreds of compute cycles. Runtime used to be dominated by page faults (disk); now it’s dominated by cache misses (DRAM). Basically every modern CPU trick (multi-level caches, miss shadow, OoO, prefetching) exists to hide this gap.

From the Sun World Wide Analyst Conference 2003:

Year	CPU freq scaling	DRAM scaling
doubling period	every 2 years	every 6 years

Cost of a memory access today:

Level	Latency
L1	2–3 cycles
L2	tens of cycles
L3	~tens–100s
DRAM	200–300 cycles

Why DRAM stays slow. DRAM is 1 transistor + 1 capacitor per bit: cheap and dense but physically slow (capacitors need refreshes, rows need to be activated). SRAM is 6 transistors per bit: fast but uses ~6× the area.

SRAM really uses 6x the area??

That’s why SRAM is used only for on-die caches. DDR improved bandwidth (two transfers per cycle), not latency.

Mitigations (all responses to the wall):

Multi-level cache hierarchy (L1/L2/L3) with staged latency
Miss shadow: useful work during a pending load
OoO + renaming: overlap multiple misses
Prefetching: pull likely-needed data into cache early
SSDs/nonvolatile memory: the next layer down is also getting closer to DRAM speed

From ECE459 L06. One of the four walls alongside the power wall, ILP limits, and the speed of light.

🛠️ Steven Gong

Table of Contents

Memory Wall

Graph View

Backlinks

🛠️ Steven Gong

Table of Contents

Memory Wall

Related

Graph View

Backlinks