Memory-Bound Program
Notes copied from Chapter 4 of PMPP.
Memory-Bound Program
Memory-bound programs are programs whose execution speed is limited by memory access throughput.
One of the fundamental concept is the compute-to-global-memory-access ratio.
Memory is expensive
Compute is a lot faster than memory access, so you want this number to be as high as possible.
They motivate this through matrix multiplication.
The above is slow
Every single time you are reading into
M[...]
,N[...]
, orP[...]
, that is a read from global memory.
EXERCISE: Calculate the compute-to-memory-acess ratio. It is 1.0
Fundamental question to answer
Is memory access done per thread, per block, or per warp?
- It seems to be done per warp basis. More of this is covered in Chapter 5