CUDA Optimization
This is a list of techniques taken from table 6.1 of the PMPP book.
- Maximize occupancy
- Enable Memory Coalescing by being aware of the order at which you are reading from RAM
- Minimize Control Divergence
- Tiling
- Privatization ..?
- Thread coarsening