CUDA Best Practices

Tips

From https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#memory-optimizations

High Priority: Minimize data transfer between the host and the device, even if it means running some kernels on the device that do not show performance gains when compared with running them on the host CPU.

Peak theoretical bandwidth between the device memory and the GPU is 898 GB/s
eak theoretical bandwidth between host memory and device memory (16 GB/s on the PCIe x16 Gen3)

https://blog.logicalincrements.com/2018/08/data-transfer-rates-bandwidth-cpu-ram-pcie-m-2-sata-usb-hdmi/

🛠️ Steven Gong

Table of Contents

CUDA Best Practices

Tips

Graph View

Backlinks