Streaming Multiprocessor

CUDA Core / Streaming Processor (SP)

CUDA cores are the standard floating point unit in an NVIDIA graphics card.

They are the smallest execution units within the GPU, designed to perform floating-point operations.

CUDA Core = ALU

CUDA cores are analogous to ALUs (Arithmetic Logic Units) in traditional CPUs.

How many CUDA cores per SM?

“To use the full possible power of a GPU you need much more threads per SM than the SM has SPs.”

GH200 and H200: These GPUs utilize the full NVIDIA Hopper GPU architecture with all 144 Streaming Multiprocessors (SMs) enabled. Each SM contains 128 CUDA Cores, resulting in a total of 18,432 CUDA cores

Note

There are Fp32 and Fp64 CUDA cores.