CUTLASS
I didn’t even know this was a thing until I went down this rabbithole of trying to understand matrix multiplication.
https://github.com/NVIDIA/cutlass
Resources
- Blog: https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/
- Quick intro at PyTorch conference: https://www.youtube.com/watch?v=yCyZEJrlrfY&ab_channel=PyTorch