CUTLASS
I didn’t even know this was a thing until I went down this rabbithole of trying to understand matrix multiplication.
https://github.com/NVIDIA/cutlass
Resources
- Blog: https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/
- Lightning Talk: Harnessing NVIDIA Tensor Cores: An Exploration of CUTLASS & OpenAI..- Matthew Nicely - Quick intro at PyTorch conference