Automatic Mixed Precision (AMP)
torch.cuda.amp provides convenience methods for mixed precision, where some operations use the torch.float32 (float) datatype and other operations use torch.float16.
Resources
- 
https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html 
- 
Some ops, like linear layers and convolutions, are much faster in float16 orÂbfloat16
- 
Other ops, like reductions, often require the dynamic range of float32.
Mixed precision tries to match each op to its appropriate datatype, which can reduce your network’s runtime and memory footprint.