Automatic Mixed Precision (AMP)
torch.cuda.amp
provides convenience methods for mixed precision, where some operations use the torch.float32 (float) datatype and other operations use torch.float16.
Resources
-
https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html
-
Some ops, like linear layers and convolutions, are much faster inÂ
float16
 orÂbfloat16
-
Other ops, like reductions, often require the dynamic range ofÂ
float32
.
Mixed precision tries to match each op to its appropriate datatype, which can reduce your network’s runtime and memory footprint.