Automatic Mixed Precision (AMP)

torch.cuda.amp provides convenience methods for mixed precision, where some operations use the torch.float32 (float) datatype and other operations use torch.float16.

Resources

https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html
Some ops, like linear layers and convolutions, are much faster in float16 or bfloat16
Other ops, like reductions, often require the dynamic range of float32.

Mixed precision tries to match each op to its appropriate datatype, which can reduce your network’s runtime and memory footprint.

🛠️ Steven Gong

Automatic Mixed Precision (AMP)

Graph View

Backlinks