Quantization-Aware Training (QAT)

QAT changes the precision and continues training for a while to adapt the model to the new precision. All quantization approaches should use this regime at least partially to achieve minimal accuracy loss in real world performance. This directly uses the regular training process to adapt the model to the quantized regime, and is generally considered more effective but more computationally expensive.

Source: https://semianalysis.com/2024/01/11/neural-network-quantization-and-number/

🛠️ Steven Gong

Quantization-Aware Training (QAT)

Graph View

Backlinks