brevitas

https://github.com/Xilinx/brevitas

primarily for quantization aware training.

Does brevitas only do fp8?

  • Brevitas does not only support INT8 quantization. It is a PyTorch-based library for quantization-aware training (QAT), and it supports multiple quantization bit widths, including:
  • INT8 (8-bit)
  • INT4 (4-bit)
  • INT2 (2-bit)
  • INT1 (Binary quantization)
  • Custom bit widths (e.g., INT6, INT5)