“Recent Advances in Post-training Quantization,” a Presentation from Intel

Alexander Kozlov, Deep Learning R&D Engineer at Intel, presents the “Recent Advances in Post-training Quantization” tutorial at the September 2020 Embedded Vision Summit.

The use of low-precision arithmetic (8-bit and smaller data types) is key for the deployment of deep neural network inference with high performance, low cost and low power consumption. Shifting to low-precision arithmetic requires a model quantization step that can be performed at model training time (quantization-aware training) or after training (post-training quantization). Post-training quantization is an easy way to quantize already trained models that provides good accuracy/performance trade-off.

In this talk, Kozlov reviews recent advances in post-training quantization methods and algorithms that help to reduce quantization error. He also shows the performance speed-up that can be achieved for various models when using 8-bit quantization.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top