Kees Vissers, Distinguished Engineer at Xilinx, presents the “Exploiting Reduced Precision for Machine Learning on FPGAs” tutorial at the May 2018 Embedded Vision Summit.
Machine learning algorithms such as convolutional neural networks have become essential for embedded vision. Their implementation using floating-point computation requires significant compute and memory resources. Research over the last two years has shown that reducing the precision of the representations of network inference parameters, inputs and activation functions results in more efficient implementations with a minimal reduction in accuracy.
With FPGAs, it is possible to customize hardware circuits to operate on these reduced precision formats: 16 bit, 8 bit and even lower precision. This significantly reduces the hardware cost and power consumption of inference engine implementations. In this talk, Vissers shows detailed results of the accuracy and implementation cost for several reduced-precision neural networks on a set of embedded platforms. From these design points, he extracts the pareto-optimal results for accuracy versus precision of both weights and activations, ranging from 16 bit to 8 bit, and down to only a few bits.