fbpx

Gian Marco Iodice, Software Engineer at ARM, presents the "Using SGEMM and FFTs to Accelerate Deep Learning" tutorial at the May 2016 Embedded Vision Summit.

Matrix Multiplication and the Fast Fourier Transform are numerical foundation stones for a wide range of scientific algorithms. With the emergence of deep learning, they are becoming even more important, particularly as use cases extend into mobile and embedded devices. In this presentation, lodice discusses and analyzes how these two key, computationally-intensive algorithms can be used to gain significant performance improvements for convolutional neural network (CNN) implementations.

After a brief introduction to the nature of CNN computations, Iodice explores the use of GEMM (General Matrix Multiplication) and mixed-radix FFTs to accelerate 3D convolution. He shows examples of OpenCL implementations of these functions and highlights their advantages, limitations and trade-offs. Central to the techniques explored is an emphasis on cache-efficient memory accesses and the crucial role of reduced-precision data types.

logo_2020

May 18 - 21, Santa Clara, California

The preeminent event for practical, deployable computer vision and visual AI, for product creators who want to bring visual intelligence to products.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 North California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top