“A Survey of Model Compression Methods,” a Presentation from Instrumental

Rustem Feyzkhanov, Staff Machine Learning Engineer at Instrumental, presents the “Survey of Model Compression Methods” tutorial at the May 2023 Embedded Vision Summit.

One of the main challenges when deploying computer vision models to the edge is optimizing the model for speed, memory and energy consumption. In this talk, Feyzkhanov provides a comprehensive survey of model compression approaches, which are crucial for harnessing the full potential of deep learning models on edge devices.

Feyzkhanov explores pruning, weight clustering and knowledge distillation, explaining how these techniques work and how to use them effectively. He also examines inference frameworks, including ONNX, TFLite and OpenVINO. Feyzkhanov discusses how these frameworks support model compression and explores the impact of hardware considerations on the choice of framework. He concludes with a comparison of the techniques presented, considering implementation complexity and typical efficiency gains.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top