Josh Morris, Engineering Manager at DSP Concepts, presents the “Comparing ML-Based Audio with ML-Based Vision: An Introduction to ML Audio for ML Vision Engineers” tutorial at the May 2022 Embedded Vision Summit.

As embedded processors become more powerful, our ability to implement complex machine learning solutions at the edge is growing. Vision has led the way, solving problems as far-reaching as facial recognition and autonomous navigation. Now, ML audio is starting to appear in more and more edge applications, for example in the form of voice assistants, voice user interfaces and voice communication systems.

Although audio data is quite different from video and image data, ML audio solutions often use many of the same techniques initially developed for video and images. In this talk, Morris introduces the ML techniques commonly used for audio at the edge, and compares and contrasts them with those commonly used for vision. You’ll get inspired to integrate ML-based audio into your next solution.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.



1646 N. California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone: +1 (925) 954-1411
Scroll to Top