Mumtaz Vauhkonen, Senior Director of AI at Skyworks Solutions, presents the “Multimodal Enterprise-scale Applications in the Generative AI Era” tutorial at the May 2025 Embedded Vision Summit.
As artificial intelligence is making rapid strides in use of large language models, the need for multimodality arises in multiple application scenarios. Similar to the way humans use multiple sensory systems to solve problems and arrive at decisions, in many applications AI problem-solving is enriched by using multimodal inputs.
In this presentation, Vauhkonen explores the process of building multimodal applications at scale, focusing on the core aspects of quality dataset creation, multimodal data fusion techniques and model pipelines for enterprise applications. She also examines the challenges that arise in bringing these applications to production and techniques for addressing these challenges.
See here for a PDF of the slides.

