LETTER FROM THE EDITOR |
Dear Colleague, Last week’s Embedded Vision Summit was a resounding success, with more than 1,200 attendees learning from more than 85 presentations, more than 65 exhibitors and hundreds of demos, as well as making valuable connections. 2025 Embedded Vision Summit presentation videos and slide decks will become available on the Embedded Vision Alliance website starting in the coming weeks. See you at the 2026 Summit, May 19-21! Brian Dipert |
VISION-LANGUAGE MODELS |
Deploying Efficient Vision-language Models At the Edge Recent large language models (LLMs) have demonstrated unprecedented performance in a variety of natural language processing (NLP) tasks. Thanks to their versatile language processing capabilities, it has become possible to develop various NLP applications that can be beneficially used in real-life scenarios. However, since LLMs require a substantial amount of computational power, ordinary users have to rely on high-performance cloud services to utilize them. In cloud environments, users inevitably face financial burdens. Additionally, there is a risk of exposing personal sensitive data to external parties. Due to these issues, there is a growing demand for using language models on edge devices. Furthermore, there has been a recent increase in the use of vision-language models (VLMs), which attach an image encoder to LLMs to utilize information from images. The demand for using VLMs at the edge is also growing. However, VLMs require more computational resources because they include a vision transformer-based image encoder. In this article, Nota AI describes how it reconstructed one VLM model, LLaVA, into a lighter model and deployed it on an example mobile edge device. |
Building Real-time Multimodal XR Applications With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising means of enhancing semantic comprehension in extended reality (XR) settings. By integrating VLMs, developers can significantly improve how XR applications interpret and interact with user actions, making them more responsive and intuitive. This tutorial from NVIDIA walks you through how to leverage the company’s AI Blueprint for video search and summarization and enhance it to support audio in an XR environment. NVIDIA explains the step-by-step process—from setting up the environment to seamless integration—for real-time speech recognition and immersive interactions. |
PERCEPTUAL INTELLIGENCE ON APPLICATION PROCESSORS |
A Step-by-step Tutorial for Enabling Pose Detection The goal of this design guide from Qualcomm is to walk developers through the steps necessary to run a deep learning model using a Snapdragon SoC’s Hexagon NPU without the need to dig through endless docs, possibly losing your sanity—or even worse your motivation—in the process. The HRNetPose Detection Application is built in Python, using tools like Qualcomm AI Hub, onnxruntime-qnn, Jupyter Notebook, OpenCV, and NumPy. By the end of the tutorial, you will have a fully working application that runs on Hexagon and detects 17 body keypoints using the HRNetPose model. |
Build and Deploy AI Vision Models Want to deploy AI models to the edge, not the cloud? In this archive video of a live webcast from last week’s Embedded Vision Summit, Phil Nelson and and Satya Mallick of OpenCV, Andy Ballester and Blythe Towal from EyePop.ai, and Qualcomm’s Meghan Stronach discuss how to enable real-time, offline inference at the edge with vision AI models and Snapdragon NPUs. |
FEATURED NEWS |
An Independent Report from BDTI Analyzes the MemryX MX3 M.2 AI Accelerator, Highlighting Ease of Use Plainsight Technologies Introduces the Open Source OpenFilter for Scalable Computer Vision AI NVIDIA Launches AI-first DGX Personal Computing Systems with Global Computer Makers Sensing Tech Debuts Three Advanced Vision Solutions Intel Unveils New GPUs for AI and Workstations |
EDGE AI AND VISION PRODUCT OF THE YEAR WINNER SHOWCASE |
Qualcomm Snapdragon 8 Elite Platform (Best Edge AI Processor) Qualcomm’s Snapdragon 8 Elite Platform is the 2025 Edge AI and Vision Product of the Year Award Winner in the Edge AI Processors category. This platform significantly enhances on-device experiences through remarkable processing power, groundbreaking AI advancements, and various mobile innovations. The Snapdragon 8 Elite includes a new custom-built Qualcomm Oryon CPU which delivers impressive speeds and efficiency to enhance every interaction. It provides a 45% performance boost, 44% greater power efficiency, and includes the mobile industry’s largest shared data cache. Additionally, Qualcomm’s Adreno GPU, with its newly designed architecture, achieves a 40% increase in performance and a 40% improvement in efficiency. Overall, users can expect a 27% reduction in power consumption. The platform enhances user experiences with on-device AI, showcased through the Qualcomm AI Engine, which incorporates multimodal generative AI and personalized support. This AI Engine utilizes a variety of models, including large multimodal models (LMMs), large language models (LLMs), and visual language models (LVMs), while supporting the world’s largest generative AI model ecosystem. It also features Qualcomm’s 45% faster Hexagon NPU, which provides an impressive 45% increase in performance per watt, driving AI capabilities to new levels. Moreover, Qualcomm’s new AI Image Signal Processor (ISP) works in tandem with the Hexagon NPU to enhance real-time image capture. Connectivity options include advanced AI-driven 5G and Wi-Fi 7 capabilities, facilitating seamless entertainment and productivity on the go. Please see here for more information on Qualcomm’s Snapdragon 8 Elite Platform. The Edge AI and Vision Product of the Year Awards celebrate the innovation of the industry’s leading companies that are developing and enabling the next generation of edge AI and computer vision products. Winning a Product of the Year award recognizes a company’s leadership in edge AI and computer vision as evaluated by independent industry experts. |