Edge AI and Vision Insights: September 3, 2025

MULTIMODAL AI ADVANCEMENTS

The Future of Visual AI: Efficient Multimodal Intelligence

AI is on the cusp of a revolution, driven by the convergence of several breakthroughs. One of the most significant of these advances is the development of large language models (LLMs) that can reason like humans, enabling them to make decisions and take actions based on complex, nuanced inputs. Another is the integration of natural language processing and computer vision through vision-language models (VLMs). In this 2025 Embedded Vision Summit keynote talk, Trevor Darrell, Professor at U.C. Berkeley, shares his perspective on the current state and trajectory of research advancing machine intelligence. He presents highlights of his group’s groundbreaking work, including methods for training vision models when labeled data is unavailable and techniques that enable robots to determine appropriate actions in novel situations. Particularly relevant to edge applications, much of Darrell’s work aims to overcome obstacles—such as massive memory and compute requirements—that limit the practical applications of state-of-the-art models. For example, he discusses approaches to making VLMs smaller and more efficient while retaining accuracy. He also shows how LLMs can be used as visual reasoning coordinators, overseeing the use of multiple task-specific models to enable superior performance.

Customizing Vision-language Models for Real-world Applications

Vision-language models (VLMs) have the potential to revolutionize various applications, and their performance can be improved through fine-tuning and customization. In this 2025 Embedded Vision Summit presentation, Monika Jhuria, Technical Marketing Engineer at NVIDIA, explores the concept and shares insights on domain adaptation for VLMs. She discusses the factors to consider when fine-tuning a VLM, including dataset requirements. Jhuria explores two key approaches for customization: VLM fine-tuning, encompassing memory-efficient fine-tuning methods such as low-rank adaptation (LoRA) and full fine-tuning, and retrieval-augmented generation (RAG) for enhanced adaptability. Finally, she discusses metrics for validating the performance of VLMs and best practices for testing domain-adapted VLMs in real-world applications. You will gain a practical understanding of VLM fine-tuning and customization and will be equipped to make informed decisions about how to unlock the full potential of these models in your own projects.

IMAGE SENSOR EVOLUTIONS AND OPTIMIZATIONS

An Introduction to the MIPI CSI-2 Image Sensor Standard and Its Latest Advances

In this 2025 Embedded Vision Summit presentation, Haran Thanigasalam, Camera and Imaging Systems Consultant for the MIPI Alliance, provides an overview of the MIPI CSI-2 image sensor interface standard, covering its fundamental features and capabilities, including low-energy transport solutions, power spectral density reduction and latency reduction. He discusses the latest advances, such as always-on sentinel conduit, smart region of interest and multi-pixel compression. Thanigasalam also explores emerging features, including D-PHY embedded clock mode and event sensing and processing. This presentation provides a comprehensive understanding of the MIPI CSI-2 standard and its applications in imaging systems.

Technology and Market Trends in CMOS Image Sensors

In this 2025 Embedded Vision Summit interview, Shung Chieh, Senior Vice President at Eikon Systems, and Florian Domengie, Principal Technology and Market Analyst for Imaging at the Yole Group, examine the most important trends in CMOS image sensor (CIS) technology and markets and the factors driving these trends. In addition to highlighting trends in mainstream CIS products and markets, they explore emerging technologies like event-based image sensors, new sensing modalities, sensors with augmented intelligence and processing capabilities, and assess the opportunities for these new approaches.

UPCOMING INDUSTRY EVENTS

Infrared Imaging: Technologies, Trends, Opportunities and Forecasts – Yole Group Webinar: September 23, 2025, 9:00 am PT

Embedded Vision Summit: May 11-13, 2026, Santa Clara, California

More Events

FEATURED NEWS

NVIDIA’s Blackwell-powered Jetson Thor is Now Available, Accelerating the Age of General Robotics

Andes Technology Further Expands Its Long-term Collaboration with Sequans Communications with AndesCore A25MP and N25F RISC-V CPU Core Licenses

Introducing the Arducam ezBOX Camera Module Series: Simplifying Embedded Vision Solutions

FRAMOS’ First Module Will be Available Soon for Sony’s IMX811 High-resolution Image Sensor

Chips&Media Launches Cframe60: Lossless and Lossy Frame Compression Standalone Hardware IP

More News

EDGE AI AND VISION PRODUCT OF THE YEAR WINNER SHOWCASE

Quadric Chimera QC Series GPNPU Processors (Best Edge AI Processor IP)

Quadric’s Chimera QC Series GPNPU Processors are the 2025 Edge AI and Vision Product of the Year Award Winner in the Edge AI Processor IP category. The Chimera GPNPU family, which scales up to 800 TOPS, is the only fully C++ programmable neural processor solution that can run complete AI and machine learning models on a single architecture. This eliminates the need to partition graphs between traditional CPUs, DSPs, and matrix accelerators. Chimera processors execute every known graph operator at high performance without having to rely on slower DSPs or CPUs for less commonly used layers. This full programmability ensures that hardware built with Quadric Chimera GPNPUs can support all future vision AI models, not just a limited selection of existing networks. Designed specifically to tackle the significant deployment challenges in machine learning inference faced by system-on-chip (SoC) developers, Quadric’s Chimera general-purpose neural processor (GPNPU) family features a simple yet powerful architecture that demonstrates improved matrix computation performance compared to traditional methods. Its key differentiator is the ability to execute diverse workloads with great flexibility within a single processor.

The Chimera GPNPU family offers a unified processor architecture capable of handling matrix and vector operations alongside scalar (control) code in one execution pipeline. In conventional SoC architectures, these tasks are typically managed separately by an NPU, DSP, and real-time CPU, necessitating the division of code and performance tuning across two or three heterogeneous cores. In contrast, the Chimera GPNPU operates as a single software-controlled core, enabling the straightforward expression of complex parallel workloads. Driven entirely by code, the Chimera GPNPU empowers developers to continuously optimize the performance of their models and algorithms throughout the device’s lifecycle. This makes it ideal for running classic backbone networks, today’s newest Vision Transformers and Large Language Models, as well as any future networks that may be developed.

Please see here for more information on Quadric’s Chimera QC Series GPNPU Processors. The Edge AI and Vision Product of the Year Awards celebrate the innovation of the industry’s leading companies that are developing and enabling the next generation of edge AI and computer vision products. Winning a Product of the Year award recognizes a company’s leadership in edge AI and computer vision as evaluated by independent industry experts.

If you're building AI or vision-enabled products, you've come to the right place.

Edge AI and Vision Insights: September 3, 2025

MULTIMODAL AI ADVANCEMENTS

IMAGE SENSOR EVOLUTIONS AND OPTIMIZATIONS

UPCOMING INDUSTRY EVENTS

FEATURED NEWS

EDGE AI AND VISION PRODUCT OF THE YEAR WINNER SHOWCASE

Pages

Topics

Contact

Address

Phone