LETTER FROM THE EDITOR |
|
Dear Colleague, Happy New Year and Happy CES for those attending! If you’re heading to CES and want to see the latest in physical AI and computer vision technologies, check out our Directory of Alliance Members at CES to see what they are showing, where to find them and easy ways to set up appointments for suite and demo visits. Would you or a colleague benefit from an introduction to (or review) of foundational vision/AI techniques? If so, check out the insightful presentations below from experienced practitioners. (And stay tuned for more in our next Insights newsletter edition in two weeks.) We’ll also catch up on some news highlights you may have missed over the holidays. Lastly, don’t miss your chance to save 25% on registration for the 2026 Embedded Vision Summit, coming up May 11-13 in Santa Clara, California! Register with code 26EVSUM-NL to take advantage of this price. Without further ado, let’s dig in! Erik Peters |
BUILDING AND DEPLOYING REAL-WORLD ROBOTS |
DEEP LEARNING LITERACY: BACK TO BASICS |
|
Introduction to Deep Learning and Visual AI: Fundamentals and Architectures This talk provides a high-level introduction to artificial intelligence and deep learning, covering the basics of machine learning and the key concepts of deep learning. Mohammad Haghighat, Senior Manager for CoreAI at eBay, explores the different types of deep learning architectures, including fully connected networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), 3D CNNs and transformers, highlighting their most common use cases and applications. He then focuses on visual AI, introducing CNNs as a fundamental architecture for image and video analysis. Haghighat discusses the building blocks of CNNs and explores example architectures such as Inception, ResNet and EfficientNet. Finally, he highlights some recent trends in visual AI such as vision transformers (ViTs), hybrid architectures and vision-language models (VLMs). You will gain a solid understanding of the fundamentals of deep learning and visual AI, as well as recent advancements and current trends in the field. |
|
Introduction to DNN Training: Fundamentals, Process and Best Practices Training a model is a crucial step in machine learning, but it can be overwhelming for beginners. In this talk, Kevin Weekly, CEO of Think Circuits, provides a comprehensive introduction to the fundamentals of model training. He introduces the different types of training, such as supervised, unsupervised and semi-supervised learning, and then delves into techniques for supervised training. He explains the training process, including error surfaces, optimization methods and back-propagation. Weekly explains key concepts such as trainable parameters and data requirements. He also discusses the main “knobs” that control the training process, such as hyperparameters, regularization and batch normalization, and provides an overview of metrics to monitor during training, including loss curves, model accuracy and precision. Additionally, he covers common problems that arise during training, such as overfitting and underfitting, and introduces approaches to address these issues. Finally, he touches on popular training frameworks and provides resources for further learning. |
DATA, PIPELINES & MLOPS |
|
Mastering the End-to-end Machine Learning Model Building Process: Best Practices and Pitfalls In this talk, Paril Ghori, Senior Data Scientist at Caterpillar, explores the complete machine learning model building process, providing data scientists and ML engineers with practical insights and strategies for success. He examines each phase of the model life cycle, from data ingestion and pre-processing to feature engineering, model selection, training and fine-tuning. He explains best practices for model evaluation, validation and deployment, including effective MLOps integration to ensure seamless model monitoring and scalability. Ghori highlights real-world case studies and common pitfalls encountered during model development, offering actionable solutions and strategies to overcome challenges. He emphasizes optimizing workflows to improve performance and ensure reproducibility in complex projects. You’ll gain a deeper understanding of the entire model building process and gain insights that will help you build robust, efficient and scalable models to drive impactful business outcomes and support continuous innovation. |
|
Multimodal Enterprise-scale Applications in the Generative AI Era As artificial intelligence is making rapid strides in use of large language models, the need for multimodality arises in multiple application scenarios. Similar to the way humans use multiple sensory systems to solve problems and arrive at decisions, in many applications AI problem-solving is enriched by using multimodal inputs. In this presentation, Mumtaz Vauhkonen, Senior Director of AI at Skyworks Solutions, explores the process of building multimodal applications at scale, focusing on the core aspects of quality dataset creation, multimodal data fusion techniques and model pipelines for enterprise applications. She also examines the challenges that arise in bringing these applications to production and techniques for addressing these challenges. |
UPCOMING INDUSTRY EVENTS |
|
Embedded Vision Summit: May 11-13, 2026, Santa Clara, California |
FEATURED NEWS |
|
NVIDIA has debuted the Nemotron 3 family of open models, featuring a hybrid latent MoE architecture poLight ASA and Image Quality Labs have announced an M12-based RPi TLens development platform for rapid evaluation of high speed, constant field-of-view focusing functionality NVIDIA’s agreement with Groq will accelerate AI inference at global scale Samsung is rumored to be launching an in-house mobile GPU by 2027 Google has released FunctionGemma, a lightweight function-calling model aimed at on-device agents |






