The Embedded Vision Summit is the preeminent conference on practical computer vision, covering applications at the edge and in the cloud. It attracts a global audience of over one thousand product creators, entrepreneurs and business decision-makers who are creating and using computer vision technology. The Embedded Vision Summit has experienced exciting growth over the last few years, with 98% of 2017 Summit attendees reporting that they’d recommend the event to a colleague. The next Summit will take place May 22-24, 2018 in Santa Clara, California. I encourage you to register for next year's Embedded Vision Summit while Super Early Bird discount rates are still available, using discount code NLEVI1205!
Google will deliver the free webinar "An Introduction to Developing Vision Applications Using Deep Learning and Google's TensorFlow Framework" on January 17, 2018 at 9 am Pacific Time, in partnership with the Embedded Vision Alliance. The webinar will be presented by Pete Warden, research engineer and technology lead on the mobile and embedded TensorFlow team. It will begin with an overview of deep learning and its use for computer vision tasks. Warden will then introduce Google's TensorFlow as a popular open source framework for deep learning development, training, and deployment, and provide an overview of the resources Google offers to enable you to kick-start your own deep learning project. He'll conclude with several case study design examples that showcase TensorFlow use and optimization on resource-constrained mobile and embedded devices. For more information and to register, see the event page.
Editor-In-Chief, Embedded Vision Alliance
DEEP LEARNING FOR VISION
Deep Visual Understanding from Deep Learning
Deep learning neural networks coupled with high-performance computing have led to remarkable advances in computer vision. For example, we now have the capability to detect and localize people or objects and determine their 3D pose and layout in a scene. But we are still quite short of "visual understanding," a much larger problem. Vision, for example, helps guide manipulation and locomotion, and this requires building dynamic models of consequences of various actions. Further, we should not just detect people, objects and actions but also link them together, by what we call "visual semantic role labeling," essentially identifying subject-verb-object relationships. And finally, we should be able to make predictions – what will happen next in a video stream? In this highly rated 2017 Embedded Vision Summit keynote talk, Jitendra Malik, Professor and Chair of the Department of Electrical Engineering and Computer Science at the University of California, Berkeley, reviews progress in deep visual understanding, gives an overview of the state of the art, and shows a tantalizing glimpse into what the future holds.
The Battle Between Traditional Algorithms and Deep Learning: The 3 Year Horizon
Deep learning techniques are gaining in popularity for many vision tasks. Will they soon dominate every facet of embedded vision? In this presentation, Cormac Brick, Director of Machine Intelligence for Intel's Movidius Group, explores this question by examining the theory and practice of applying deep learning to real-world vision problems, with examples illustrating how this shift is happening more quickly in some areas and more slowly in others. Today it’s widely accepted that deep learning techniques involving convolutional neural networks (CNNs) are dominating for image recognition tasks. Other vision tasks use hybrid approaches. And others are still using classical vision techniques. These themes are further explored through a range of real-world examples: gesture tracking, which is moving to CNNs; SLAM, moving toward a hybrid approach; ISP (imaging pipelines), sticking with traditional algorithms; and geometry-based functions such as warping and point cloud manipulation, which will likely stay with classical approaches. The audience will gain a better understanding of the mix of approaches required to deploy a robust real-world embedded vision system.
VISION PROCESSORS AND CORES
A Multi-purpose Vision Processor for Embedded Systems
This presentation from Allied Vision's Michael Melle and Felix Nikolaus gives an overview of an innovative vision processor that delivers the superior image quality of industrial cameras while enabling the embedded vision system designer to meet the cost, size and power requirements of embedded vision applications. In addition to delivering superior image quality, the processor functions as a sensor bridge, an ISP and a signal transmitter for downstream processing. The processor facilitates the transformation of industrial cameras, which are equipped with more expensive and larger commercial processors, to embedded applications that require smaller form factors, lower power and advanced image processing. Through an extensive library of pre-processing and image processing functions, designers can shift more image processing to the camera, reducing the demands on data transfer and host processing. The processor accepts inputs from most image sensors on the market and can process high data rates and image resolutions.
Ultra-Efficient VPU: Low-power Deep Learning, Computer Vision and Computational Photography
This talk from Petronel Bigioi, General Manager at FotoNation, focuses on bringing intelligence to the edge to enable local devices to see and hear. It explores the power-consumption-vs.-flexibility dilemma by examining hard-coded and programmable architectures. It highlights FotoNation’s hybrid design, in which the most appropriate processing path is chosen to achieve the best power, performance, programmability and silicon area. It also highlights FotoNation’s approach to programmable deep neural network engines. And, since seeing is believing, Bigioi demonstrates uses cases in face detection, face tracking, generic object tracking and more.