This blog post was originally published at Vision Systems Design's website. It is reprinted here with the permission of PennWell.
Conventional 2-D image sensors, found in the bulk of today's computer vision system designs, enable a tremendous breadth of vision capabilities. However, their inability to discern an object's distance from the sensor can make it difficult or impossible to implement some vision functions. And clever workarounds, such as supplementing 2-D sensed representations with known 3-D models of identified objects (human hands, bodies or faces, for example) can be too constraining in some cases.
In what kinds of applications would full 3-D sensing be of notable value? Consider, for example, a gesture interface implementation. The ability to discern motion not only up-and-down and side-to-side but also front-to-back greatly expands the variety, richness and precision of the suite of gestures that a system can decode. Or consider a biometrics application: face recognition. Depth sensing is valuable in determining that the object being sensed is an actual person's face, versus a photograph of that person's face; alternative means of accomplishing this objective, such as requiring the biometric subject to blink during the sensing cycle, are inelegant in comparison.
ADAS (advanced driver assistance system) and autonomous vehicle applications that benefit from 3-D sensors are abundant. You can easily imagine, for example, the added value of being able to determine not only that another vehicle or object is in the roadway ahead of you, but also to accurately discern its distance. The need for accurate, three-dimensional non-contact scanning of real-world objects, whether for a medical instrument, in conjunction with increasingly popular 3-D printers, or for some other purpose, is also obvious. And plenty of other compelling applications exist: 3-D videoconferencing, manufacturing defect screening, etc.
Two talks at the 2016 Embedded Vision Summit covered 3-D sensing application advantages and implementation specifics in detail. First is "Getting from Idea to Product with 3D Vision," co-presented by Intel's Anavai Ramesh and MathWorks' Avinash Nehemiah. To safely navigate autonomously, they suggest, cars, drones and robots need to understand their surroundings in three dimensions. For system developers, 3-D vision brings a slew of new concepts, terminology, and algorithms, such as SLAM (simultaneous location and mapping), SFM (structure from motion) and visual odometry. Their talk focuses on the challenges engineers are likely to face while incorporating 3-D vision algorithms into products, and discusses practical approaches to solving these problems in real-world autonomous systems. Here's a preview:
I also recommend you check out the presentation "High-resolution 3D Reconstruction on a Mobile Processor" from Qualcomm's Michael Mangen. Mangen provides an overview of how you can implement high-resolution 3-D reconstruction – a capability typically requiring cloud or server processing – directly on a mobile device, thanks to advances in depth sensors and mobile processors. Here's a preview:
What are the different types of depth sensors available for consideration in your next design? For the answer to that question, I recommend you take a look at "3-D Sensors Bring Depth Discernment to Embedded Vision Designs," published on the Embedded Vision Alliance website. This technical article explains stereoscopic vision, structured light and time of flight technologies in detail, providing comparisons between them and suggestions as to which of them might be preferable for use in a particular situation. And if the earlier mentioned gesture interface and face recognition applications are of interest to you, I'll recommend two other technical articles: "The Gesture Interface: A Compelling Competitive Advantage in the Technology Race" and "Facial Analysis Delivers Diverse Vision Processing Capabilities."
Depth sensing is one of the main focus areas of the next Embedded Vision Summit, taking place May 1-3, 2017 at the Santa Clara, California Convention Center. Designed for product creators interested in incorporating visual intelligence into electronic systems and software, the Summit is intended to inspire attendees' imaginations about potential applications for practical computer vision technology through exciting presentations and demonstrations, to offer practical know-how for attendees to help them incorporate vision capabilities into their hardware and software products, and to provide opportunities for attendees to meet and talk with leading vision technology companies and learn about their offerings.
For example, Chris Osterwood, Chief Technical Officer at Carnegie Robotics, will deliver the presentation "How to Choose a 3-D Vision Technology" at the upcoming Summit. His talk will provide an overview of various 3-D sensor technologies (passive stereo, active stereo, time of flight, structured light, 2-D and 3-D lasers, and monocular) and their capabilities and limitations, based on his company's experience selecting the right 3-D technology and sensor for a diverse range of autonomous robot designs. Check out the introductory video that follows, and then register now to secure your seat at the Summit. Super Early Bird registration rates are available for a limited time only using discount code colvsd0117. Additional information is now available on the Alliance website.
I'll be back again soon with more discussion on a timely computer vision topic. Until then, as always, I welcome your comments.
Editor-in-Chief, Embedded Vision Alliance