We’re excited to announce the keynote speaker for the May 22-24 Embedded Vision Summit: Dr. Kristen Grauman, an award-winning, pioneering researcher in computer vision and machine learning focused on visual recognition, video, and embodied perception. Dr. Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Director in Facebook AI Research (FAIR). In her keynote talk, “Frontiers in Perceptual AI: First-Person Video and Multimodal Perception,” Dr. Grauman will explore the concept of first-person or “egocentric” perception, which requires understanding the video and multimodal data that streams from wearable cameras and other sensors. The multimodal nature is particularly compelling, with opportunities to bring together audio, language, and vision.
In her presentation, Dr. Grauman will introduce Ego4D, a massive new open-sourced multimodal egocentric dataset that captures the daily-life activity of people around the world. Building on this resource, she’ll present her team’s ideas for searching egocentric videos with natural language queries, injecting semantics from text and speech into powerful video representations, and learning audio-visual models to, for example, understand a camera wearer’s physical environment or augment their hearing in busy places. And she’ll discuss the interesting performance-oriented challenges raised by having very long video sequences, along with ideas for learning to scale retrieval and encoders.
The Embedded Vision Summit, returning again this year to the Santa Clara, California, Convention Center, is the key event for system and application developers who are incorporating computer vision and perceptual AI into products. It attracts a unique audience of over 1,400 product creators, entrepreneurs and business decision-makers who are creating and using computer vision and edge AI technologies. It’s a unique venue for learning, sharing insights and getting the word out about interesting new technologies, techniques, applications, products and practical breakthroughs in computer vision and edge AI.
Once again we’ll be offering a packed program with more than 90 sessions, more than 80 exhibitors, and hundreds of demos, all covering the technical and business aspects of practical computer vision, deep learning, edge AI and related technologies. And also returning this year is the Edge AI Deep Dive Day, a series of in-depth sessions, all focused on specific topics in perceptual AI at the edge. Registration is now open, and if you register now, you can save 15% by using the code SUMMIT23-NL. Register now and tell a friend! You won’t want to miss what is shaping up to be our best Summit yet.
Editor-In-Chief, Edge AI and Vision Alliance
PROCESSING OPTIONS IN AUTONOMOUS SYSTEMS
Natural Intelligence Outperforms Artificial Intelligence for Autonomy and Vision
Mainstream approaches to AI for autonomy and computer vision make use of data-, energy- and compute-intensive techniques such as deep learning, which struggle to generalize and are fragile and opaque. In contrast, natural intelligence has already solved the fundamental problems in vision and autonomy, even in compute packages as small as the insect brain. (For example, honeybee brains comprise approximately one million neurons, occupying one cubic millimeter and consuming microwatts of power.) In this talk, James Marshall, Chief Scientific Officer at Opteran Technologies, outlines how Opteran is pioneering a novel approach by reverse engineering real brain circuits to produce solutions for vision and autonomy that are orders of magnitude more performant in every aspect. This enables robust, efficient fast and explainable autonomy in previously impossible use cases, from nano drones to logistics solutions that don’t require special infrastructure and that can adapt to novel scenarios.
Build Smarter, Safer and Efficient Autonomous Robots and Mobile Machines
Automation is expanding rapidly from the factory floor to the consumer’s front door. Examples include autonomous mobile robots used in warehouses and last mile delivery and service robots, and advanced operator assistance systems used in off-highway machinery such as excavators and lifting platforms. Vision, AI and sensor fusion with functional safety are key technology enablers of these autonomous systems. In this presentation, Manisha Agrawal, Product Marketing Manager at Texas Instruments, discusses the sensing, processing and application development challenges in designing such functionally safe and autonomous systems. She then introduces an example of an efficient autonomous system design that is optimized for performance, power, size and system cost using TI’s functional-safety-capable TDA4 family of scalable processors. Finally, she presents the associated application development environment using open-source, industry-standard APIs that enable automatic hardware acceleration without any hand coding.
RICH DATA FUSION FOR ENHANCED MACHINE UNDERSTANDING
Unifying Computer Vision and Natural Language Understanding for Autonomous Systems
As the applications of autonomous systems expand, many such systems need the ability to perceive using both vision and language, coherently. For example, some systems need to translate a visual scene into language. Others may need to follow language-based instructions when operating in environments that they understand visually. Or, they may need to combine visual and language inputs to understand their environments. In this talk, Mumtaz Vauhkonen, Lead Distinguished Scientist and Head of Computer Vision for Cognitive AI in AI&D at Verizon, introduces popular approaches to joint language-vision perception. She also presents a unique deep learning rule-based approach utilizing a universal language object model. This new model derives rules and learns a universal language of object interaction and reasoning structure from a corpus, which it then applies to the objects detected visually. She shows that this approach works reliably for frequently occurring actions. She also shows that this type of model can be localized for specific environments and can communicate with humans and other autonomous systems.
Combining Ultra-low-power Proximity Sensing and Ranging to Enable New Applications
Time-of-flight (ToF) sensors are widely used to provide depth maps, which enable machines to understand their surroundings in three dimensions for applications ranging from touchless user interfaces to obstacle detection for robotics. In this talk, Armita Abadian, Senior Technical Marketing Manager for Imaging in the Americas at STMicroelectronics, shows how her company’s ToF sensors can double as ultra-low-power proximity detectors with the help of a specially designed software driver. In proximity detector mode, the ToF sensor consumes about 150 μW. When an object appears close to the sensor, a single-bit output signal can be used to wake up an embedded processor. The ToF sensor can then be switched to depth sensing mode. These combined capabilities enable new levels of energy efficiency. For example, a touchless user interface system can remain in ultra-low-power mode until a user appears nearby—at which point the system can wake up and switch the ToF sensor into ranging mode to detect hand gestures and control the system.
DEEPX develops neural network processing units (NPUs) and other artificial intelligence (AI) technologies for edge applications. The company’s strengths in high performance and ultra-low power consumption enable it to deliver efficient, low-cost AI NPUs for the rapidly growing IoT industry.
Edge Impulse is a leading development platform for machine learning on edge devices. The company’s mission is to enable every developer and device maker with the best development and deployment experience for machine learning on the edge, focusing on sensor, audio, and computer vision applications.
EDGE AI AND VISION PRODUCT OF THE YEAR WINNER SHOWCASE
OrCam Technologies OrCam Read (Best Consumer Edge AI End Product)
OrCam Technologies’ OrCam Read was the 2022 Edge AI and Vision Product of the Year Award winner in the Consumer Edge AI End Product category. OrCam Read is the first of a new class of easy-to-use handheld digital readers that supports people with mild to moderate vision loss, as well as those with reading challenges, to access the texts they need and to more effectively accomplish their daily tasks. Whether reading an article for school, perusing a news story on a smartphone, reviewing a phone bill or ordering from a menu, OrCam Read is the only personal AI reader that can instantly capture and read full pages of text and digital screens out loud. All of OrCam Read’s information processing – from its text-to-speech functionality implemented to operate on the edge, to its voice-controlled operation using the “Hey OrCam” voice assistant, to the Natural Language Processing (NLP)- and Natural Language Understanding (NLU)-driven Smart Reading feature – is processed locally, on the device, with no data connectivity required.
Please see here for more information on OrCam Technologies’ OrCam Read. The Edge AI and Vision Product of the Year Awards celebrate the innovation of the industry’s leading companies that are developing and enabling the next generation of edge AI and computer vision products. Winning a Product of the Year award recognizes a company’s leadership in edge AI and computer vision as evaluated by independent industry experts.
Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.