Time is running out! Today is the last day to save 15% on your 2018 Embedded Vision Summit registration pass with our Early Bird Discount. Don’t miss the chance to accelerate your learning curve and uncover practical techniques in computer vision by attending some of our 80+ “how-to” sessions. Use promo code NLEVI0410 when you register online today to save!
Tomorrow, April 11 at 7 pm PT, Embedded Vision Alliance Founder Jeff Bier will deliver the free presentation "Computer Vision at the Edge and in the Cloud: Architectures, Algorithms, Processors, and Tools" at the monthly meeting of the IEEE Signal Processing Society in Santa Clara, California. Bier will discuss the benefits and trade-offs of edge, cloud, and hybrid vision processing models, and when you should consider each option. He will also provide an update on important recent developments in the technologies enabling vision, including processors, sensors, algorithms, development tools, services and standards, and highlight some of the most interesting and most promising end-products and applications incorporating vision capabilities. For more information and to register, see the event page.
Editor-In-Chief, Embedded Vision Alliance
DEEP LEARNING FOR VISION
How CNNs Localize Objects with Increasing Precision and Speed
Locating objects in images (“detection”) quickly and efficiently enables object tracking and counting applications. By 2012, progress on techniques for detecting objects in images had plateaued, and techniques based on histogram of oriented gradients (HOG) were state of the art. Soon, though, convolutional neural networks (CNNs), in addition to classifying objects, were also beginning to become effective at detecting objects. Research in CNN-based object detection was jump-started by the groundbreaking region-based CNN (R-CNN). In this talk, Auro Tripathy, Founding Principal at ShatterLine Labs, follows the evolution of neural network algorithms for object detection, starting with R-CNN and proceeding to Fast R-CNN, Faster R-CNN, “You Only Look Once” (YOLO), and up to the latest Single Shot Multibox detector. He examines the successive innovations in performance and accuracy embodied in these algorithms – which is a good way to understand the insights behind effective neural-network-based object localization. He also contrasts bounding-box approaches with pixel-level segmentation approaches and presents pros and cons.
Designing CNN Algorithms for Real-time Applications
The real-time performance of CNN-based applications can be improved several-fold by making smart decisions at each step of the design process – from the selection of the machine learning framework and libraries used to the design of the neural network algorithm to the implementation of the algorithm on the target platform. This talk from Matthew Chiu, Founder of Almond AI, delves into how to evaluate the runtime performance of a CNN from a software architecture standpoint. It then explains in detail how to build a neural network from the ground up based on the requirements of the target hardware platform. Chiu shares his ideas on how to improve performance without sacrificing accuracy, by applying recent research on training very deep networks. He also shows examples of how network optimization can be achieved at the algorithm design level by making more efficient use of weights before the model is compressed via more traditional methods for deployment in a real-time application.
Computer Vision on ARM: Faster Ways to Optimize Software for Advanced Mobile Computing Platforms
Mobile computing platforms are ubiquitous, and therefore provide an exciting vehicle for the deployment of new computer vision and deep learning applications. This talk from Roberto Mijat, Director of Software Product Management in the Business Segment Group at ARM, explores real industry use cases where the adoption of optimized low-level primitives for ARM processors has enabled improved performance and optimal use of heterogeneous system resources. Mijat examines how these primitives are used for popular machine learning frameworks, and how they can be used for other frameworks and applications. He also provides a live demonstration and discusses key aspects of performance analysis.
Adventures in DIY Embedded Vision: The Can’t-miss Dartboard
Can a mechanical engineer with no background in computer vision build a complex, robust, real-time computer vision system? Yes, with a little help from his friends. Engineer, inventor and YouTube personality Mark Rober fulfilled a three-year long dream to create a dartboard where the player gets a bullseye every time. Basically, the player throws a dart, and a motion capture system tracks the dart in the air. Rober's system uses the tracked X, Y, Z positions to predict where the dart will land, via regression analysis. Once the system knows where the dart will land, it moves the board to the right spot using stepper motors. All of this happens in around 400 ms. In this talk, Rober presents the design of his vision-controlled dartboard, highlights some of the challenges he faced in building it, and shares how he overcame these challenges.