fbpx

Auro Tripathy, Founding Principal at ShatterLine Labs, presents the "How CNNs Localize Objects with Increasing Precision and Speed" tutorial at the May 2017 Embedded Vision Summit.

Locating objects in images (“detection”) quickly and efficiently enables object tracking and counting applications on embedded visual sensors (fixed and mobile). By 2012, progress on techniques for detecting objects in images – a topic of perennial interest in computer vision – had plateaued, and techniques based on histogram of oriented gradients (HOG) were state of the art. Soon, though, convolutional neural networks (CNNs), in addition to classifying objects, were also beginning to become effective at simultaneously detecting objects. Research in CNN-based object detection was jump-started by the groundbreaking region-based CNN (R-CNN).

In this talk, Tripathy follows the evolution of neural network algorithms for object detection, starting with R-CNN and proceeding to Fast R-CNN, Faster R-CNN, “You Only Look Once” (YOLO), and up to the latest Single Shot Multibox detector. He examines the successive innovations in performance and accuracy embodied in these algorithms – which is a good way to understand the insights behind effective neural-network-based object localization. He also contrasts bounding-box approaches with pixel-level segmentation approaches and presents pros and cons.

logo_2020

May 18 - 21, Santa Clara, California

The preeminent event for practical, deployable computer vision and visual AI, for product creators who want to bring visual intelligence to products.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 North California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top