Jitendra Malik, Arthur J. Chick Professor and Chair of the Department of Electrical Engineering and Computer Science at the University of California, Berkeley, presents the "Deep Visual Understanding from Deep Learning" tutorial at the May 2017 Embedded Vision Summit.

Deep learning and neural networks coupled with high-performance computing have led to remarkable advances in computer vision. For example, we now have a good capability to detect and localize people or objects and determine their 3D pose and layout in a scene. But we are still quite short of "visual understanding," a much larger problem.

For example, vision helps guide manipulation and locomotion, and this requires building dynamic models of consequences of various actions. Further, we should not just detect people, objects and actions but also link them together, by what we call "visual semantic role labeling," essentially identifying subject-verb-object relationships. And finally, we should be able to make predictions – what will happen next in a video stream? In this talk, Professor Malik reviews progress in deep visual understanding, gives an overview of the state of the art, and shows a tantalizing glimpse into what the future holds.

logo_2020

May 18 - 21, Santa Clara, California

The preeminent event for practical, deployable computer vision and visual AI, for product creators who want to bring visual intelligence to products.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 North California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top //