Dr. Ren Wu, former distinguished scientist at Baidu's Institute of Deep Learning (IDL), presents the keynote talk, "Enabling Ubiquitous Visual Intelligence Through Deep Learning," at the May 2015 Embedded Vision Summit.
Deep learning techniques have been making headlines lately in computer vision research. Using techniques inspired by the human brain, deep learning employs massive replication of simple algorithms which learn to distinguish objects through training on vast numbers of examples. Neural networks trained in this way are gaining the ability to recognize objects as accurately as humans.
Some experts believe that deep learning will transform the field of vision, enabling the widespread deployment of visual intelligence in many types of systems and applications. But there are many practical problems to be solved before this goal can be reached. For example, how can we create the massive sets of real-world images required to train neural networks? And given their massive computational requirements, how can we deploy neural networks into applications like mobile and wearable devices with tight cost and power consumption constraints?
In this talk, Ren shares an insider’s perspective on these and other critical questions related to the practical use of neural networks for vision, based on the pioneering work being conducted by his former team at Baidu.
Note 1: Regarding the ImageNet results included in this presentation, the organizers of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) have said: “Because of the violation of the regulations of the test server, these results may not be directly comparable to results obtained and reported by other teams.”
Note 2: The presenter, Ren Wu, has told the Embedded Vision Alliance that “There was some ambiguity with the rules. According to the ‘official’ interpretation of the rules, there should be no more than 52 submissions within a half year. For us, we achieved the reported results after 200 tests total within a half year. We believe there is no way to obtain any measurable gains, nor did we try to obtain any gains, from an 'extra' hundred tests as our networks have billions of parameters and are trained by tens of billions of training samples.”