Q&A: AI Researcher Roland Memisevic Discusses the Secret to Dataset Generation, Data-driven AI, and What’s Next in Machine Learning

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

Why a change in mindset about capturing the right data is essential for advances in machine learning

Machine learning is rapidly changing how software and algorithms are developed. And data is the lifeblood driving the machine learning revolution. We sat down with Roland Memisevic, senior director of engineering at Qualcomm Canada and part of Qualcomm AI Research, to get the latest updates on creating datasets at scale, data-driven AI, the latest AI research trends, the big AI challenges to overcome, and what’s next in AI.

What led you to AI? Can you tell us about your work with Geoffrey Hinton, and your later academic career?

I’m the classic case. At around 17 years old, I read an AI book by Douglas Hofstadter that really piqued my interest and got me hooked-on AI. In my mind, I was thinking of C-3PO and creating human-like robots for companions. This still excites me, and I believe that at some point in our lifetime we will at least see an accurate human interface that understands the world and can naturally communicate to us through a screen, if not an actual robot. We are going to better understand intelligence by building intelligent systems.

I got interested in neural nets around 2002 since they were a form of AI that actually seemed to work. When I decided to get a Ph.D. and pursue an academic career in neural networks, there actually were not many opportunities or funding for this research. One of the places doing leading research was Geoffrey Hinton’s Toronto lab, which I was lucky enough to join. Since neural network theory was a bit messy, not so principled or based on elegant math, it was met with skepticism – you definitely needed to have an engineering mindset to deal with the randomness and exploratory nature of developing neural nets. Some of that still exists today.

What prompted you to start TwentyBN?

Around 2008, it was becoming increasingly apparent that neural networks would have a big impact with speech and a couple of years later with computer vision. The only two ingredients missing for neural networks to flourish were compute and labeled data. This was a huge surprise for me and many of my colleagues at that time.

Around 2012 I became a faculty member at MILA, a research institute in artificial intelligence in Montreal. In my MILA research, I was blown away that large, labelled data sets could work so well and felt that something was broken in the traditional machine learning workflow where researchers iterated over making model architecture changes on whatever data was available. For TwentyBN, we envisioned a workflow where you are focused more on the data rather than the model — a data-centric approach where researchers create or improve capabilities in an AI system not by tweaking architectures but by being creative around generating data. As data grows, the less important the neural network architecture becomes. Many of the AI systems we have created are computationally quite simple and run well on power-constrained edge devices, although they solve computationally fairly complex tasks. TwentyBN was created as a company with data at the center of importance.

Why was on-device AI inference an important topic for TwentyBN?

In many interactive applications, the AI needs to interface and understand the world through sensors, like cameras and microphones, and provide immediate response, so latency must be low. In addition, privacy is a big concern so processing and keeping the personal data on the device is required. At TwentyBN, we developed a fitness app where an AI coach would motivate and provide feedback as you exercised, but you certainly did not want this video footage being sent to the cloud.

Generally, since any sensory processing happens ultimately at the edge and at least some degree of processing happens to the raw sensor signal, it is fair to say that there is always some inference at the edge involved. On the other hand, there is usually some cloud component for aggregate data or compute-intensive processing. So, in practically any scenario, we are nowadays dealing with hybrid requirements, and at TwentyBN we made sure that both components — edge and cloud — were available in our applications.

You’ve developed unique datasets over the years. Why?

Intuitively, it seemed like the right thing to do. When you put the data sourcing into the center of everything rather than model tweaking, you are automatically pushed to ask the right questions. As a neural network researcher or developer, you need to think about the data you need rather that the data that is there. However, not many people were trying to solve this problem until quite recently — Andrew Ng’s initiative with “MLOps” is really gaining traction and creating a wakeup moment for the AI community. For TwentyBN, data sourcing evolved as a function of research needs. When the AI system is trained to solve a certain class of tasks, follow-on tasks naturally emerge, requiring new data sourcing interfaces, which then lead to new capabilities, and so on. It is a cycle.

As a neural network researcher or developer, you need to think about the data you need rather than the data that is there.Roland Memisevic

How were you able to scale the collection of high-quality labeled data at low cost?

Operationalizing was the key from the get-go. We built the tooling for our crowd acting platform where crowd workers are paid to act out and record on video the requested concepts from researchers. This was our main focus — creating the software dedicated to this purpose of collecting data. We made it efficient and scalable, added intuitive user interfaces, and of course iterated quickly over time to respond to customer needs — including our own needs.

How can the AI research community take advantage of these datasets?

Two popular datasets we created and licensed at TwentyBN were Something Something (email here for interest) and Jester (email here for interest).

What role can these datasets play in advancing AI research happening at Qualcomm AI Research?

There are two aspects that can play a role. First, the existing datasets, like gesture control, are useful for neural network development, and second, the crowd acting platform can efficiently create new data at scale.  The existing data is also useful from a transfer learning perspective. We have always used data sourcing in a “cumulative” fashion, such that a neural network is trained on most of the data that was sourced over the years and fine-tuned on use case-specific data.

What big challenges are left to overcome in data collection?

A big challenge is on the cultural side in terms of adopting a completely different mindset and workflow to build AI systems. It is overcoming the entrenched mindset that is so common in the AI community where you tinker with the neural network architecture rather than focus on getting good data. Once you realize data is the key, it is a matter operationalizing the collection of good data and building a lot of tooling.

Another big challenge is that things don’t nicely compartmentalize in AI. Figuring out what is good data is often very domain specific to the application. As a result, there is a huge benefit to vertical integration where the data, data collection, neural network design, and application are all done together. There is a very strong feedback loop when you have this end-to-end understanding and realize what data you need to keep improving. For example, the little discoveries that you make through application feedback informs the data that you need to collect.

Qualcomm Technologies recently brought you and the rest of the top-notch AI research team from TwentyBN on-board. What are your impressions of Qualcomm AI Research so far?

Qualcomm has a no-nonsense culture. There is an intellectual honesty and openness that encourages speaking the truth on technical matters without regard to politics or seniority. Technology decisions are made based on the facts. For me, that has been beautiful to see. It is definitely an engineering culture.

I’m also realizing how strong of a position Qualcomm Technologies has as an edge compute player. It is a great place to be at because the most important data is always being generated at the edge and that is where you want AI to run.

You’ve been working at the intersection between cutting-edge AI research and consumer products. What is the key to successfully bringing computer vision innovation to market?

What we learned at TwentyBN developing the fitness application is that you need to vertically integrate across the end-to-end stack and operationalize the data collection process to become efficient and principled.

Looking toward the future, what are the most challenging problems in the field of AI right now?

A big challenge is how do you make neural networks, which are a parallel mess, think more. AI is paradigm change in computing, where we are going from serial computing and a Von Neumann architecture to parallel processing of these big parallel messes.

I believe it makes sense to consider a “third compute paradigm” that is much more human-like. Human brains process data in a very parallel manner unlike serial computers, yet humans have the capability to do serial thinking and reasoning as well. This is a huge research problem to understand. Humans have capabilities that are far superior to AI like creativity, common sense, and language. I believe that a key reason for these capabilities is that human symbol processing happens on a sub-symbolic substrate. Increasing the degree of “thinking” that can happen in a sub-symbolic, parallel mess is a research area where I hope we will see a lot of progress in the coming years. While it allows us to better understand those magical aspects of human cognition, it also allows us to make better use of AI accelerator hardware than we do today.

In terms of predictions, which areas of AI research are you expecting to see large progress and exciting breakthroughs?

I predict progress in system 2 cognition, which is a deliberate type of thinking involved in focus, deliberation, reasoning or analysis, in neural networks. This goes along this third computing paradigm idea that I just mentioned.

AI will also enter our homes, transforming them into smart homes with multi-modal interaction. There is a lot of work to get this productized, but I expect a lot of progress ranging from truly smart TVs to robots.

Thanks Roland!

To learn more about our latest research, visit the Qualcomm AI Research page. And if you are interested in joining our team and making an impact at scale, please apply for one of our open machine learning positions.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.



1646 N. California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone: +1 (925) 954-1411
Scroll to Top