The Evolution of Generative AI: From the Generic Cloud to the Personalized Intelligent Device

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

An interview with Qualcomm’s head of AI on where generative AI is going

As part of Qualcomm Technologies’ executive outreach on the importance of on-device generative AI, the Senior Vice President of Product Management, Snapdragon Technologies and Roadmaps, Ziad Asghar, sat for an interview with TIRIAS Research Principal Analyst, Jim McGregor.

The focus of the discussion was the value that generative AI will bring to our lives and how it will evolve over time, moving toward on-device generative AI and personalized models that are contextually aware of their environment and who’s using them. What exactly will this mean to you in the future, and what will it mean to the ecosystem? Ziad shares his detailed responses. The following is a transcript of the interview.

Editor’s Note: The following interview was transcribed and edited for readability.

Jim McGregor (JM): Why is generative AI on device so critical for the industry?

Ziad Asghar (ZA): I’m sure you’re seeing all the excitement around generative AI. And that really comes from the fact that people can see the amazing productivity and entertainment use cases that you are able to bring up with generative AI.

But with on-device AI, you can take those use cases, bring them onto your smartphone, extended reality device, automobile or PC, and run them entirely, natively on the device.

That affords amazing benefits. You’re offloading all that work from the cloud, you have much more of a private experience, you can keep those queries on the device, and many other advantages as well. But the key point is that you get those experiences anywhere in the world. That’s how you get those generative AI experiences in the palm of people’s hands. And I think that will be very, very powerful.

There are use cases around both productivity and around pure entertainment. We are creating content, which was only in the realm of human behavior in the past. And now, as content creation has come into the machine world, there’s a lot of excitement. There’s a lot of benefit that we can derive from these capabilities.

JM: Have you done any studies or analysis on how much this might increase productivity? If I create an outline, an image or do other content creation, I can probably free up maybe 15–20% of my time. But I’m sure for other people, it might be even more.

ZA: Yes, that’s a great way to think about it. There is some apprehension about some of these techniques, but think about the benefit that they can bring. And the way I think about them on a daily basis is through this example.

Example: We have a video conference call on a topic. Generative AI could essentially transcribe that whole call and at the same time you can tell it to create three PowerPoint slides of what we discussed. It (generative AI) can generate that for you. And in the next meeting, when you want to present a summary, all that information is available to you.

So, the way I think about generative AI in the productivity context is it can do a lot of things that are mundane or routine that you have to do in a repeated fashion. Generally, it can take routine work off your hands, opening us up to be able to do work that’s more novel, more innovative, things where humans are amazing at doing. And that’s the true power of generative AI.

JM: When we think of generative AI today, we think of applications like ChatGPT. Isn’t that just the tip of the iceberg?

ZA: Indeed, we’re just getting started.

What we’ve shown so far, for example, running entirely on the device, which means you could be flying on the plane and still be using generative AI in a use case like text-to-image. That’s no different from existing models like Stable Diffusion, Dall-E or Midjourney. What you do is enter a text query and the application creates an image for exactly what you entered.

But there are text-to-text models like Llama 2 coming out. We (Qualcomm) are working with Meta quite closely to be able to bring Llama 2 onto the device. But as you alluded to, this is just getting started.

What we see coming is multimodal AI. What that means is you have multiple modalities. So, you can have text-to-image, image-to-video and text-to-3D all in a single experience. And I think it really changes the way we interact with our devices.

At Qualcomm, we’re very excited about it, because it really changes each of our products. They get better and become more useful to the consumer. And at the same time, it adds a lot of value because we can do all of this on the device. So, this is definitely just the start of this revolution.

JM: I have to play devil’s advocate because this is Qualcomm. Qualcomm is the leader in wireless technology. We have 5G, we’re going towards 6G, not to mention Wi-Fi 7. So, what do I get from on-device that I’m not going to get from the cloud experience?

ZA: With 5G, we talk about the notion of hybrid AI. What “hybrid AI” means is that, for those capabilities that we can meet on the device, we will do that level of processing on the device. When you need to reach out to larger models and more assets, such as a very large 100 billion parameter model, you can use the 5G to be able to reach out and have greater collaboration between the cloud and the edge while still significantly offloading the cloud and maintaining an amazing user experience.

Generative AI in the cloud faces many challenges that can be overcome with on-device processing. As you may have seen on the news recently, companies are putting their proprietary code into a GPT tool to find bugs in code, for example. Now imagine if you could do that processing entirely on your PC.

Because you have the generative capability running locally on the device, it solves the biggest problems that people have with AI in many cases, which is privacy. At the same time, people are very excited about it because we have a lot more use cases and users coming to use these capabilities.

Finally, each one of these AI models is 10 to 200 times more complex than what was used two or three years ago. For a cloud vendor facing larger models, more use cases and more people, this is a multiplicative effect. We think the cloud cannot meet all these needs. That’s why generative AI, to be able to reach its full potential, needs to be on the device.

Additionally, your device knows at any given point in time if you’re sitting, walking or driving. It knows your preferences, your calendar and your age. And if we could basically augment some of those generative AI queries with that information, you get to a point where the generative AI experience is far better than what you can get on the cloud. I think that’s the Holy Grail.

That’s when the experience is vastly better on the device than what you can have in the cloud. I think that’s why we feel so excited about generative AI.

You don’t want that private contextual information going to the cloud for processing. You want that information to stay on your device, you want it to stay private. And I think that’s a huge advantage. Of course, there are also benefits of immediacy — you don’t have to wait for that information to come back to you from the cloud. I think these aspects will change the way people use generative AI.

JM: Personal awareness is an interesting point. Being able to customize the AI, especially when I think of future applications around video, like gaming or metaverses, creates the world around me according to who I am, not everyone else.

ZA: That’s a good example. We have techniques coming that allow you to fine-tune that AI model. So, what I envision is that as time goes by, that model that you have on the device will continue to become much more customized to each and every person’s needs. That model, in time, is far better than what you have in the cloud.

And that’s why on-device generative AI can do a far better job of creating the virtual world around you, of doing text-to-text, doing text-to-image and all those scenarios. There are techniques out there like low-rank adaptation and others that are coming that will allow us to be able to do that on the device with the capabilities that we have on our Snapdragon processors.

JM: It seems like there’s as much effort being put into shrinking models as there is into expanding models. Qualcomm showed the Stable Diffusion demo, and I have to admit that it was addictive. Playing with that for a half hour at Mobile World Congress was just incredible. But Qualcomm did a lot of quantization to enable it. So, what is Qualcomm doing to further help developers in not only building those models, but shrinking those models?

ZA: That’s the key point on how to make it run on the device. We’ve been talking about and focusing on doing a lot of this AI processing in the integer domain. The way to simply think about it is if we abstract away some of the details. When the model is trained in the cloud, it’s trained in 16-bit or 32-bit floating point.

That results in lots of bits, which means every time you infer the model, you’re doing a lot of calculations. Alternatively, we have been focused on compressing these models, quantizing and pruning them, and are able to make them much smaller and at the same time, for example, using only four bits.

So, every inference run is just doing 4-bit processing rather than 16- or 32-bit. That translates into amazing capabilities in terms of power saving and how much concurrent AI processing can be done on the device. I feel the growth of large language models (LLMs) over the past few months has validated our strategy of essentially making the models smaller through quantizing them.

I’ll add another aspect. I think what is happening is the models are getting bigger as they become more capable, for example, GPT-3 was a 175 million parameter model. But Llama 2 from Meta has seven and 13 billion parameter models and in some respects, comes close to the capabilities of a model like GPT-3. So, the way I think about it is initially when a new capability is developed, the model is fairly large.

As people improve the algorithms over time, the model starts to become smaller. And another approach that we’ll see people take is what I call domain-specific models. With those domain-specific models, you’ll be able to train a model for that given application.

For example, in Internet of Things (IoT), let’s say you’re training something for medical triage or for the hospitality industry. You will train it only on the data that’s required for that application, and then the model can be significantly smaller and with very good accuracy. So, I think some of those techniques will make the experience amazing on the device.

JM: I have to think that this is going to change everything about how we work, how we live, how we play — even how we learn. So, colleges and universities are very interested in this. Is there an outreach from Qualcomm and the industry that’s going to help them migrate towards this?

ZA: I think there is significant societal benefit for a lot of this stuff that we’re talking about. So, in the example around education, today you have a class full of students and the teacher is basically delivering information to them at the same pace.

Within that classroom, there are some that are able to grasp that material faster than others. But with generative AI you can envision the ability to give the “ready to move on” feedback to a generative AI model that’s running on the device in front of each and every student. As a result, you can change the pace of delivery of that material.

What that means is you now have a customized academic plan for each student in that classroom.

That is where you get an amazing amount of benefit. We are working with the ecosystem to enable a lot of those capabilities.

There is a similar analogy to this in the healthcare segment as well. Today in elderly care, the only time when the remedy or the therapy changes is when the patient actually goes to the doctor’s office.

What if you had the ability to get all the information from the patient on a regular basis and be able to tweak the therapy, within certain limits and with the presence of healthcare providers who are looking at it, without them having to go to the doctor? You can change the experiences in a very big way.

So, I think there’s many more such societal benefits. But to make that happen, you have to bring that capability onto the device. Why? Because you don’t want that healthcare information going out to the cloud, you got to have it on the device.

I think those aspects will make generative AI an extremely powerful resource. The utility of this is going to be massive.

JM: I don’t think most people even realize that in healthcare, most general physicians are consulting databases of information anyway. So, it’s not like they’re making a decision based on your care, your injury or your illness — just out of the box or out of their brains. They are using database resources today.

ZA: Now, if you could have a GPT that’s trained on that material, specifically, rather than hundreds of other data concepts as well, it could actually do a very fine job of addressing those questions.

JM: And to your point, when we can take it on device, we finally have a Star Trek tricorder.

ZA: Exactly, and to take a similar example, we have “Knight Rider” cars as well. Those of us who grew up watching “Knight Rider” (an 80’s TV show), now we have that experience within our automobile. You can talk to your car.

Hopefully, you won’t have that eject button but other than that, you can have exactly the same experience where it’s much more of a conversational experience with your car, rather than this very limited vocabulary that we have today in the navigation experience. Generative AI changes the experience entirely.

JM: From your standpoint, what do you think are going to be some of the most interesting applications or devices — or whatever that this is going to generate?

ZA: I think each and every device. Just take the example of your smartphone.

Today, if you want to reserve a restaurant, you first have to go to Yelp or some other website, figure out what restaurant it is, call up the restaurant, reserve it, then find the path to get there. Now, what if I had a virtual assistant that I can just talk to on the phone that goes behind the scenes to do exactly those three or four different applications, and is able to do the whole of the task for you?

It will change the way we interact with our devices. It changes that human machine interface.

JM: I used to tell people that mobile apps eliminated some major steps in having to log on to a browser, go to a website and do something. The app is already on your device and you bring it up. Generative AI just takes that one step further to where you don’t even have to type it in; you can use the user interfaces, whether it’s through gesture, voice, text or it knows you.

ZA: Exactly, you can say, “I want food,” and it’s going to come up with the information you want. And in time, it might know that you like to eat Italian food without prompting.

My point is, it can actually learn and adapt at an amazing pace, and it could make that experience seamless.

For example, if you do home automation, you can go to one application to control your lights, one for your garage or one application for something else. You could tell this virtual assistant on your device or your home hub to “turn everything off.” It takes care of all those things for you. I think that’s where we need to get to. Apps were the first user interface transition. And I think this virtual assistant can be the next major transition to change the way we interact with these devices.

JM: And I think another good point is that generative AI doesn’t necessarily take away jobs. This creates new opportunities, just like the PC did, just like the internet did.

ZA: Absolutely. With every tech transition, people have that concern (of losing jobs). But every time, technology allows us to be able to do more. And I think AI is definitely going to do that.

I usually refer to this as a force multiplier. What this means is that you can do the same work in less time or more work in the same time with generative AI. Perhaps it used to take you three hours to create slides, but generative AI can create those slides for you. It can draft your email. It can find the critical points in a PDF for you. It really saves that time you’re doing that busy work and allows you to be able to do what humans do — an amazing job at discovering new things.

Pat Lawlor
Director, Technical Marketing, Qualcomm Technologies, Inc.

Jerry Chang
Senior Manager, Marketing, Qualcomm Technologies

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.



1646 N. California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone: +1 (925) 954-1411
Scroll to Top