5 Benefits of On-device Generative AI

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

On-device AI processing offers must-have benefits for privacy, performance, personalization, cost, and energy

In the mid-’90s, the World Wide Web ushered in the era of massive remote data center computing now known as the cloud. And this shift paved the way to advancements in scientific modeling, design and simulation, research, and the world’s recent obsession with generative artificial intelligence (AI).

As discussed in our previous OnQ post Hybrid AI trends by the numbers: Costs, resources, parameters and more, these advancements are accompanied by increasing data center capital and operating costs: prohibitive ones that are increasingly creating a need — and an opportunity — to offload some workloads to edge devices like tablets, smartphones, personal computers (PCs), vehicles, and extended reality (XR) headsets. But the benefits of migrating workloads to these devices extend well beyond just the cost savings to data centers.

On-device AI is not new for us. For more than a decade, Qualcomm Technologies has been researching and working with customers, including original equipment manufacturers and application developers, to enhance the user experience through AI. Today it’s commonly used in radio frequency signal processing, battery management, audio processing, computational photography, video enhancement, and a variety of other on-device applications.

Extending on-device AI support to generative AI through optimized and/or specialized neural network models can further enhance the user experience through increased privacy and security, performance, and personalization while lowering the required costs and energy consumption.


On-device AI has several key benefits.

1. AI privacy and security

The transfer, storage, and use of data on multiple platforms and cloud services increases the potential for data tracking, data manipulation, and data theft.

On-device AI inherently helps protect users’ privacy since queries and personal information remain solely on the device. This is important for consumer data, as well as providing an additional level of protection for medical, enterprise, government, and other sensitive applications.

For example, a programming assistant app for generating code could run on the device without exposing confidential information to the cloud.


On-device AI provides low latency, high performance, and reliability for edge devices.

2. AI performance

AI performance can be measured in many ways, including processing performance and application latency. On-device processing performance of mobile devices has increased by double-digits with each technology generation and is projected to continue this trend, allowing for the use of larger generative AI models over time, especially as they become more optimized.

For generative AI, application latency is also critical. While consumers are more accommodating in waiting for the generation of a report, a commercial chatbot must respond in near real-time for a positive user experience. Processing generative AI models on device avoids the potential for latency caused by congested networks or cloud servers, while increasing the reliability by being able to execute a query anywhere and anytime.


With sensor and contextual data, on-device AI enables personalized experiences.

3. AI and personalization

Along with increased privacy, a strong benefit to consumers of on-device generative AI will be enhanced personalization. On-device generative AI will enable the customization of models and responses to the user’s unique speech patterns, expressions, reactions, usage patterns, environment, and even external data, such as from a fitness tracker or medical device, for full contextual awareness. This capability allows generative AI to essentially create a unique digital persona or personas for each user over time. The same can be done for a group, organization, or enterprise to create common and cohesive responses.

Smartphones are a user’s most personal device, and generative AI will make the entire user experience all-the-more personal.


On-device AI can offload computing from the cloud, saving cost and enabling scale.

4. Cost of AI

As cloud providers struggle with the equipment and operating costs associated with running generative AI models, they are beginning to charge consumer fees for services that were initially free. These fees are likely to continue increasing to meet the rising costs or until alternative business models can be found to offset the costs. Running generative AI on device can not only reduces the cost to consumers, it can also reduce the costs to cloud service providers and networking service providers while allowing valuable resources to be used for other high-value and high-priority tasks.


Efficient on-device AI processing can save energy and offload energy demands from the cloud.

5. AI and energy

The cost of running generative AI models on device versus the cloud translates directly to the amount of power required to run these models. Inference processing of large generative AI models may require the use of several AI accelerators, such as graphics processing units (GPUs) or tensor processing units (TPUs), and possibly even several servers. According to TIRIAS Research Principal Analyst, Jim McGregor, the idle power consumption of a single fully populated AI-accelerated server can approach one kilowatt of power while the peak power consumption can approach several kilowatts of power. This number multiplies by the number of servers required to run a generative AI model and the number of times a model is run, which as stated previously, is increasing exponentially. Added to this is the cost of power required to transfer the data over complex networks to and from the cloud. As a result, the amount of power consumption is also on an exponential growth trend.

Edge devices with efficient AI processing offer leading performance per watt, especially when compared with the cloud. Edge devices can run generative AI models at a fraction of the energy, especially when considering not only processing but also data transport. This difference is significant in energy costs as well as helping cloud providers offload data center energy consumption to meet their environmental and sustainability goals.

Pushing the boundaries of technology

The evolution of mobile technology pushed the boundaries of efficient processing for applications, images, videos, and sensors, and enabled the use of multiple user interfaces. Generative AI will further push the boundaries of on-device processing and will continue to enhance the personal computing experience. Qualcomm Technologies is working to enhance the performance of future smartphone, PC, vehicle and internet of things platforms while working with partners to bring generative AI on device through an open ecosystem. Look for more details in our future AI on the Edge OnQ posts.

Pat Lawlor
Director, Technical Marketing, Qualcomm Technologies

Jerry Chang
Senior Manager, Marketing, Qualcomm Technologies

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top