Google Announces LiteRT Qualcomm AI Engine Direct Accelerator

Google has announced a new LiteRT Qualcomm AI Engine Direct Accelerator, giving Android and embedded developers a much more direct path to Qualcomm NPUs for on-device AI and vision workloads.

Built on top of Qualcomm’s AI Engine Direct (“QNN”) SDK, the new accelerator replaces the older TensorFlow Lite QNN delegate and plugs directly into LiteRT, Google’s high-performance on-device ML runtime. The big win for engineers is that you no longer have to wrestle with low-level, vendor-specific SDKs or target each Snapdragon SoC separately: LiteRT now presents a unified API while the QNN backend talks to the underlying Qualcomm AI Stack.

For embedded AI and computer vision use cases, this matters because it finally makes the NPU a first-class, portable target. Google reports support for around 90 LiteRT ops, enough for full NPU delegation on most of the 72 benchmarked models across vision, audio, and NLP. On Snapdragon 8 Elite Gen 5, NPU inference can be up to 100× faster than CPU and about 10× faster than GPU, with more than 50 models running in under 5 ms—latency levels that enable real-time perception, tracking, and multimodal UX on device.

The team also highlights a FastVLM-based scene-understanding demo: an int8/int16-quantized visual language model running fully on the Snapdragon NPU achieves ~0.12 s time-to-first-token and over 11k tokens/s prefill, enough for smooth, interactive “describe what the camera sees” experiences on a phone. For robotics, AR, or industrial vision developers, this is directly relevant to building low-latency perception stacks that share a single SoC with UI and control logic.

Developers can pull pre-optimized Qualcomm-compatible models from Qualcomm AI Hub and deploy them to Qualcomm NPUs via LiteRT’s AOT toolchain and Google Play’s on-device AI delivery. For more details, see Google’s LiteRT announcement and docs, and Qualcomm’s AI Engine Direct and AI Hub resources.

If you're building AI or vision-enabled products, you've come to the right place.

Google Announces LiteRT Qualcomm AI Engine Direct Accelerator

Google has announced a new LiteRT Qualcomm AI Engine Direct Accelerator, giving Android and embedded developers a much more direct path to Qualcomm NPUs for on-device AI and vision workloads.

Further Reading:

Pages

Topics

Contact

Address

Phone