Smarter, Faster, More Personal AI Delivered on Consumer Devices with Arm’s New Lumex CSS Platform, Driving Double-digit Performance Gains

News Highlights:

  • Arm Lumex CSS platform unlocks real-time on-device AI use cases like assistants, voice translation and personalization, with new SME2-enabled Arm CPUs delivering up to 5x faster AI performance

  • Developers can access SME2 performance with KleidiAI, now integrated into all major mobile OSes and AI frameworks, including PyTorch ExecuTorch, Google LiteRT, Alibaba MNN and Microsoft ONNX Runtime

  • For flagship devices, Arm Lumex CSS platform achieves an unprecedented six years of double-digit IPC performance gains

  • New Mali G1-Ultra redefines mobile entertainment and is built for gamers, with 2x ray tracing uplift

AI is no longer a feature, it’s the foundation of next-generation mobile and consumer technology. Users now expect real-time assistance, seamless communication, or personalized content that is instant, private, and available on device, without compromise. Meeting these expectations requires more than incremental upgrades, it demands a step change that brings performance, privacy and efficiency together in a scalable way.

Introducing Arm Lumex

That’s why we’re introducing Arm Lumex, our most advanced compute subsystem (CSS) platform, purpose-built to accelerate AI experiences on flagship smartphones and next-gen PCs.

Lumex unites our highest performing CPUs with Scalable Matrix Extension version 2 (SME2), GPUs and system IP, enabling the ecosystem to bring AI devices to market faster and deliver experiences from desktop class mobile gaming to real time translation, smarter assistants, and personalized applications.

We are enabling SME2 across every CPU platform and by 2030, SME and SME2 will add over 10 billion TOPS of compute across more than 3 billion devices, delivering an exponential leap in on-device AI capability.

Partners can choose exactly how they build Lumex into their SoC – they can take the platform as delivered and leverage cutting-edge physical implementations tailored to their needs, reaping time to market and time to performance benefits. Alternatively, partners can configure the platform RTL for their targeted tiers and harden the cores themselves.

Lumex and our simplified naming conventions across the Arm portfolio were announced earlier this year.

The platform combines:

  • Next-generation SME2-enabled Armv9.3 CPU cluster including C1-Ultra and C1-Pro, powering flagship devices
  • New C1-Premium, purpose built for the sub-flagship market, providing best in class area efficiency
  • New Mali G1-Ultra GPU with next-generation ray tracing enabling advanced graphics and gaming, plus a boost to AI performance
  • The most flexible and power-aware DynamIQ Shared Unit (DSU) Arm has delivered to date: C1-DSU
  • Optimized physical implementations for 3nm nodes
  • Deep integration across the software stack delivering seamless AI acceleration for developers using KleidiAI libraries

Accelerated AI Everywhere with SME2-Enabled CPUs

The SME2-enabled Arm C1 CPU cluster provides dramatic AI performance gains for real-world, AI-driven tasks:

  • Up to 5x uplift in AI performance
  • 4.7x lower latency for speech-based workloads
  • 2.8x faster audio generation

This leap in CPU AI compute enables real-time, on-device AI inference capabilities, providing users with smoother, faster experiences across interactions like audio generation, computer vision, and contextual assistants.

So what does this mean in real world use cases? SME2 can deliver a whole new level of responsiveness and efficiency. For example, our Smart Yoga Tutor demo app saw a 2.4x boost in text-to-speech, meaning users get instant feedback on their poses, all without draining battery life. Together with Alipay and vivo, we achieved 40% reduction in the time it takes for LLM response for interaction with the user, proving SME2 is delivering faster real-time generative AI on-device.

SME2 isn’t just about speed; it’s also unlocking AI-powered capabilities that traditional CPUs can’t match. For example, neural camera denoising now runs at over 120fps in 1080p or 30fps in 4K, all on a single core. That enables smartphone users to capture sharper, crystal-clear images even in the darkest scenes, allowing for smoother interactions and richer experiences on everyday devices.

Unlike cloud-first AI, which is constrained by latency, cost, and privacy concerns, Lumex brings intelligence directly to the device where it’s faster, safer, and always available. SME2 is being embraced by leading ecosystem players including Alibaba, Alipay, Samsung LSI, Tencent and vivo.

Architectural Freedom for Every Product Tier

Lumex offers partners the freedom to balance peak performance, sustained efficiency, and silicon area in products ranging from high-end smartphones and PCs to emerging AI-first form factors:

CPU  Key benefit Performance and efficiency gains Ideal use cases
C1-Ultra Flagship peak performance +25% single-thread performance
Double-digit IPC gain year-on-year
Large-model inference, computational photography, content creation, generative AI
C1-Premium C1-Ultra performance with greater area efficiency 35% smaller area than C1-Ultra Sub-flagship mobile segments, voice assistants, multitasking
C1-Pro Sustained efficiency +16% sustained performance Video playback, streaming inference
C1-Nano Extremely power-efficient +26% efficiency, using less area Wearables, smallest form factors

Enabling Desktop-Class Gaming and Faster AI Inference on Mali GPU

With over 12 billion Arm GPUs shipped to date, Arm is at the center of mobile gaming experiences. The new Arm Mali G1-Ultra GPU continues to push the boundaries of mobile gaming, delivering high-fidelity, console-class graphics. This is made possible by a brand-new Ray Tracing Unit v2 (RTUv2), powering advanced lighting, shadows and reflections, leading to a 2x uplift in ray tracing performance compared to its predecessor. For AI workloads, the G1-Ultra enables up to 20% faster inference performance, enhancing responsiveness across real-time applications.

The Mali G1-Ultra delivers 20% better performance across graphics benchmarks compared to the previous generation, with across-the-board improvements for leading titles, including Arena Breakout, Fortnite, Genshin Impact, and Honkai Star Rail. The G1-Premium and G1-Pro GPUs deliver superior performance and power-efficiency for constrained devices.

Finally, Developer-Friendly AI for Mobile

For developers, AI experiences just work on the Lumex platform. Through the KleidiAI integration across major frameworks including PyTorch ExecuTorch, Google LiteRT, Alibaba MNN and Microsoft ONNX Runtime, apps automatically benefit from SME2 acceleration with no code changed required.

For developers building cross-platform apps, Lumex brings new portability:

  • Google apps like Gmail, YouTube and Google Photos are already SME2-ready, ensuring seamless integration as Lumex-based devices hit the market
  • Cross platform portability means optimizations built for Android can seamlessly extend to Windows on Arm and other platforms
  • Partners like Alipay are already showcasing on device LLMs running efficiently with SME2

Technology leaders – including Apple, Samsung, and MediaTek – are integrating AI acceleration capabilities for faster, more efficient on-device AI. Apple is powering Apple Intelligence; Samsung and MediaTek are improving responsiveness and efficiency of real-time AI applications such as translation, summarization, and personal assistants using Google Gemini.

Arm Lumex: Platform-Level Intelligence for the AI Era

Arm Lumex is more than our most advanced CSS platform for the consumer computing market, it’s the foundation for the next era of intelligent AI-enabled experiences. Whether you’re an OEM or developer, Lumex gives you the tools to deliver personal, private and high-performance AI at the edge, where it matters most. Built for the AI era, Lumex is where the future of mobile innovation begins.

Chris Bergey
SVP and GM of the Client Line of Business, Arm

Supporting Quotes:

“Through deep integration with SME2, MNN enables low-latency, quantized inference for billion-parameter models like Qwen on smartphones — showcasing Arm and Alibaba’s joint innovation in scalable, next-gen mobile AI.”
Xiaotang Jiang, Head of MNN, Taobao and Tmall Group, Alibaba

“The validation of LLM inference using SME2 has been completed on vivo’s next generation flagship smartphone through the close collaboration of Arm, Alipay and vivo. We observe that prefill and decode performance can be improved by over 40% and 25% respectively. These results demonstrate significant progress in CPU backend, and we are highly encouraged by the outcomes achieved so far.”
Xindan Weng, Head of Client Engineering, Alipay

“SME2-enhanced hardware enables more advanced AI models, like Gemma 3, to run directly on a wide range of devices. As SME2 continues to scale, it will enable mobile developers to seamlessly deploy the next generation of AI features across ecosystems. This will ultimately benefit end-users with low-latency experiences that are widely available on their smartphones.”
Iliyan Malchev, Distinguished Software Engineer, Android at Google

“At Honor, our mission is to bring premium experiences to more users, especially through our upper mid-range smartphones. By leveraging the Arm Lumex CSS platform, we’re able to deliver smooth performance, intelligent AI features, and outstanding power efficiency that elevate everyday mobile experiences.”
Honor

“AI is changing how we interact with our devices and the world around us, and the Arm ecosystem is driving important developments in this space. At Meta, we’re excited about the integration of Arm Kleidi and PyTorch’s ExecuTorch, allowing our applications to seamlessly run on next-generation technology that accelerates end-user experiences.”
Sy Choudhury, Director, AI Partnerships, Meta

“At Samsung, we’re excited to continue our collaboration with Arm by leveraging Arm’s compute subsystem platform to develop the next generation of flagship mobile products. This partnership enables us to push the boundaries of on-device AI, delivering smarter, faster, and more efficient experiences for our users.”
Nak Hee Seong, Vice President and Head of SOC IP Development Team at Samsung Electronics

“SME2 accelerates on-device large language models, like Tencent’s Hunyuan, by addressing key performance bottlenecks and enabling efficient LLM deployment on mobile for enhanced user experiences.”
Felix Yang, Distinguished Expert, Machine Learning Platform, Tencent

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top