Qualcomm Unveils AI200 and AI250: Redefining Rack-scale Data Center Inference Performance for the AI Era

Highlights:

  • Qualcomm AI200 and AI250 solutions deliver rack-scale performance and superior memory capacity for fast data center generative AI inference at industry-leading total cost of ownership (TCO). Qualcomm AI250 introduces an innovative memory architecture, offering a generational leap in effective memory bandwidth and efficiency for AI workloads.

  • Both solutions feature a rich software stack and seamless compatibility with leading AI frameworks, empowering enterprises and developers to deploy secure, scalable generative AI across data centers.

  • Products are part of a multi-generation data center AI inference roadmap with an annual cadence.

Oct 27, 2025 – SAN DIEGO – Qualcomm Technologies, Inc. today announced the launch of its next-generation AI inference-optimized solutions for data centers: the Qualcomm® AI200 and AI250 chip-based accelerator cards, and racks. Building off the Company’s NPU technology leadership, these solutions offer rack-scale performance and superior memory capacity for fast generative AI inference at high performance per dollar per watt—marking a major leap forward in enabling scalable, efficient, and flexible generative AI across industries.

Qualcomm AI200 introduces a purpose-built rack-level AI inference solution designed to deliver low total cost of ownership (TCO) and optimized performance for large language & multimodal model (LLM, LMM) inference and other AI workloads. It supports 768 GB of LPDDR per card for higher memory capacity and lower cost, enabling exceptional scale and flexibility for AI inference.

The Qualcomm AI250 solution will debut with an innovative memory architecture based on near-memory computing, providing a generational leap in efficiency and performance for AI inference workloads by delivering greater than 10x higher effective memory bandwidth and much lower power consumption. This enables disaggregated AI inferencing for efficient utilization of hardware while meeting customer performance and cost requirements.

Both rack solutions feature direct liquid cooling for thermal efficiency, PCIe for scale up, Ethernet for scale out, confidential computing for secure AI workloads, and a rack-level power consumption of 160 kW.

“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference. These innovative new AI infrastructure solutions empower customers to deploy generative AI at unprecedented TCO, while maintaining the flexibility and security modern data centers demand,” said Durga Malladi, SVP & GM, Technology Planning, Edge Solutions & Data Center, Qualcomm Technologies, Inc. “Our rich software stack and open ecosystem support make it easier than ever for developers and enterprises to integrate, manage, and scale already trained AI models on our optimized AI inference solutions. With seamless compatibility for leading AI frameworks and one-click model deployment, Qualcomm AI200 and AI250 are designed for frictionless adoption and rapid innovation.”

Our hyperscaler-grade AI software stack, which spans end-to-end from the application layer to system software layer, is optimized for AI inference. The stack supports leading machine learning (ML) frameworks, inference engines, generative AI frameworks, and LLM / LMM inference optimization techniques like disaggregated serving. Developers benefit from seamless model onboarding and one-click deployment of Hugging Face models via Qualcomm Technologies’ Efficient Transformers Library and Qualcomm AI Inference Suite. Our software provides ready-to-use AI applications and agents, comprehensive tools, libraries, APIs, and services for operationalizing AI.

Qualcomm AI200 and AI250 are expected to be commercially available in 2026 and 2027 respectively. Qualcomm Technologies is committed to a data center roadmap with an annual cadence moving forward, focused on industry-leading AI inference performance, energy efficiency, and industry-leading TCO. For more information, visit our website.

About Qualcomm

Qualcomm relentlessly innovates to deliver intelligent computing everywhere, helping the world tackle some of its most important challenges. Building on our 40 years of technology leadership in creating era-defining breakthroughs, we deliver a broad portfolio of solutions built with our leading-edge AI, high-performance, low-power computing, and unrivaled connectivity. Our Snapdragon® platforms power extraordinary consumer experiences, and our Qualcomm Dragonwing™ products empower businesses and industries to scale to new heights. Together with our ecosystem partners, we enable next-generation digital transformation to enrich lives, improve businesses, and advance societies. At Qualcomm, we are engineering human progress.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top