This market research report was originally published at Tractica's website. It is reprinted here with the permission of Tractica.
A new proposition is emerging as 5G networks roll out and artificial intelligence (AI) use cases proliferate across devices and the cloud. Both megatrends – 5G and AI – may converge at the start of the next decade in the 2020s, creating new opportunities and business models. In the past, I have been skeptical of how 5G will play a role with AI, but there may be a different way to look at the issue.
AI Shifting from Cloud to Edge
First, a bit on how AI works today and how it is shifting. Current AI models are largely processed on the cloud, in terms of both training and inference. The AI we are referring to here is the deep learning flavor that is largely vision- and language-focused – or what we call perception machines.
These deep learning models are getting larger and much more complex, which lends itself well to the large compute resources available in the cloud, mostly powered by graphics processing units (GPUs) today. The deep learning flavor of AI is gaining the most traction in the consumer internet services market. This is where the large hyperscaler companies, including the FAANGs (Facebook, Amazon, Apple, Netflix, Google) and the BATs (Baidu, Alibaba, Tencent), have built AI-first use cases and business models.
We are still in the first innings of how AI will develop in terms of its use cases. For the most part, the use cases are centered on consumer internet services, with multiple others emerging in areas as diverse as retail, healthcare, manufacturing, advertising, business services, and others. Tractica has built a taxonomy of more than 300 use cases across 30 industry sectors sizing the AI opportunity until 2025. Thus, we have a sense of where the market is today and how it could develop in the future from a use case and industry perspective.
A shift is beginning in the consumer devices market and some other markets like security cameras and automotive where hardware and software advances allow for AI model inference to be run on the device itself. We explored this trend in greater detail in the Artificial Intelligence for Edge Devices report, which covers the shift of AI processing on the devices in eight categories: robots, drones, augmented and virtual reality (AR/VR), smart speakers, PCs/tablets, mobile, security cameras, and automotive.
Tractica’s report estimates that by 2025, five of the eight device categories will have a 100% attach rate of AI processors running AI models on the device. It also compares AI edge silicon and AI cloud silicon and estimates that the AI edge silicon market will be 3x-4x that of AI cloud silicon by 2025. Therefore, the emerging narrative for AI processing is that AI at the edge will overshadow AI processing in the cloud over the next 6-7 years.
Device Edge to Network Edge
AI edge mainly refers to the device itself and does not include the network edge, or the fog, as it is termed in Internet of Things (IoT) circles. Essentially, the network edge is the base station or a network router or physical location like a digital subscriber line (DSL) or cable box placed at the curb side or somewhere near the end device or end user.
With 5G coming into the picture with its millisecond latencies and gigabit bandwidth, the cloud is shifting closer to the device because the device has almost instantaneous connectivity to cloud resources. But that instantaneous connectivity is mostly defined from the 5G base station to the device, especially when referring to millisecond latencies and gigabit bandwidth. Yes, end-to-end latency is much improved, but the biggest boost to connectivity is in the last mile of the network – essentially the air interface, where 5G brings a lot of its smarts.
Therefore, in a 5G world, we can imagine some of the AI processing being shifted or offloaded from the cloud and the device to the edge of the network. This would reduce hardware compute and power requirements, especially at the edge, where these can make or break business models.
For example, autonomous cars are being built with massive hardware compute like NVIDIA’s Drive PX Pegasus, which is targeted at Level 5 automation and provides 320 TOPS of compute at 500 W of power. This is assuming that all the AI inference will be done on the car itself (i.e., at the edge). These compute capabilities add a major cost to the car itself, not to mention the cooling systems needed to manage 500 W of heat dissipation.
The Drive PX Pegasus costs in the range of $15,000, which is a significant chunk of a car’s bill of materials. In fact, the high cost of NVIDIA’s Drive system led to Tesla developing its own chips and dumping NVIDIA. If there was a way to reduce the cost of the AI hardware, it would have a significant impact on the Level 5 robo-taxi business case.
Edge AI Data Centers Create New 5G Business Models
With a 5G network in place, in theory we could offload part of the AI processing to the network edge. Doing so would essentially create an AI edge data center, thereby reducing the cost of the AI hardware on the car itself. These mini data centers could be spread out across a geofenced square mile of the city that offers Level 5 robo-taxi services using the existing 5G rooftops or even lamp posts as offload points.
The offload to the network edge is only possible with 5G networks their millisecond latency and gigabit bandwidth, and that is why 5G could be the turning point where AI processing shifts from the device edge to the network edge. 5G would enable an additional layer of intelligence at the network edge, which devices could rely on for processing AI, mostly inference but also training in the future. This is where a truly decentralized AI processing network emerges, and 5G is the enabler for this future.
These words should be music to the ears of 5G service providers. In fact, Tractica has been presenting this business case to several operators. Chinese operators in particular have shown enthusiasm for the concept, as they have been struggling with the 5G business case.
There are a few details that need to be worked out. How does the operator charge for the AI edge data center? Is it by processor time, data processed, or simply space rental? Who will provide the hardware for the AI edge data center, the service provider or the end customer? Will the economics of shifting the processing to the AI edge data center work in favor of the device vendor, in this case the autonomous car manufacturer?
The answers to these questions will emerge over the next few years as the telecom ecosystem, including service providers and telecom vendors, start talking to the AI application providers and device manufacturers. While an automotive business case is described above, depending on the strength or existing business capabilities, the focus could also be on industrial or security cameras. Different economics and ROIs will emerge depending on the use case, and so there is a lot of work that needs to happen to make the 5G-enabled business proposition real.
However, the time to start building on these propositions is now if industry players wants to stay ahead of the competition. The 5G-enabled AI edge data center is a business-to-business (B2B) proposition that will sit beside others like content caching or cloud gaming at the edge. However, as AI penetrates almost every application, the importance of an AI edge data center will increase. In other words, as operators create 5G network slices for different services, many of them will be slices of AI edge data center services.
Research Director, Tractica