Cheng Wang, Co-founder and CTO of Flex Logix, presents the “Challenges in Architecting Vision Inference Systems for Transformer Models” tutorial at the May 2023 Embedded Vision Summit.
When used correctly, transformer neural networks can deliver greater accuracy for less computation. But transformers are challenging for existing AI engine architectures because they use many compute functions not required by previously prevalent convolutional neural networks.
In this talk, Wang explores key transformer compute requirements and highlights how they differ from CNN compute requirements. He then introduces Flex Logix’s silicon IP InferX X1 AI accelerator. Wang shows how the dynamic TPU array architecture used by InferX efficiently executes transformer neural networks. He also explains how InferX integrates into your system and shows how it scales to adapt to varying cost and performance requirements.
See here for a PDF of the slides.