Rajy Meeyakhan Rawther, PMTS Software Architect in the Machine Learning Software Engineering group at AMD, presents the “Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A Case Study” tutorial at the September 2020 Embedded Vision Summit.
In this talk, Rawther presents techniques for obtaining the best inference performance when deploying machine learning applications in the cloud. With the increasing use of AI in applications ranging from image classification/object detection to natural language processing, it is vital to deploy AI applications in ways that are scalable and efficient. Much work has focused on how to distribute DNN training for parallel execution using machine learning frameworks (TensorFlow, MXNet, PyTorch and others). There has been less work on scaling and deploying trained models on multi-processor systems.
Rawther presents a case study analysis of scaling an image classification application in the cloud using multiple Kubernetes pods. She explores the factors and bottlenecks affecting performance and examine techniques for building a scalable application pipeline.
See here for a PDF of the slides.