“Image Tokenization for Distributed Neural Cascades,” a Presentation from Google and VeriSilicon

Derek Chow, Software Engineer at Google, and Shang-Hung Lin, Vice President of NPU Technology at VeriSilicon, co-present the “Image Tokenization for Distributed Neural Cascades” tutorial at the May 2025 Embedded Vision Summit.

Multimodal LLMs promise to bring exciting new abilities to devices! As we see foundational models become more capable, we see compute requirements grow as well. It is not uncommon to see LLMs grow to tens of billions of parameters, at a rate faster than what embedded processors can provide.

In this talk, Chow and Lin introduce the concept of a “neural cascade,” a scheme that allows for division of computation across devices. They present a recipe for constructing a neural cascade from a pre-existing LLM and they show how this system harmonizes edge and cloud devices to enable new experiences.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top