“Depth Estimation from Monocular Images Using Geometric Foundation Models,” a Presentation from Toyota Research Institute

Algorithms & Models, Edge AI and Vision Alliance, Sensors and Cameras, Software, Summit 2025, Tools, Videos / October 15, 2025

Rareș Ambruș, Senior Manager for Large Behavior Models at Toyota Research Institute, presents the “Depth Estimation from Monocular Images Using Geometric Foundation Models” tutorial at the May 2025 Embedded Vision Summit.

In this presentation, Ambruș looks at recent advances in depth estimation from images. He first focuses on the ability to estimate metric depth from monocular camera images from different domains and camera parameters.

Next, Ambruș looks at extensions to the multi-view setting and covers an efficient diffusion-based architecture capable of encoding hundreds of images and rendering depth and RGB images from novel viewpoints. Throughout the presentation, he focuses on the interplay between architectural inductive bias, training data and optimization objectives and their combined effect on building geometric foundation models that estimate 3D structure from images.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Topics

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone

Phone: +1 (925) 954-1411

If you're building AI or vision-enabled products, you've come to the right place.

“Depth Estimation from Monocular Images Using Geometric Foundation Models,” a Presentation from Toyota Research Institute

Pages

Topics

Contact

Address

Phone