Articles

Optimizing Transformer-based Diffusion Models for Video Generation with NVIDIA TensorRT

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. State-of-the-art image diffusion models take tens of seconds to process a single image. This makes video diffusion even more challenging, requiring significant computational resources and high costs. By leveraging the latest FP8 quantization features on NVIDIA Hopper GPUs […]

Optimizing Transformer-based Diffusion Models for Video Generation with NVIDIA TensorRT Read More +

Enable Pose Detection on Snapdragon X Elite: Step-by-step Tutorial

This article was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. I know why you’re here; you’ve decided to buy your first device with Snapdragon X Elite processor, awesome choice! You now ventured over to Qualcomm AI Hub, grabbed a model and excitedly watched as it downloaded. “Hmmm okay…

Enable Pose Detection on Snapdragon X Elite: Step-by-step Tutorial Read More +

Video Understanding: Qwen2-VL, An Expert Vision-language Model

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Qwen2-VL, an advanced vision language model built on Qwen2 [1], sets new benchmarks in image comprehension across varied resolutions and ratios, while also tackling extended video content. ‍Though Qwen2-V excels at many fronts, this article explores the model’s

Video Understanding: Qwen2-VL, An Expert Vision-language Model Read More +

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization Read More +

Scalable Video Search: Cascading Foundation Models

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Video has become the lingua franca of the digital age, but its ubiquity presents a unique challenge: how do we efficiently extract meaningful information from this ocean of visual data? ‍In Part 1 of this series, we navigate

Scalable Video Search: Cascading Foundation Models Read More +

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM Read More +

AutoML Decoded: The Ultimate Guide and Tools Comparison

This article was originally published at Tryolabs’ website. It is reprinted here with the permission of Tryolabs. The quest for efficient and user-friendly solutions has led to the emergence of a game-changing concept: Automated Machine Learning (AutoML). AutoML is the process of automating the tasks involved in the entire Machine Learning lifecycle, such as data

AutoML Decoded: The Ultimate Guide and Tools Comparison Read More +

Zero-Shot AI: The End of Fine-tuning as We Know It?

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Models like SAM 2, LLaVA or ChatGPT can do tasks without special training. This has people wondering if the old way (i.e., fine-tuning) of training AI is becoming outdated. In this article, we compare two models: YOLOv8 (fine-tuning)

Zero-Shot AI: The End of Fine-tuning as We Know It? Read More +

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top