Power Cloud-native Microservices at the Edge with NVIDIA JetPack 6.0, Now GA

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA.

NVIDIA JetPack SDK powers NVIDIA Jetson modules, offering a comprehensive solution for building end-to-end accelerated AI applications. JetPack 6 expands the Jetson platform’s flexibility and scalability with microservices and a host of new features. It’s the most downloaded version of JetPack in 2024.

With the JetPack 6.0 production release now generally available, developers can confidently bring those new capabilities to most advanced embedded AI and robotics applications. This post highlights the key features and new AI workflows.

JetPack 6 feature highlights

JetPack 6 enables an expanding array of Linux-based distributions on Jetson. These include Ubuntu server from Canonical, RHEL 9.4 by Redhat, SUSE, Wind River Linux, Redhawk Real Time OS, and various Yocto-based distributions. These Linux-based provide commercially supported enterprise offerings on Jetson to deploy and manage Jetson-based products with confidence.

The ability to run any Linux kernel enables Jetson customers to use their kernel version of choice and avoids having to spend resources backporting their drivers to a specific Jetson Linux kernel. Jetson customers can maintain their kernels independent of the JetPack road map.

With JetPack 6, you have the freedom to upgrade the compute stack without upgrading the Jetson Linux BSP. This feature has been especially popular in the community.

In addition, JetPack 6 adds Jetson Platform Services to Jetson Linux BSP and Jetson AI Stack. Jetson Platform Services is a suite of prebuilt and customizable services designed to accelerate AI application development on Jetson devices. These collections of modular services enable true cloud-native applications that are API-driven and disaggregated.

Modular, disaggregated, fungible architecture with Jetson Platform Services

Jetson Platform Services, now available as part of JetPack 6, provides a modular architecture with a large collection of customizable software and reusable microservices for building vision AI applications. It offers foundation services for infrastructural capabilities, AI services for insight generation, and a reference cloud for secure edge-to-cloud connectivity.

The diverse set of microservices includes Video Storage Toolkit (VST), AI perception service based on NVIDIA DeepStream, generative AI inference service, analytics service, and more. Each provides APIs to configure and access the functionality of the microservices.

These APIs are presented externally to the system using the IoT gateway foundation service, based on a standard pattern used in cloud-native architectures to expose APIs within a system using a single gateway. Client applications exercise microservice functionality by invoking the respective APIs through the API gateway service.

Figure 1. NVIDIA JetPack 6.0 stack

Jetson Platform Services also provides an IoT cloud module that enables clients to be authenticated and authorized when accessing these APIs remotely. This IoT cloud module is cloud agnostic and can run from any public or private cloud.

Figure 2. Cloud-native workflow on NVIDIA Jetson

AI services

A collection of AI services provides optimized video processing and AI inference capabilities leveraging a combination of AI models, multi-object tracking, and streaming analytics techniques. These are containerized software with standardized APIs that can be integrated into an end application, as illustrated with the reference workflows.

AI inference service for VLM

Vision Language Models (VLMs) enable semantic understanding of images and videos by combining vision modalities to LLMs. The AI inference service for VLM enables accessing VLM functionality through standardized APIs. The service can be instantiated using one of two supported models (VILA or LLaVA), and provides two main capabilities:

Set conditions for creating alerts from streaming video through natural language prompts
Query (prompt) the video and get responses back using natural language

VLMs generally require considerable GPU and memory demands. They come in different sizes based on the number of parameters. VILA is available in 13B, 7B, and 2.7B variants. The accuracy of the model in terms of its ability to grasp semantics of an image improves with increase in parameter count, but at the cost of higher GPU usage and memory utilization. The user needs to select the right model based on their choice of Jetson platform and available system resources based on their workload.

This service can be integrated into generative AI workflows, as detailed in the next section.

AI analytics service

Video analytics applications often involve analyzing the movement of people or objects across a camera’s field of view. The AI analytics service operates on ‌metadata generated by an inference service, such as detection or tracking. This service takes the streaming metadata and generates spatial and temporal insights of object movement. Core functionality of this service includes:

Line crossing (tripwire): Define virtual polylines in the camera’s field of view and maintain counts of objects crossing the line over a period of time.
Region of interest: Define enclosed polygons and maintain time-series counts of objects within the region. For example, this can be used to detect when the number of people waiting in a checkout line reaches a certain limit.
Behavior analytics: Helps retrieve trajectories of objects moving through the camera’s field of view. This capability can be used to understand trends in object movement by creating heat-map visualizations (Figure 3).

Figure 3. Heat-map visualizations are available through the AI analytics service

All the analytics highlighted here can be generated and extracted using APIs. For more information about the AI analytics service, see the Jetson Platform Services release documentation.

Foundation services

Foundation services provide domain-agnostic capabilities for assembling production-grade AI systems, including camera management, storage management, IoT, API gateway, and message bus. The associated services can be conveniently installed through the SDK Manager (beginning with the Jetpack 6.0 GA release), which are then deployed as Linux services. Foundation services include:

Video Storage Toolkit (VST) service: Supports automatic discovery of ONVIF compliant cameras, along with ingestion, storage, and streaming of video streams from cameras. The downstream AI services or any application can consume these streams from VST through standard streaming protocols such as RTSP or webRTC. VST is optimized for handling large numbers of connected cameras and leverages underlying hardware accelerated support in Jetson for video decode and encode, scaling and preprocessing, and overlay generation.
Storage service: Storage provisioning and management supports automatic provisioning of SATA and NVMe storage attached to a Jetson device to supplement on-board storage and storage allocation among various microservices. Storage service supports logical volumes spanning multiple drives (including addition of drives over time) and disk encryption (for data-at-rest protection) through standard LUKS capability provided by Jetson Linux.
Network services: Supports configuration of Ethernet interfaces used for connecting to IP cameras using on-board or external POE switches and sets up DHCP for IP address allocation during camera startup.
Redis service: A unified system message bus on Jetson that supports messaging and synchronization between various microservices along with serving as a time series database for analytics.
API gateway (Ingress): Most microservices publish APIs for other services and applications to invoke. The Ingress service provides a standard mechanism to present these API endpoints. Incoming requests are routed to the appropriate microservices based on configured routes thereby keeping the underlying microservices architecture abstracted away from the API consumer.
Monitoring: To monitor your application and the services running on the device, the Monitoring service provides hooks to collect this data using prometheus. It also includes a grafana dashboard for visualization, which can be accessed remotely using the Ingress or API Gateway service. It includes a system monitoring service for tracking system utilization including CPU and GPU, memory, and disk (collected using node-exporter).
IoT gateway: For applications that use any cloud services, the IoT Gateway service provides a provisioning agent to authenticate the device and securely connect to the cloud. It establishes a bidirectional TCP connection with the cloud, to enable the device (which may be behind the firewall) to communicate with the cloud. Incoming traffic is forwarded to registered internal endpoints through the Ingress service. It also supports push notifications for events from various microservices to external clients through the cloud, and can be extended to support custom events derived from user created microservices.
Firewall: If you need a firewall to protect your device, specially for production cases, this service sets up UFW (uncomplicated firewall) with some default rules, which can be modified per your needs to control ingress and egress network traffic to the system.

Enabling generative AI workflows

With Jetson Platform Services, you can quickly build AI applications for the edge. To further accelerate this journey, there are several reference workflows you can use, including generative AI workflows. These workflows illustrate best practices for configuring and instantiating various Jetson services. It provides a recipe for building complex vision AI applications using APIs and the services outlined previously. You can customize or build on top of these workflows using APIs. The workflows are packaged as a Docker compose file, along with a reference mobile application to show how to leverage the APIs.

AI-NVR

The AI-NVR (Network Video Recorder) is an end-to-end reference application to build AI-based NVR solutions. It comes packed with amazing features such as video management and storage, people occupancy and heat map metrics, user authentication and authorization, device security and encrypted storage, and a reference mobile app. This workflow has been optimized for channel throughput and performance by utilizing all the different accelerators on the Jetson platform. The reference workflow uses the DeepStream AI perception service with a highly accurate NVIDIA PeopleNet model and a multi-object tracker. You have the flexibility to customize the perception service with your own AI model, or bring your own perception service. To learn more about this workflow, watch the AI-NVR Overview.

Generative AI-powered alerts at the edge

With generative AI-powered alerts, you can use VLMs to extract insights from videos and generate alerts using natural language. Combining both vision and language modalities, these models are trained on large datasets consisting of text, images, and videos and can understand natural language prompts and perform visual question answering.

VLMs extend beyond basic object detection and classification and provide a deeper contextual understanding of the scene. With this workflow, you can set alerts using APIs on the input video stream using natural language. For example, “alert if there is a fire.” Second, you can perform Q&A on the video.

Generative AI-powered alerts enable live Q&A on video using VLMs

Zero-shot detection using generative AI

The zero-shot detection workflow uses the NanoOwl model, an open vocabulary model that can detect any number of objects. Unlike the traditional object detection model which is trained on a fixed number of classes, the open vocabulary model is trained on internet-scale data, which makes it capable of detecting most common objects without explicitly training the model for these classes. With this workflow users can dynamically detect any object by prompting the model of the class to detect using APIs. To learn more about this workflow, see Bringing Generative AI to the Edge with NVIDIA Metropolis Microservices for Jetson.

Expanded Jetson support

Jetson Platform Services is compatible on all devices in the Orin family, from Orin Nano to AGX. The foundation services are supported across all these devices, and can be installed using SDK Manager. Similarly, the AI-NVR workflow is supported on all devices, though the number of streams will vary depending on the hardware configuration.

For the VLM reference workflows, model choice needs to be made with the Jetson platform in mind. Refer to the VLM reference page on Jetson AI Lab for information about expected stream count on Jetson Orin AGX and Nano. Also take into account any other workloads specific to your case that might require GPU and memory resources while deciding on the model.

Enabling production deployment

Production systems require robust, reliable hardware. NVIDIA has a deep partnership with many OEMs that can provide a production-quality carrier board and packaging. Some of our partners have also integrated and validated aforementioned workflows and Jetson Platform Services. This guarantees that all the services will work out of the box on their platform. Partners that have integrated JetPack 6 and Jetson Platform Services include:

Yuan
Aetina
Aaeon
Advantech
AVermedia
Seeed Studio
CRG

Once you build your system and create your application, the final step in productization is the deployment and management of the application. Applications might also need to be frequently updated in the field, which requires a remote over-the-air (OTA) update. We’re delighted to partner with a few of the leading fleet management companies who have integrated the Jetson Platform Service and can provide a turnkey solution to deploy and update your edge applications. Partners include:

Namla
Allxon
Mender

Summary

NVIDIA JetPack 6.0 delivers a host of new features, from enhancements at the Linux BSP layer and the AI stack, to a new way of building edge applications. It introduces Jetson Platform Services, a collection of cloud-native, modular services that come with standardized APIs that can be quickly integrated into workflows.

Take advantage of these services and workflows to accelerate generative AI application development on the edge. To get started on your next generative AI application, download JetPack 6.0. For technical questions, visit the NVIDIA Jetson forum.

Related resources

GTC session: A New Class of Cloud-Native Applications at the Far Edge With Generative AI
GTC session: Bringing Generative AI and Vision AI to Production at the Edge with Metropolis Microservices for Jetson
GTC session: Taking Generative AI and Vision AI to Production at the Edge
NGC Containers: GenAI SD NIM
Webinar: Accelerate Edge AI Development With Metropolis APIs and Microservices on Jetson
Webinar: JetPack 6: The Biggest Upgrade to JetPack Ever

Chintan Shah
Senior Product Manager, NVIDIA

Suhas Hariharapura Sheshadri
Product Manager, NVIDIA

Bhanu Pisupati
Design and Development Lead, Metropolis, NVIDIA

Sudesh Kamath
Contributing Developer, Metropolis, NVIDIA

If you're building AI or vision-enabled products, you've come to the right place.