R²D²: Adapting Dexterous Robots with NVIDIA Research Workflows and Models

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA.

Robotic arms are used today for assembly, packaging, inspection, and many more applications. However, they are still preprogrammed to perform specific and often repetitive tasks. To meet the increasing need for adaptability in most environments, perceptive arms are needed to make decisions and adjust behavior based on real-time data. This leads to more flexibility across tasks in collaborative environments, and improves safety through hazard awareness.

This edition of NVIDIA Robotics Research and Development Digest (R2D2) explores several robot dexterity, manipulation, and grasping workflows and AI models from NVIDIA Research (listed below) and how they can address key robot challenges such as adaptability and data scarcity:

  • DextrAH-RGB: A workflow for dexterous grasping from stereo RGB input.
  • DexMimicGen: A data generation pipeline for bimanual dexterous manipulation using imitation learning (IL). Featured at ICRA 2025.
  • GraspGen: A synthetic dataset of over 57 million grasps for different robots and grippers.

What are dexterous robots?

Dexterous robots manipulate objects with precision, adaptability, and efficiency. Robot dexterity involves fine motor control, coordination, and the ability to handle a wide range of tasks, often in unstructured environments. Key aspects of robot dexterity include grip, manipulation, tactile sensitivity, agility, and coordination.

Robot dexterity is crucial in industries like manufacturing, healthcare, and logistics enabling automation in tasks that traditionally require human-like precision.

NVIDIA robot dexterity and manipulation workflows and models

Dexterous grasping is a challenging task in robotics, requiring robots to manipulate a wide variety of objects with precision and speed. Traditional methods struggle with reflective objects and do not generalize well to new objects or dynamic environments.

NVIDIA Research addresses these challenges by developing end-to-end foundation models and workflows that enable robust manipulation across objects and environments.

DextrAH-RGB for dexterous grasping

DextrAH-RGB is a workflow that performs dexterous arm-hand grasping from stereo RGB input. Using this workflow, policies are trained entirely in simulation and can generalize to novel objects when deployed. DextrAH-RGB is trained at scale in simulation across different objects using NVIDIA Isaac Lab.

Video 1. Training DextrAH-RGB in simulation

The training pipeline consists of two stages. First, a teacher policy is trained in simulation using reinforcement learning (RL). The teacher is a privileged fabric-guided policy (FGP) that acts on a geometric fabric action space. Geometric fabrics are a form of vectorized low-level control that defines motion as joint position, velocity, and acceleration signals that are transmitted as commands to a robot’s controller. This enables rapid iteration by ensuring safety and reactivity at deployment, by embedding collision avoidance and goal-reaching behaviors.

The teacher policy has an LSTM layer that reasons and adapts to the world’s physics. This helps incorporate corrective behaviors, like regrasping and grasp success understanding, to react to current dynamics. The first stage of training ensures robustness and adaptability by leveraging domain randomization. Physics and visual and perturbation parameters are varied to progressively increase environmental difficulty as the teacher policy is trained.

In the second stage of training, the teacher policy is distilled into an RGB-based student policy in simulation using photorealistic tiled rendering. This step uses an imitation learning framework called DAgger. The student policy receives RGB images from a stereo camera, enabling it to implicitly infer depth and object positions.

Figure 1. DextrAH-RGB training pipeline

Simulation-to-real with Boston Dynamics Atlas MTS robot

NVIDIA and Boston Dynamics have been working together to train and deploy DextrAH-RGB. Figure 2 and Video 2 show a robotic system driven by a generalist policy with robust, zero-shot simulation-to-real grasping capabilities deployed on Atlas’ upper torso.

Figure 2. Training the teacher policy for Atlas at scale using NVIDIA Isaac Lab

The system exhibits a variety of grasps powered by Atlas’ three fingered grippers, which can accommodate both lightweight and heavy objects, and shows emerging failure detection and retry behaviors.

Video 2. The Boston Dynamics Atlas MTS robot successfully grasps industrial objects using DextrAH-RGB

DexMimicGen for bimanual manipulation data generation

DexMimicGen is a workflow for bimanual manipulation data generation that uses a small number of human demonstrations to generate large-scale trajectory datasets. The aim is to reduce the tedious task of manual data collection by enabling robots to learn actions in simulation, which can be transferred to the real world. This workflow addresses the challenge of data scarcity in IL for bimanual dexterous robots like humanoids.

DexMimicGen uses simulation-based augmentation for generating datasets. First, a human demonstrator collects a handful of demonstrations using a teleoperation device. DexMimicGen then generates a large dataset of demonstration trajectories in simulation. For example, in the initial publication, researchers generated 21K demos from just 60 human demos using DexMimicGen. Finally, a policy is trained on the generated dataset using IL to perform a manipulation task and deployed to a physical robot.

Figure 3. DexMimicGen workflow

Bimanual manipulation is challenging due to the need for precise coordination between two arms across different tasks. Parallel tasks like picking up a different object in each arm, require independent control policies. Coordinated tasks, like lifting a large object, require arms to synchronize motion and timing. Sequential tasks require subtasks to be completed in a certain order, like moving a box with one hand and putting an object in it with the other.

DexMimicGen accounts for these varying requirements during data generation using a ‘parallel, coordination, and sequential’ taxonomy of subtasks. This uses asynchronous execution strategies for independent arm subtasks, synchronization mechanisms for coordination tasks, and ordering constraints for sequential subtasks. This method ensures precise alignment and logical task execution during data generation.

Figure 4. Successfully sorting cans using a model trained on data generated with DexMimicGen

When deployed in the real world, DexMimicGen enabled a humanoid to achieve a 90% success rate in a can sorting task using data generated through its real-to-sim-to-real pipeline. For comparison, when trained on human demonstrations alone, the model achieved 0% success. These observations highlight DexMimicGen’s effectiveness in reducing human effort while enabling robust robot learning for complex manipulation tasks.

GraspGen dataset for multiple robots and grippers

To support research, GraspGen provides a new simulated dataset on Hugging Face of 57 million grasps for three different grippers. The dataset includes 6D gripper transformations and success labels for different object meshes.

Figure 5. Proposed grasps for a range of different objects in the dataset

The three grippers are the Franka Panda gripper, the Robotiq 2F-140 industrial gripper, and a single-contact suction gripper. GraspGen was entirely generated in simulation, demonstrating the benefits of automatic data generation to scale datasets in size and diversity.

Figure 6. The coordinate frame convention for three grippers in the simulated GraspGen dataset: the Robotiq 2F-140 gripper (left), a single-contact suction gripper (center), and the Franka Panda gripper (right)

Summary

To meet the increasing need for adaptability in most environments, robotic arms are needed to make decisions and adjust behavior based on real-time data. This post has explored several robot dexterity, manipulation, and grasping workflows and AI models and how they address key robot challenges such as adaptability and data scarcity.

Check out the following resources to learn more:

This post is part of our NVIDIA Robotics Research and Development Digest (R2D2) to give developers deeper insight into the latest breakthroughs from NVIDIA Research across physical AI and robotics applications.

Stay up to date by subscribing to the newsletter and following NVIDIA Robotics on YouTube, Discord, and NVIDIA Developer Forums. To start your robotics journey, enroll in free NVIDIA Robotics Fundamentals courses.

Acknowledgments

For their contributions to the research mentioned in this post, thanks to Arthur Allshire, Mohak Bhardwaj, Mark Carlson, Yu-Wei Chao, Clemens Eppner, Gina Fay, Jim Fan, Dieter Fox, Ankur Handa, Zhenyu Jiang, Kevin Lin, Michael Lutter, Ajay Mandlekar, Adithyavairavan Murali, Nathan Ratliff, Fabio Ramos, Alberto Rodriguez, Ritvik Singh, Balakumar Sundaralingam, Karl Van Wyk, Weikang Wan, Wentao Yuan, Jun Yamada, Yuqi Xie, Zhenjia Xu, and Yuke Zhu.

Related resources

Asawaree Bhide
Technical Marketing Engineer, NVIDIA

Yichao Pan
Senior Product Manager, NVIDIA

Nathan Ratliff
Director, NVIDIA

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top