The team trained their RL models in the NVIDIA Isaac Gym and transferred the learning to a robotics lab in Europe. (Image Credit: University of Toronto)
Researchers at the University of Toronto, NVIDIA, and other organizations unveiled a system utilizing extremely efficient deep reinforcement learning and simulated environments to train robotic hands at a reduced cost. The system’s setup, which includes physical robot hardware, training, and inference, can be purchased for just under $10,000.
The team developed their system by building on TriFinger, a three-fingered, 6DoF robotic hand that reduced robotics research costs and originally ran on the PyBullet Physics engine. Overall, the goal was to improve simulated learning while keeping costs down. To achieve this, they replaced the PyBullet, a slow, difficult-to-train, and noisy CPU-based environment, with NVIDIA’s Isaac Gym, a highly efficient simulated environment capable of running on desktop-grade GPUs. It runs on NVIDIA’s PhysX GPU-accelerated engine to perform thousands of parallel simulations on a single GPU. With that in mind, it provides approximately 100,000 samples per second on an RTX 3090 GPU.
Due to the GPU-powered virtual environment’s high efficiency, the researchers were able to train their reinforcement learning models in a high-fidelity simulation without losing speed. High fidelity provides more realism for the training environment, resulting in fewer model modifications from physical robots. At the same time, it reduces the sim2real gap.
To test their reinforcement learning system, the team utilized a sample object manipulation task. The RL model received proprioceptive data and eight key points, which resembled the target object’s pose in 3D Euclidean space. The model’s output represents the torques applied to the motors of the robot’s nine joints. This system relies on the Proximal Policy Optimization (PPO), a cost-efficient model-free RL algorithm.
To make the model more robust, the team added random noise to different environmental aspects during training. After training the system, the researchers tested it in the real world via remote access to the TriFinger robots. First, they replaced the simulator’s proprioceptive data and image input with the sensor and camera data. The system handed over its capabilities to the real robot.
The keypoint-based object tracking ensured the robot handles object manipulations across various poses, scales, and conditions. The team claims this technique can operate on robotic hands that have more degrees of freedom. However, they couldn’t measure the sim2real gap because they didn’t have a physical robot.
This system can be installed on other RL systems features navigation and pathfinding, leading to the ability to train mobile robots. The team also believes this project presents “a path for a democratization of robot learning and a viable solution through large scale simulation and robotics-as-a-service.”
Have a story tip? Message me at: http://twitter.com/Cabe_Atwell