simmediumlocomotionmetric · varies

GPO: Growing Policy Optimization for Legged Robot Locomotion and Whole-Body Control

Description

Training reinforcement learning (RL) policies for legged robots remains challenging due to high-dimensional continuous actions, hardware constraints, and limited exploration. Existing methods for locomotion and whole-body control work well for position-based control with environment-specific heuristics (e.g., reward shaping, curriculum design, and manual initialization), but are less effective for torque-based control, where sufficiently exploring the action space and obtaining informative gradi

Source

http://arxiv.org/abs/2601.20668v1