← Back to Benchmarks
simmediumquadrupedmetric · varies
Gain Tuning Is Not What You Need: Reward Gain Adaptation for Constrained Locomotion Learning
Description
Existing robot locomotion learning techniques rely heavily on the offline selection of proper reward weighting gains and cannot guarantee constraint satisfaction (i.e., constraint violation) during training. Thus, this work aims to address both issues by proposing Reward-Oriented Gains via Embodied Regulation (ROGER), which adapts reward-weighting gains online based on penalties received throughout the embodied interaction process. The ratio between the positive reward (primary reward) and negat