simmediumquadrupedmetric · varies

Gain Tuning Is Not What You Need: Reward Gain Adaptation for Constrained Locomotion Learning

Description

Existing robot locomotion learning techniques rely heavily on the offline selection of proper reward weighting gains and cannot guarantee constraint satisfaction (i.e., constraint violation) during training. Thus, this work aims to address both issues by proposing Reward-Oriented Gains via Embodied Regulation (ROGER), which adapts reward-weighting gains online based on penalties received throughout the embodied interaction process. The ratio between the positive reward (primary reward) and negat

Source

http://arxiv.org/abs/2510.10759v1