simmediumroboticsmetric · varies

Behavior-Constrained Reinforcement Learning with Receding-Horizon Credit Assignment for High-Performance Control

Description

Learning high-performance control policies that remain consistent with expert behavior is a fundamental challenge in robotics. Reinforcement learning can discover high-performing strategies but often departs from desirable human behavior, whereas imitation learning is limited by demonstration quality and struggles to improve beyond expert data. We propose a behavior-constrained reinforcement learning framework that improves beyond demonstrations while explicitly controlling deviation from expert

Source

http://arxiv.org/abs/2604.03023v1