simmediumoffline-rlmetric · varies

From Static to Dynamic: Enhancing Offline-to-Online Reinforcement Learning via Energy-Guided Diffusion Stratification

Description

Transitioning from offline to online reinforcement learning (RL) poses critical challenges due to distributional shifts between the fixed behavior policy in the offline dataset and the evolving policy during online learning. Although this issue is widely recognized, few methods attempt to explicitly assess or utilize the distributional structure of the offline data itself, leaving a research gap in adapting learning strategies to different types of samples. To address this challenge, we propose

Source

http://arxiv.org/abs/2511.03828v1