simmediumoffline-rlmetric · varies

Offline Reinforcement Learning with Wasserstein Regularization via Optimal Transport Maps

Description

Offline reinforcement learning (RL) aims to learn an optimal policy from a static dataset, making it particularly valuable in scenarios where data collection is costly, such as robotics. A major challenge in offline RL is distributional shift, where the learned policy deviates from the dataset distribution, potentially leading to unreliable out-of-distribution actions. To mitigate this issue, regularization techniques have been employed. While many existing methods utilize density ratio-based me

Source

http://arxiv.org/abs/2507.10843v1