simmediumoffline-rlmetric · varies

Beyond Static Datasets: Robust Offline Policy Optimization via Vetted Synthetic Transitions

Description

Offline Reinforcement Learning (ORL) holds immense promise for safety-critical domains like industrial robotics, where real-time environmental interaction is often prohibitive. A primary obstacle in ORL remains the distributional shift between the static dataset and the learned policy, which typically mandates high degrees of conservatism that can restrain potential policy improvements. We present MoReBRAC, a model-based framework that addresses this limitation through Uncertainty-Aware latent s

Source

http://arxiv.org/abs/2601.18107v1