simmediumoffline-rlmetric · varies

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Description

Offline reinforcement learning (RL) offers a powerful paradigm for data-driven control. Compared to model-free approaches, offline model-based RL (MBRL) explicitly learns a world model from a static dataset and uses it as a surrogate simulator, improving data efficiency and enabling potential generalization beyond the dataset support. However, most existing offline MBRL methods follow a two-stage training procedure: first learning a world model by maximizing the likelihood of the observed transi

Source

http://arxiv.org/abs/2505.13709v3