← Back to Benchmarks
simmediumoffline-rlmetric · varies
Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning
Description
Offline reinforcement learning (RL) offers a powerful paradigm for data-driven control. Compared to model-free approaches, offline model-based RL (MBRL) explicitly learns a world model from a static dataset and uses it as a surrogate simulator, improving data efficiency and enabling potential generalization beyond the dataset support. However, most existing offline MBRL methods follow a two-stage training procedure: first learning a world model by maximizing the likelihood of the observed transi