← Back to Benchmarks
simmediumoffline-rlmetric · varies
Manifold-Constrained Energy-Based Transition Models for Offline Reinforcement Learning
Description
Model-based offline reinforcement learning is brittle under distribution shift: policy improvement drives rollouts into state--action regions weakly supported by the dataset, where compounding model error yields severe value overestimation. We propose Manifold-Constrained Energy-based Transition Models (MC-ETM), which train conditional energy-based transition models using a manifold projection--diffusion negative sampler. MC-ETM learns a latent manifold of next states and generates near-manifold