simmediumoffline-rlmetric · varies

Forecasting in Offline Reinforcement Learning for Non-stationary Environments

Description

Offline Reinforcement Learning (RL) provides a promising avenue for training policies from pre-collected datasets when gathering additional interaction data is infeasible. However, existing offline RL methods often assume stationarity or only consider synthetic perturbations at test time, assumptions that often fail in real-world scenarios characterized by abrupt, time-varying offsets. These offsets can lead to partial observability, causing agents to misperceive their true state and degrade per

Source

http://arxiv.org/abs/2512.01987v3