simmediumoffline-rlmetric · varies

Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning

Description

Offline reinforcement learning (RL) aims to learn a policy from a static dataset without further interactions with the environment. Collecting sufficiently large datasets for offline RL is exhausting since this data collection requires colossus interactions with environments and becomes tricky when the interaction with the environment is restricted. Hence, how an agent learns the best policy with a minimal static dataset is a crucial issue in offline RL, similar to the sample efficiency problem

Source

http://arxiv.org/abs/2505.05701v2