simmediumoffline-rlmetric · varies

Should We Ever Prefer Decision Transformer for Offline Reinforcement Learning?

Description

In recent years, extensive work has explored the application of the Transformer architecture to reinforcement learning problems. Among these, Decision Transformer (DT) has gained particular attention in the context of offline reinforcement learning due to its ability to frame return-conditioned policy learning as a sequence modeling task. Most recently, Bhargava et al. (2024) provided a systematic comparison of DT with more conventional MLP-based offline RL algorithms, including Behavior Cloning

Source

http://arxiv.org/abs/2507.10174v1