simmediumoffline-rlmetric · varies

Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers

Description

Transformers have emerged as a compelling architecture for sequential decision-making by modeling trajectories via self-attention. In reinforcement learning (RL), they enable return-conditioned control without relying on value function approximation. Decision Transformers (DTs) exploit this by casting RL as supervised sequence modeling, but they are restricted to offline data and lack exploration. Online Decision Transformers (ODTs) address this limitation through entropy-regularized training on

Source

http://arxiv.org/abs/2509.15498v1