simmediumimitationmetric · varies

Filtering Learning Histories Enhances In-Context Reinforcement Learning

Description

Transformer models (TMs) have exhibited remarkable in-context reinforcement learning (ICRL) capabilities, allowing them to generalize to and improve in previously unseen environments without re-training or fine-tuning. This is typically accomplished by imitating the complete learning histories of a source RL algorithm over a substantial amount of pretraining environments, which, however, may transfer suboptimal behaviors inherited from the source algorithm/dataset. Therefore, in this work, we ad

Source

http://arxiv.org/abs/2505.15143v1