simmediumoffline-rlmetric · varies

Multi-agent In-context Coordination via Decentralized Memory Retrieval

Description

Large transformer models, trained on diverse datasets, have demonstrated impressive few-shot performance on previously unseen tasks without requiring parameter updates. This capability has also been explored in Reinforcement Learning (RL), where agents interact with the environment to retrieve context and maximize cumulative rewards, showcasing strong adaptability in complex settings. However, in cooperative Multi-Agent Reinforcement Learning (MARL), where agents must coordinate toward a shared

Source

http://arxiv.org/abs/2511.10030v1