simmediumoffline-rlmetric · varies

Puzzle it Out: Local-to-Global World Model for Offline Multi-Agent Reinforcement Learning

Description

Offline multi-agent reinforcement learning (MARL) aims to solve cooperative decision-making problems in multi-agent systems using pre-collected datasets. Existing offline MARL methods primarily constrain training within the dataset distribution, resulting in overly conservative policies that struggle to generalize beyond the support of the data. While model-based approaches offer a promising solution by expanding the original dataset with synthetic data generated from a learned world model, the

Source

http://arxiv.org/abs/2601.07463v2