simmediumoffline-rlmetric · varies

Learning Massively Multitask World Models for Continuous Control

Description

General-purpose control demands agents that act across many tasks and embodiments, yet research on reinforcement learning (RL) for continuous control remains dominated by single-task or offline regimes, reinforcing a view that online RL does not scale. Inspired by the foundation model recipe (large-scale pretraining followed by light RL) we ask whether a single agent can be trained on hundreds of tasks with online interaction. To accelerate research in this direction, we introduce a new benchmar

Source

http://arxiv.org/abs/2511.19584v2