← Back to Benchmarks
simmediumoffline-rlmetric · varies

A2Perf: Real-World Autonomous Agents Benchmark

Description

Autonomous agents and systems cover a number of application areas, from robotics and digital assistants to combinatorial optimization, all sharing common, unresolved research challenges. It is not sufficient for agents to merely solve a given task; they must generalize to out-of-distribution tasks, perform reliably, and use hardware resources efficiently during training and inference, among other requirements. Several methods, such as reinforcement learning and imitation learning, are commonly u

Source

http://arxiv.org/abs/2503.03056v1