← Back to Benchmarks
simmediumatarimetric · varies

Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field

Description

Consistent and reproducible evaluation of Deep Reinforcement Learning (DRL) is not straightforward. In the Arcade Learning Environment (ALE), small changes in environment parameters such as stochasticity or the maximum allowed play time can lead to very different performance. In this work, we discuss the difficulties of comparing different agents trained on ALE. In order to take a step further towards reproducible and comparable DRL, we introduce SABER, a Standardized Atari BEnchmark for general

Source

http://arxiv.org/abs/1908.04683v5