simmediumatarimetric · varies

The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning

Description

In this work we present a new agent architecture, called Reactor, which combines multiple algorithmic and architectural contributions to produce an agent with higher sample-efficiency than Prioritized Dueling DQN (Wang et al., 2016) and Categorical DQN (Bellemare et al., 2017), while giving better run-time performance than A3C (Mnih et al., 2016). Our first contribution is a new policy evaluation algorithm called Distributional Retrace, which brings multi-step off-policy updates to the distribut

Source

http://arxiv.org/abs/1704.04651v2