simmediumatarimetric · varies

Is Policy Learning Overrated?: Width-Based Planning and Active Learning for Atari

Description

Width-based planning has shown promising results on Atari 2600 games using pixel input, while using substantially fewer environment interactions than reinforcement learning. Recent width-based approaches have computed feature vectors for each screen using a hand designed feature set or a variational autoencoder trained on game screens (VAE-IW), and prune screens that do not have novel features during the search. We propose Olive (Online-VAE-IW), which updates the VAE features online using active

Source

http://arxiv.org/abs/2109.15310v2