simmediumatarimetric · varies

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

Description

We propose HyperAgent, a reinforcement learning (RL) algorithm based on the hypermodel framework for exploration in RL. HyperAgent allows for the efficient incremental approximation of posteriors associated with an optimal action-value function ($Q^\star$) without the need for conjugacy and follows the greedy policies w.r.t. these approximate posterior samples. We demonstrate that HyperAgent offers robust performance in large-scale deep RL benchmarks. It can solve Deep Sea hard exploration probl

Source

http://arxiv.org/abs/2402.10228v5