simmediumatarimetric · varies

Resetting the Optimizer in Deep RL: An Empirical Study

Description

We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain their own internal parameters such as estimates of the first-order and the second-order moments of the

Source

http://arxiv.org/abs/2306.17833v2