simmediumatarimetric · varies

Conformal Symplectic Optimization for Stable Reinforcement Learning

Description

Training deep reinforcement learning (RL) agents necessitates overcoming the highly unstable nonconvex stochastic optimization inherent in the trial-and-error mechanism. To tackle this challenge, we propose a physics-inspired optimization algorithm called relativistic adaptive gradient descent (RAD), which enhances long-term training stability. By conceptualizing neural network (NN) training as the evolution of a conformal Hamiltonian system, we present a universal framework for transferring lon

Source

http://arxiv.org/abs/2412.02291v2