← Back to Benchmarks
simmediumquadrupedmetric · varies

First Order Model-Based RL through Decoupled Backpropagation

Description

There is growing interest in reinforcement learning (RL) methods that leverage the simulator's derivatives to improve learning efficiency. While early gradient-based approaches have demonstrated superior performance compared to derivative-free methods, accessing simulator gradients is often impractical due to their implementation cost or unavailability. Model-based RL (MBRL) can approximate these gradients via learned dynamics models, but the solver efficiency suffers from compounding prediction

Source

http://arxiv.org/abs/2509.00215v2