← Back to Benchmarks
simmediumatarimetric · varies
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Description
Reinforcement learning (RL) is empirically successful in complex nonlinear Markov decision processes (MDPs) with continuous state spaces. By contrast, the majority of theoretical RL literature requires the MDP to satisfy some form of linear structure, in order to guarantee sample efficient RL. Such efforts typically assume the transition dynamics or value function of the MDP are described by linear functions of the state features. To resolve this discrepancy between theory and practice, we intro