← Back to Benchmarks
simmediumatarimetric · varies
Flow Models for Unbounded and Geometry-Aware Distributional Reinforcement Learning
Description
We introduce a new architecture for Distributional Reinforcement Learning (DistRL) that models return distributions using normalizing flows. This approach enables flexible, unbounded support for return distributions, in contrast to categorical approaches like C51 that rely on fixed or bounded representations. It also offers richer modeling capacity to capture multi-modality, skewness, and tail behavior than quantile based approaches. Our method is significantly more parameter-efficient than cate