simmediumoffline-rlmetric · varies

FairDICE: Fairness-Driven Offline Multi-Objective Reinforcement Learning

Description

Multi-objective reinforcement learning (MORL) aims to optimize policies in the presence of conflicting objectives, where linear scalarization is commonly used to reduce vector-valued returns into scalar signals. While effective for certain preferences, this approach cannot capture fairness-oriented goals such as Nash social welfare or max-min fairness, which require nonlinear and non-additive trade-offs. Although several online algorithms have been proposed for specific fairness objectives, a un

Source

http://arxiv.org/abs/2506.08062v2