simmediumoffline-rlmetric · varies

Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning

Description

Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interact

Source

http://arxiv.org/abs/2508.02217v1