← Back to Benchmarks
simmediumoffline-rlmetric · varies
Multi-Policy Pareto Front Tracking Based Online and Offline Multi-Objective Reinforcement Learning
Description
Multi-objective reinforcement learning (MORL) plays a pivotal role in addressing multi-criteria decision-making problems in the real world. The multi-policy (MP) based methods are widely used to obtain high-quality Pareto front approximation for the MORL problems. However, traditional MP methods only rely on the online reinforcement learning (RL) and adopt the evolutionary framework with a large policy population. This may lead to sample inefficiency and/or overwhelmed agent-environment interact