← Back to Benchmarks
simmediumoffline-rlmetric · varies
Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning
Description
We introduce Unsupervised Meta-Testing with Conditional Neural Processes (UMCNP), a novel hybrid few-shot meta-reinforcement learning (meta-RL) method that uniquely combines, yet distinctly separates, parameterized policy gradient-based (PPG) and task inference-based few-shot meta-RL. Tailored for settings where the reward signal is missing during meta-testing, our method increases sample efficiency without requiring additional samples in meta-training. UMCNP leverages the efficiency and scalabi