← Back to Benchmarks
simmediumlocomotionmetric · varies

LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning

Description

We introduce Large Language Model-Assisted Preference Prediction (LAPP), a novel framework for robot learning that enables efficient, customizable, and expressive behavior acquisition with minimum human effort. Unlike prior approaches that rely heavily on reward engineering, human demonstrations, motion capture, or expensive pairwise preference labels, LAPP leverages large language models (LLMs) to automatically generate preference labels from raw state-action trajectories collected during reinf

Source

http://arxiv.org/abs/2504.15472v1