simmediumroboticsmetric · varies

AsgardBench -- Evaluating Visually Grounded Interactive Planning Under Minimal Feedback

Description

With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation during execution based on visual observations rather than navigation or low-level manipulation. In the landscape of embodied AI benchmarks, AsgardBench targets the capability category of interactive planning, which is more sophisticated than offline high-level planning as it requires agents to revise plans in response to environmental fee

Source

http://arxiv.org/abs/2603.15888v2