← Back to Benchmarks
simmediumroboticsmetric · varies
AsgardBench -- Evaluating Visually Grounded Interactive Planning Under Minimal Feedback
Description
With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation during execution based on visual observations rather than navigation or low-level manipulation. In the landscape of embodied AI benchmarks, AsgardBench targets the capability category of interactive planning, which is more sophisticated than offline high-level planning as it requires agents to revise plans in response to environmental fee