← Back to Benchmarks
simmediumroboticsmetric · varies

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

Description

The pursuit of artificial general intelligence (AGI) has been accelerated by Multimodal Large Language Models (MLLMs), which exhibit superior reasoning, generalization capabilities, and proficiency in processing multimodal inputs. A crucial milestone in the evolution of AGI is the attainment of human-level planning, a fundamental ability for making informed decisions in complex environments, and solving a wide range of real-world problems. Despite the impressive advancements in MLLMs, a question

Source

http://arxiv.org/abs/2312.06722v3