simmediumvision-robotmetric · varies

VideoAesBench: Benchmarking the Video Aesthetics Perception Capabilities of Large Multimodal Models

Description

Large multimodal models (LMMs) have demonstrated outstanding capabilities in various visual perception tasks, which has in turn made the evaluation of LMMs significant. However, the capability of video aesthetic quality assessment, which is a fundamental ability for human, remains underexplored for LMMs. To address this, we introduce VideoAesBench, a comprehensive benchmark for evaluating LMMs' understanding of video aesthetic quality. VideoAesBench has several significant characteristics: (1) D

Source

http://arxiv.org/abs/2601.21915v2