simmediumaerialmetric · varies

FALCON: Future-Aware Learning with Contextual Object-Centric Pretraining for UAV Action Recognition

Description

We introduce FALCON, a unified self-supervised video pretraining approach for UAV action recognition from raw RGB aerial footage, requiring no additional preprocessing at inference. UAV videos exhibit severe spatial imbalance: large, cluttered backgrounds dominate the field of view, causing reconstruction-based pretraining to waste capacity on uninformative regions and under-learn action-relevant human/object cues. FALCON addresses this by integrating object-aware masked autoencoding with object

Source

http://arxiv.org/abs/2409.18300v2