← Back to Benchmarks
simmediumroboticsmetric · varies

AnyUser: Translating Sketched User Intent into Domestic Robots

Description

We introduce AnyUser, a unified robotic instruction system for intuitive domestic task instruction via free-form sketches on camera images, optionally with language. AnyUser interprets multimodal inputs (sketch, vision, language) as spatial-semantic primitives to generate executable robot actions requiring no prior maps or models. Novel components include multimodal fusion for understanding and a hierarchical policy for robust action generation. Efficacy is shown via extensive evaluations: (1) Q

Source

http://arxiv.org/abs/2604.04811v1