← Back to Benchmarks
simmediummobile-manipulationmetric · varies
SAGA: Open-World Mobile Manipulation via Structured Affordance Grounding
Description
We present SAGA, a versatile and adaptive framework for visuomotor control that can generalize across various environments, task objectives, and user specifications. To efficiently learn such capability, our key idea is to disentangle high-level semantic intent from low-level visuomotor control by explicitly grounding task objectives in the observed environment. Using an affordance-based task representation, we express diverse and complex behaviors in a unified, structured form. By leveraging mu