← Back to Benchmarks
simmediumnavigationmetric · varies
MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks
Description
Real-world robotic tasks are long-horizon and often span multiple floors, demanding rich spatial reasoning. However, existing embodied benchmarks are largely confined to single-floor in-house environments, failing to reflect the complexity of real-world tasks. We introduce MANSION, the first language-driven framework for generating building-scale, multi-floor 3D environments. Being aware of vertical structural constraints, MANSION generates realistic, navigable whole-building structures with div