simmediumvision-robotmetric · varies

TrianguLang: Geometry-Aware Semantic Consensus for Pose-Free 3D Localization

Description

Localizing objects and parts from natural language in 3D space is essential for robotics, AR, and embodied AI, yet existing methods face a trade-off between the accuracy and geometric consistency of per-scene optimization and the efficiency of feed-forward inference. We present TrianguLang, a feed-forward framework for 3D localization that requires no camera calibration at inference. Unlike prior methods that treat views independently, we introduce Geometry-Aware Semantic Attention (GASA), which

Source

http://arxiv.org/abs/2603.08096v2