← Back to Benchmarks
simmediumroboticsmetric · varies

VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator

Description

Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN). In this paper, we present VISITRON, a multi-modal Transformer-based navigator better suited to the interactive regime inherent to Cooperative Vision-and-Dialog Navigation (CVDN). VISITRON is trained to: i) identify and associate object-level concepts and semantics between the

Source

http://arxiv.org/abs/2105.11589v2