← Back to Benchmarks
simmediumnavigationmetric · varies
DiffPixelFormer: Differential Pixel-Aware Transformer for RGB-D Indoor Scene Segmentation
Description
Indoor semantic segmentation is fundamental to computer vision and robotics, supporting applications such as autonomous navigation, augmented reality, and smart environments. Although RGB-D fusion leverages complementary appearance and geometric cues, existing methods often depend on computationally intensive cross-attention mechanisms and insufficiently model intra- and inter-modal feature relationships, resulting in imprecise feature alignment and limited discriminative representation. To addr