Indoor segmentation and support inference from rgbd images

Pushmeet Kohli Nathan Silberman, Derek Hoiem, Rob Fergus · 2012

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

dataset 2 background 1

citation-polarity summary

use dataset 2 background 1

representative citing papers

RelFlexformer: Efficient Attention 3D-Transformers for Integrable Relative Positional Encodings

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

RelFlexformers enable flexible integrable 3D RPE in attention via NU-FFT, generalizing prior methods to heterogeneous token positions with O(L log L) complexity.

Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Spatial Prediction pretext task learns spatial structure in self-supervised learning by regressing relative position and scale between image views, yielding more structured representations and better generalization.

Vision Transformers Need Better Token Interaction

cs.CV · 2026-05-22 · unverdicted · novelty 5.0

Replacing softmax attention with entmax-1.5 in DINOv1 ViT-S/16 improves semantic segmentation mIoU on three benchmarks while keeping ImageNet linear-probing accuracy unchanged.

Seed1.5-VL Technical Report

cs.CV · 2025-05-11 · unverdicted · novelty 4.0

Seed1.5-VL is a compact multimodal model that sets new records on dozens of vision-language benchmarks and outperforms prior systems on agent-style tasks.

citing papers explorer

Showing 4 of 4 citing papers.

RelFlexformer: Efficient Attention 3D-Transformers for Integrable Relative Positional Encodings cs.LG · 2026-05-11 · unverdicted · none · ref 51
RelFlexformers enable flexible integrable 3D RPE in attention via NU-FFT, generalizing prior methods to heterogeneous token positions with O(L log L) complexity.
Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning cs.CV · 2026-05-11 · unverdicted · none · ref 10
Spatial Prediction pretext task learns spatial structure in self-supervised learning by regressing relative position and scale between image views, yielding more structured representations and better generalization.
Vision Transformers Need Better Token Interaction cs.CV · 2026-05-22 · unverdicted · none · ref 19
Replacing softmax attention with entmax-1.5 in DINOv1 ViT-S/16 improves semantic segmentation mIoU on three benchmarks while keeping ImageNet linear-probing accuracy unchanged.
Seed1.5-VL Technical Report cs.CV · 2025-05-11 · unverdicted · none · ref 96
Seed1.5-VL is a compact multimodal model that sets new records on dozens of vision-language benchmarks and outperforms prior systems on agent-style tasks.

Indoor segmentation and support inference from rgbd images

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer