MRCL extends pairwise spatial contrastive pre-training to multi-hop paths in scene graphs, yielding NDCG@5 = 0.748 on GQA graph retrieval and gains on spatial recognition and QA tasks.
Visual genome: Connecting language and vision using crowdsourced dense image annotations.International journal of computer vision, 123(1):32–73
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multi-hop Relational Contrastive Learning: Extending Spatial Contrastive Pre-training Beyond Pairwise Relations
MRCL extends pairwise spatial contrastive pre-training to multi-hop paths in scene graphs, yielding NDCG@5 = 0.748 on GQA graph retrieval and gains on spatial recognition and QA tasks.