InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.)

Unifying Multimodal Retrieval via Document Screenshot Embedding · 2024 · DOI 10.18653/v1/2024.emnlp-main.373

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

BERAG: Bayesian Ensemble Retrieval-Augmented Generation for Knowledge-based Visual Question Answering

cs.CL · 2026-04-24 · unverdicted · novelty 7.0

BERAG applies Bayesian ensemble weighting of individual documents via token-by-token posterior updates in retrieval-augmented generation, yielding gains on knowledge-based visual QA tasks.

Document-as-Image Representations Fall Short for Scientific Retrieval

cs.IR · 2026-04-20 · conditional · novelty 7.0

Document-as-image representations underperform text-based and interleaved multimodal approaches for scientific document retrieval on a new LaTeX-derived benchmark.

Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval

cs.CV · 2026-04-11 · unverdicted · novelty 7.0

ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.

Overview of the EReL@MIR 2025 Multimodal Document Retrieval Challenge (Track 1)

cs.CV · 2026-06-02 · unverdicted · novelty 1.0

The EReL@MIR 2025 Track 1 challenge evaluates single systems on two multimodal retrieval tasks and finds that Qwen2-VL decoder-based embedders dominate, with a training-free entry within 0.1 points of the fine-tuned winner.

citing papers explorer

Showing 4 of 4 citing papers.

BERAG: Bayesian Ensemble Retrieval-Augmented Generation for Knowledge-based Visual Question Answering cs.CL · 2026-04-24 · unverdicted · none · ref 19
BERAG applies Bayesian ensemble weighting of individual documents via token-by-token posterior updates in retrieval-augmented generation, yielding gains on knowledge-based visual QA tasks.
Document-as-Image Representations Fall Short for Scientific Retrieval cs.IR · 2026-04-20 · conditional · none · ref 3
Document-as-image representations underperform text-based and interleaved multimodal approaches for scientific document retrieval on a new LaTeX-derived benchmark.
Visual Late Chunking: An Empirical Study of Contextual Chunking for Efficient Visual Document Retrieval cs.CV · 2026-04-11 · unverdicted · none · ref 20
ColChunk adaptively chunks visual document patches into contextual multi-vectors via clustering, cutting storage by over 90% while raising average nDCG@5 by 9 points.
Overview of the EReL@MIR 2025 Multimodal Document Retrieval Challenge (Track 1) cs.CV · 2026-06-02 · unverdicted · none · ref 15
The EReL@MIR 2025 Track 1 challenge evaluates single systems on two multimodal retrieval tasks and finds that Qwen2-VL decoder-based embedders dominate, with a training-free entry within 0.1 points of the fine-tuned winner.

InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer