PixelRAG shows that operating RAG entirely over web screenshots outperforms text-based retrieval on NQ, SimpleQA, MMSearch, LiveVQA, and MoNaCo, with up to 18.1% accuracy gains and 3x token savings via image compression.
arXiv preprint arXiv:2410.21311 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Presents DocFormBench benchmark and DocFormFlow workflow for content-aware LLM document formatting, claiming higher accuracy and lower token use via decoupled localization and modification.
citing papers explorer
-
What to Format and How: A Benchmark and Workflow Approach for Document Formatting
Presents DocFormBench benchmark and DocFormFlow workflow for content-aware LLM document formatting, claiming higher accuracy and lower token use via decoupled localization and modification.