ProSA auditing framework shows that structural loss metrics track OCR instability and downstream QA degradation far better than area-based footprint measures across two document parsers on 1,000 pages.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5representative citing papers
OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve performance.
PA-BDM adapts block diffusion by switching to causal intra-block denoising and dynamically committing reliable prefixes to KV cache, yielding higher accuracy and 71.6% higher throughput than a comparable baseline on document benchmarks.
High OCR accuracy on standard metrics does not guarantee strong downstream RAG performance because structural and semantic errors cause retrieval and generation failures on challenging industrial documents.