EpiCurveBench supplies 1,000 epidemic curve images and ECS metric shows top VLMs reach only 52.3% while correlating 1.5-3.6 times more strongly than DTW with downstream epidemiological statistics.
Structchart: Perception, structuring, reasoning for visual chart understanding
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
method 2polarities
use method 2representative citing papers
Introduces a benchmark for MLLM-based chart data extraction from unlabeled images and a human-centered training framework that reaches SOTA numerical accuracy with a 7B model.
InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.
citing papers explorer
-
EpiCurveBench: Evaluating VLMs on Epidemic Curve Digitization
EpiCurveBench supplies 1,000 epidemic curve images and ECS metric shows top VLMs reach only 52.3% while correlating 1.5-3.6 times more strongly than DTW with downstream epidemiological statistics.
-
Making Multimodal LLMs Reliable Chart Data Extractors: A Benchmark and Training Framework
Introduces a benchmark for MLLM-based chart data extraction from unlabeled images and a human-centered training framework that reaches SOTA numerical accuracy with a 7B model.
-
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
-
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.