Title resolution pending

Association for Computational Linguistics · 2025 · DOI 10.18653/v1/2025.acl-long.736

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

open at publisher browse 8 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ChinaHeritaQA: A Culturally-Grounded Visual Question Answering Dataset for World Heritage Sites in China

cs.CV · 2026-06-08 · unverdicted · novelty 7.0

ChinaHeritaQA is a new bilingual VQA benchmark dataset with 2,279 images and 14,133 QA pairs for evaluating cultural reasoning abilities of VLMs on Chinese World Heritage sites across seven cognitive dimensions.

Grounded or Guessing? LVLM Confidence Estimation via Blind-Image Contrastive Ranking

cs.CL · 2026-05-11 · unverdicted · novelty 7.0 · 2 refs

BICR trains a lightweight probe on contrastive hidden states from real versus blind images to detect visual grounding in LVLM predictions, outperforming baselines on calibration and discrimination with fewer parameters.

MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

cs.AI · 2026-05-08 · unverdicted · novelty 7.0 · 3 refs

MemQ improves LLM agent performance by using eligibility traces over provenance DAGs to assign credit to dependent memories, achieving top success rates on six benchmarks with largest gains on complex multi-step tasks.

SASAV: Self-Directed Agent for Scientific Analysis and Visualization

cs.GR · 2026-04-03 · unverdicted · novelty 7.0

SASAV introduces the first fully autonomous multi-agent system for scientific data analysis and visualization that operates without external prompting or human-in-the-loop feedback.

CulMind: Benchmarking Multimodal Understanding and Reasoning in Chinese Cultural Heritage

cs.CL · 2026-06-19 · unverdicted · novelty 6.0

Introduces CulMind benchmark, CulMind-R reasoning subset, and ReaScore metric to evaluate MLLMs on Chinese cultural heritage multimodal understanding and reasoning quality.

Make Each Token Count: Towards Improving Long-Context Performance with KV Cache Eviction

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

A unified learnable KV eviction policy with cross-layer calibration reduces memory and matches or exceeds full-cache performance on long-context tasks by retaining useful tokens and limiting attention dilution.

AICA-Bench: Holistically Examining the Capabilities of VLMs in Affective Image Content Analysis

cs.CV · 2026-04-07 · unverdicted · novelty 6.0

AICA-Bench evaluates 23 VLMs on affective image analysis, identifies weak intensity calibration and shallow descriptions as limitations, and proposes training-free Grounded Affective Tree Prompting to improve performance.

RAVE: Re-Allocating Visual Attention in Large Multimodal Models

cs.CV · 2026-05-18 · unverdicted · novelty 5.0 · 2 refs

RAVE is a lightweight pair-gating addition to self-attention that improves visual token allocation in LMMs and delivers an average 3-point gain on multimodal benchmarks, largest on perception-heavy tasks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Grounded or Guessing? LVLM Confidence Estimation via Blind-Image Contrastive Ranking cs.CL · 2026-05-11 · unverdicted · none · ref 35 · 2 links
BICR trains a lightweight probe on contrastive hidden states from real versus blind images to detect visual grounding in LVLM predictions, outperforming baselines on calibration and discrimination with fewer parameters.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer