arXiv preprint arXiv:2502.14149 (2025)

He, R · 2025 · arXiv 2502.14149

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA

cs.CV · 2025-11-03 · conditional · novelty 7.0

QA-SNNE adds question-answer alignment via bilateral gating to semantic nearest neighbor entropy, yielding higher AUROC for uncertainty detection in surgical VQA models under both standard and rephrased questions.

SurgLQA: Scalable Long-Horizon Surgical Video Question Answering

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

SurgLQA introduces FTC for compact long-range video representations and TMS for adaptive test-time scaling, reporting gains on restructured Colon-LQA and REAL-Colon-VQA benchmarks.

SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding

cs.CV · 2025-11-05 · conditional · novelty 5.0

SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.

citing papers explorer

Showing 3 of 3 citing papers.

When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA cs.CV · 2025-11-03 · conditional · none · ref 3
QA-SNNE adds question-answer alignment via bilateral gating to semantic nearest neighbor entropy, yielding higher AUROC for uncertainty detection in surgical VQA models under both standard and rephrased questions.
SurgLQA: Scalable Long-Horizon Surgical Video Question Answering cs.CV · 2026-05-18 · unverdicted · none · ref 9
SurgLQA introduces FTC for compact long-range video representations and TMS for adaptive test-time scaling, reporting gains on restructured Colon-LQA and REAL-Colon-VQA benchmarks.
SurgViVQA: Temporally-Grounded Video Question Answering for Surgical Scene Understanding cs.CV · 2025-11-05 · conditional · none · ref 5
SurgViVQA adds temporal video encoding to surgical VideoQA and reports 9-11% gains in keyword accuracy over image-only baselines on two datasets plus improved robustness to question rephrasing.

arXiv preprint arXiv:2502.14149 (2025)

fields

years

verdicts

representative citing papers

citing papers explorer