VISE is the first benchmark for sycophancy in Video-LLMs, with two training-free mitigation strategies based on key-frame selection and internal representation steering.
Leveraging logical rules in knowledge editing: A cherry on the top
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2verdicts
UNVERDICTED 2representative citing papers
Introduces a benchmark using logical rules from knowledge graphs to generate multi-hop questions that evaluate whether knowledge edits in LLMs propagate to entailed facts, finding up to 24% performance gaps for methods like ROME and FT.
citing papers explorer
-
Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs
VISE is the first benchmark for sycophancy in Video-LLMs, with two training-free mitigation strategies based on key-frame selection and internal representation steering.
-
Benchmarking Knowledge Editing using Logical Rules
Introduces a benchmark using logical rules from knowledge graphs to generate multi-hop questions that evaluate whether knowledge edits in LLMs propagate to entailed facts, finding up to 24% performance gaps for methods like ROME and FT.