pith:YNQJL4CM
Seeing the Scene Matters: Revealing Forgetting in Video Understanding Models with a Scene-Aware Long-Video Benchmark
Vision-language models forget long-range scene context in videos, shown by a new benchmark with sharp accuracy drops.
arxiv:2603.27259 v3 · 2026-03-28 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{YNQJL4CMKFECWAPFYQ6KXDPJKD}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Our evaluation reveals a sharp drop in accuracy when VLMs attempt to answer scene-level questions, indicating significant forgetting of long-range context.
That the authors' definition of a scene as a coherent segment with consistent visual and semantic contexts accurately isolates long-range forgetting, and that the benchmark questions do not introduce other confounds in video selection or question design.
SceneBench shows VLMs lose accuracy on scene-level questions in long videos due to forgetting, and Scene-RAG retrieval improves performance by 2.5%.
Receipt and verification
| First computed | 2026-06-23T01:12:04.172364Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
c36095f04c51482b01e5c43cab8de950e4981d45e47e73177019011181475703
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/YNQJL4CMKFECWAPFYQ6KXDPJKD \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c36095f04c51482b01e5c43cab8de950e4981d45e47e73177019011181475703
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "9b4a89f2c7c43f0f960d61d77cc48477fdefe91ef8f2eef88e3cedeea7cb2d5c",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2026-03-28T12:44:19Z",
"title_canon_sha256": "3ca3f6151f7d08480faf12c4676262d71303d8b81b2330b5e919c0d9e9c76115"
},
"schema_version": "1.0",
"source": {
"id": "2603.27259",
"kind": "arxiv",
"version": 3
}
}