pith. sign in

hub

Streammem: Query-agnostic kv cache memory for stream- ing video understanding.arXiv preprint arXiv:2508.15717

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

hub tools

citation-role summary

background 3 baseline 1

citation-polarity summary

years

2026 14 2025 1

clear filters

representative citing papers

FlowNar: Scalable Streaming Narration for Long-Form Videos

cs.CV · 2026-05-30 · unverdicted · novelty 6.0

FlowNar achieves bounded memory and 3x higher throughput for streaming narration on Ego4D, EgoExo4D, and EpicKitchens100 by combining dynamic historical context removal with a Cross Linear Attentive Memory module.

Linear Scaling Video VLMs for Long Video Understanding

cs.CV · 2026-05-29 · unverdicted · novelty 5.0

StateKV is an inference-time technique that replaces quadratic self-attention prefill in video VLMs with a fixed-capacity importance-based recurrent state, keeping accuracy near full attention on long-video benchmarks without retraining.

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

cs.CV · 2026-06-05 · unverdicted · novelty 4.0

This is a survey that frames video MLLM research via a human-view formulation of perceptual representations, memory states, reasoning traces, and predictions, then reviews methods, datasets, benchmarks, and open problems.

citing papers explorer

Showing 3 of 3 citing papers after filters.