pith. machine review for the scientific record. sign in

Hawkeye: Training video-text llms for grounding text in videos.CoRR, abs/2403.10228

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

fields

cs.CV 8 cs.MM 1

years

2026 9

verdicts

UNVERDICTED 9

representative citing papers

MarkIt: Training-Free Visual Markers for Precise Video Temporal Grounding

cs.MM · 2026-04-28 · unverdicted · novelty 7.0

MarkIt uses a query-to-mask bridge with open-vocabulary segmentation to add visual markers and frame indices to videos, enabling Vid-LLMs to achieve state-of-the-art temporal grounding on moment retrieval and highlight detection benchmarks.

ViLL-E: Video LLM Embeddings for Retrieval

cs.CV · 2026-04-13 · unverdicted · novelty 6.0

ViLL-E introduces a dynamic embedding mechanism and joint contrastive-generative training for VideoLLMs, delivering up to 7% gains in temporal localization and 4% in video retrieval while enabling new zero-shot capabilities.

citing papers explorer

Showing 9 of 9 citing papers.