Anomaly Locality in Video Surveillance

arxiv: 1901.10364 · v1 · pith:VR7Q3Z2Qnew · submitted 2019-01-29 · 💻 cs.CV

Anomaly Locality in Video Surveillance

Federico Landi , Cees G. M. Snoek , Rita Cucchiara This is my paper

classification 💻 cs.CV

keywords spatiotemporalsurveillancevideosanomalylocalityanomaliesdatasetdetection

0 comments p. Extension

pith:VR7Q3Z2Q Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{VR7Q3Z2Q}

Prints a linked pith:VR7Q3Z2Q badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

This paper strives for the detection of real-world anomalies such as burglaries and assaults in surveillance videos. Although anomalies are generally local, as they happen in a limited portion of the frame, none of the previous works on the subject has ever studied the contribution of locality. In this work, we explore the impact of considering spatiotemporal tubes instead of whole-frame video segments. For this purpose, we enrich existing surveillance videos with spatial and temporal annotations: it is the first dataset for anomaly detection with bounding box supervision in both its train and test set. Our experiments show that a network trained with spatiotemporal tubes performs better than its analogous model trained with whole-frame videos. In addition, we discover that the locality is robust to different kinds of errors in the tube extraction phase at test time. Finally, we demonstrate that our network can provide spatiotemporal proposals for unseen surveillance videos leveraging only video-level labels. By doing, we enlarge our spatiotemporal anomaly dataset without the need for further human labeling.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Weakly-Supervised Spatiotemporal Anomaly Detection
cs.CV 2026-05 unverdicted novelty 5.0

A multiple instance learning approach with ranking loss localizes spatiotemporal anomalies in videos using only video-level normal/anomalous labels on the UCF Crime2Local dataset.