pith. sign in

arxiv: 1608.08128 · v3 · pith:J3ERXGFRnew · submitted 2016-08-29 · 💻 cs.CV

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

classification 💻 cs.CV
keywords beenneuralnetworkachievedifferentrecurrentresultstemporally
0
0 comments X
read the original abstract

This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed. As the first step, features have been extracted from video frames using an state of the art 3D Convolutional Neural Network. This features are fed in a recurrent neural network that solves the activity classification and temporally location tasks in a simple and flexible way. Different architectures and configurations have been tested in order to achieve the best performance and learning of the video dataset provided. In addition it has been studied different kind of post processing over the trained network's output to achieve a better results on the temporally localization of activities on the videos. The results provided by the neural network developed in this thesis have been submitted to the ActivityNet Challenge 2016 of the CVPR, achieving competitive results using a simple and flexible architecture.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Simple vs complex temporal recurrences for video saliency prediction

    cs.CV 2019-07 unverdicted novelty 4.0

    Both ConvLSTM and exponential moving average modifications to a static saliency model achieve state-of-the-art video saliency prediction on DHF1K after SALICON pre-training and yield similar maps.