pith. machine review for the scientific record. sign in

arxiv: 1812.10328 · v1 · submitted 2018-12-26 · 💻 cs.CV

Recognition: unknown

A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition

Authors on Pith no claims yet
classification 💻 cs.CV
keywords activitygroupframeworkaccuracycnnscollectiveconvolutionaldataset
0
0 comments X
read the original abstract

In this work, we present a framework based on multi-stream convolutional neural networks (CNNs) for group activity recognition. Streams of CNNs are separately trained on different modalities and their predictions are fused at the end. Each stream has two branches to predict the group activity based on person and scene level representations. A new modality based on the human pose estimation is presented to add extra information to the model. We evaluate our method on the Volleyball and Collective Activity datasets. Experimental results show that the proposed framework is able to achieve state-of-the-art results when multiple or single frames are given as input to the model with 90.50% and 86.61% accuracy on Volleyball dataset, respectively, and 87.01% accuracy of multiple frames group activity on Collective Activity dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Attention-Guided Dual-Stream Learning for Group Engagement Recognition: Fusing Transformer-Encoded Motion Dynamics with Scene Context via Adaptive Gating

    cs.CV 2026-04 unverdicted novelty 6.0

    DualEngage fuses transformer-encoded student motion dynamics with 3D scene features via softmax-gated fusion to recognize group engagement in classroom videos, reporting 96.21% average accuracy on a university dataset.