pith. sign in

arxiv: 2303.10741 · v2 · pith:PH3S7HJ6new · submitted 2023-03-19 · 💻 cs.CV · cs.LG

Computer Vision Estimation of Emotion Reaction Intensity in the Wild

classification 💻 cs.CV cs.LG
keywords emotionreactionintensitymodelaffectivecomputeremotionalemotions
0
0 comments X
read the original abstract

Emotions play an essential role in human communication. Developing computer vision models for automatic recognition of emotion expression can aid in a variety of domains, including robotics, digital behavioral healthcare, and media analytics. There are three types of emotional representations which are traditionally modeled in affective computing research: Action Units, Valence Arousal (VA), and Categorical Emotions. As part of an effort to move beyond these representations towards more fine-grained labels, we describe our submission to the newly introduced Emotional Reaction Intensity (ERI) Estimation challenge in the 5th competition for Affective Behavior Analysis in-the-Wild (ABAW). We developed four deep neural networks trained in the visual domain and a multimodal model trained with both visual and audio features to predict emotion reaction intensity. Our best performing model on the Hume-Reaction dataset achieved an average Pearson correlation coefficient of 0.4080 on the test set using a pre-trained ResNet50 model. This work provides a first step towards the development of production-grade models which predict emotion reaction intensities rather than discrete emotion categories.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Fine-tuning a multimodal large language model for clinician-grade autism behavioral scoring from short home videos

    cs.CV 2026-06 unverdicted novelty 6.0

    Fine-tuning Gemini 2.5 Pro with LoRA on 400 home videos improves per-feature agreement with clinicians by 40% and zero-shot ASD diagnosis F1 by 53% on held-out data, with classifier pipelines reaching 77% accuracy.

  2. Facial Expression Recognition in the Deep Learning Era: A Systematic Multi-Criteria Review of Methods, Models, Datasets, Performance, Challenges, and Future Research Directions

    cs.CV 2026-06 unverdicted novelty 4.0

    This survey organizes deep learning FER literature into five evolutionary phases and a seven-criteria taxonomy, compares datasets and performance, and outlines challenges.