arxiv: 2605.14523 · v1 · submitted 2026-05-14 · 🪐 quant-ph

Recognition: 2 theorem links

· Lean Theorem

HQTN-SER: Speech Emotion Recognition with Hybrid Quantum Tensor Networks

Mahad Mohtashim , Nouhaila Innan , Muhammad Shafique

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:47 UTC · model grok-4.3

classification 🪐 quant-ph

keywords speech emotion recognitionquantum machine learningtensor networkshybrid quantum-classicalmatrix product statesaffective computingquantum circuitsspeech processing

0 comments

The pith

A hybrid quantum tensor network achieves consistent accuracies above 73 percent on speech emotion benchmarks using few qubits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HQTN-SER to test whether an MPS-inspired quantum tensor network can capture emotional patterns in speech more efficiently than standard approaches. It combines the quantum module with a classical fusion layer for end-to-end classification and evaluates the full system on three public datasets under identical training conditions. The results show stable convergence and solid performance without large qubit counts or heavy parameter tuning. A sympathetic reader would care because real-world speech emotion systems often fail on subtle cues and recording noise, and this work explores whether structured quantum connectivity offers a compact way to model those correlations today.

Core claim

HQTN-SER uses an MPS-inspired quantum tensor network module that enforces structured qubit interactions to model correlations in speech representations with a small number of trainable parameters, then fuses the quantum measurement features with a learned classical latent embedding for classification. On RAVDESS, SAVEE, and MDER the model reaches 80.12 percent, 78.26 percent, and 73.51 percent accuracy respectively, with stable training and low qubit requirements, establishing tensor network structure as an effective hardware-aware design choice for quantum-assisted speech emotion recognition.

What carries the argument

The MPS-inspired quantum tensor network module that enforces structured interactions among a small number of qubits to model correlations in speech feature representations.

If this is right

Tensor network connectivity supports stable training of quantum-assisted models on standard speech emotion datasets.
Low qubit counts suffice for competitive accuracy when the network structure matches the data correlations.
The same hybrid design supplies a reproducible baseline for future quantum affective computing experiments.
Structured quantum modules can add value in tasks where classical models alone struggle with subtle nonlinear patterns.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could transfer to other audio classification problems such as speaker verification if the tensor structure captures temporal dependencies effectively.
Testing on larger or noisier real-world recordings would clarify whether the stability advantage persists outside controlled benchmarks.
Similar MPS-style connectivity might benefit quantum models in neighboring domains like video emotion analysis that also involve sequential subtle signals.

Load-bearing premise

The observed accuracy and convergence stability come specifically from the quantum tensor network connectivity rather than the classical fusion network or the shared preprocessing steps.

What would settle it

An ablation that replaces the quantum tensor network with an equivalent classical dense layer while freezing every other component including qubit count and training protocol; equal or higher accuracy in the classical version would falsify the claim that the tensor structure supplies the benefit.

Figures

Figures reproduced from arXiv: 2605.14523 by Mahad Mohtashim, Muhammad Shafique, Nouhaila Innan.

**Figure 1.** Figure 1: Overview of the HQTN-SER workflow for speech emotion recognition. Raw audio (.wav) from SAVEE, RAVDESS, and MDER is converted into [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: MPS-structured variational circuit used in HQTN-SER. Trainable single-qubit rotations ( [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Training and validation loss/accuracy curves on SAVEE. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Training and validation loss/accuracy curves on RAVDESS. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Training and validation loss/accuracy curves on MDER. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

Speech emotion recognition (SER) remains fragile in real-world conditions because emotional cues are subtle, speaker-dependent, and easily confounded by recording variability, while high-performing deep models typically rely on large and carefully curated training sets. Quantum machine learning offers an alternative way to introduce nonlinear correlation modeling with compact modules, yet existing quantum SER studies remain limited and the impact of circuit structure is not well understood. This paper presents HQTN-SER, a hybrid quantum-classical framework that investigates how quantum tensor network connectivity can support SER under small-qubit settings. HQTN-SER introduces (i) an MPS-inspired quantum tensor network module that enforces structured interactions to model correlations in speech representations with a small number of trainable parameters, and (ii) a fusion strategy that combines quantum measurement features with a learned classical latent embedding for end-to-end emotion classification. We evaluate HQTN-SER on three public benchmarks (RAVDESS, SAVEE, and MDER) under a unified preprocessing and training protocol. The proposed model achieves consistent performance across datasets, RAVDESS = 80.12%, SAVEE = 78.26% and MDER = 73.51% accuracy, with stable convergence and low qubit counts, showing that tensor network structure can be an effective and hardware-aware design choice for quantum-assisted SER. The results provide a reproducible baseline and clarify when structured quantum modules can add value to affective computing today.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces HQTN-SER, a hybrid quantum-classical framework for speech emotion recognition that uses an MPS-inspired quantum tensor network module to enforce structured interactions on speech representations with few trainable parameters, fused with a classical latent embedding for end-to-end classification. Evaluated under a unified preprocessing and training protocol on the RAVDESS, SAVEE, and MDER benchmarks, the model reports accuracies of 80.12%, 78.26%, and 73.51% respectively, together with claims of stable convergence and low qubit counts, arguing that tensor-network connectivity offers an effective hardware-aware design choice for quantum-assisted SER and supplies a reproducible baseline.

Significance. If the attribution of performance gains to the MPS-inspired quantum tensor module can be substantiated through appropriate controls, the work would provide a concrete, reproducible baseline for quantum machine learning in affective computing under small-qubit regimes. It would clarify when structured quantum connectivity adds value beyond classical fusion networks, informing hardware-aware circuit design for signal-processing tasks.

major comments (2)

[§4 (Experiments) and §5 (Results)] §4 (Experiments) and §5 (Results): The reported accuracies (RAVDESS 80.12%, SAVEE 78.26%, MDER 73.51%) and stable convergence are presented without ablation studies that isolate the contribution of the MPS-inspired quantum tensor network. No baselines are supplied that (a) remove the quantum module entirely, (b) replace it with a classical tensor network or MLP of matched parameter count, or (c) alter quantum connectivity while keeping the fusion stage fixed. Without these controls the performance cannot be rigorously attributed to the quantum tensor structure rather than the classical latent embedding or preprocessing choices.
[§3 (Methods)] §3 (Methods): The description of the quantum circuit implementation lacks explicit equations for the MPS-inspired tensor contraction, the measurement operators, and the precise fusion mechanism between quantum features and the classical embedding. This prevents verification of the claimed low-qubit regime and parameter efficiency.

minor comments (2)

[Abstract and §5] The abstract and results sections report point accuracies without error bars, standard deviations across runs, or the number of independent trials, which would strengthen the consistency claim.
[Figures] Figure captions and axis labels in the convergence plots could be expanded to indicate the exact loss function and optimizer used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help strengthen the attribution of results and the clarity of the methodological description. We address each major point below and will incorporate revisions to provide the requested controls and explicit formulations.

read point-by-point responses

Referee: [§4 (Experiments) and §5 (Results)] §4 (Experiments) and §5 (Results): The reported accuracies (RAVDESS 80.12%, SAVEE 78.26%, MDER 73.51%) and stable convergence are presented without ablation studies that isolate the contribution of the MPS-inspired quantum tensor network. No baselines are supplied that (a) remove the quantum module entirely, (b) replace it with a classical tensor network or MLP of matched parameter count, or (c) alter quantum connectivity while keeping the fusion stage fixed. Without these controls the performance cannot be rigorously attributed to the quantum tensor structure rather than the classical latent embedding or preprocessing choices.

Authors: We agree that ablation studies are required to rigorously attribute performance gains to the MPS-inspired quantum tensor network rather than the classical components. In the revised manuscript we will add a dedicated ablation subsection in §4/§5 that includes: (a) a purely classical baseline without the quantum module, (b) a classical tensor network or MLP with matched parameter count, and (c) variants that alter quantum connectivity while keeping the fusion stage fixed. These controls will be evaluated under the same unified protocol to substantiate the claims. revision: yes
Referee: [§3 (Methods)] §3 (Methods): The description of the quantum circuit implementation lacks explicit equations for the MPS-inspired tensor contraction, the measurement operators, and the precise fusion mechanism between quantum features and the classical embedding. This prevents verification of the claimed low-qubit regime and parameter efficiency.

Authors: We acknowledge that the current textual description in §3, while outlining the overall architecture, does not supply the explicit mathematical details needed for full verification. In the revision we will insert precise equations for the MPS-inspired tensor contraction, the measurement operators, and the fusion operation that combines quantum measurement features with the classical latent embedding. These additions will directly support the low-qubit and parameter-efficiency claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results from direct evaluation on benchmarks

full rationale

The paper proposes HQTN-SER as a hybrid architecture combining an MPS-inspired quantum tensor network module with classical fusion and reports accuracies (RAVDESS 80.12%, SAVEE 78.26%, MDER 73.51%) from evaluation under a unified preprocessing and training protocol on three public datasets. No derivation chain, equations, or first-principles results are presented that reduce any claimed performance or convergence property to the model's own inputs or fitted parameters by construction. No self-citations are used to justify uniqueness or load-bearing premises, and no predictions are obtained by renaming or refitting quantities already present in the training data. The central claims rest on empirical measurement rather than self-referential logic, rendering the analysis self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not enumerate free parameters or axioms; the central claim implicitly rests on the assumption that the chosen quantum circuit ansatz and fusion layer can be trained end-to-end without additional regularization or data-specific tuning beyond what is stated.

pith-pipeline@v0.9.0 · 5561 in / 1237 out tokens · 38989 ms · 2026-05-15T01:47:26.861249+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MPS-structured variational circuit... local trainable blocks... interleaved with CNOT gates between adjacent qubits only... nearest-neighbor entanglement... left-to-right sweep
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

HQTN-SER achieves... RAVDESS = 80.12%... stable convergence and low qubit counts

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

Emotion recognition in human-computer interaction,

R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. V otsis, S. Kollias, W. Fellenz, and J. Taylor, “Emotion recognition in human-computer interaction,”IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32– 80, 2001

work page 2001
[2]

Survey on speech emotion recognition: Features, classification schemes, and databases,

M. El Ayadi, M. S. Kamel, and F. Karray, “Survey on speech emotion recognition: Features, classification schemes, and databases,”Pattern Recognition, vol. 44, no. 3, pp. 572–587, 2011

work page 2011
[3]

Speech emotion recognition: Two decades in a nutshell, benchmarks, and ongoing trends,

B. Schuller, “Speech emotion recognition: Two decades in a nutshell, benchmarks, and ongoing trends,”Communications of the ACM, vol. 61, pp. 90–99, 04 2018

work page 2018
[4]

Speech emotion recognition in conversations using artificial intelligence: a systematic review and meta-analysis,

G. Alhussein, I. Ziogas, S. Saleem, and L. J. Hadjileontiadis, “Speech emotion recognition in conversations using artificial intelligence: a systematic review and meta-analysis,”Artificial Intelligence Review, vol. 58, no. 7, p. 198, 2025

work page 2025
[5]

Speech emotion recognition with light weight deep neural ensemble model using hand crafted features,

J. H. Chowdhury, S. Ramanna, and K. Kotecha, “Speech emotion recognition with light weight deep neural ensemble model using hand crafted features,”Scientific Reports, vol. 15, no. 1, p. 11824, 2025

work page 2025
[6]

A comprehensive review of multimodal emotion recognition: Techniques, challenges, and future directions,

Y . Wu, Q. Mi, and T. Gao, “A comprehensive review of multimodal emotion recognition: Techniques, challenges, and future directions,” Biomimetics, vol. 10, no. 7, p. 418, 2025

work page 2025
[7]

Quantum machine learning in feature hilbert spaces,

M. Schuld and N. Killoran, “Quantum machine learning in feature hilbert spaces,”Physical Review Letters, vol. 122, no. 4, Feb. 2019

work page 2019
[8]

Quantum machine learning,

J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, “Quantum machine learning,”Nature, vol. 549, no. 7671, pp. 195–202, 2017

work page 2017
[9]

Financial fraud detection using quantum graph neural networks,

N. Innan, A. Sawaika, A. Dhor, S. Dutta, S. Thota, H. Gokal, N. Patel, M. A.-Z. Khan, I. Theodonis, and M. Bennai, “Financial fraud detection using quantum graph neural networks,”Quantum Machine Intelligence, vol. 6, no. 1, p. 7, 2024

work page 2024
[10]

Quantum state tomog- raphy using quantum machine learning,

N. Innan, O. I. Siddiqui, S. Arora, T. Ghosh, Y . P. Koc ¸ak, D. Paragas, A. A. O. Galib, M. A.-Z. Khan, and M. Bennai, “Quantum state tomog- raphy using quantum machine learning,”Quantum Machine Intelligence, vol. 6, no. 1, p. 28, 2024

work page 2024
[11]

Lep-qnn: Loan eligibility prediction using quantum neural networks,

N. Innan, A. Marchisio, M. Bennai, and M. Shafique, “Lep-qnn: Loan eligibility prediction using quantum neural networks,” in2025 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2025, pp. 1864–1872

work page 2025
[12]

QUA V: Quantum-assisted path planning and optimization for uav navigation with obstacle avoidance,

N. Innan, M. Kashif, A. Marchisio, Y .-S. Gan, F. Barbaresco, and M. Shafique, “QUA V: Quantum-assisted path planning and optimization for uav navigation with obstacle avoidance,” in2025 IEEE International Conference on Quantum Artificial Intelligence (QAI). IEEE, 2025, pp. 208–215

work page 2025
[13]

HQNN-FSP: A hybrid classical-quantum neural network for regression-based financial stock market prediction,

P. K. Choudhary, N. Innan, M. Shafique, and R. Singh, “HQNN-FSP: A hybrid classical-quantum neural network for regression-based financial stock market prediction,”Quantum Machine Intelligence, vol. 8, no. 1, p. 55, 2026

work page 2026
[14]

A review on quantum machine learning in applied systems and engineering,

Y .-Y . Hong and D. J. D. Lopez, “A review on quantum machine learning in applied systems and engineering,”IEEE Access, 2025

work page 2025
[15]

Advanced speech emotion recognition utilizing optimized equivariant quantum convolutional neural network for accurate emotional state classification,

G. Balachandran, S. Ranjith, G. Jagan, and T. Chenthil, “Advanced speech emotion recognition utilizing optimized equivariant quantum convolutional neural network for accurate emotional state classification,” Knowledge-Based Systems, vol. 316, p. 113414, 2025

work page 2025
[16]

Quantum ai in speech emotion recognition,

M. Norval and Z. Wang, “Quantum ai in speech emotion recognition,” Entropy, vol. 27, no. 12, p. 1201, 2025

work page 2025
[17]

A survey on quantum machine learning in speech acoustics,

M. A. Kucharski, “A survey on quantum machine learning in speech acoustics,” in2025 IEEE 25th International Symposium on Computa- tional Intelligence and Informatics (CINTI). IEEE, 2025, pp. 235–240

work page 2025
[18]

Barren plateaus in variational quantum computing,

M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Biamonte, P. J. Coles, L. Cincio, J. R. McClean, Z. Holmes, and M. Cerezo, “Barren plateaus in variational quantum computing,”Nature Reviews Physics, vol. 7, no. 4, pp. 174–189, 2025

work page 2025
[19]

Financial fraud detection: a comparative study of quantum machine learning models,

N. Innan, M. A.-Z. Khan, and M. Bennai, “Financial fraud detection: a comparative study of quantum machine learning models,”International Journal of Quantum Information, vol. 22, no. 02, p. 2350044, 2024

work page 2024
[20]

Scaling Laws for Hybrid Quantum Neural Networks: Depth, Width, and Quantum-Centric Diagnostics

D. Vyskubov, K. Vyskubov, N. Innan, and M. Shafique, “Scaling laws for hybrid quantum neural networks: Depth, width, and quantum-centric diagnostics,”arXiv preprint arXiv:2604.06007, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[21]

Quantum machine learning tensor network states,

A. Kardashin, A. Uvarov, and J. Biamonte, “Quantum machine learning tensor network states,”Frontiers in Physics, vol. 8, p. 586374, 03 2021

work page 2021
[22]

Databases, features and classifiers for speech emotion recognition: a review,

M. Swain, A. Routray, and P. Kabisatpathy, “Databases, features and classifiers for speech emotion recognition: a review,”International Journal of Speech Technology, vol. 21, no. 1, pp. 93–120, 2018

work page 2018
[23]

Speech emotion recognition using deep learning techniques: A review,

R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain, “Speech emotion recognition using deep learning techniques: A review,” IEEE access, vol. 7, pp. 117 327–117 345, 2019

work page 2019
[24]

Speech emotion recognition with deep convolutional neural networks,

D. Issa, M. F. Demirci, and A. Yazici, “Speech emotion recognition with deep convolutional neural networks,”Biomedical Signal Processing and Control, vol. 59, p. 101894, 2020

work page 2020
[25]

Learning salient features for speech emotion recognition using convolutional neural networks,

Q. Mao, M. Dong, Z. Huang, and Y . Zhan, “Learning salient features for speech emotion recognition using convolutional neural networks,” Multimedia, IEEE Transactions on, vol. 16, pp. 2203–2213, 12 2014

work page 2014
[26]

Evaluating multi-layer perceptron and recurrent neural networks for speech emotion recognition,

S. Solanki, J. Agarwal, A. Jain, A. K. Dubey, A. Panwar, and P. Priyadarshi, “Evaluating multi-layer perceptron and recurrent neural networks for speech emotion recognition,” in2025 3rd International Conference on Communication, Security, and Artificial Intelligence (ICCSAI), vol. 3. IEEE, 2025, pp. 349–354

work page 2025
[27]

Self-attentional models for lattice inputs,

M. Sperber, G. Neubig, N.-Q. Pham, and A. Waibel, “Self-attentional models for lattice inputs,” inProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 1185–1197

work page 2019
[28]

Robust speech recognition via large-scale weak supervi- sion,

A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust speech recognition via large-scale weak supervi- sion,” inInternational conference on machine learning. PMLR, 2023, pp. 28 492–28 518

work page 2023
[29]

Survey of deep representation learning for speech emotion recognition,

S. Latif, R. Rana, S. Khalifa, R. Jurdak, J. Qadir, and B. Schuller, “Survey of deep representation learning for speech emotion recognition,” IEEE Transactions on Affective Computing, vol. 14, no. 2, pp. 1634– 1654, 2021

work page 2021
[30]

Emotion recognition from pe- ripheral physiological signals: A systematic review of trends, challenges and opportunities,

I. Barradas, Z. N. Khan, and A. Peer, “Emotion recognition from pe- ripheral physiological signals: A systematic review of trends, challenges and opportunities,”ACM Transactions on Interactive Intelligent Systems, vol. 16, no. 1, pp. 1–41, 2026

work page 2026
[31]

Quantum-enhanced cortical deep echo state network for fast and accurate speech emotion recognition,

R. Soltani, B. Emna, and H. Ltifi, “Quantum-enhanced cortical deep echo state network for fast and accurate speech emotion recognition,” Quantum Machine Intelligence, vol. 7, 08 2025

work page 2025
[32]

Hybrid quantum ma- chine learning based human speech emotion recognition,

S. Mittal, Y . Chand, M. Kumar, and N. K. Kundu, “Hybrid quantum ma- chine learning based human speech emotion recognition,” in2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2025, pp. 1–5

work page 2025
[33]

Qfsm: A novel quantum federated learning algorithm for speech emotion recognition with min- imal gated unit in 5g iov,

Z. Qu, Z. Chen, S. Dehdashti, and P. Tiwari, “Qfsm: A novel quantum federated learning algorithm for speech emotion recognition with min- imal gated unit in 5g iov,”IEEE Transactions on Intelligent Vehicles, vol. PP, pp. 1–12, 01 2024

work page 2024
[34]

Rep- resentation learning with parameterised quantum circuits for advancing speech emotion recognition,

T. Rajapakshe, R. Rana, F. Riaz, S. Khalifa, and B. W. Schuller, “Rep- resentation learning with parameterised quantum circuits for advancing speech emotion recognition,”Scientific Reports, 2025

work page 2025
[35]

Sentiqnf: A novel approach to sentiment analysis using quantum algorithms and neuro-fuzzy systems,

K. Dave, N. Innan, B. K. Behera, Z. Mumtaz, S. Al-Kuwari, and A. Farouk, “Sentiqnf: A novel approach to sentiment analysis using quantum algorithms and neuro-fuzzy systems,”IEEE Transactions on Computational Social Systems, 2025

work page 2025
[36]

Gesture and emotion detection using quantum computing for enhanced recognition and analysis,

S. S. J. Krishna, M. Anish, A. M. Posonia, J. A. Mayan, and P. Asha, “Gesture and emotion detection using quantum computing for enhanced recognition and analysis,” in2024 International Conference on Expert Clouds and Applications (ICOECA). IEEE, 2024, pp. 530–535

work page 2024
[37]

Quantum-based deep learning method for recognition of facial expressions,

R. Golchha, M. Sahu, and V . Bhateja, “Quantum-based deep learning method for recognition of facial expressions,”Neural Computing and Applications, vol. 37, no. 16, pp. 10 163–10 173, 2025

work page 2025
[38]

A practical introduction to tensor networks: Matrix product states and projected entangled pair states,

R. Or ´us, “A practical introduction to tensor networks: Matrix product states and projected entangled pair states,”Annals of Physics, vol. 349, p. 117–158, Oct. 2014

work page 2014
[39]

Supervised learning with tensor networks,

E. Stoudenmire and D. J. Schwab, “Supervised learning with tensor networks,” inAdvances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, Eds., vol. 29. Curran Associates, Inc., 2016

work page 2016
[40]

The ryerson audio-visual database of emotional speech and song (ravdess): A dynamic, multimodal set of facial and vocal expressions in north american english,

S. Livingstone and F. Russo, “The ryerson audio-visual database of emotional speech and song (ravdess): A dynamic, multimodal set of facial and vocal expressions in north american english,”PLOS ONE, vol. 13, p. e0196391, 05 2018

work page 2018
[41]

Audio-visual feature selection and reduction for emotion classification,

S. Haq, P. J. B. Jackson, and J. D. Edge, “Audio-visual feature selection and reduction for emotion classification,” inAuditory-Visual Speech Processing, 2008, pp. 185–190

work page 2008
[42]

Moroccan dialect emotion recognition dataset,

M. amine Soumiaa, “Moroccan dialect emotion recognition dataset,” 2024

work page 2024
[43]

Consensus-based distributed quantum kernel learning for speech recognition,

K.-C. Chen, W. Ma, and X. Xu, “Consensus-based distributed quantum kernel learning for speech recognition,” in2025 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2025, pp. 1–5

work page 2025