Evaluation of Conversational Agents: Understanding Culture, Context and Environment in Emotion Detection

Auxane Boch; Emmanuel Ahene; Martha Teiko Teye; Twum Frimpong; Yaw Marfo Missah

arxiv: 2605.30099 · v1 · pith:64RGMGZKnew · submitted 2026-05-28 · 💻 cs.CV

Evaluation of Conversational Agents: Understanding Culture, Context and Environment in Emotion Detection

Martha Teiko Teye , Yaw Marfo Missah , Emmanuel Ahene , Twum Frimpong , Auxane Boch This is my paper

Pith reviewed 2026-06-29 08:22 UTC · model grok-4.3

classification 💻 cs.CV

keywords emotion detectionconversational AIsarcasmcultural factorsBlack African societyconvolutional neural networkmultimodal dataAFME algorithm

0 comments

The pith

A model combining speech and images detects emotions and sarcasm at 85-96 percent accuracy while addressing cultural factors in Black African society.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an emotion prediction model for conversational AI that incorporates cultural, contextual, and environmental factors specific to Black African society. It combines speech and image data using a three-layer Convolutional Neural Network and a new Audio-Frame Mean Expression algorithm to detect seven basic emotions along with sarcasm. This approach achieves accuracies between 85 and 96 percent by emphasizing pre-processing and post-processing stages. A sympathetic reader would care because generalized emotion detection systems have overlooked these cultural differences, potentially leading to less effective and less ethical AI applications in diverse regions.

Core claim

We develop an emotion prediction model with accuracies ranging between 85% and 96%. Our model combines both speech and image data to detect the seven basic emotions with a focus on also identifying sarcasm. It uses 3-layers of the Convolutional Neural Network in addition to a new Audio-Frame Mean Expression (AFME) algorithm and focuses on model pre-processing and post-processing stages. In the end, our proposed solution contributes to maintaining the credibility of an emotion recognition system in conversational AIs.

What carries the argument

The Audio-Frame Mean Expression (AFME) algorithm, a new method for processing audio frames to capture mean expressions, paired with a 3-layer Convolutional Neural Network to enable multimodal emotion and sarcasm detection.

If this is right

The model improves emotion recognition accuracy in culturally specific contexts.
It enables better sarcasm detection by integrating cultural considerations.
It supports more credible conversational AI systems for Black African users.
Focus on pre- and post-processing stages enhances overall system reliability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This model could be tested for transferability to other cultural contexts to see if the cultural factors are unique or generalizable.
Integrating this approach with existing conversational agents might reduce miscommunications in diverse user bases.
Future work could explore real-time implementation in human-robot interactions within specific environments.

Load-bearing premise

The model successfully incorporates and validates cultural, contextual, and environmental factors specific to Black African society in its emotion detection performance.

What would settle it

Running the model on emotion datasets from other cultural groups and observing if the accuracy drops below the reported range or fails to identify culturally nuanced expressions would falsify the claim of successful incorporation of those factors.

Figures

Figures reproduced from arXiv: 2605.30099 by Auxane Boch, Emmanuel Ahene, Martha Teiko Teye, Twum Frimpong, Yaw Marfo Missah.

**Figure 5.** Figure 5: It also clearly observed that the [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: FIGURE 6 [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

**Figure 8.** Figure 8: FIGURE 8 [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗

**Figure 9.** Figure 9: FIGURE 9 [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗

**Figure 10.** Figure 10: FIGURE 10 [PITH_FULL_IMAGE:figures/full_fig_p007_10.png] view at source ↗

read the original abstract

Valuable decisions and highly prioritized analysis now depend on applications such as facial biometrics, social media photo tagging, and human robots interactions. However, the ability to successfully deploy such applications is based on their efficiencies on tested use cases taking into consideration possible edge cases. Over the years, lots of generalized solutions have been implemented to mimic human emotions including sarcasm. However, factors such as geographical location or cultural difference have not been explored fully amidst its relevance in resolving ethical issues and improving conversational AI (Artificial Intelligence). In this paper, we seek to address the potential challenges in the usage of conversational AI within Black African society. We develop an emotion prediction model with accuracies ranging between 85% and 96%. Our model combines both speech and image data to detect the seven basic emotions with a focus on also identifying sarcasm. It uses 3-layers of the Convolutional Neural Network in addition to a new Audio-Frame Mean Expression (AFME) algorithm and focuses on model pre-processing and post-processing stages. In the end, our proposed solution contributes to maintaining the credibility of an emotion recognition system in conversational AIs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper promises culturally-aware emotion detection for Black African society but delivers only a generic 3-layer CNN pipeline with no data, validation, or mechanism for those factors.

read the letter

The paper sets out to improve conversational AI by addressing emotion detection in Black African contexts, claiming accuracies of 85-96% with a 3-layer CNN on speech and images plus a new AFME algorithm for sarcasm detection. It highlights the importance of cultural factors.

What is new here is the application to this specific society and the introduction of the AFME algorithm. The paper does a decent job of motivating why generalized models may fall short on cultural differences and ethical issues in AI deployment.

The main soft spot is that nothing in the description shows how the model actually incorporates or validates those cultural, contextual, or environmental factors. No datasets from the region are mentioned, no annotations for culture, and no experiments isolating those effects. The technical part reads like any other multimodal emotion recognition pipeline. The performance numbers come without baselines, splits, or error analysis, so it's hard to know if they mean anything.

This leaves the work with a gap between its stated goals and what is delivered. A reader looking for methods that handle cultural variation won't get usable insights from this.

I wouldn't bring it to a reading group. It doesn't seem ready for peer review because the core claim about cultural focus lacks any demonstration in the provided text. Desk reject seems appropriate.

Referee Report

3 major / 2 minor

Summary. The manuscript claims to develop a multi-modal emotion prediction model for conversational AI that detects seven basic emotions plus sarcasm by combining speech and image inputs. It uses a 3-layer CNN together with a new Audio-Frame Mean Expression (AFME) algorithm, reports accuracies of 85–96 %, and positions the work as addressing cultural, contextual, and environmental challenges specific to Black African society.

Significance. A validated, culturally grounded emotion model for an under-represented population would be a meaningful contribution to inclusive conversational AI. The current manuscript, however, supplies neither the datasets, cultural annotations, nor controlled experiments needed to substantiate that positioning, so the claimed significance cannot be assessed.

major comments (3)

[Abstract] Abstract: the central motivation and contribution statements assert that the model addresses 'potential challenges in the usage of conversational AI within Black African society' and incorporates 'cultural, contextual, and environmental factors.' No dataset drawn from the target population, no cultural annotations, no environment-specific features, and no ablation or validation isolating cultural effects are described anywhere in the manuscript. This renders the societal claim an unsupported assertion rather than a demonstrated property of the model.
[Abstract] Abstract and model description: accuracies 'ranging between 85% and 96%' are stated without any reference to datasets, train/test splits, baselines, error bars, cross-validation procedure, or how cultural factors were measured or controlled. The performance claim therefore lacks any supporting derivation or evidence.
[Model description] Model description: the technical pipeline (3-layer CNN + AFME) is presented as a generic multi-modal architecture for the seven basic emotions and sarcasm. No mechanism is given for incorporating or validating Black African cultural/contextual factors despite the explicit motivation, making the cultural focus load-bearing yet unaddressed.

minor comments (2)

The manuscript introduces the AFME algorithm but provides neither pseudocode, equations, nor implementation details sufficient for reproduction.
No references to prior culturally aware emotion-recognition datasets or benchmarks are supplied to situate the claimed novelty.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. The feedback highlights important gaps in how the manuscript positions its contributions relative to the evidence provided. We address each point below and will revise the manuscript accordingly to ensure claims are appropriately scoped and supported.

read point-by-point responses

Referee: [Abstract] Abstract: the central motivation and contribution statements assert that the model addresses 'potential challenges in the usage of conversational AI within Black African society' and incorporates 'cultural, contextual, and environmental factors.' No dataset drawn from the target population, no cultural annotations, no environment-specific features, and no ablation or validation isolating cultural effects are described anywhere in the manuscript. This renders the societal claim an unsupported assertion rather than a demonstrated property of the model.

Authors: We agree that the manuscript does not include datasets, annotations, or experiments drawn from Black African populations or that isolate cultural effects. The cultural context serves as the initial motivation for the work but is not demonstrated through specific validation in the current version. We will revise the abstract, introduction, and conclusion to remove or qualify these societal claims and present the work as a general multi-modal emotion detection model. revision: yes
Referee: [Abstract] Abstract and model description: accuracies 'ranging between 85% and 96%' are stated without any reference to datasets, train/test splits, baselines, error bars, cross-validation procedure, or how cultural factors were measured or controlled. The performance claim therefore lacks any supporting derivation or evidence.

Authors: The reported accuracy range is based on internal experiments, but the manuscript does not provide the required details on datasets, splits, baselines, or validation procedures. We will add a new Experiments section that includes these elements, along with any available error bars or cross-validation information, to substantiate the performance claims. revision: yes
Referee: [Model description] Model description: the technical pipeline (3-layer CNN + AFME) is presented as a generic multi-modal architecture for the seven basic emotions and sarcasm. No mechanism is given for incorporating or validating Black African cultural/contextual factors despite the explicit motivation, making the cultural focus load-bearing yet unaddressed.

Authors: The described pipeline is a general architecture without explicit mechanisms for cultural or contextual adaptation. We will revise the model description and related sections to clarify that cultural factors are not incorporated in the current implementation and are positioned as motivation for future extensions rather than a demonstrated feature of this work. revision: yes

Circularity Check

0 steps flagged

No circularity detected; claims are descriptive assertions without a derivation chain that reduces to inputs.

full rationale

The provided abstract and description contain no equations, no fitted parameters presented as predictions, no self-citations, and no derivation steps. The model is described as a 3-layer CNN plus new AFME algorithm reporting 85-96% accuracy on seven emotions plus sarcasm, with a stated motivation around Black African cultural factors. However, the absence of any mathematical chain or reduction means there is nothing to inspect for self-definitional equivalence or fitted-input-as-prediction patterns. The mismatch between motivation and technical description is a claim-support issue, not circularity by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; the AFME algorithm is presented as novel without external validation or derivation details.

invented entities (1)

Audio-Frame Mean Expression (AFME) algorithm no independent evidence
purpose: Processing audio frames to aid emotion detection alongside CNN image processing
Introduced in the abstract as a new component but no independent evidence or derivation is supplied.

pith-pipeline@v0.9.1-grok · 5741 in / 1234 out tokens · 23081 ms · 2026-06-29T08:22:31.850530+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 23 canonical work pages

[1]

Tsay and B

M. Tsay and B. M. Bodine, “Exploring parasocial interaction in college students as a multidimensional construct: Do personality, interpersonal need, and television motive predict their relationships with media characters?,” Psychol. Pop. Media Cult., vol. 1, no. 3, pp. 185–200, 2012, doi: 10.1037/a0028120

work page doi:10.1037/a0028120 2012
[2]

Real -time emotional state detection from facial expression on embedded devices,

S. Turabzadeh, H. Meng, R. M. Swash, M. Pleva, and J. Juhar, “Real -time emotional state detection from facial expression on embedded devices,” in 2017 Seventh International Conference on Innovative Computing Technology (INTECH) , 2017, pp. 46 –51, doi: 10.1109/INTECH.2017.8102423

work page doi:10.1109/intech.2017.8102423 2017
[3]

How affordances of chatbots cross the chasm between social and traditional enterprise systems,

E. Stoeckli, C. Dremel, F. Uebernickel, and W. Brenner, “How affordances of chatbots cross the chasm between social and traditional enterprise systems,” Electron. Mark., vol. 30, pp. 369 –403, 2020, doi: 10.1007/s12525 - 019-00359-6

work page doi:10.1007/s12525 2020
[4]

Number of voice assistants in use worldwide 2019 -2023,

H. Tankovska, “Number of voice assistants in use worldwide 2019 -2023,” Voicebot.ai; Business Wire , 2020. https://www.statista.com/statistics/973815/worldwide- digital-voice-assistant-in-use/ (accessed Sep. 03, 2020)

2019
[5]

Robotics and Artificial Intelligence in Africa [Regional],

D. Vernon, “Robotics and Artificial Intelligence in Africa [Regional],” IEEE Robot. Autom. Mag. , vol. 26, no. 4, pp. 131 –135, Dec. 2019, doi: 10.1109/MRA.2019.2946107

work page doi:10.1109/mra.2019.2946107 2019
[6]

The AI Invasion is Coming to Africa (and It’s a Good Thing),

L. Novitske, “The AI Invasion is Coming to Africa (and It’s a Good Thing),” Stanford Soc. Innov. Rev., 2018, doi: 10.48558/JM86-7M29

work page doi:10.48558/jm86-7m29 2018
[7]

How changes in technology and automation will affect the labour market in Africa,

K. . Millington, “How changes in technology and automation will affect the labour market in Africa,” UK Dep. Int. Dev. , pp. 1 –20, 2017, [Online]. Available: https://opendocs.ids.ac.uk/opendocs/handle/20.500.12413 /13054

2017
[8]

Bias in data -driven artificial intelligence systems —An introductory survey,

E. Ntoutsi et al. , “Bias in data -driven artificial intelligence systems —An introductory survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov. , vol. 10, no. 3, pp. 1–14, 2020, doi: 10.1002/widm.1356

work page doi:10.1002/widm.1356 2020
[9]

Damasio on mind and emotions: A conceptual critique,

S. Brinkmann, “Damasio on mind and emotions: A conceptual critique,” Nord. Psychol. , vol. 58, no. 4, pp. 366–380, 2006, doi: 10.1027/1901-2276.54.4.366

work page doi:10.1027/1901-2276.54.4.366 2006
[10]

Facial expression,

P. Ekman, “Facial expression,” Nonverbal Behav. Commun., vol. 38, no. 2, pp. 97 –166, 1952, doi: 10.1080/00335635209381778

work page doi:10.1080/00335635209381778 1952
[11]

Emotion and Sarcasm Identification of Posts From Facebook Data Using a Hybrid Approach,

V. M. Raghavan, K. P. Mohana, R. R. Sundara, and S. Rajeswari, “Emotion and Sarcasm Identification of Posts From Facebook Data Using a Hybrid Approach,” 7 VOLUME 10, 2022 ICTACT J. Soft Comput. , vol. 07, no. 02, pp. 1427 –1435, 2017, doi: 10.21917/ijsc.2017.0197

work page doi:10.21917/ijsc.2017.0197 2022
[12]

Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds,

K.-Y. Huang, C. -H. Wu, Q. -B. Hong, M. -H. Su, and Y. - H. Chen, “Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2019, pp. 5866 –5870, doi: 10.1109/ICASSP.2019.8682283

work page doi:10.1109/icassp.2019.8682283 2019
[13]

‘Danger, Will Robinson!’ The challenges of social robots for intergroup relations,

E. J. Vanman and A. Kappas, “‘Danger, Will Robinson!’ The challenges of social robots for intergroup relations,” Soc. Personal. Psychol. Compass , vol. 13, no. 8, pp. 1 – 13, 2019, doi: 10.1111/spc3.12489

work page doi:10.1111/spc3.12489 2019
[14]

Acculturative Stress and Specific Coping Strategies among Immigrant and Later Generation College Students,

F. J. Mena, A. M. Padilla, and M. Maldonado, “Acculturative Stress and Specific Coping Strategies among Immigrant and Later Generation College Students,” Hisp. J. Behav. Sci., vol. 9, no. 2, pp. 207–225, 1987, doi: 10.1177/07399863870092006

work page doi:10.1177/07399863870092006 1987
[15]

A Systems Model of Dyadic Nonverbal Interaction,

M. L. Patterson, “A Systems Model of Dyadic Nonverbal Interaction,” J. Nonverbal Behav., vol. 43, no. 2, pp. 111– 132, 2019, doi: 10.1007/s10919-018-00292-w

work page doi:10.1007/s10919-018-00292-w 2019
[16]

Consistent Optical Flow Maps for Full and Micro Facial Expression Recognition Consistent Optical Flow Maps for full and micro facial expression recognition,

B. Allaert, I. M. Bilasco, and C. Djeraba, “Consistent Optical Flow Maps for Full and Micro Facial Expression Recognition Consistent Optical Flow Maps for full and micro facial expression recognition,” no. February, 2017, doi: 10.5220/0006127402350242

work page doi:10.5220/0006127402350242 2017
[17]

Attentional Bias to Facial Expressions of Different Emotions - A Cross -Cultural Comparison of ≠Akhoe Hai||om and German Children and Adolescents.,

C. Mühlenbeck, C. Pritsch, I. Wartenburge r, S. Telkemeyer, and K. Liebal, “Attentional Bias to Facial Expressions of Different Emotions - A Cross -Cultural Comparison of ≠Akhoe Hai||om and German Children and Adolescents.,” Front. Psychol., vol. 11, p. 795, 2020, doi: 10.3389/fpsyg.2020.00795

work page doi:10.3389/fpsyg.2020.00795 2020
[18]

Emotion Detection using Image Processing in Python,

M. S. Raghav Puri, Archit Gupta, “Emotion Detection using Image Processing in Python,” 12th INDIACom; INDIACom-2018; IEEE Conf. ID 42835 2018 5th Int. Conf. “Computing Sustain. Glob. Dev. 14th - 16th March, 2018, pp. 1–6, 2018

2018
[19]

Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units,

P. R. Dachapally, “Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units,” ArXiv, vol. abs/1706.0, 2017

2017
[20]

Deep Learning Approaches for Facial Emotion Recognition: A Case Study on FER -2013,

P. Giannopoulos, I. Perikos, and I. Hatzilygeroudis, “Deep Learning Approaches for Facial Emotion Recognition: A Case Study on FER -2013,” in Advances in Hybridization of Intelligent Methods: Models, Systems and Applications, I. Hatzilygeroudis and V. Palade, Eds. Cham: Springer International Publishing, 2018, pp. 1–16

2013
[21]

Facial Emotion Detection Using Deep Learning,

A. Jaiswal, A. Krishnama Raju, and S. Deb, “Facial Emotion Detection Using Deep Learning,” in 2020 International Conference for Emerging Technology (INCET), 2020, pp. 1 –5, doi: 10.1109/INCET49848.2020.9154121

work page doi:10.1109/incet49848.2020.9154121 2020
[22]

Facial Emotion Recognition : State of the Art Performance on FER2013,

Y. Khaireddin and Z. Chen, “Facial Emotion Recognition : State of the Art Performance on FER2013,” no. May, 2021

2021
[23]

AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias,

R. K. E. Bellamy et al., “AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias,” IBM J. Res. Dev. , vol. 63, no. 4 –5, 2019, doi: 10.1147/JRD.2019.2942287

work page doi:10.1147/jrd.2019.2942287 2019
[24]

Facial emotion recognition using transfer learning in the deep CNN,

M. A. H. Akhand, S. Roy, N. Siddique, M. A. S. Kamal, and T. Shimamura, “Facial emotion recognition using transfer learning in the deep CNN,” Electron., vol. 10, no. 9, 2021, doi: 10.3390/electronics10091036

work page doi:10.3390/electronics10091036 2021
[25]

FER-2013 Face Database,

Y. Courville, P.L.C.; Goodfellow, A.; Mirza, I.J.M.; Bengio, “FER-2013 Face Database,” Univ. Montr., 2013

2013
[26]

CREMA -D: Crowd -sourced emotional multimodal actors dataset,

H. Cao, D. G. Cooper, M. K. Keutmann, R. C. Gur, A. Nenkova, and R. Verma, “CREMA -D: Crowd -sourced emotional multimodal actors dataset,” IEEE Trans. Affect. Comput. , vol. 5, no. 4, pp. 377 –390, 2014, doi: 10.1109/TAFFC.2014.2336244

work page doi:10.1109/taffc.2014.2336244 2014
[27]

The Ryerson Audio - Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English,

S. R. Livingstone and F. A. Russo, “The Ryerson Audio - Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English,” PLoS One, vol. 13, no. 5, pp. 1 –35, 2018, doi: 10.1371/journal.pone.0196391

work page doi:10.1371/journal.pone.0196391 2018
[28]

Surrey audio -visual expressed emotion (savee) database,

P. J. and S. ul Haq, “Surrey audio -visual expressed emotion (savee) database,” 2011

2011
[29]

Toronto emotional speech set (TESS),

M. K. Pichora-Fuller and K. Dupuis, “Toronto emotional speech set (TESS).” Scholars Portal Dataverse, doi: doi:10.5683/SP2/E8H2MF

work page doi:10.5683/sp2/e8h2mf
[30]

Chapter 1 - A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION,

R. Plutchik, “Chapter 1 - A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION,” in Theories of Emotion , R. Plutchik and H. Kellerman, Eds. Academic Press, 1980, pp. 3–33

1980
[31]

Talks, We should all be feminists | Chimamanda Ngozi Adichie | TEDxEuston

T. Talks, We should all be feminists | Chimamanda Ngozi Adichie | TEDxEuston . United States, 2013, pp. 10:21 - 10:22 minutes

2013
[32]

Real Time Emotion Detection of Humans Using Mini -Xception Algorithm,

S. A. Fatima, A. Kumar, and S. S. Raoof, “Real Time Emotion Detection of Humans Using Mini -Xception Algorithm,” {IOP} Conf. Ser. Mater. Sci. Eng., vol. 1042, no. 1, p. 12027, Jan. 2021, doi: 10.1088/1757 - 899x/1042/1/012027

work page doi:10.1088/1757 2021
[33]

Facial Expression and Sarcasm,

P. Rockwell, “Facial Expression and Sarcasm,” Percept. Mot. Skills , vol. 93, no. 1, pp. 47 –50, Aug. 2001, doi: 10.2466/pms.2001.93.1.47

work page doi:10.2466/pms.2001.93.1.47 2001

[1] [1]

Tsay and B

M. Tsay and B. M. Bodine, “Exploring parasocial interaction in college students as a multidimensional construct: Do personality, interpersonal need, and television motive predict their relationships with media characters?,” Psychol. Pop. Media Cult., vol. 1, no. 3, pp. 185–200, 2012, doi: 10.1037/a0028120

work page doi:10.1037/a0028120 2012

[2] [2]

Real -time emotional state detection from facial expression on embedded devices,

S. Turabzadeh, H. Meng, R. M. Swash, M. Pleva, and J. Juhar, “Real -time emotional state detection from facial expression on embedded devices,” in 2017 Seventh International Conference on Innovative Computing Technology (INTECH) , 2017, pp. 46 –51, doi: 10.1109/INTECH.2017.8102423

work page doi:10.1109/intech.2017.8102423 2017

[3] [3]

How affordances of chatbots cross the chasm between social and traditional enterprise systems,

E. Stoeckli, C. Dremel, F. Uebernickel, and W. Brenner, “How affordances of chatbots cross the chasm between social and traditional enterprise systems,” Electron. Mark., vol. 30, pp. 369 –403, 2020, doi: 10.1007/s12525 - 019-00359-6

work page doi:10.1007/s12525 2020

[4] [4]

Number of voice assistants in use worldwide 2019 -2023,

H. Tankovska, “Number of voice assistants in use worldwide 2019 -2023,” Voicebot.ai; Business Wire , 2020. https://www.statista.com/statistics/973815/worldwide- digital-voice-assistant-in-use/ (accessed Sep. 03, 2020)

2019

[5] [5]

Robotics and Artificial Intelligence in Africa [Regional],

D. Vernon, “Robotics and Artificial Intelligence in Africa [Regional],” IEEE Robot. Autom. Mag. , vol. 26, no. 4, pp. 131 –135, Dec. 2019, doi: 10.1109/MRA.2019.2946107

work page doi:10.1109/mra.2019.2946107 2019

[6] [6]

The AI Invasion is Coming to Africa (and It’s a Good Thing),

L. Novitske, “The AI Invasion is Coming to Africa (and It’s a Good Thing),” Stanford Soc. Innov. Rev., 2018, doi: 10.48558/JM86-7M29

work page doi:10.48558/jm86-7m29 2018

[7] [7]

How changes in technology and automation will affect the labour market in Africa,

K. . Millington, “How changes in technology and automation will affect the labour market in Africa,” UK Dep. Int. Dev. , pp. 1 –20, 2017, [Online]. Available: https://opendocs.ids.ac.uk/opendocs/handle/20.500.12413 /13054

2017

[8] [8]

Bias in data -driven artificial intelligence systems —An introductory survey,

E. Ntoutsi et al. , “Bias in data -driven artificial intelligence systems —An introductory survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov. , vol. 10, no. 3, pp. 1–14, 2020, doi: 10.1002/widm.1356

work page doi:10.1002/widm.1356 2020

[9] [9]

Damasio on mind and emotions: A conceptual critique,

S. Brinkmann, “Damasio on mind and emotions: A conceptual critique,” Nord. Psychol. , vol. 58, no. 4, pp. 366–380, 2006, doi: 10.1027/1901-2276.54.4.366

work page doi:10.1027/1901-2276.54.4.366 2006

[10] [10]

Facial expression,

P. Ekman, “Facial expression,” Nonverbal Behav. Commun., vol. 38, no. 2, pp. 97 –166, 1952, doi: 10.1080/00335635209381778

work page doi:10.1080/00335635209381778 1952

[11] [11]

Emotion and Sarcasm Identification of Posts From Facebook Data Using a Hybrid Approach,

V. M. Raghavan, K. P. Mohana, R. R. Sundara, and S. Rajeswari, “Emotion and Sarcasm Identification of Posts From Facebook Data Using a Hybrid Approach,” 7 VOLUME 10, 2022 ICTACT J. Soft Comput. , vol. 07, no. 02, pp. 1427 –1435, 2017, doi: 10.21917/ijsc.2017.0197

work page doi:10.21917/ijsc.2017.0197 2022

[12] [12]

Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds,

K.-Y. Huang, C. -H. Wu, Q. -B. Hong, M. -H. Su, and Y. - H. Chen, “Speech Emotion Recognition Using Deep Neural Network Considering Verbal and Nonverbal Speech Sounds,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2019, pp. 5866 –5870, doi: 10.1109/ICASSP.2019.8682283

work page doi:10.1109/icassp.2019.8682283 2019

[13] [13]

‘Danger, Will Robinson!’ The challenges of social robots for intergroup relations,

E. J. Vanman and A. Kappas, “‘Danger, Will Robinson!’ The challenges of social robots for intergroup relations,” Soc. Personal. Psychol. Compass , vol. 13, no. 8, pp. 1 – 13, 2019, doi: 10.1111/spc3.12489

work page doi:10.1111/spc3.12489 2019

[14] [14]

Acculturative Stress and Specific Coping Strategies among Immigrant and Later Generation College Students,

F. J. Mena, A. M. Padilla, and M. Maldonado, “Acculturative Stress and Specific Coping Strategies among Immigrant and Later Generation College Students,” Hisp. J. Behav. Sci., vol. 9, no. 2, pp. 207–225, 1987, doi: 10.1177/07399863870092006

work page doi:10.1177/07399863870092006 1987

[15] [15]

A Systems Model of Dyadic Nonverbal Interaction,

M. L. Patterson, “A Systems Model of Dyadic Nonverbal Interaction,” J. Nonverbal Behav., vol. 43, no. 2, pp. 111– 132, 2019, doi: 10.1007/s10919-018-00292-w

work page doi:10.1007/s10919-018-00292-w 2019

[16] [16]

Consistent Optical Flow Maps for Full and Micro Facial Expression Recognition Consistent Optical Flow Maps for full and micro facial expression recognition,

B. Allaert, I. M. Bilasco, and C. Djeraba, “Consistent Optical Flow Maps for Full and Micro Facial Expression Recognition Consistent Optical Flow Maps for full and micro facial expression recognition,” no. February, 2017, doi: 10.5220/0006127402350242

work page doi:10.5220/0006127402350242 2017

[17] [17]

Attentional Bias to Facial Expressions of Different Emotions - A Cross -Cultural Comparison of ≠Akhoe Hai||om and German Children and Adolescents.,

C. Mühlenbeck, C. Pritsch, I. Wartenburge r, S. Telkemeyer, and K. Liebal, “Attentional Bias to Facial Expressions of Different Emotions - A Cross -Cultural Comparison of ≠Akhoe Hai||om and German Children and Adolescents.,” Front. Psychol., vol. 11, p. 795, 2020, doi: 10.3389/fpsyg.2020.00795

work page doi:10.3389/fpsyg.2020.00795 2020

[18] [18]

Emotion Detection using Image Processing in Python,

M. S. Raghav Puri, Archit Gupta, “Emotion Detection using Image Processing in Python,” 12th INDIACom; INDIACom-2018; IEEE Conf. ID 42835 2018 5th Int. Conf. “Computing Sustain. Glob. Dev. 14th - 16th March, 2018, pp. 1–6, 2018

2018

[19] [19]

Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units,

P. R. Dachapally, “Facial Emotion Detection Using Convolutional Neural Networks and Representational Autoencoder Units,” ArXiv, vol. abs/1706.0, 2017

2017

[20] [20]

Deep Learning Approaches for Facial Emotion Recognition: A Case Study on FER -2013,

P. Giannopoulos, I. Perikos, and I. Hatzilygeroudis, “Deep Learning Approaches for Facial Emotion Recognition: A Case Study on FER -2013,” in Advances in Hybridization of Intelligent Methods: Models, Systems and Applications, I. Hatzilygeroudis and V. Palade, Eds. Cham: Springer International Publishing, 2018, pp. 1–16

2013

[21] [21]

Facial Emotion Detection Using Deep Learning,

A. Jaiswal, A. Krishnama Raju, and S. Deb, “Facial Emotion Detection Using Deep Learning,” in 2020 International Conference for Emerging Technology (INCET), 2020, pp. 1 –5, doi: 10.1109/INCET49848.2020.9154121

work page doi:10.1109/incet49848.2020.9154121 2020

[22] [22]

Facial Emotion Recognition : State of the Art Performance on FER2013,

Y. Khaireddin and Z. Chen, “Facial Emotion Recognition : State of the Art Performance on FER2013,” no. May, 2021

2021

[23] [23]

AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias,

R. K. E. Bellamy et al., “AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias,” IBM J. Res. Dev. , vol. 63, no. 4 –5, 2019, doi: 10.1147/JRD.2019.2942287

work page doi:10.1147/jrd.2019.2942287 2019

[24] [24]

Facial emotion recognition using transfer learning in the deep CNN,

M. A. H. Akhand, S. Roy, N. Siddique, M. A. S. Kamal, and T. Shimamura, “Facial emotion recognition using transfer learning in the deep CNN,” Electron., vol. 10, no. 9, 2021, doi: 10.3390/electronics10091036

work page doi:10.3390/electronics10091036 2021

[25] [25]

FER-2013 Face Database,

Y. Courville, P.L.C.; Goodfellow, A.; Mirza, I.J.M.; Bengio, “FER-2013 Face Database,” Univ. Montr., 2013

2013

[26] [26]

CREMA -D: Crowd -sourced emotional multimodal actors dataset,

H. Cao, D. G. Cooper, M. K. Keutmann, R. C. Gur, A. Nenkova, and R. Verma, “CREMA -D: Crowd -sourced emotional multimodal actors dataset,” IEEE Trans. Affect. Comput. , vol. 5, no. 4, pp. 377 –390, 2014, doi: 10.1109/TAFFC.2014.2336244

work page doi:10.1109/taffc.2014.2336244 2014

[27] [27]

The Ryerson Audio - Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English,

S. R. Livingstone and F. A. Russo, “The Ryerson Audio - Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English,” PLoS One, vol. 13, no. 5, pp. 1 –35, 2018, doi: 10.1371/journal.pone.0196391

work page doi:10.1371/journal.pone.0196391 2018

[28] [28]

Surrey audio -visual expressed emotion (savee) database,

P. J. and S. ul Haq, “Surrey audio -visual expressed emotion (savee) database,” 2011

2011

[29] [29]

Toronto emotional speech set (TESS),

M. K. Pichora-Fuller and K. Dupuis, “Toronto emotional speech set (TESS).” Scholars Portal Dataverse, doi: doi:10.5683/SP2/E8H2MF

work page doi:10.5683/sp2/e8h2mf

[30] [30]

Chapter 1 - A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION,

R. Plutchik, “Chapter 1 - A GENERAL PSYCHOEVOLUTIONARY THEORY OF EMOTION,” in Theories of Emotion , R. Plutchik and H. Kellerman, Eds. Academic Press, 1980, pp. 3–33

1980

[31] [31]

Talks, We should all be feminists | Chimamanda Ngozi Adichie | TEDxEuston

T. Talks, We should all be feminists | Chimamanda Ngozi Adichie | TEDxEuston . United States, 2013, pp. 10:21 - 10:22 minutes

2013

[32] [32]

Real Time Emotion Detection of Humans Using Mini -Xception Algorithm,

S. A. Fatima, A. Kumar, and S. S. Raoof, “Real Time Emotion Detection of Humans Using Mini -Xception Algorithm,” {IOP} Conf. Ser. Mater. Sci. Eng., vol. 1042, no. 1, p. 12027, Jan. 2021, doi: 10.1088/1757 - 899x/1042/1/012027

work page doi:10.1088/1757 2021

[33] [33]

Facial Expression and Sarcasm,

P. Rockwell, “Facial Expression and Sarcasm,” Percept. Mot. Skills , vol. 93, no. 1, pp. 47 –50, Aug. 2001, doi: 10.2466/pms.2001.93.1.47

work page doi:10.2466/pms.2001.93.1.47 2001