arxiv: 2604.06863 · v1 · submitted 2026-04-08 · 💻 cs.SI · cs.AI· cs.CL· cs.HC

Recognition: unknown

Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings

Junhua Ding, Mingchen Li, Navyasri Meka, Wajdi Aljedaani, Xinyue Ye, Xuan Lu, Yingjie Liu, Yunhe Feng

Pith reviewed 2026-05-10 17:18 UTC · model grok-4.3

classification 💻 cs.SI cs.AIcs.CLcs.HC

keywords skin-toned emojisLLM biasemoji embeddingssentiment polaritysemantic consistencyrepresentational similarityAI fairnessdigital inclusion

0 comments

The pith

LLMs support skin tone modifiers in emojis more robustly than dedicated embedding models, but both classes exhibit skewed sentiments and inconsistent meanings across skin tones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares representations of skin-toned emojis in large language models and specialized embedding models. It shows that LLMs handle these modifiers effectively while embedding models fall short in basic support. A detailed look at semantic consistency, how representations relate to each other, sentiment differences, and overall biases points to systemic issues where emojis with certain skin tones carry different emotional weights or meanings. This matters for anyone using AI in communication because these models influence how identity and inclusion are expressed online. If the biases hold, platforms relying on these systems risk amplifying existing social divides rather than reducing them.

Core claim

The central discovery is that dedicated emoji embedding models like emoji2vec and emoji-sw2v show severe deficiencies in handling skin tone modifiers, whereas LLMs such as Llama, Gemma, Qwen, and Mistral provide robust support. However, across both types of models, investigations reveal systemic disparities including skewed sentiment polarities and inconsistent semantic meanings associated with emojis of different skin tones.

What carries the argument

The multi-faceted evaluation framework that examines semantic consistency, representational similarity, sentiment polarity, and core biases in skin-toned emoji representations.

Load-bearing premise

The chosen metrics for semantic consistency, sentiment polarity, and representational similarity accurately measure tone-specific biases without being confounded by other aspects of the models or training data.

What would settle it

A follow-up experiment that applies alternative sentiment classifiers or conducts human judgment studies on the same emoji-skin tone pairs and finds uniform polarity and meanings across tones would disprove the presence of systemic disparities.

Figures

Figures reproduced from arXiv: 2604.06863 by Junhua Ding, Mingchen Li, Navyasri Meka, Wajdi Aljedaani, Xinyue Ye, Xuan Lu, Yingjie Liu, Yunhe Feng.

**Figure 2.** Figure 2: Heatmaps illustrating the pairwise Relative Norm Distance (RND) scores for the hand gestures emoji subdomain [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: A boxplot comparing the distribution of the number [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Skin-toned emojis are crucial for fostering personal identity and social inclusion in online communication. As AI models, particularly Large Language Models (LLMs), increasingly mediate interactions on web platforms, the risk that these systems perpetuate societal biases through their representation of such symbols is a significant concern. This paper presents the first large-scale comparative study of bias in skin-toned emoji representations across two distinct model classes. We systematically evaluate dedicated emoji embedding models (emoji2vec, emoji-sw2v) against four modern LLMs (Llama, Gemma, Qwen, and Mistral). Our analysis first reveals a critical performance gap: while LLMs demonstrate robust support for skin tone modifiers, widely-used specialized emoji models exhibit severe deficiencies. More importantly, a multi-faceted investigation into semantic consistency, representational similarity, sentiment polarity, and core biases uncovers systemic disparities. We find evidence of skewed sentiment and inconsistent meanings associated with emojis across different skin tones, highlighting latent biases within these foundational models. Our findings underscore the urgent need for developers and platforms to audit and mitigate these representational harms, ensuring that AI's role on the web promotes genuine equity rather than reinforcing societal biases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper compares skin-tone emoji handling in LLMs versus embedding models but the bias claims rest on metrics that likely track training-data frequency instead of isolating tone effects.

read the letter

The main point is that LLMs appear to handle skin-tone modifiers in emojis more reliably than older dedicated embedding models like emoji2vec, yet both show inconsistencies in sentiment and semantic meaning across tones. The dual-model-class comparison is the clearest new element and gives a useful contrast that prior single-class audits did not provide. The multi-angle probe (semantic consistency, representational similarity, sentiment polarity) is a reasonable way to surface potential issues in digital communication tools. That said, the stress-test concern holds: nothing in the abstract or described approach shows frequency-matched controls or ablations on balanced emoji subsets, so the reported disparities could simply reflect how often lighter or darker skin-tone variants co-occur with certain base emojis in the training corpora. No sample sizes, model versions, or statistical tests are supplied, which leaves the central claims unevaluable. The work is aimed at researchers who audit fairness in consumer AI and social platforms; anyone already running similar embedding or LLM probes might pick up the comparison framing. It is coherent enough on its own terms to warrant a serious referee, provided the authors are asked to add the missing controls and full methodological reporting.

Referee Report

2 major / 2 minor

Summary. The manuscript presents the first large-scale comparative study of tone-based biases in skin-toned emoji representations between specialized emoji embedding models (emoji2vec and emoji-sw2v) and four LLMs (Llama, Gemma, Qwen, Mistral). It claims that LLMs show robust support for skin tone modifiers while emoji models have severe deficiencies, and through analysis of semantic consistency, representational similarity, sentiment polarity, and biases, finds evidence of skewed sentiment and inconsistent meanings across different skin tones.

Significance. If the findings hold after addressing methodological details, this study would be significant for understanding how AI models may perpetuate societal biases in digital communication symbols. It provides a multi-faceted empirical investigation that could guide mitigation strategies for developers and platforms. The comparative aspect across model classes is a strength, though the absence of controls for training data frequencies limits the ability to isolate tone-specific effects.

major comments (2)

[Abstract] Abstract: The abstract asserts evidence of disparities in semantic consistency, sentiment polarity, and core biases but supplies no sample sizes, statistical tests, model versions, or controls; without these details the support for the central claims cannot be evaluated and the multi-faceted investigation remains non-reproducible.
[Methods/Results] Methods/Results (metrics sections): The analysis of semantic consistency, representational similarity, and sentiment polarity does not include frequency-matched controls or ablations on balanced subsets; the observed skewed sentiment and inconsistent meanings across skin tones may therefore reflect training-data frequency imbalances or base-emoji co-occurrence patterns rather than model-internal tone-based bias, which is load-bearing for the claim of systemic disparities.

minor comments (2)

[Methods] Specify the exact model versions and parameter counts for the LLMs (e.g., Llama-2-7B vs. Llama-3) and embedding dimensions for emoji2vec to enable direct replication.
[Methods] Clarify how sentiment polarity is computed (lexicon-based, model-based, or human-annotated) and report inter-annotator agreement if applicable.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive feedback, which highlights important issues of reproducibility and methodological controls. We address each major comment point by point below and describe the corresponding revisions.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract asserts evidence of disparities in semantic consistency, sentiment polarity, and core biases but supplies no sample sizes, statistical tests, model versions, or controls; without these details the support for the central claims cannot be evaluated and the multi-faceted investigation remains non-reproducible.

Authors: We agree that the abstract should provide more concrete details to support evaluation and reproducibility. In the revised manuscript we will update the abstract to specify the exact model versions (Llama-3-8B, Gemma-2-9B, Qwen2-7B, Mistral-7B), the number of base emojis and skin-tone variants analyzed (10 base emojis × 5 tones), sample sizes for each metric, and the statistical procedures used (paired t-tests for sentiment, cosine similarity thresholds for consistency, and representational similarity via vector comparisons). revision: yes
Referee: [Methods/Results] Methods/Results (metrics sections): The analysis of semantic consistency, representational similarity, and sentiment polarity does not include frequency-matched controls or ablations on balanced subsets; the observed skewed sentiment and inconsistent meanings across skin tones may therefore reflect training-data frequency imbalances or base-emoji co-occurrence patterns rather than model-internal tone-based bias, which is load-bearing for the claim of systemic disparities.

Authors: We acknowledge the validity of this concern. The current analyses do not include explicit frequency-matched controls, which limits causal attribution to tone-specific bias versus data frequency. For the two emoji embedding models we will add frequency counts from their training corpora and perform ablations on balanced base-emoji subsets. For the LLMs, detailed token-level training frequencies are not publicly available, so direct matching is not feasible; we will instead add a limitations discussion and proxy analyses of co-occurrence patterns drawn from public web corpora. The revised methods and results sections will present these additions and qualify the interpretation of systemic disparities accordingly. revision: partial

standing simulated objections not resolved

Full frequency-matched controls for the LLMs, because their training data frequencies are proprietary and inaccessible.

Circularity Check

0 steps flagged

No circularity in empirical model comparison

full rationale

The paper is a purely empirical study that compares existing LLMs and emoji embedding models on metrics for semantic consistency, representational similarity, sentiment polarity, and bias. It reports observed performance gaps and disparities without any derivation chain, equations, fitted parameters renamed as predictions, or load-bearing self-citations that reduce the central claims to inputs by construction. All steps rely on direct evaluation of pre-trained models against external data, keeping the analysis self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical observational study; no mathematical derivations, fitted parameters in equations, new axioms, or postulated entities appear in the abstract.

pith-pipeline@v0.9.0 · 5539 in / 1111 out tokens · 96256 ms · 2026-05-10T17:18:07.377157+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 9 canonical work pages · 4 internal anchors

[1]

Mistral AI. 2024. Mistral-7B-Instruct-v0.3. https://huggingface.co/mistralai/ Mistral-7B-Instruct-v0.3. Apache 2.0 License

2024
[2]

Francesco Barbieri and Jose Camacho-Collados. 2018. How gender and skin tone modifiers affect emoji semantics in twitter. The Association for Computational Linguistics

2018
[3]

Natã M Barbosa and Monchu Chen. 2019. Rehumanized crowdsourcing: A labeling framework addressing bias and ethics in machine learning. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12

2019
[4]

Elena Barry, Shoaib Jameel, and Haider Raza. 2021. Emojional: Emoji Embeddings. InUK Workshop on Computational Intelligence. Springer, 312–324

2021
[5]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. En- riching Word Vectors with Subword Information.arXiv preprint arXiv:1607.04606 (2016)

work page arXiv 2016
[6]

André Brock. 2011. Beyond the pale: The Blackbird web browser’s critical reception.New Media & Society13, 7 (2011), 1085–1103

2011
[7]

When Keeping it Real Goes Wrong

André Brock. 2011. “When Keeping it Real Goes Wrong”: Resident Evil 5, Racial Representation, and Gamers.Games and Culture6, 5 (2011), 429–452

2011
[8]

Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases.Science356, 6334 (2017), 183–186

2017
[9]

Stephen Cave and Kanta Dihal. 2020. The whiteness of AI.Philosophy & Technol- ogy33, 4 (2020), 685–703

2020
[10]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint arXiv:1810.04805(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, and Sebastian Riedel. 2016. emoji2vec: Learning emoji representations from their description. arXiv preprint arXiv:1609.08359(2016)

work page arXiv 2016
[12]

Yunhe Feng, Zheng Lu, Wenjun Zhou, Zhibo Wang, and Qing Cao. 2020. New emoji requests from Twitter users: when, where, why, and what we can do about them.ACM Transactions on Social Computing3, 2 (2020), 1–25

2020
[13]

Thomas B Fitzpatrick. 1988. The validity and practicality of sun-reactive skin types I through VI.Archives of dermatology124, 6 (1988), 869–871

1988
[14]

Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes.Proceedings of the National Academy of Sciences115, 16 (2018), E3635–E3644

2018
[15]

Dirk Hovy and Shrimai Prabhumoye. 2021. Five sources of bias in natural language processing.Language and Linguistics Compass15, 8 (2021), e12432

2021
[16]

Tianran Hu, Han Guo, Hao Sun, Thuy-vy Thi Nguyen, and Jiebo Luo. 2017. Spice up your chat: the intentions and sentiment effects of using emojis. InEleventh international aaai conference on web and social media

2017
[17]

Turn that frown upside-down

Linda K Kaye, Helen J Wall, and Stephanie A Malone. 2016. “Turn that frown upside-down”: A contextual account of emoticon usage on different virtual plat- forms.Computers in Human Behavior60 (2016), 463–467

2016
[18]

Petra Kralj Novak, Jasmina Smailović, Borut Sluban, and Igor Mozetič. 2015. Sentiment of emojis.PloS one10, 12 (2015), e0144296

2015
[19]

Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing.arXiv preprint arXiv:1808.06226(2018)

work page internal anchor Pith review arXiv 2018
[20]

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. InInternational conference on machine learn- ing. PMLR, 957–966

2015
[21]

Meta. 2024. Meta-Llama-3.2-1B-Instruct. https://huggingface.co/meta-llama/ Llama-3.2-1B-Instruct. Llama 3.2 Community License

2024
[22]

Meta. 2024. Meta-Llama-3.2-3B-Instruct. https://huggingface.co/meta-llama/ Llama-3.2-3B-Instruct. Llama 3.2 Community License

2024
[23]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space.arXiv preprint arXiv:1301.3781 (2013)

work page internal anchor Pith review arXiv 2013
[24]

Hannah Miller, Daniel Kluver, Jacob Thebault-Spieker, Loren Terveen, and Brent Hecht. 2017. Understanding emoji ambiguity in context: The role of text in emoji- related miscommunication. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 11. 152–161

2017
[25]

One part politics, one part technology, one part history

Kate M Miltner. 2021. “One part politics, one part technology, one part history”: Racial representation in the Unicode 7.0 emoji set.New Media & Society23, 3 (2021), 515–534

2021
[26]

Saif Mohammad. 2018. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. InProceedings of the 56th annual meeting of the association for computational linguistics (volume 1: Long papers). 174–184

2018
[27]

Safiya Umoja Noble. 2018. Algorithms of oppression. InAlgorithms of oppression. New York university press

2018
[28]

Arielle Pardes. 2015. The solution to the emoji diversity problem: Make them all yellow. https://www.vice.com/en/article/wd7ejm/emoji-shouldve-made-all- their-characters-yellow-408

2015
[29]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. InProceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543

2014
[30]

2019.Invisible women: Data bias in a world designed for men

Caroline Criado Perez. 2019.Invisible women: Data bias in a world designed for men. Abrams

2019
[31]

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al . 2018. Improving language understanding by generative pre-training. (2018)

2018
[32]

Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable auditing: Investi- gating the impact of publicly naming biased performance results of commercial ai products. InProceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 429–435

2019
[33]

Jens Helge Reelfs, Oliver Hohlfeld, Markus Strohmaier, and Niklas Henckell
[34]

Word-emoji embeddings from large scale messaging data reflect real-world semantic associations of expressive icons.arXiv preprint arXiv:2006.01207(2020)

work page arXiv 2006
[35]

Sashank Santhanam, Vidhushini Srinivasan, Shaina Glass, and Samira Shaikh
[36]

I stand with you: Using emojis to study solidarity in crisis events.arXiv preprint arXiv:1907.08326(2019)

work page arXiv 1907
[37]

Rico Sennrich, Barry Haddow, and Alexandra Birch. 2015. Neural machine translation of rare words with subword units.arXiv preprint arXiv:1508.07909 (2015)

work page internal anchor Pith review arXiv 2015
[38]

Aditya Shirsath. 2021. Training Word2Vec model for emojis using twitter data. https://github.com/AdiShirsath/Emoji_Word2Vec

2021
[39]

Robert Sparrow. 2020. Robotics has a race problem.Science, Technology, & Human Values45, 3 (2020), 538–560

2020
[40]

Chris Sweeney and Maryam Najafian. 2019. A transparent framework for evalu- ating unintended demographic bias in word embeddings. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1662–1667

2019
[41]

Gemma Team. 2024. Gemma. (2024). doi:10.34740/KAGGLE/M/3301

work page doi:10.34740/kaggle/m/3301 2024
[42]

Qwen Team et al. 2024. Qwen2 Technical Report. (2024)

2024
[43]

Garreth W Tigwell and David R Flatla. 2016. Oh that’s what you meant! Reducing emoji misunderstanding. InProceedings of the 18th international conference on human-computer interaction with mobile devices and services adjunct. 859–866

2016
[44]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need.Advances in neural information processing systems30 (2017)

2017
[45]

I love you

Sarah Wiseman and Sandy JJ Gould. 2018. Repurposing emoji for personalised communication: Why the pizza emoji means “I love you”. InProceedings of the 2018 CHI conference on human factors in computing systems. 1–10

2018
[46]

J Zimmerman. 2015. Racially diverse emoji are a nice idea. But will anyone use them.The Guardian(2015). A Detailed Token Distribution Analysis To complement the summary statistics in the main text, Figure 3 presents a boxplot visualizing the distribution of token counts re- quired to represent the full set of 2,735 skin-toned emojis across the five tested...

2015