arxiv: 2604.17398 · v1 · submitted 2026-04-19 · 💻 cs.CL

Recognition: unknown

Contrastive Analysis of Linguistic Representations in Large Language Model Outputs through Structured Synthetic Data Generation and Abstracted N-gram Associations

L. Alonso Alemany, S.A. Desimone

Authors on Pith no claims yet

Pith reviewed 2026-05-10 05:50 UTC · model grok-4.3

classification 💻 cs.CL

keywords linguistic biassocial groupssynthetic dataminimal pairspointwise mutual informationcontrastive analysislanguage modelsdiscourse patterns

0 comments

The pith

A framework creates minimal pairs of synthetic texts differing only by social group to detect subtle linguistic biases in language model outputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a method to find linguistic patterns linked to different social groups in AI text by generating synthetic data. It produces comparable texts through fixed scenarios combined with varying group references, forming minimal pairs that isolate the group effect. These texts undergo abstraction of linguistic forms whose associations with groups are measured statistically. High-signal segments are then ranked for human experts to evaluate bias in full context. This matters for catching nuanced, context-embedded expressions of bias that fixed word lists overlook, and it works for any text genre.

Core claim

By generating contextualized data via controlled combinations of situational scenarios and group markers to form minimal pairs of texts that differ only in the referenced group, then generalizing linguistic forms and quantifying their disproportionate associations across groups with a variant of pointwise mutual information, followed by a fragment-ranking strategy to prioritize segments for expert review, the framework discovers linguistic and discursive patterns associated with different social groups in large language model outputs while bridging quantitative detection and qualitative interpretation.

What carries the argument

Minimal pairs of contextualized texts created by combining situational scenarios with group markers, with associations between abstracted linguistic forms and groups quantified by a pointwise mutual information variant and prioritized through fragment ranking.

If this is right

Subtle bias expressions can be characterized without relying on pre-determined lists of words or expressions.
Analysis applies to full contextualized texts rather than isolated words or sentences, across narrative, task-oriented, or dialogic genres.
Quantitative scores from the statistical associations can be used to rank and surface text fragments for targeted qualitative expert assessment.
The method supports detection of patterns in any genre by holding narrative conditions constant while varying only group references.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same generation process could be applied directly to outputs from specific language models to compare bias patterns across different systems.
Patterns identified this way might serve as targets for future model fine-tuning aimed at reducing group-linked linguistic differences.
Applying the framework to real discourse data from news or social media could test whether synthetic findings match observed patterns outside controlled generation.

Load-bearing premise

Controlled combinations of situational scenarios and group markers can generate minimal pairs of texts that differ only in the referenced group while maintaining comparable narrative conditions without introducing confounding linguistic differences.

What would settle it

If the generated minimal-pair texts for different groups show consistent unintended differences in length, syntactic complexity, or lexical diversity unrelated to the group markers, the contrastive analysis would be invalidated.

Figures

Figures reproduced from arXiv: 2604.17398 by L. Alonso Alemany, S.A. Desimone.

**Figure 1.** Figure 1: Equivalence class of n-grams with the lemmas “enfrentar” (face) and “obst´aculo” (obstacle). to their corresponding lemmas. N-grams that share the same set of lemmas are then grouped into a common equivalence class, independently of word order. This grouping relies on the assumption that such n-grams convey a similar underlying meaning despite surface-level variation. By aggregating multiple low-frequency … view at source ↗

**Figure 3.** Figure 3: These fragments are then assessed by experts to determine whether they cause harm. In this stage, we operationalize the identification of potentially biased text fragments by leveraging the previously defined equivalence classes and their corresponding BS values. This involves specifying which types of fragments and equivalence classes are relevant, as well as applying a scoring mechanism to prioritize fra… view at source ↗

**Figure 2.** Figure 2 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 2.** Figure 2: An example prompts to generate contrastive stories with constant scenarios (in this [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Example of fragments of texts with high positive BiasScore (strongly associated [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

We present a methodological framework to discover linguistic and discursive patterns associated to different social groups through contrastive synthetic text generation and statistical analysis. In contrast with previous approaches, we aim to characterize subtle expressions of bias, instead of diagnosing bias through a pre-determined list of words or expressions. We are also working with contextualized data instead of isolated words or sentences. Our methodology applies to textual productions in any genre, encompassing narrative, task-oriented or dialogic. Contextualized data are generated using controlled combinations of situational scenarios and group markers, creating minimal pairs of texts that differ only in the referenced group while maintaining comparable narrative conditions. To facilitate robust analysis, linguistic forms are generalized and associations between linguistic abstractions and groups are quantified using a variant of pointwise mutual information to detect expressions that appear disproportionately across groups. A fragment-ranking strategy then prioritizes text segments with a high concentration of biased linguistic signals, which allows for experts to assess the harmful potential of linguistic expressions in context, bridging quantitative analysis and qualitative interpretation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a pipeline for finding group-associated language in LLM outputs via synthetic minimal pairs, n-gram abstraction, and PMI scoring, but it offers no results to show the pairs stay minimal or the scores capture real patterns rather than model artifacts.

read the letter

The main takeaway is a methodological sketch for contrastive analysis of bias in generated text. It generates pairs of LLM outputs that hold the scenario fixed while swapping group markers, abstracts the language to n-grams, scores associations with a PMI variant, and ranks fragments for human inspection. The goal is to surface subtle, contextual signals instead of relying on preset word lists, and the approach is meant to work across narrative, dialog, or task text.

Referee Report

2 major / 1 minor

Summary. The manuscript presents a methodological framework for discovering subtle linguistic and discursive patterns associated with different social groups via contrastive synthetic text generation. Controlled combinations of situational scenarios and group markers are used to produce minimal pairs of LLM-generated texts that differ only in the referenced group; linguistic forms are then abstracted (via n-grams), associations are quantified with a PMI variant, and a fragment-ranking procedure prioritizes segments for qualitative expert review. The approach is claimed to apply across genres, avoid reliance on pre-specified bias lexicons, and bridge quantitative detection with contextual interpretation.

Significance. If the generation step successfully isolates group effects and the PMI-based associations prove reliable, the framework could offer a scalable, context-aware alternative to lexicon-based bias detection, with potential utility for auditing LLM outputs and analyzing discourse in computational linguistics and fairness research.

major comments (2)

[Abstract] Abstract: the central claim that 'controlled combinations of situational scenarios and group markers, creating minimal pairs of texts that differ only in the referenced group while maintaining comparable narrative conditions' is presented without any validation, ablation, or discussion of LLM-induced confounders (e.g., shifts in lexical choice, sentence complexity, or implicit content triggered by demographic markers). Because the subsequent n-gram abstraction and PMI scoring operate directly on these outputs, any generation artifact would be misattributed as a group-associated signal, rendering the pipeline circular.
[Abstract] Abstract and methodology description: no empirical results, validation experiments, error analysis, or quantitative metrics are supplied to show that the PMI variant or fragment-ranking strategy actually surfaces bias expressions; all claims rest on the method's description alone, leaving the soundness of the approach untested.

minor comments (1)

[Abstract] Abstract: the precise formulation of the 'variant of pointwise mutual information' and the abstraction procedure for n-grams are not specified, hindering assessment of novelty and reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and commit to revisions that strengthen the presentation of the methodological framework.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'controlled combinations of situational scenarios and group markers, creating minimal pairs of texts that differ only in the referenced group while maintaining comparable narrative conditions' is presented without any validation, ablation, or discussion of LLM-induced confounders (e.g., shifts in lexical choice, sentence complexity, or implicit content triggered by demographic markers). Because the subsequent n-gram abstraction and PMI scoring operate directly on these outputs, any generation artifact would be misattributed as a group-associated signal, rendering the pipeline circular.

Authors: We agree that the current manuscript would be improved by explicit discussion of the minimal-pair construction and potential LLM-induced confounders. The intended control relies on identical situational prompts that differ solely in the group marker, combined with instructions to preserve narrative structure and length. We acknowledge that LLMs can still introduce unintended variations in lexical choice or complexity. In the revised version we will expand the methodology section with a detailed account of prompt design, a new subsection on possible confounders, and mitigation strategies such as post-generation normalization checks. revision: yes
Referee: [Abstract] Abstract and methodology description: no empirical results, validation experiments, error analysis, or quantitative metrics are supplied to show that the PMI variant or fragment-ranking strategy actually surfaces bias expressions; all claims rest on the method's description alone, leaving the soundness of the approach untested.

Authors: The manuscript presents the framework as a methodological contribution. We recognize that empirical demonstration of the PMI variant and ranking procedure would strengthen the claims. In the revision we will add a dedicated experimental section that applies the full pipeline to a collection of controlled synthetic scenarios, reports quantitative metrics on association scores, provides concrete examples of ranked fragments, and includes an error analysis of false positives and negatives. revision: yes

Circularity Check

0 steps flagged

No circularity: standard statistical pipeline on synthetic data

full rationale

The paper presents a methodological framework that generates contextualized minimal-pair texts via controlled LLM prompts and then applies abstracted n-gram analysis with a PMI variant to quantify group associations. No equations, derivations, or fitted parameters are described that reduce the output associations to the generation inputs by construction. The approach invokes no self-citation load-bearing uniqueness theorems, no ansatz smuggling, and no renaming of known results as novel derivations. The central claim remains a standard application of contrastive statistical analysis to generated data, which is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full manuscript details on parameters, exact statistical variant, and validation are unavailable. The central assumption is that synthetic generation can isolate group effects.

axioms (1)

domain assumption Synthetic texts can be generated as minimal pairs differing only in group markers while preserving comparable narrative conditions
Invoked as the basis for contrastive analysis in the abstract description of data generation.

pith-pipeline@v0.9.0 · 5479 in / 1240 out tokens · 45874 ms · 2026-05-10T05:50:44.630235+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 16 canonical work pages

[1]

Detecting Subtle Biases: An Ethical Lens on Underexplored Areas in AI Language Models Biases

Bali, Shayan and Farsi, Farhan and Hosseini, Mohammad and Khorramrouz, Adel and Asgari, Ehsaneddin. Detecting Subtle Biases: An Ethical Lens on Underexplored Areas in AI Language Models Biases. Proceedings of the 19th Conference of the E uropean Chapter of the A ssociation for C omputational L inguistics (Volume 1: Long Papers). 2026. doi:10.18653/v1/2026...

work page doi:10.18653/v1/2026.eacl-long.345 2026
[2]

Disability & Society , volume =

Grue, Jan , title =. Disability & Society , volume =. 2016 , doi =

2016
[3]

La interseccionalidad: una aproximación situada a la dominación , journal =

Mara. La interseccionalidad: una aproximación situada a la dominación , journal =. 2016 , issn =. doi:https://doi.org/10.1016/j.df.2016.09.005 , url =

work page doi:10.1016/j.df.2016.09.005 2016
[4]

Semantics derived automatically from language corpora contain human-like biases,

Caliskan, Aylin and Bryson, Joanna J. and Narayanan, Arvind , year=. Semantics derived automatically from language corpora contain human-like biases , volume=. Science , publisher=. doi:10.1126/science.aal4230 , number=

work page doi:10.1126/science.aal4230
[5]

Learning Gender-Neutral Word Embeddings

Zhao, Jieyu and Zhou, Yichao and Li, Zeyu and Wang, Wei and Chang, Kai-Wei. Learning Gender-Neutral Word Embeddings. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. doi:10.18653/v1/D18-1521

work page doi:10.18653/v1/d18-1521 2018
[6]

C row S -Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

Nangia, Nikita and Vania, Clara and Bhalerao, Rasika and Bowman, Samuel R. C row S -Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.154

work page doi:10.18653/v1/2020.emnlp-main.154 2020
[7]

S tereo S et: Measuring stereotypical bias in pretrained language models

Nadeem, Moin and Bethke, Anna and Reddy, Siva. S tereo S et: Measuring stereotypical bias in pretrained language models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:10.18653/v1/2021.acl-long.416

work page doi:10.18653/v1/2021.acl-long.416 2021
[8]

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? , doi =

Bender, Emily and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret , year =. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? , doi =
[9]

2021 , issue_date =

Mehrabi, Ninareh and Morstatter, Fred and Saxena, Nripsuta and Lerman, Kristina and Galstyan, Aram , title =. 2021 , issue_date =

2021
[10]

2022 , eprint=

BBQ: A Hand-Built Bias Benchmark for Question Answering , author=. 2022 , eprint=

2022
[11]

2023 , isbn =

Gadiraju, Vinitha and Kane, Shaun and Dev, Sunipa and Taylor, Alex and Wang, Ding and Denton, Remi and Brewer, Robin , title =. 2023 , isbn =. doi:10.1145/3593013.3593989 , booktitle =

work page doi:10.1145/3593013.3593989 2023
[12]

On the Interpretability and Significance of Bias Metrics in Texts: a PMI -based Approach

Valentini, Francisco and Rosati, Germ \'a n and Blasi, Dami \'a n and Fernandez Slezak, Diego and Altszyler, Edgar. On the Interpretability and Significance of Bias Metrics in Texts: a PMI -based Approach. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2023. doi:10.18653/v1/2023.acl-short.44

work page doi:10.18653/v1/2023.acl-short.44 2023
[13]

Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages =

Aka, Osman and Burke, Ken and Bauerle, Alex and Greer, Christina and Mitchell, Margaret , title =. Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , pages =. 2021 , isbn =. doi:10.1145/3461702.3462557 , abstract =

work page doi:10.1145/3461702.3462557 2021
[14]

Identifying and Reducing Gender Bias in Word-Level Language Models

Bordia, Shikha and Bowman, Samuel R. Identifying and Reducing Gender Bias in Word-Level Language Models. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Student Research Workshop. 2019. doi:10.18653/v1/N19-3002

work page doi:10.18653/v1/n19-3002 2019
[15]

``Kelly is a Warm Person, Joseph is a Role Model'': Gender Biases in LLM -Generated Reference Letters

Wan, Yixin and Pu, George and Sun, Jiao and Garimella, Aparna and Chang, Kai-Wei and Peng, Nanyun. ``Kelly is a Warm Person, Joseph is a Role Model'': Gender Biases in LLM -Generated Reference Letters. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.243

work page doi:10.18653/v1/2023.findings-emnlp.243 2023
[16]

2023 , eprint=

Evaluating and Mitigating Discrimination in Language Model Decisions , author=. 2023 , eprint=

2023
[17]

Gender bias and stereotypes in Large Language Models , url=

Kotek, Hadas and Dockum, Rikker and Sun, David , year=. Gender bias and stereotypes in Large Language Models , url=. doi:10.1145/3582269.3615599 , booktitle=

work page doi:10.1145/3582269.3615599
[18]

Decoding Ableism in Large Language Models: An Intersectional Approach

Li, Rong and Kamaraj, Ashwini and Ma, Jing and Ebling, Sarah. Decoding Ableism in Large Language Models: An Intersectional Approach. Proceedings of the Third Workshop on NLP for Positive Impact. 2024. doi:10.18653/v1/2024.nlp4pi-1.22

work page doi:10.18653/v1/2024.nlp4pi-1.22 2024
[19]

Gallegos, Ryan A

Gallegos, Isabel O. and Rossi, Ryan A. and Barrow, Joe and Tanjim, Md Mehrab and Kim, Sungchul and Dernoncourt, Franck and Yu, Tong and Zhang, Ruiyi and Ahmed, Nesreen K. Bias and Fairness in Large Language Models: A Survey. Computational Linguistics. 2024. doi:10.1162/coli_a_00524

work page doi:10.1162/coli_a_00524 2024
[20]

2024 , eprint=

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models , author=. 2024 , eprint=

2024
[21]

2024 , eprint=

Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models , author=. 2024 , eprint=

2024
[22]

Griffiths , title =

Xuechunzi Bai and Angelina Wang and Ilia Sucholutsky and Thomas L. Griffiths , title =. Proceedings of the National Academy of Sciences , volume =. 2025 , doi =

2025
[23]

2025 , eprint=

AccessEval: Benchmarking Disability Bias in Large Language Models , author=. 2025 , eprint=

2025
[24]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , year =

Miceli, Milagros and Dinika, Adio-Adet and Kauffman, Krystal and Salim Wagner, Camilla and Sachenbacher, Laurenz and Hanna, Alex and Gebru, Timnit , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , year =. doi:10.1609/aies.v8i2.36667 , url =

work page doi:10.1609/aies.v8i2.36667
[25]

Cold, Calculated, and Condescending

Phutane, Mahika and Seelam, Ananya and Vashistha, Aditya , year=. “Cold, Calculated, and Condescending”: How AI Identifies and Explains Ableism Compared to Disabled People , url=. doi:10.1145/3715275.3732128 , booktitle=

work page doi:10.1145/3715275.3732128
[26]

2026 , eprint=

Implicit Bias in LLMs for Transgender Populations , author=. 2026 , eprint=

2026