Interventions in LLM-simulated user experiments induce distribution shifts in latent attributes that create confounding bias, diagnosable with negative control outcomes and partially mitigated by adding setting-relevant persona details.
Personas with Attitudes: Controlling LLM s for Diverse Data Annotation
9 Pith papers cite this work. Polarity classification is still indexing.
years
2026 9representative citing papers
Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.
Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.
LLM-generated ML pipelines show higher bias (87.7% sensitive attributes) than conditional statements (59.2%), indicating that simple if-statement tests underestimate bias risk in practical code generation.
A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.
Finetuning Qwen3-32B with data augmentation and self-training achieves competitive 8th-place ranking on SemEval-2026 conspiracy detection.
Finetuning LLMs with QLoRA and multilingual data augmentation for polarization detection, type, and manifestation in SemEval-2026 Task 9.
Fine-tuning LLMs by adapting the mdok approach produces competitive results on binary detection, source attribution, and hybrid/adversarial code identification in SemEval-2026 Task 13.
citing papers explorer
-
The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study
Interventions in LLM-simulated user experiments induce distribution shifts in latent attributes that create confounding bias, diagnosable with negative control outcomes and partially mitigated by adding setting-relevant persona details.
-
Misaligned by Reward: Socially Undesirable Preferences in LLMs
Reward models for LLMs frequently select socially undesirable options across four social domains, show no overall best performer, and exhibit a bias-avoidance versus context-sensitivity trade-off.
-
Why Low-Resource NLP Needs More Than Cross-Lingual Transfer: Lessons Learned from Luxembourgish
Cross-lingual transfer and language-specific data efforts are interdependent and complementary for effective low-resource NLP, as demonstrated through Luxembourgish case studies and synthesis.
-
From If-Statements to ML Pipelines: Revisiting Bias in Code-Generation
LLM-generated ML pipelines show higher bias (87.7% sensitive attributes) than conditional statements (59.2%), indicating that simple if-statement tests underestimate bias risk in practical code generation.
-
FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals
A feature-based decision tree with parsing-derived signals and heuristics detects LLM-generated code in a lightweight, CPU-only setup for SemEval-2026 Task 13.
-
mdok-style at SemEval-2026 Task 10: Finetuning LLMs for Conspiracy Detection
Finetuning Qwen3-32B with data augmentation and self-training achieves competitive 8th-place ranking on SemEval-2026 conspiracy detection.
-
mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection
Finetuning LLMs with QLoRA and multilingual data augmentation for polarization detection, type, and manifestation in SemEval-2026 Task 9.
-
mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
Fine-tuning LLMs by adapting the mdok approach produces competitive results on binary detection, source attribution, and hybrid/adversarial code identification in SemEval-2026 Task 13.
- Psychologically Potent, Computationally Invisible: LLMs Generate Social-Comparison-Eliciting Posts They Fail to Detect