A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.
Elisa Celis
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 3roles
background 2polarities
background 2representative citing papers
A framework estimates grammatical gender directions in contextual embeddings via controlled and natural contexts, finding unweighted controlled contexts and centroid estimators yield the purest directions.
LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.
citing papers explorer
-
Is She Even Relevant? When BERT Ignores Explicit Gender Cues
A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.
-
Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts
A framework estimates grammatical gender directions in contextual embeddings via controlled and natural contexts, finding unweighted controlled contexts and centroid estimators yield the purest directions.
-
LaMDA: Language Models for Dialog Applications
LaMDA shows that fine-tuning on human-value annotations and consulting external knowledge sources significantly improves safety and factual grounding in large dialog models beyond what scaling alone achieves.