C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
LLMs produce overly positive idealized depictions of disability in simulated social media posts that do not match real posts by people with disabilities and show topic bias favoring nondisabled people.
Model developers must address human concerns, preferences, values, and goals with rigor at every stage of the LLM pipeline rather than only in post-training.
citing papers explorer
-
C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment
C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
-
Shiny Stories, Hidden Struggles: Investigating the Representation of Disability Through the Lens of LLMs
LLMs produce overly positive idealized depictions of disability in simulated social media posts that do not match real posts by people with disabilities and show topic bias favoring nondisabled people.
-
Reflections and New Directions for Human-Centered Large Language Models
Model developers must address human concerns, preferences, values, and goals with rigor at every stage of the LLM pipeline rather than only in post-training.