Transformer activations show spectral anti-concentration for concepts in the tail while syntax prefers high-variance directions, forming a dual geometry.
arXiv preprint arXiv:2505.13141 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
method 1polarities
use method 1representative citing papers
C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
SPLIT benchmark finds Gemini-2.5-Flash and LLaMA-3.3-70B degrade in Ukrainian while DeepSeek-V3 stays stable, with weak human-AI agreement on cultural grounding.
citing papers explorer
-
Concepts Whisper While Syntax Shouts: Spectral Anti-Concentration and the Dual Geometry of Transformer Representations
Transformer activations show spectral anti-concentration for concepts in the tail while syntax prefers high-variance directions, forming a dual geometry.
-
C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment
C-Mining automatically mines high-fidelity Culture Points from raw multilingual text by treating cross-lingual geometric isolation in embeddings as a quantifiable signal for cultural specificity, then uses them to synthesize better instruction data.
-
SPLIT: Cross-Lingual Empathy and Cultural Grounding in English and Ukrainian LLM Responses
SPLIT benchmark finds Gemini-2.5-Flash and LLaMA-3.3-70B degrade in Ukrainian while DeepSeek-V3 stays stable, with weak human-AI agreement on cultural grounding.