GiLT augments Transformers with semantic dependency graphs by modulating attention to improve syntactic generalization while keeping perplexity competitive and enabling better finetuning on downstream tasks.
Transactions of the Association for Computational Linguistics , volume =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Amortized optimization with policy gradients and graph knowledge selects informative word subsets to explain black-box DLM outputs.
citing papers explorer
-
GiLT: Augmenting Transformer Language Models with Dependency Graphs
GiLT augments Transformers with semantic dependency graphs by modulating attention to improve syntactic generalization while keeping perplexity competitive and enabling better finetuning on downstream tasks.