Reasoning models detect modifications to their chains of thought with only modest accuracy and cannot reliably identify the nature of those modifications.
A Dependency Treebank of Spoken Second Language E nglish
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Parser agreement between two adapted models serves as a reliable proxy for human correctness in L2 Korean UD annotation, with disagreements clustering in predictable linguistic areas like grammatical relations and clause boundaries.
citing papers explorer
-
Can Reasoning Models Detect Changes to their Chains of Thought?
Reasoning models detect modifications to their chains of thought with only modest accuracy and cannot reliably identify the nature of those modifications.
-
Parser agreement and disagreement in L2 Korean UD: Implications for human-in-the-loop annotation
Parser agreement between two adapted models serves as a reliable proxy for human correctness in L2 Korean UD annotation, with disagreements clustering in predictable linguistic areas like grammatical relations and clause boundaries.