LGMT applies metamorphic testing derived from first-order logic equivalences to detect reasoning inconsistencies in LLMs that static benchmarks miss.
Are large language models really good logical reasoners? a comprehensive evaluation and beyond
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
support 1representative citing papers
The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.
WriteFlow is a voice-based AI system that scaffolds metacognitive regulation in academic writing by enabling iterative goal refinement, goal-text alignment, and evaluation of goal fulfillment, as demonstrated in user studies.
citing papers explorer
-
LGMT: Logic-Grounded Metamorphic Testing for Evaluating the Reasoning Reliability of LLMs
LGMT applies metamorphic testing derived from first-order logic equivalences to detect reasoning inconsistencies in LLMs that static benchmarks miss.
-
Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs
The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.
-
From Intention to Text: AI-Supported Goal Setting in Academic Writing
WriteFlow is a voice-based AI system that scaffolds metacognitive regulation in academic writing by enabling iterative goal refinement, goal-text alignment, and evaluation of goal fulfillment, as demonstrated in user studies.