Recognition: no theorem link
Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs
Pith reviewed 2026-05-10 17:53 UTC · model grok-4.3
The pith
A unified two-stage framework organizes research on abductive reasoning in large language models into hypothesis generation and selection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
This paper presents the first survey of abductive reasoning in LLMs, tracing its trajectory from philosophical foundations to contemporary AI implementations. To address the widespread conceptual confusion and disjointed task definitions prevalent in the field, we establish a unified two-stage definition that formally categorizes prior work. This definition disentangles abduction into Hypothesis Generation, where models bridge epistemic gaps to produce candidate explanations, and Hypothesis Selection, where the generated candidates are evaluated and the most plausible explanation is chosen. Building upon this foundation, we present a comprehensive taxonomy of the literature, categorizing the
What carries the argument
The unified two-stage definition of abductive reasoning, separating Hypothesis Generation from Hypothesis Selection, which structures the taxonomy and benchmark analysis.
If this is right
- Previous studies on abductive reasoning in LLMs can be systematically categorized by tasks, datasets, methodologies, and evaluation strategies.
- LLMs demonstrate distinct performance patterns when generating candidate explanations versus selecting the most plausible one.
- Abductive reasoning performance relates to deductive and inductive reasoning capabilities, offering broader insights into model reasoning.
- Critical gaps exist in static benchmark designs, narrow domain coverage, limited training frameworks, and insufficient mechanistic understanding.
Where Pith is reading between the lines
- Training methods could be designed to target hypothesis generation and selection as separate skills to enhance overall abductive performance.
- Links between different reasoning types may support development of AI systems capable of multiple inference modes.
- Addressing the noted gaps would require creating benchmarks with wider domains and more dynamic designs.
- Techniques for understanding model internals could be focused on how explanations are formed during abductive tasks.
Load-bearing premise
The proposed two-stage split into Hypothesis Generation and Hypothesis Selection accurately and exhaustively organizes all prior abductive work without omitting important variants or forcing artificial boundaries on existing task definitions.
What would settle it
A significant collection of abductive reasoning research that cannot be classified into either the hypothesis generation stage or the hypothesis selection stage would falsify the completeness of the unified definition.
Figures
read the original abstract
Regardless of its foundational role in human discovery and sense-making, abductive reasoning--the inference of the most plausible explanation for an observation--has been relatively underexplored in Large Language Models (LLMs). Despite the rapid advancement of LLMs, the exploration of abductive reasoning and its diverse facets has thus far been disjointed rather than cohesive. This paper presents the first survey of abductive reasoning in LLMs, tracing its trajectory from philosophical foundations to contemporary AI implementations. To address the widespread conceptual confusion and disjointed task definitions prevalent in the field, we establish a unified two-stage definition that formally categorizes prior work. This definition disentangles abduction into Hypothesis Generation, where models bridge epistemic gaps to produce candidate explanations, and Hypothesis Selection, where the generated candidates are evaluated and the most plausible explanation is chosen. Building upon this foundation, we present a comprehensive taxonomy of the literature, categorizing prior work based on their abductive tasks, datasets, underlying methodologies, and evaluation strategies. In order to ground our framework empirically, we conduct a compact benchmark study of current LLMs on abductive tasks, together with targeted comparative analyses across model sizes, model families, evaluation styles, and the distinct generation-versus-selection task typologies. Moreover, by synthesizing recent empirical results, we examine how LLM performance on abductive reasoning relates to deductive and inductive tasks, providing insights into their broader reasoning capabilities. Our analysis reveals critical gaps in current approaches--from static benchmark design and narrow domain coverage to narrow training frameworks and limited mechanistic understanding of abductive processes...
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to provide the first comprehensive survey of abductive reasoning in large language models (LLMs). It traces the concept from its philosophical origins to modern AI applications, introduces a unified two-stage definition consisting of Hypothesis Generation and Hypothesis Selection to resolve conceptual confusion and disjointed task definitions, develops a taxonomy organizing the literature by abductive tasks, datasets, methodologies, and evaluation strategies, performs a compact benchmark study comparing LLMs on these tasks with analyses across model sizes, families, and task types, and synthesizes results to relate abductive reasoning to deductive and inductive capabilities while identifying gaps in benchmarks, domains, training, and mechanistic understanding.
Significance. If the proposed taxonomy and two-stage definition prove to be comprehensive and accurate, this work will serve as a foundational reference for standardizing research on abductive reasoning in LLMs. The empirical benchmark, despite being compact, offers valuable comparative insights, and the examination of relations to other reasoning types contributes to understanding broader LLM reasoning. The identification of critical gaps provides clear directions for future work. The synthesis of external literature is a strength, though the empirical component requires more detail for full impact.
minor comments (3)
- [Benchmark Study] The compact benchmark study lacks detailed methods, specific dataset lists, or statistical controls, limiting verifiability of the comparative analyses across model sizes, families, and generation-versus-selection typologies.
- A summary table mapping surveyed works to the two-stage taxonomy categories would improve readability and allow readers to quickly assess coverage.
- [Abstract] The abstract describes 'targeted comparative analyses' and 'synthesizing recent empirical results' without naming the models, metrics, or key quantitative findings, reducing standalone clarity.
Simulated Author's Rebuttal
We thank the referee for the positive and constructive review, which recognizes the paper's role as a foundational survey and the value of the unified taxonomy, benchmark, and gap analysis. We appreciate the recommendation for minor revision and address the single point raised regarding the empirical component below.
read point-by-point responses
-
Referee: The empirical component requires more detail for full impact.
Authors: We agree that expanding the description of the compact benchmark would strengthen the manuscript. In the revision, we will add: (1) explicit details on task selection criteria and prompt templates used for generation vs. selection stages; (2) full per-model performance tables with standard deviations across runs; (3) a brief error analysis categorizing failure modes by hypothesis quality; and (4) justification for the benchmark's scope relative to the taxonomy. These additions will be placed in an expanded Section 5 without altering the compact nature of the study. revision: yes
Circularity Check
No significant circularity in survey and taxonomy synthesis
full rationale
This paper is a literature survey that proposes a two-stage taxonomy (Hypothesis Generation followed by Hypothesis Selection) to organize existing abductive reasoning work in LLMs. No equations, fitted parameters, predictions, or derivations appear in the provided text or abstract. The central claims rest on synthesis and categorization of external prior literature rather than any internal reduction to the paper's own inputs or self-citations. The two-stage split is presented as an organizing framework, not as a result derived from data or prior self-work within the manuscript. This is the expected non-circular outcome for a survey paper whose contributions are classificatory rather than predictive or deductive.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Sathyanarayanan N. Aakur and Sudeep Sarkar. Leveraging symbolic knowledge bases for commonsense natural language inference using pattern theory. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 0 (11): 0 13185--13202, 2023. doi:10.1109/TPAMI.2023.3287837
-
[2]
Atocha Aliseda. Abductive Reasoning: Logical Investigations into Discovery and Explanation, volume 330 of Synthese Library. Springer Dordrecht, 2006. ISBN 978-1-4020-3907-2. doi:10.1007/1-4020-3907-7
-
[3]
Alkan, Shashwat Sourav, Maja Jabłońska, Simone Astarita, Rishabh Chakrabarty, N
A. Alkan, Shashwat Sourav, Maja Jabłońska, Simone Astarita, Rishabh Chakrabarty, N. Garuda, P. Khetarpal, Maciej Pi'oro, Dimitrios Tanoglidis, Kartheik G. Iyer, M. Polimera, Michael J. Smith, Tirthankar Ghosal, M. Huertas-Company, Sandor Kruk, Kevin Schawinski, and Ioana Ciucua. A survey on hypothesis generation for scientific discovery in the era of larg...
2025
-
[4]
Advancing abductive reasoning in knowledge graphs through complex logical hypothesis generation
Jiaxin Bai, Yicheng Wang, Tianshi Zheng, Yue Guo, Xin Liu, and Yangqiu Song. Advancing abductive reasoning in knowledge graphs through complex logical hypothesis generation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 1312--1329, Bangkok, Thailand, 2024. Association for Computati...
-
[5]
Steering large language model activations in sparse spaces
Reza Bayat, Ali Rahimi-Kalahroudi, Mohammad Pezeshki, Sarath Chandar, and Pascal Vincent. Steering large language model activations in sparse spaces. In Proceedings of the Conference on Language Modeling (COLM), 2025
2025
-
[6]
On relationships between induction and abduction: A logical point of view
Brigitte Bessant. On relationships between induction and abduction: A logical point of view. In Peter A. Flach and Antonis C. Kakas (eds.), Abduction and Induction: Essays on their Relation and Integration, volume 18 of Applied Logic Series, pp.\ 77--87. Kluwer Academic Publishers, Dordrecht, 2000. doi:10.1007/978-94-017-0606-3_5
-
[7]
Abductive commonsense reasoning
Chandra Bhagavatula, Ronan Le Bras, Chaitanya Malaviya, Keisuke Sakaguchi, Ari Holtzman, Hannah Rashkin, Doug Downey, Wen-tau Yih, and Yejin Choi. Abductive commonsense reasoning. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Byg1v1HKDB
2020
-
[8]
and Angeli, Gabor and Potts, Christopher and Manning, Christopher D
Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.\ 632--642, Lisbon, Portugal, 2015. Association for Computational Linguistics. doi:10.18653/v1/D15-1075. URL https://a...
-
[9]
Markus J. Buehler. In situ graph reasoning and knowledge expansion using graph-preflexor. Advanced Intelligent Discovery, 1 0 (3): 0 e202500006, 2025. doi:10.1002/aidi.202500006. URL https://doi.org/10.1002/aidi.202500006
-
[10]
e-snli: Natural language inference with natural language explanations
Oana-Maria Camburu, Tim Rockt \"a schel, Thomas Lukasiewicz, and Phil Blunsom. e-snli: Natural language inference with natural language explanations. In Advances in Neural Information Processing Systems, volume 31, 2018. URL https://papers.nips.cc/paper/8163-e-snli-natural-language-inference-with-natural-language-explanations
2018
-
[11]
Daniel G. Campos. On the distinction between P eirce's abduction and L ipton's inference to the best explanation. Synthese, 180: 0 419--442, 2011. doi:10.1007/s11229-009-9709-3. URL https://doi.org/10.1007/s11229-009-9709-3
-
[12]
Self-consistent narrative prompts on abductive natural language inference
Chunkit Chan, Xin Liu, Tsz Ho Chan, Jiayang Cheng, Yangqiu Song, Ginny Wong, and Simon See. Self-consistent narrative prompts on abductive natural language inference. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (...
-
[13]
Abductivemllm: Boosting visual abductive reasoning within mllms
Boyu Chang, Qi Wang, Xi Guo, Zhixiong Nan, Yazhou Yao, and Tianfei Zhou. Abductivemllm: Boosting visual abductive reasoning within mllms. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pp.\ 2698--2706, 2026. doi:10.1609/aaai.v40i4.37258. URL https://ojs.aaai.org/index.php/AAAI/article/view/37258
-
[14]
Nuo Chen, Yicheng Tong, Jiaying Wu, Minh Duc Duong, Qian Wang, Qingyun Zou, Bryan Hooi, and Bingsheng He. Beyond brainstorming: What drives high-quality scientific ideas? lessons from multi-agent collaboration. arXiv preprint arXiv:2508.04575, 2025. doi:10.48550/arXiv.2508.04575. URL https://arxiv.org/abs/2508.04575
-
[15]
What’s in the image? a deep-dive into the vision of vision language models
Aditya Chinchure, Sahithya Ravi, Raymond Ng, Vered Shwartz, Boyang Li, and Leonid Sigal. Black swan: Abductive and defeasible video reasoning in unpredictable events. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 24201--24210, 2025. doi:10.1109/CVPR52734.2025.02254. URL https://blackswan-video.github.io/
-
[16]
On the Measure of Intelligence
François Chollet. On the measure of intelligence, 2019. URL https://arxiv.org/abs/1911.01547
work page internal anchor Pith review arXiv 2019
-
[17]
Transformers as soft reasoners over language
Peter Clark, Oyvind Tafjord, and Kyle Richardson. Transformers as soft reasoners over language. In Christian Bessiere (ed.), Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20 , pp.\ 3882--3890. International Joint Conferences on Artificial Intelligence Organization, 7 2020. doi:10.24963/ijcai.2020/537. URL...
-
[18]
Mavor-Parker, Aengus Lynch, Stefan Heimersheim, and Adri \`a Garriga-Alonso
Arthur Conmy, Augustine N. Mavor-Parker, Aengus Lynch, Stefan Heimersheim, and Adri \`a Garriga-Alonso. Towards automated circuit discovery for mechanistic interpretability. In Advances in Neural Information Processing Systems, volume 36, 2023
2023
-
[19]
Inference to the best explanation in large language models
Dhairya Dalal, Marco Valentino, Andre Freitas, and Paul Buitelaar. Inference to the best explanation in large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 217--235, Bangkok, Thailand, 2024. Association for Computational Linguistics. doi:10.18653/v1/2024.acl-long.1...
-
[20]
True detective: A deep abductive reasoning benchmark undoable for GPT -3 and challenging for GPT -4
Maksym Del and Mark Fishel. True detective: A deep abductive reasoning benchmark undoable for GPT -3 and challenging for GPT -4. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pp.\ 314--322, Toronto, Canada, 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.starsem-1.28. URL https://acla...
-
[21]
Abductive Reasoning in Science
Finnur Dells \'e n. Abductive Reasoning in Science. Elements in Philosophy of Science. Cambridge University Press, June 2024. ISBN 9781009500524. doi:10.1017/9781009353199
-
[22]
C. Delrieux. Abductive inference in defeasible reasoning: a model for research programmes. Journal of Applied Logic, 2 0 (4): 0 409--437, 2004. doi:10.1016/j.jal.2004.07.003
-
[23]
Assessing the reasoning capabilities of LLM s in the context of evidence-based claim verification
John Dougrez-Lewis, Mahmud Elahi Akhter, Federico Ruggeri, Sebastian L \"o bbers, Yulan He, and Maria Liakata. Assessing the reasoning capabilities of LLM s in the context of evidence-based claim verification. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (eds.), Findings of the Association for Computational Linguistics: A...
-
[24]
Abduction
Igor Douven. Abduction . In Edward N. Zalta and Uri Nodelman (eds.), The Stanford Encyclopedia of Philosophy . Metaphysics Research Lab, Stanford University, W inter 2025 edition, 2025
2025
-
[25]
Pan, Sylvia Wang, Kunxun Qi, Yuming Shen, and Yu Deng
Jianfeng Du, Jeff Z. Pan, Sylvia Wang, Kunxun Qi, Yuming Shen, and Yu Deng. Validation of growing knowledge graphs by abductive text evidences. Proceedings of the AAAI Conference on Artificial Intelligence, 33 0 (01): 0 2784--2791, Jul. 2019. doi:10.1609/aaai.v33i01.33012784. URL https://ojs.aaai.org/index.php/AAAI/article/view/4130
-
[26]
e- CARE : a new dataset for exploring explainable causal reasoning
Li Du, Xiao Ding, Kai Xiong, Ting Liu, and Bing Qin. e- CARE : a new dataset for exploring explainable causal reasoning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 432--446, Dublin, Ireland, 2022. Association for Computational Linguistics. doi:10.18653/v1/2022.acl-long.33. URL h...
-
[27]
Hwang, Maxwell Forbes, and Yejin Choi
Denis Emelin, Ronan Le Bras, Jena D. Hwang, Maxwell Forbes, and Yejin Choi. Moral stories: Situated reasoning about norms, intents, actions, and their consequences. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.\ 698--718, Onli...
-
[28]
Peter A. Flach and Antonis C. Kakas. Abductive and inductive reasoning: Background and issues. In Peter A. Flach and Antonis C. Kakas (eds.), Abduction and Induction: Essays on their Relation and Integration, volume 18 of Applied Logic Series, pp.\ 1--27. Kluwer Academic Publishers, Dordrecht, 2000. doi:10.1007/978-94-017-0606-3_1
-
[29]
Harry G. Frankfurt. Peirce's notion of abduction. The Journal of Philosophy, 55 0 (14): 0 593--597, 1958. doi:10.2307/2021966. URL https://doi.org/10.2307/2021966
-
[30]
Patterson, Matthew Churpek, Tim- othy Miller, Dmitriy Dligach, and Majid Afshar
Yanjun Gao, Ruizhe Li, Emma Croxford, John Caskey, Brian W Patterson, Matthew Churpek, Timothy Miller, Dmitriy Dligach, and Majid Afshar. Leveraging medical knowledge graphs into large language models for diagnosis prediction: Design and application study. JMIR AI, 4: 0 e58670, February 2025. ISSN 2817-1705. doi:10.2196/58670. URL http://dx.doi.org/10.2196/58670
-
[31]
Unifying deductive and abductive reasoning in knowledge graphs with masked diffusion model
Yisen Gao, Jiaxin Bai, Yi Huang, Xingcheng Fu, Qingyun Sun, and Yangqiu Song. Unifying deductive and abductive reasoning in knowledge graphs with masked diffusion model. In Proceedings of the ACM Web Conference 2026 (WWW '26), Dubai, United Arab Emirates, 2026 a . Association for Computing Machinery. doi:10.1145/3774904.3792133. URL https://doi.org/10.114...
-
[32]
Controllable logical hypothesis generation for abductive reasoning in knowledge graphs
Yisen Gao, Jiaxin Bai, Tianshi Zheng, Ziwei Zhang, Qingyun Sun, Xingcheng Fu, Jianxin Li, and Yangqiu Song. Controllable logical hypothesis generation for abductive reasoning in knowledge graphs. In International Conference on Learning Representations, 2026 b . URL https://openreview.net/forum?id=oTgJg0M9kY. Poster
2026
-
[33]
The third PASCAL recognizing textual entailment challenge
Danilo Giampiccolo, Bernardo Magnini, Ido Dagan, and Bill Dolan. The third PASCAL recognizing textual entailment challenge. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp.\ 1--9, Prague, 2007. Association for Computational Linguistics. URL https://aclanthology.org/W07-1401/
2007
-
[34]
Available: http://dx.doi.org/10.1038/s41586-025-09422-z
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. Nature, 2025. doi:10.1038/s41586-025-09422-z. URL https://www.nature.com/articles/s41586-025-09422-z
-
[35]
Whodunit: Evaluation benchmark for culprit detection in mystery stories, 2025
Kshitij Gupta. Whodunit: Evaluation benchmark for culprit detection in mystery stories, 2025. URL https://arxiv.org/abs/2502.07747
-
[36]
Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science
Norwood Russell Hanson. Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge University Press, Cambridge, 1958
1958
-
[37]
The inference to the best explanation
Gilbert Harman. The inference to the best explanation. Philosophical Review, 74 0 (1): 0 88--95, 1965. doi:10.2307/2183532
- [38]
-
[39]
From reasoning to learning: A survey on hypothesis discovery and rule learning with large language models
Kaiyu He and Zhiyu Chen. From reasoning to learning: A survey on hypothesis discovery and rule learning with large language models. Transactions on Machine Learning Research, 2025. URL https://openreview.net/forum?id=d7W38UzUg0
2025
-
[40]
Gear: A general evaluation framework for abductive reasoning, 2025 a
Kaiyu He, Peilin Wu, Mian Zhang, Kun Wan, Wentian Zhao, Xinya Du, and Zhiyu Chen. Gear: A general evaluation framework for abductive reasoning, 2025 a . URL https://arxiv.org/abs/2509.24096
-
[41]
Kaiyu He, Mian Zhang, Shuo Yan, Peilin Wu, and Zhiyu Chen. Idea: Enhancing the rule learning ability of large language model agents through induction, deduction, and abduction. In Findings of the Association for Computational Linguistics: ACL 2025, pp.\ 13563--13597, Vienna, Austria, 2025 b . Association for Computational Linguistics. doi:10.18653/v1/2025...
-
[42]
Zhitao He, Pengfei Cao, Yubo Chen, Kang Liu, Ruopeng Li, Mengshu Sun, and Jun Zhao. LEGO : A multi-agent collaborative framework with role-playing and iterative feedback for causality explanation generation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp.\ 9142--9163, Singapore, 2023. Association for Computational Linguistics...
-
[43]
Hwang, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, and Yejin Choi
Jack Hessel, Jena D. Hwang, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, and Yejin Choi. The abduction of sherlock holmes: A dataset for visual abductive reasoning. In Computer Vision -- ECCV 2022, volume 13696 of Lecture Notes in Computer Science, pp.\ 558--575. Springer, Cham, 2022. doi:10.1007/978-3-031-20059-5_32. URL...
-
[44]
A implies b: Circuit analysis in llms for propositional logical reasoning
Guan Zhe Hong, Nishanth Dikkala, Enming Luo, Cyrus Rashtchian, Xin Wang, and Rina Panigrahy. A implies b: Circuit analysis in llms for propositional logical reasoning. In Advances in Neural Information Processing Systems, 2025
2025
-
[45]
Shengxin Hong, Liang Xiao, Xin Zhang, and Jianxia Chen. Argmed-agents: Explainable clinical decision reasoning with large language models via argumentation schemes. In 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp.\ 1989--1996. IEEE, 2024. doi:10.1109/BIBM62325.2024.10822109. Also available as arXiv:2403.06294
-
[46]
Disentangling logic: The role of context in large language models' formal reasoning capabilities
Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Mingyu Jin, Shuhang Lin, Haochen Xue, Zelong Li, Jindong Wang, and Yongfeng Zhang. Disentangling logic: The role of context in large language models' formal reasoning capabilities. In Findings of the Association for Computational Linguistics: ACL 2025, pp.\ 19219--19242, Vienna, Austria, 2025. Association fo...
-
[47]
The relation of P eirce's abduction to inference to the best explanation
Yi Jiang. The relation of P eirce's abduction to inference to the best explanation. Chinese Semiotic Studies, 20 0 (3): 0 485--496, 2024. doi:10.1515/css-2024-2022
-
[48]
Kakas and Loizos Michael
A. Kakas and Loizos Michael. Abduction and argumentation for explainable machine learning: A position survey, 2020
2020
-
[49]
Peirce and the autonomy of abductive reasoning
Tomis Kapitan. Peirce and the autonomy of abductive reasoning. Erkenntnis, 37: 0 1--26, 1992
1992
-
[50]
Minsu Kim and James Thorne. Epistemology of language models: Do language models have holistic knowledge? In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 12644--12669, Bangkok, Thailand, August 2024. Association for Computational Linguistics. doi:10.18653/v1/2024.findings-ac...
-
[51]
Playgrounds for abstraction and reasoning
Subin Kim, Prin Phunyaphibarn, Donghyun Ahn, and Sundong Kim. Playgrounds for abstraction and reasoning. In NeurIPS 2022 Workshop on Neuro Causal and Symbolic AI (nCSI), 2022. URL https://openreview.net/forum?id=F4RNpByoqP
2022
-
[52]
arXiv preprint arXiv:2403.00745 , year=
J \'a nos Kram \'a r, Tom Lieberum, Rohin Shah, and Neel Nanda. Atp*: An efficient and scalable method for localizing llm behaviour to components. arXiv preprint arXiv:2403.00745, 2024
-
[53]
Multi-modal action chain abductive reasoning (mar)
Mengze Li, Tianbao Wang, Jiahe Xu, Kairong Han, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Shiliang Pu, and Fei Wu. Multi-modal action chain abductive reasoning (mar). In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), pp.\ 4617--4628, Toronto, Canada, 2023. Association for Computational Lingui...
-
[54]
Qingchuan Li, Mingyue Cheng, Zirui Liu, Daoyu Wang, Yuting Zeng, and Tongxuan Liu. From hypothesis to premises: Llm-based backward logical reasoning with selective symbolic translation. Proceedings of the AAAI Conference on Artificial Intelligence, 40 0 (37): 0 31671--31679, 2026. doi:10.1609/aaai.v40i37.40434. URL https://ojs.aaai.org/index.php/AAAI/arti...
-
[55]
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection , isbn =
Chen Liang, Wenguan Wang, Tianfei Zhou, and Yi Yang. Visual abductive reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 15544--15554, 2022. doi:10.1109/CVPR52688.2022.01512. URL https://openaccess.thecvf.com/content/CVPR2022/html/Liang_Visual_Abductive_Reasoning_CVPR_2022_paper.html
-
[56]
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. Encouraging divergent thinking in large language models through multi-agent debate. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp.\ 17889--17904, Miami, Florida, USA, 2024. Association for Computati...
-
[57]
Let's verify step by step
Hunter Lightman, Vineet Kosaraju, Yuri Burda, Harrison Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, and Karl Cobbe. Let's verify step by step. In International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=v8L0pN6EOi. Poster
2024
-
[58]
Shiyin Lin. Abductive inference in retrieval-augmented language models: Generating and validating missing premises, 2025. URL https://arxiv.org/abs/2511.04020
-
[59]
Inference to the Best Explanation
Peter Lipton. Inference to the Best Explanation. Routledge, London, 2nd edition, 2004
2004
-
[60]
Inference to the best explanation
Peter Lipton. Inference to the best explanation. In Stathis Psillos and Martin Curd (eds.), The Routledge Companion to Philosophy of Science, pp.\ 193--202. Routledge, Abingdon, 2008
2008
-
[61]
An incomplete loop: Instruction inference, instruction following, and in-context learning in language models
Emmy Liu, Graham Neubig, and Jacob Andreas. An incomplete loop: Instruction inference, instruction following, and in-context learning in language models. In Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=nUNbjMDBWC
2024
-
[62]
Evaluating the logical reasoning abilities of large reasoning models, 2025
Hanmeng Liu, Yiran Ding, Zhizhang Fu, Chaoli Zhang, Xiaozhang Liu, and Yue Zhang. Evaluating the logical reasoning abilities of large reasoning models, 2025
2025
-
[63]
The magic of IF : Investigating causal reasoning abilities in large language models of code
Xiao Liu, Da Yin, Chen Zhang, Yansong Feng, and Dongyan Zhao. The magic of IF : Investigating causal reasoning abilities in large language models of code. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Findings of the Association for Computational Linguistics: ACL 2023, pp.\ 9009--9022, Toronto, Canada, July 2023. Association for Computati...
-
[64]
Llm discussion: Enhancing the creativity of large language models via discussion framework and role-play
Li-Chun Lu, Shou-Jen Chen, Tsung-Min Pai, Chan-Hung Yu, Hung yi Lee, and Shao-Hua Sun. Llm discussion: Enhancing the creativity of large language models via discussion framework and role-play. In Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=ybaK4asBT2
2024
-
[65]
Man Luo, Shrinidhi Kumbhar, Ming Shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, and Chitta Baral. Towards logiglue: A brief survey and a benchmark for analyzing logical reasoning capabilities of language models. arXiv preprint arXiv:2310.00836, 2023. doi:10.48550/arXiv.2310.00836. URL https://arxiv.org/abs/2310.00836
-
[66]
Bill MacCartney and Christopher D. Manning. Natural logic for textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp.\ 193--200, Prague, 2007. Association for Computational Linguistics. URL https://aclanthology.org/W07-1431/
2007
-
[67]
Toward mechanistic explanation of deductive reasoning in language models
Davide Maltoni and Matteo Ferrara. Toward mechanistic explanation of deductive reasoning in language models. arXiv preprint arXiv:2510.09340, 2025
-
[68]
ER-Reason: A Benchmark Dataset for LLM Clinical Reasoning in the Emergency Room
Nikita Mehandru, Niloufar Golchini, David Bamman, Travis Zack, Melanie F. Molina, and Ahmed Alaa. Er-reason: A benchmark dataset for llm-based clinical reasoning in the emergency room, 2025. URL https://arxiv.org/abs/2505.22919
work page internal anchor Pith review arXiv 2025
-
[69]
Locating and editing factual associations in gpt
Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in gpt. In Advances in Neural Information Processing Systems, volume 35, 2022. ROME
2022
-
[70]
A System of Logic
John Stuart Mill. A System of Logic. Harper & brothers, New York, 1858
-
[71]
Peirce-suit of truth - why inference to the best explanation and abduction ought not to be confused
Gerhard Minnameier. Peirce-suit of truth - why inference to the best explanation and abduction ought not to be confused. Erkenntnis, 60: 0 75--105, 2004
2004
-
[72]
Yunxiang Mo, Tianshi Zheng, Qing Zong, Jiayu Liu, Baixuan Xu, Yauwai Yim, Chunkit Chan, Jiaxin Bai, and Yangqiu Song. Dixitworld: Evaluating multimodal abductive reasoning in vision-language models with multi-agent dixit gameplay, 2025. URL https://arxiv.org/abs/2510.10117
-
[73]
Ha Thanh Nguyen, Randy Goebel, Francesca Toni, Kostas Stathis, and Ken Satoh. How well do sota legal reasoning models support abductive reasoning? In Proceedings of the International Conference on Logic Programming 2023 Workshops, volume 3437 of CEUR Workshop Proceedings, London, United Kingdom, 2023. URL https://ceur-ws.org/Vol-3437/paper1LPLR.pdf. Logic...
2023
-
[74]
In-context Learning and Induction Heads
Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, a...
work page internal anchor Pith review arXiv 2022
-
[75]
From we to me: Theory informed narrative shift with abductive reasoning
Jaikrishna Manojkumar Patil, Divyagna Bavikadi, Kaustuv Mukherji, Ashby Steward-Nolan, Peggy-Jean Allin, Tumininu Awonuga, Joshua Garland, and Paulo Shakarian. From we to me: Theory informed narrative shift with abductive reasoning. arXiv preprint arXiv:2603.03320, 2026. doi:10.48550/arXiv.2603.03320. URL https://arxiv.org/abs/2603.03320
-
[76]
Social commonsense reasoning with multi-head knowledge attention
Debjit Paul and Anette Frank. Social commonsense reasoning with multi-head knowledge attention. In Findings of the Association for Computational Linguistics: EMNLP 2020, pp.\ 2969--2980, Online, 2020. Association for Computational Linguistics. doi:10.18653/v1/2020.findings-emnlp.267. URL https://aclanthology.org/2020.findings-emnlp.267/
-
[77]
Approaches to abductive reasoning: An overview
Gabriele Paul. Approaches to abductive reasoning: An overview. Artificial Intelligence Review, 7: 0 109--152, 1993. doi:10.1007/BF00849080. URL https://doi.org/10.1007/BF00849080
-
[78]
Collected Papers of Charles Sanders Peirce
Charles Sanders Peirce. Collected Papers of Charles Sanders Peirce. Harvard University Press, Cambridge, MA, 1931--1958. Volumes 1--6 edited by C. Hartshorne and P. Weiss (1931--1935); Volumes 7--8 edited by A.W. Burks (1958)
1931
-
[79]
Abduction as deductive saturation: A proof-theoretic inquiry
Mario Piazza, Gabriele Pulcini, and Andrea Sabatini. Abduction as deductive saturation: A proof-theoretic inquiry. Journal of Philosophical Logic, 52 0 (6): 0 1575--1602, 2023. doi:10.1007/s10992-023-09718-3
-
[80]
Doing experiments and revising rules with natural language and probabilistic reasoning
Wasu Top Piriyakulkij, Cassidy Langenfeld, Tuan Anh Le, and Kevin Ellis. Doing experiments and revising rules with natural language and probabilistic reasoning. In Advances in Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=HXdAfK488A. Poster
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.