Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation

Alex Yuxuan Peng; Jiamou Liu; Michael Witbrock; Neset Tan; Qiming Bao; Tim Hartill; Zhenyun Deng

arxiv: 2207.14000 · v4 · pith:B5NLBOXNnew · submitted 2022-07-28 · 💻 cs.CL · cs.AI· cs.LG· cs.LO

Multi-Step Deductive Reasoning Over Natural Language: An Empirical Study on Out-of-Distribution Generalisation

Qiming Bao , Alex Yuxuan Peng , Tim Hartill , Neset Tan , Zhenyun Deng , Michael Witbrock , Jiamou Liu This is my paper

classification 💻 cs.CL cs.AIcs.LGcs.LO

keywords reasoningmodelattentiondeeplogicmulti-stepconceptrulesdatasetsdeeper

0 comments

read the original abstract

Combining deep learning with symbolic logic reasoning aims to capitalize on the success of both fields and is drawing increasing attention. Inspired by DeepLogic, an end-to-end model trained to perform inference on logic programs, we introduce IMA-GloVe-GA, an iterative neural inference network for multi-step reasoning expressed in natural language. In our model, reasoning is performed using an iterative memory neural network based on RNN with a gated attention mechanism. We evaluate IMA-GloVe-GA on three datasets: PARARULES, CONCEPTRULES V1 and CONCEPTRULES V2. Experimental results show DeepLogic with gated attention can achieve higher test accuracy than DeepLogic and other RNN baseline models. Our model achieves better out-of-distribution generalisation than RoBERTa-Large when the rules have been shuffled. Furthermore, to address the issue of unbalanced distribution of reasoning depths in the current multi-step reasoning datasets, we develop PARARULE-Plus, a large dataset with more examples that require deeper reasoning steps. Experimental results show that the addition of PARARULE-Plus can increase the model's performance on examples requiring deeper reasoning depths. The source code and data are available at https://github.com/Strong-AI-Lab/Multi-Step-Deductive-Reasoning-Over-Natural-Language.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

PrologMCP: A Standardized Prolog Tool Interface for LLM Agents
cs.AI 2026-06 conditional novelty 6.0

PrologMCP is a standardized MCP server for Prolog that lets LLM agents delegate inference, achieving near-perfect accuracy on PARARULE-Plus subsets where reasoning LLMs drop to 0.94-0.95.
Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models
cs.AI 2025-03 unverdicted novelty 5.0

The paper unifies perspectives on Long CoT in reasoning LLMs by introducing a taxonomy, detailing characteristics of deep reasoning and reflection, and discussing emergence phenomena and future directions.