Implicitly Learning to Reason in First-Order Logic

Brendan Juba; Vaishak Belle

arxiv: 1906.10106 · v1 · pith:BU6E7EM4new · submitted 2019-06-24 · 💻 cs.AI

Implicitly Learning to Reason in First-Order Logic

Vaishak Belle , Brendan Juba This is my paper

Pith reviewed 2026-05-25 17:14 UTC · model grok-4.3

classification 💻 cs.AI

keywords first-order logicPAC semanticsimplicit learnabilityquery answeringbackground knowledgesymmetries of constantsrelational theoriesmachine learning

0 comments

The pith

Symmetries among constants let first-order logic be learned implicitly from examples for query answering over infinite domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that first-order logic background knowledge, partly given explicitly and partly as examples from a fixed distribution, supports robust query answering even for countably infinite domains. It achieves this by exploiting symmetries among constants and generalizing implicit learnability inside the PAC semantics framework. A sympathetic reader would care because earlier theoretical results had ruled out learning relational theories in this generality, while first-order logic is widely viewed as the right language for representing knowledge. The approach requires only minimal assumptions on the distribution and moves past the propositional limit that has constrained prior learning-to-reason work.

Core claim

We present a new theoretical approach to robustly learning to reason in first-order logic, and consider universally quantified clauses over a countably infinite domain. Our results exploit symmetries exhibited by constants in the language, and generalize the notion of implicit learnability to show how queries can be computed against (implicitly) learned first-order background knowledge.

What carries the argument

Symmetries exhibited by constants in the language, which generalize implicit learnability so that queries can be answered against learned first-order background knowledge under PAC semantics.

If this is right

Queries about first-order formulas can be answered using a mix of explicit formulas and implicitly learned background knowledge.
Generalization from finite samples succeeds for countably infinite domains when constant symmetries are present.
PAC semantics supplies a model-theoretic guarantee for first-order reasoning that requires few distributional assumptions.
The propositional limitation of earlier robust learning-to-reason results is overcome for universally quantified clauses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If constant symmetries are common in practice, learning systems could handle open domains without enumerating every constant in advance.
The same symmetry argument might extend to other relational languages such as description logics.
One could test the claim by measuring query accuracy when training examples are drawn from a distribution that breaks constant symmetries.

Load-bearing premise

The probability distribution over examples must permit the symmetries of constants to enable generalization from finite samples to countably infinite domains under PAC semantics.

What would settle it

A concrete distribution over examples in which permuting constants produces incorrect query answers on new constants outside the training sample.

read the original abstract

We consider the problem of answering queries about formulas of first-order logic based on background knowledge partially represented explicitly as other formulas, and partially represented as examples independently drawn from a fixed probability distribution. PAC semantics, introduced by Valiant, is one rigorous, general proposal for learning to reason in formal languages: although weaker than classical entailment, it allows for a powerful model theoretic framework for answering queries while requiring minimal assumptions about the form of the distribution in question. To date, however, the most significant limitation of that approach, and more generally most machine learning approaches with robustness guarantees, is that the logical language is ultimately essentially propositional, with finitely many atoms. Indeed, the theoretical findings on the learning of relational theories in such generality have been resoundingly negative. This is despite the fact that first-order logic is widely argued to be most appropriate for representing human knowledge. In this work, we present a new theoretical approach to robustly learning to reason in first-order logic, and consider universally quantified clauses over a countably infinite domain. Our results exploit symmetries exhibited by constants in the language, and generalize the notion of implicit learnability to show how queries can be computed against (implicitly) learned first-order background knowledge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a symmetry-based lift of implicit PAC learning to first-order clauses over countably infinite domains, but leaves the required condition on the example distribution unstated.

read the letter

The central new move is to take the propositional implicit-learnability results and extend them to universally quantified clauses in first-order logic whose domain is countably infinite. The authors do this by pointing to symmetries among the constants and claiming that these let finite samples support query answering against the learned background knowledge under PAC semantics. That direction addresses a real gap: earlier work had largely ruled out robust learning for relational theories in full generality, and first-order logic is the language people actually want for complex knowledge. Keeping the minimal distributional assumptions of PAC is also the right framing if the goal is robustness rather than classical entailment. The abstract is clear on the target setting and on what prior limitation is being removed. The soft spot is exactly the one the stress-test flags. The argument rests on the distribution D inducing enough symmetry on the constants so that finite samples yield uniform convergence over all groundings. No explicit condition on D is given in the abstract—no invariance under cofinite permutations, no finite-support restriction, nothing. Without that, an arbitrary D can put vanishing mass on all but finitely many constants and the symmetry claim collapses. The full paper must supply the precise assumption and show it is sufficient; otherwise the PAC guarantee for the infinite domain is unsupported. This is for people working on the intersection of PAC learning and logical reasoning who need to know whether the symmetry trick actually closes the gap. A reader who already knows the propositional results will get value from seeing how the infinite-domain case is handled, if the proofs are there. I would send it to peer review so the derivations and the exact statement of the distribution condition can be checked.

Referee Report

1 major / 1 minor

Summary. The paper proposes a theoretical approach to answering queries about first-order logic formulas when background knowledge is represented partly by explicit formulas and partly by examples drawn from a fixed distribution. It uses PAC semantics to learn universally quantified clauses over a countably infinite domain by exploiting symmetries among constants and generalizing the notion of implicit learnability.

Significance. If the central results hold, the work would be significant for extending PAC-based reasoning beyond propositional logic to first-order logic, a setting where prior theoretical results have been largely negative. The generalization of implicit learnability via constant symmetries offers a potential route to handling infinite domains with finite samples, provided the required uniformity conditions are established.

major comments (1)

[Abstract] Abstract (results paragraph): the claim that symmetries among constants permit PAC learning of universally quantified clauses over a countably infinite domain lacks an explicit characterization of the distribution D that would guarantee the necessary uniformity of convergence over all groundings. An arbitrary D can break the symmetry argument by concentrating mass on finitely many constants.

minor comments (1)

The abstract refers to 'our results' and 'theoretical findings' without including any derivations, proofs, or counterexample checks in the provided text; the full manuscript should place these in the main body with numbered equations or theorems.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of the work's significance and for the constructive comment on the abstract. We address the point below.

read point-by-point responses

Referee: [Abstract] Abstract (results paragraph): the claim that symmetries among constants permit PAC learning of universally quantified clauses over a countably infinite domain lacks an explicit characterization of the distribution D that would guarantee the necessary uniformity of convergence over all groundings. An arbitrary D can break the symmetry argument by concentrating mass on finitely many constants.

Authors: We agree that the abstract would benefit from an explicit characterization of D. While the body of the paper (via the symmetry-based generalization of implicit learnability) establishes that the distribution must support uniformity over groundings to make the symmetry argument hold, an arbitrary D concentrating on finitely many constants could indeed invalidate the result. We will revise the abstract's results paragraph to state that D is required to be symmetric with respect to the constants (assigning positive mass to infinitely many constants in a permutation-invariant manner). This is a clarification of existing assumptions rather than an alteration of the theorems. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation builds on external PAC semantics and constant symmetries without self-referential reduction

full rationale

The paper presents a generalization of implicit learnability to first-order logic over countably infinite domains by exploiting constant symmetries within the PAC framework introduced by Valiant. No equations, fitted parameters, or predictions are shown that reduce by construction to inputs. The central claim rests on an external model-theoretic framework rather than self-citations or ansatzes from the authors' prior work. The symmetry argument is introduced as a new theoretical approach without load-bearing self-referential definitions or uniqueness theorems imported from the same authors. The derivation is therefore self-contained against the stated assumptions on the distribution and language.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities can be extracted. Ledger remains empty pending full text.

pith-pipeline@v0.9.0 · 5737 in / 969 out tokens · 20553 ms · 2026-05-25T17:14:07.447449+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

[Belle, 2017] V . Belle. Open-universe weighted model counting. In AAAI,

work page 2017
[2]

Billingsley

[Billingsley, 1995] P . Billingsley. Probability and Measure . Wiley-Interscience, 3 edition, April

work page 1995
[3]

The learnability of description logics with equalit y constraints

[Cohen and Hirsh, 1994 ] William W Cohen and Haym Hirsh. The learnability of description logics with equalit y constraints. Machine Learning, 17(2-3):169–199,

work page 1994
[4]

Polynomial learnability and inductive logic pro- gramming: Methods and results

[Cohen and Page, 1995 ] William W Cohen and C David Page. Polynomial learnability and inductive logic pro- gramming: Methods and results. New Generation Com- puting, 13(3-4):369–409,

work page 1995
[5]

Complexity theoretic limitations on learning dnf’s

[Daniely and Shalev-Shwartz, 2016 ] Amit Daniely and Shai Shalev-Shwartz. Complexity theoretic limitations on learning dnf’s. In COLT, pages 815–830,

work page 2016
[6]

De Raedt and S

[De Raedt and Dˇ zeroski, 1994] L. De Raedt and S. Dˇ zeroski. First-order jk-clausal theories are P AC-learnable. Artiﬁ- cial Intelligence, 70(1):375–392,

work page 1994
[7]

De Giacomo, Y

[Giacomo et al., 2011] G. De Giacomo, Y . Lesp´ erance, and H. J. Levesque. E ﬃcient reasoning in proper knowledge bases with unknown individuals. In IJCAI, pages 827–832,

work page 2011
[8]

Learning implicitly in reasoning in PAC-Semantics

[Juba, 2012] Brendan Juba. Learning implicitly in reasoning in pac-semantics. arXiv preprint arXiv:1209.0056 ,

work page internal anchor Pith review Pith/arXiv arXiv 2012
[9]

Implicit learning of common sense for reasoning

[Juba, 2013] Brendan Juba. Implicit learning of common sense for reasoning. In IJCAI, pages 939–946,

work page 2013
[10]

Toward e ﬃcient agnostic learning

[Kearns et al., 1994] Michael J Kearns, Robert E Schapire, and Linda M Sellie. Toward e ﬃcient agnostic learning. Machine Learning, 17(2-3):115–141,

work page 1994
[11]

Kersting, S

[Kersting et al., 2011] K. Kersting, S. Natarajan, and D. Poole. Statistical relational AI: Logic, probability an d computation

work page 2011
[12]

Learning to reason with a restricted view

[Khardon and Roth, 1999 ] Roni Khardon and Dan Roth. Learning to reason with a restricted view. Machine Learn- ing, 35(2):95–116,

work page 1999
[13]

Lakemeyer and H

[Lakemeyer and Levesque, 2002 ] G. Lakemeyer and H. J. Levesque. Evaluation-based reasoning with disjunctive in- formation in ﬁrst-order knowledge bases. In Proc. KR , pages 73–81,

work page 2002
[14]

[Levesque and Lakemeyer, 2001] H. J. Levesque and G. Lakemeyer. The logic of knowledge bases . The MIT Press,

work page 2001
[15]

Levesque

[Levesque, 1984] H.J. Levesque. A logic of implicit and ex- plicit belief. In Proceedings of the F ourth National Confer- ence on Artiﬁcial Intelligence , pages 198–202. American Association for Artiﬁcial Intelligence,

work page 1984
[16]

[Levesque, 1998] H. J. Levesque. A completeness result for reasoning with incomplete ﬁrst-order knowledge bases. In Proc. KR, pages 14–23,

work page 1998
[17]

Levesque

[Liu and Levesque, 2005 ] Y ongmei Liu and Hector J. Levesque. Tractable reasoning in ﬁrst-order knowledge bases with disjunctive information. In Proc. AAAI, pages 639–644,

work page 2005
[18]

Levesque

[Liu et al., 2004] Y ongmei Liu, Gerhard Lakemeyer, and Hector J. Levesque. A logic of limited belief for reason- ing with disjunctive information. In KR, pages 587–597,

work page 2004
[19]

McCarthy and P

[McCarthy and Hayes, 1969 ] J. McCarthy and P . J. Hayes. Some philosophical problems from the standpoint of artiﬁ- cial intelligence. In Machine Intelligence, pages 463–502,

work page 1969
[20]

Reading between the lines

[Michael, 2009] Loizos Michael. Reading between the lines. In IJCAI,

work page 2009
[21]

Partial observability and learnability

[Michael, 2010] Loizos Michael. Partial observability and learnability. Artiﬁcial Intelligence , 174(11):639–669,

work page 2010
[22]

[Moore, 1982] Robert C. Moore. The role of logic in knowl- edge representation and commonsense reasoning. In AAAI, pages 428–433,

work page 1982
[23]

Muggleton and L

[Muggleton and De Raedt, 1994 ] S. Muggleton and L. De Raedt. Inductive logic programming: Theory and methods. The Journal of Logic Programming , 19:629–679,

work page 1994
[24]

Richardson and P

[Richardson and Domingos, 2006 ] M. Richardson and P . Domingos. Markov logic networks. Machine learning, 62(1):107–136,

work page 2006
[25]

Srivastava, S

[Srivastava et al., 2014] S. Srivastava, S. J. Russell, P . Ruan, and X. Cheng. First-order open-universe pomdps. In UAI, pages 742–751,

work page 2014
[26]

Robust logics

[V aliant, 2000] Leslie G V aliant. Robust logics. Artiﬁcial Intelligence, 117(2):231–253, 2000

work page 2000

[1] [1]

[Belle, 2017] V . Belle. Open-universe weighted model counting. In AAAI,

work page 2017

[2] [2]

Billingsley

[Billingsley, 1995] P . Billingsley. Probability and Measure . Wiley-Interscience, 3 edition, April

work page 1995

[3] [3]

The learnability of description logics with equalit y constraints

[Cohen and Hirsh, 1994 ] William W Cohen and Haym Hirsh. The learnability of description logics with equalit y constraints. Machine Learning, 17(2-3):169–199,

work page 1994

[4] [4]

Polynomial learnability and inductive logic pro- gramming: Methods and results

[Cohen and Page, 1995 ] William W Cohen and C David Page. Polynomial learnability and inductive logic pro- gramming: Methods and results. New Generation Com- puting, 13(3-4):369–409,

work page 1995

[5] [5]

Complexity theoretic limitations on learning dnf’s

[Daniely and Shalev-Shwartz, 2016 ] Amit Daniely and Shai Shalev-Shwartz. Complexity theoretic limitations on learning dnf’s. In COLT, pages 815–830,

work page 2016

[6] [6]

De Raedt and S

[De Raedt and Dˇ zeroski, 1994] L. De Raedt and S. Dˇ zeroski. First-order jk-clausal theories are P AC-learnable. Artiﬁ- cial Intelligence, 70(1):375–392,

work page 1994

[7] [7]

De Giacomo, Y

[Giacomo et al., 2011] G. De Giacomo, Y . Lesp´ erance, and H. J. Levesque. E ﬃcient reasoning in proper knowledge bases with unknown individuals. In IJCAI, pages 827–832,

work page 2011

[8] [8]

Learning implicitly in reasoning in PAC-Semantics

[Juba, 2012] Brendan Juba. Learning implicitly in reasoning in pac-semantics. arXiv preprint arXiv:1209.0056 ,

work page internal anchor Pith review Pith/arXiv arXiv 2012

[9] [9]

Implicit learning of common sense for reasoning

[Juba, 2013] Brendan Juba. Implicit learning of common sense for reasoning. In IJCAI, pages 939–946,

work page 2013

[10] [10]

Toward e ﬃcient agnostic learning

[Kearns et al., 1994] Michael J Kearns, Robert E Schapire, and Linda M Sellie. Toward e ﬃcient agnostic learning. Machine Learning, 17(2-3):115–141,

work page 1994

[11] [11]

Kersting, S

[Kersting et al., 2011] K. Kersting, S. Natarajan, and D. Poole. Statistical relational AI: Logic, probability an d computation

work page 2011

[12] [12]

Learning to reason with a restricted view

[Khardon and Roth, 1999 ] Roni Khardon and Dan Roth. Learning to reason with a restricted view. Machine Learn- ing, 35(2):95–116,

work page 1999

[13] [13]

Lakemeyer and H

[Lakemeyer and Levesque, 2002 ] G. Lakemeyer and H. J. Levesque. Evaluation-based reasoning with disjunctive in- formation in ﬁrst-order knowledge bases. In Proc. KR , pages 73–81,

work page 2002

[14] [14]

[Levesque and Lakemeyer, 2001] H. J. Levesque and G. Lakemeyer. The logic of knowledge bases . The MIT Press,

work page 2001

[15] [15]

Levesque

[Levesque, 1984] H.J. Levesque. A logic of implicit and ex- plicit belief. In Proceedings of the F ourth National Confer- ence on Artiﬁcial Intelligence , pages 198–202. American Association for Artiﬁcial Intelligence,

work page 1984

[16] [16]

[Levesque, 1998] H. J. Levesque. A completeness result for reasoning with incomplete ﬁrst-order knowledge bases. In Proc. KR, pages 14–23,

work page 1998

[17] [17]

Levesque

[Liu and Levesque, 2005 ] Y ongmei Liu and Hector J. Levesque. Tractable reasoning in ﬁrst-order knowledge bases with disjunctive information. In Proc. AAAI, pages 639–644,

work page 2005

[18] [18]

Levesque

[Liu et al., 2004] Y ongmei Liu, Gerhard Lakemeyer, and Hector J. Levesque. A logic of limited belief for reason- ing with disjunctive information. In KR, pages 587–597,

work page 2004

[19] [19]

McCarthy and P

[McCarthy and Hayes, 1969 ] J. McCarthy and P . J. Hayes. Some philosophical problems from the standpoint of artiﬁ- cial intelligence. In Machine Intelligence, pages 463–502,

work page 1969

[20] [20]

Reading between the lines

[Michael, 2009] Loizos Michael. Reading between the lines. In IJCAI,

work page 2009

[21] [21]

Partial observability and learnability

[Michael, 2010] Loizos Michael. Partial observability and learnability. Artiﬁcial Intelligence , 174(11):639–669,

work page 2010

[22] [22]

[Moore, 1982] Robert C. Moore. The role of logic in knowl- edge representation and commonsense reasoning. In AAAI, pages 428–433,

work page 1982

[23] [23]

Muggleton and L

[Muggleton and De Raedt, 1994 ] S. Muggleton and L. De Raedt. Inductive logic programming: Theory and methods. The Journal of Logic Programming , 19:629–679,

work page 1994

[24] [24]

Richardson and P

[Richardson and Domingos, 2006 ] M. Richardson and P . Domingos. Markov logic networks. Machine learning, 62(1):107–136,

work page 2006

[25] [25]

Srivastava, S

[Srivastava et al., 2014] S. Srivastava, S. J. Russell, P . Ruan, and X. Cheng. First-order open-universe pomdps. In UAI, pages 742–751,

work page 2014

[26] [26]

Robust logics

[V aliant, 2000] Leslie G V aliant. Robust logics. Artiﬁcial Intelligence, 117(2):231–253, 2000

work page 2000