pith. sign in

arxiv: 2605.07710 · v2 · pith:MGTUWLE6new · submitted 2026-05-08 · 💻 cs.FL

Learning Tree Automata with Term Rewriting

Pith reviewed 2026-05-25 06:27 UTC · model grok-4.3

classification 💻 cs.FL
keywords tree automataAngluin learningterm rewritingmembership queriesquery complexitydeductive inferenceautomata learning
0
0 comments X

The pith

Supplying a term rewriting system allows the tree automaton learner to infer some membership query answers and reduce total queries asked.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends Angluin-style learning for tree automata by allowing a term rewriting system to be provided alongside the usual queries. This system encodes properties of the target language, such as the irrelevance of subtree ordering under certain symbols. The learner uses the rewrite rules to deduce answers to membership queries instead of posing them to the oracle. When the target language satisfies the properties, this approach decreases the number of queries required to identify the automaton. The authors illustrate the reduction with examples of natural properties of tree-structured data.

Core claim

The central claim is that incorporating a term rewriting system into the learning procedure for tree automata enables deductive inference of membership answers, which lowers query complexity while preserving the algorithm's ability to learn the target language correctly.

What carries the argument

The term rewriting system, whose rules are applied to infer membership of terms in the target language during the learning process.

If this is right

  • The number of membership queries decreases when the rewrite system captures equational properties of the language.
  • The method works for properties like commutativity of function symbols in tree terms.
  • Correctness holds as long as the rewrite system is consistent with the target language.
  • Significant query savings are possible for common properties of tree data such as order-irrelevance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This technique might apply to learning other structures by supplying appropriate rewrite systems for their properties.
  • Applications in verifying tree-structured programs could benefit from reduced interaction with oracles.
  • Future work could explore how to derive suitable rewrite systems automatically from examples.

Load-bearing premise

The provided term rewriting system must be consistent with the target language and its deductive closure must correctly answer the membership queries it is applied to.

What would settle it

If the learner outputs an incorrect automaton after using the rewrite system to answer queries, while the actual language differs on one of those inferred memberships, the approach does not hold.

read the original abstract

We present an extension of the Angluin-style learning algorithm for tree automata that incorporates deductive inference. The learning algorithm is provided with a term rewriting system that specifies properties of the target tree language (e.g., the order of subtrees under a symbol f is irrelevant). This term rewriting system is used to infer answers to some queries, which reduces the query complexity of the learning algorithm. We present examples of rewrite systems that express natural properties of tree-structured data, which yield a significant reduction in the number of queries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper extends the Angluin-style learning algorithm for tree automata by supplying a term rewriting system (TRS) that encodes properties of the target language (e.g., commutativity). The TRS is used to compute the deductive closure of observed membership answers, allowing the algorithm to infer additional query results and thereby reduce the total number of membership and equivalence queries. The manuscript supplies the necessary definitions of the augmented observation table, the inference procedure, and a proof that the resulting hypothesis is consistent with all observed and inferred answers and that the procedure terminates with a correct automaton whenever the supplied TRS is sound for the unknown target language. Concrete rewrite systems for natural tree-language properties are presented and shown to produce substantial query savings on illustrative examples.

Significance. If the central claim holds, the work demonstrates a practical and sound method for injecting domain knowledge into automata learning via an externally supplied, consistent TRS. This is a natural and standard augmentation of oracle-assisted learning that preserves correctness while lowering query complexity. The explicit proof of consistency and termination under the stated precondition, together with the concrete examples, constitutes a clear contribution to the literature on efficient learning of tree automata.

minor comments (3)
  1. [§3.2] §3.2: the definition of the deductive closure operator could be accompanied by a short pseudocode fragment showing how membership answers are propagated under a single rewrite step; the current prose description leaves the implementation details implicit.
  2. [§4] §4, Example 2: the reduction in query count is stated qualitatively; adding a small table that contrasts the number of queries required with and without the TRS on the same target language would make the claimed savings concrete and easier to verify.
  3. The termination argument relies on the TRS being both sound and terminating; while soundness is stated as a precondition, the manuscript should note explicitly that the supplied TRS is assumed to be terminating (or that the closure computation is performed only up to a bounded depth).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the supportive summary and the recommendation of minor revision. No major comments were provided in the report, so we have no specific points to address point-by-point. We will perform a careful proofreading pass and incorporate any minor editorial suggestions during revision.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper presents an extension of Angluin's tree automaton learning algorithm that accepts an externally supplied term rewriting system as input to perform deductive inference on membership queries. The TRS is not derived from the algorithm's outputs or fitted parameters; it is a user-provided assumption whose consistency with the unknown target language is stated as a precondition. The construction augments the standard observation table with deductive closure under the rewrite rules, and the manuscript supplies explicit definitions, the inference procedure, and a termination/correctness argument that holds when the precondition is met. No load-bearing step reduces to a self-definition, a fitted input renamed as prediction, or a self-citation chain; the central claim remains independent of its own results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that any supplied term rewriting system is sound and complete for the properties of the target language; no free parameters or invented entities are mentioned.

axioms (1)
  • domain assumption The term rewriting system is consistent with the target tree language and its deductive closure yields correct membership answers.
    The algorithm uses the system to replace oracle queries; inconsistency would invalidate the inferred answers.

pith-pipeline@v0.9.0 · 5601 in / 1180 out tokens · 47065 ms · 2026-05-25T06:27:05.667476+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 1 internal anchor

  1. [1]

    A. V . Aho, M. R. Garey, and J. D. Ullman. The transitive reduction of a directed graph.SIAM Journal on Computing, 1(2):131–137, 1972

  2. [2]

    Closure properties and decision problems of dag automata.Inf

    Siva Anantharaman, Paliath Narendran, and Micha ¨el Rusinowitch. Closure properties and decision problems of dag automata.Inf. Process. Lett., 94(5):231–240, 2005

  3. [3]

    Learning regular sets from queries and counterexamples.Information and computation, 75(2):87– 106, 1987

    Dana Angluin. Learning regular sets from queries and counterexamples.Information and computation, 75(2):87– 106, 1987

  4. [4]

    Unification of concept terms in description logics

    Franz Baader and Paliath Narendran. Unification of concept terms in description logics. In Henri Prade, editor, ECAI 1998, pages 331–335. John Wiley and Sons, 1998

  5. [5]

    Cambridge University Press, 1998

    Franz Baader and Tobias Nipkow.Term rewriting and all that. Cambridge University Press, 1998

  6. [6]

    Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search

    Beno ˆıt Barbot, Benedikt Bollig, Alain Finkel, Serge Haddad, Igor Khmelnitsky, Martin Leucker, Daniel Neider, Rajarshi Roy, and Lina Ye. Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. InICGI 2021, pages 113–129. PMLR, 2021

  7. [7]

    Berlinkov

    Mikhail V . Berlinkov. Approximating the minimum length of synchronizing words is hard.Theory Comput. Syst., 54(2):211–223, 2014

  8. [8]

    Automata learning and identification of the support of language models

    Satwik Bhattamishra, Michael Hahn, and Varun Kanade. Automata learning and identification of the support of language models. InThe Fourteenth International Conference on Learning Representations, 2026

  9. [9]

    On the regularity and learnability of ordered DAG lan- guages

    Henrik Bj ¨orklund, Johanna Bj¨orklund, and Petter Ericson. On the regularity and learnability of ordered DAG lan- guages. In Arnaud Carayol and Cyril Nicaud, editors,CIAA 2017, volume 10329 ofLecture Notes in Computer Science, pages 27–39. Springer, 2017. 15

  10. [10]

    Automata and logics for unranked and unordered trees

    Iovka Boneva and Jean-Marc Talbot. Automata and logics for unranked and unordered trees. In J ¨urgen Giesl, editor,RTA 2005, volume 3467 ofLecture Notes in Computer Science, pages 500–515. Springer, 2005

  11. [11]

    Pozn ´amka k homog´ennym experimentom s koneˇcn`ymi automatmi.Matematicko-fyzik ´alny ˇcasopis, 14(3):208–216, 1964

    J ´an ˇCern`y. Pozn ´amka k homog´ennym experimentom s koneˇcn`ymi automatmi.Matematicko-fyzik ´alny ˇcasopis, 14(3):208–216, 1964

  12. [12]

    Automata on dag representations of finite trees

    Witold Charatonik. Automata on dag representations of finite trees. 1999

  13. [13]

    Hubert Comon, Max Dauchet, R ´emi Gilleron, Florent Jacquemard, Denis Lugiez, Christof L¨oding, Sophie Tison, and Marc Tommasi.Tree Automata Techniques and Applications. 2008

  14. [14]

    Scalable tree-based register automata learning

    Simon Dierl, Paul Fiterau-Brostean, Falk Howar, Bengt Jonsson, Konstantinos Sagonas, and Fredrik T ˚aquist. Scalable tree-based register automata learning. InTACAS 2024, pages 87–108, 2024

  15. [15]

    Learning a regular tree language from a teacher

    Frank Drewes and Johanna H ¨ogberg. Learning a regular tree language from a teacher. In Zolt´an ´Esik and Zolt´an F¨ul¨op, editors,DLT 2003, volume 2710 ofLecture Notes in Computer Science, pages 279–291. Springer, 2003

  16. [16]

    Query learning of regular tree languages: How to avoid dead states.Theory Comput

    Frank Drewes and Johanna H ¨ogberg. Query learning of regular tree languages: How to avoid dead states.Theory Comput. Syst., 40(2):163–185, 2007

  17. [17]

    Reset sequences for monotonic automata.SIAM J

    David Eppstein. Reset sequences for monotonic automata.SIAM J. Comput., 19(3):500–510, 1990

  18. [18]

    Active automata learning with advice

    Michal Fica and Jan Otop. Active automata learning with advice. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1655–1662. IOS Press, 2025

  19. [19]

    Active Automata Learning with Adaptive Distinguishing Sequences

    Markus Theo Frohme. Active automata learning with adaptive distinguishing sequences.CoRR, abs/1902.01139, 2019

  20. [20]

    Learning definable hypotheses on trees

    ´Emilie Grienenberger and Martin Ritzert. Learning definable hypotheses on trees. In Pablo Barcel ´o and Marco Calautti, editors,ICDT 2019, volume 127 ofLIPIcs, pages 24:1–24:18. Schloss Dagstuhl - Leibniz-Zentrum f ¨ur Informatik, 2019

  21. [21]

    Learning multiplicity tree automata

    Amaury Habrard and Jos ´e Oncina. Learning multiplicity tree automata. In Yasubumi Sakakibara, Satoshi Kobayashi, Kengo Sato, Tetsuro Nishino, and Etsuji Tomita, editors,ICGI 2006, volume 4201 ofLecture Notes in Computer Science, pages 268–280. Springer, 2006

  22. [22]

    The TTT algorithm: A redundancy-free approach to active automata learning

    Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT algorithm: A redundancy-free approach to active automata learning. InRV 2014, pages 307–322, 2014

  23. [23]

    Four one-shot learners for regular tree languages and their polynomial characterizability.Theor

    Anna Kasprzik. Four one-shot learners for regular tree languages and their polynomial characterizability.Theor. Comput. Sci., 485:85–106, 2013

  24. [24]

    Learning tree automata with term rewriting

    Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting. Into appear at IJCAI 2026. ijcai.org, 2026

  25. [25]

    Learning tree automata with term rewriting: code and data

    Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting: code and data. https://github. com/jotop/LearnDFTAwithTRS, 2026

  26. [26]

    Information extraction from web documents based on local unranked tree automaton inference

    Raymond Kosala, Maurice Bruynooghe, Jan Van den Bussche, and Hendrik Blockeel. Information extraction from web documents based on local unranked tree automaton inference. InIJCAI 2003, pages 403–408. Morgan Kaufmann, 2003

  27. [27]

    Small test suites for active automata learning

    Loes Kruger, Sebastian Junges, and Jurriaan Rot. Small test suites for active automata learning. InTACAS 2024, pages 109–129, 2024

  28. [28]

    Actively learning el terminologies from large language models

    Matteo Magnini, Riccardo Squarcialupi, Martin Tunge Sterri, Ana Ozaki, and Rivera Castillo. Actively learning el terminologies from large language models. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1792–1799. IOS Press, 2025. 16

  29. [29]

    Complexity of equivalence and learning for multiplicity tree automata.Journal of Machine Learning Research, 16:2465–2500, 2015

    Ines Marusic and James Worrell. Complexity of equivalence and learning for multiplicity tree automata.Journal of Machine Learning Research, 16:2465–2500, 2015

  30. [30]

    An introduction to description logics.Description logic handbook, 1:40, 2003

    Daniele Nardi, Ronald J Brachman, et al. An introduction to description logics.Description logic handbook, 1:40, 2003

  31. [31]

    Learning of structurally unambiguous probabilistic gram- mars

    Dolav Nitay, Dana Fisman, and Michal Ziv-Ukelson. Learning of structurally unambiguous probabilistic gram- mars. InAAAI 2021, pages 9170–9178. AAAI Press, 2021

  32. [32]

    Automatic grammar repair

    Moeketsi Raselimo and Bernd Fischer. Automatic grammar repair. In Eelco Visser, Dimitris S. Kolovos, and Emma S¨oderberg, editors,SLE 2021, pages 126–142. ACM, 2021

  33. [33]

    Learning regular tree languages from correction and equivalence queries.J

    Catalin Ionut Tirn ˘auc˘a and Cristina Tirn˘auc˘a. Learning regular tree languages from correction and equivalence queries.J. Autom. Lang. Comb., 12(4):501–524, 2007

  34. [34]

    Craig A. Tovey. A simplified np-complete satisfiability problem.Discret. Appl. Math., 8(1):85–89, 1984

  35. [35]

    Model learning.Communications of the ACM, 60(2):86–95, 2017

    Frits Vaandrager. Model learning.Communications of the ACM, 60(2):86–95, 2017

  36. [36]

    Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann

    Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. InTACAS 2022, pages 223–243, 2022

  37. [37]

    Learning pomset automata

    Gerco van Heerdt, Tobias Kapp´e, Jurriaan Rot, and Alexandra Silva. Learning pomset automata. In Stefan Kiefer and Christine Tasson, editors,ETAPS 2021, volume 12650, pages 510–530. Springer, 2021

  38. [38]

    Witwicki, Matei Zaharia, and Sanjit A

    Marcell Vazquez-Chanlatte, Karim Elmaaroufi, Stefan J. Witwicki, Matei Zaharia, and Sanjit A. Seshia.l ∗lm: Learning automata from examples using natural language oracles, 2025

  39. [39]

    Synthesizing transformations on hierarchically structured data

    Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. Synthesizing transformations on hierarchically structured data. In Chandra Krintz and Emery D. Berger, editors,PLDI 2016, pages 508–521. ACM, 2016. 17