Learning Tree Automata with Term Rewriting
Pith reviewed 2026-05-25 06:27 UTC · model grok-4.3
The pith
Supplying a term rewriting system allows the tree automaton learner to infer some membership query answers and reduce total queries asked.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that incorporating a term rewriting system into the learning procedure for tree automata enables deductive inference of membership answers, which lowers query complexity while preserving the algorithm's ability to learn the target language correctly.
What carries the argument
The term rewriting system, whose rules are applied to infer membership of terms in the target language during the learning process.
If this is right
- The number of membership queries decreases when the rewrite system captures equational properties of the language.
- The method works for properties like commutativity of function symbols in tree terms.
- Correctness holds as long as the rewrite system is consistent with the target language.
- Significant query savings are possible for common properties of tree data such as order-irrelevance.
Where Pith is reading between the lines
- This technique might apply to learning other structures by supplying appropriate rewrite systems for their properties.
- Applications in verifying tree-structured programs could benefit from reduced interaction with oracles.
- Future work could explore how to derive suitable rewrite systems automatically from examples.
Load-bearing premise
The provided term rewriting system must be consistent with the target language and its deductive closure must correctly answer the membership queries it is applied to.
What would settle it
If the learner outputs an incorrect automaton after using the rewrite system to answer queries, while the actual language differs on one of those inferred memberships, the approach does not hold.
read the original abstract
We present an extension of the Angluin-style learning algorithm for tree automata that incorporates deductive inference. The learning algorithm is provided with a term rewriting system that specifies properties of the target tree language (e.g., the order of subtrees under a symbol f is irrelevant). This term rewriting system is used to infer answers to some queries, which reduces the query complexity of the learning algorithm. We present examples of rewrite systems that express natural properties of tree-structured data, which yield a significant reduction in the number of queries.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extends the Angluin-style learning algorithm for tree automata by supplying a term rewriting system (TRS) that encodes properties of the target language (e.g., commutativity). The TRS is used to compute the deductive closure of observed membership answers, allowing the algorithm to infer additional query results and thereby reduce the total number of membership and equivalence queries. The manuscript supplies the necessary definitions of the augmented observation table, the inference procedure, and a proof that the resulting hypothesis is consistent with all observed and inferred answers and that the procedure terminates with a correct automaton whenever the supplied TRS is sound for the unknown target language. Concrete rewrite systems for natural tree-language properties are presented and shown to produce substantial query savings on illustrative examples.
Significance. If the central claim holds, the work demonstrates a practical and sound method for injecting domain knowledge into automata learning via an externally supplied, consistent TRS. This is a natural and standard augmentation of oracle-assisted learning that preserves correctness while lowering query complexity. The explicit proof of consistency and termination under the stated precondition, together with the concrete examples, constitutes a clear contribution to the literature on efficient learning of tree automata.
minor comments (3)
- [§3.2] §3.2: the definition of the deductive closure operator could be accompanied by a short pseudocode fragment showing how membership answers are propagated under a single rewrite step; the current prose description leaves the implementation details implicit.
- [§4] §4, Example 2: the reduction in query count is stated qualitatively; adding a small table that contrasts the number of queries required with and without the TRS on the same target language would make the claimed savings concrete and easier to verify.
- The termination argument relies on the TRS being both sound and terminating; while soundness is stated as a precondition, the manuscript should note explicitly that the supplied TRS is assumed to be terminating (or that the closure computation is performed only up to a bounded depth).
Simulated Author's Rebuttal
We thank the referee for the supportive summary and the recommendation of minor revision. No major comments were provided in the report, so we have no specific points to address point-by-point. We will perform a careful proofreading pass and incorporate any minor editorial suggestions during revision.
Circularity Check
No significant circularity identified
full rationale
The paper presents an extension of Angluin's tree automaton learning algorithm that accepts an externally supplied term rewriting system as input to perform deductive inference on membership queries. The TRS is not derived from the algorithm's outputs or fitted parameters; it is a user-provided assumption whose consistency with the unknown target language is stated as a precondition. The construction augments the standard observation table with deductive closure under the rewrite rules, and the manuscript supplies explicit definitions, the inference procedure, and a termination/correctness argument that holds when the precondition is met. No load-bearing step reduces to a self-definition, a fitted input renamed as prediction, or a self-citation chain; the central claim remains independent of its own results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The term rewriting system is consistent with the target tree language and its deductive closure yields correct membership answers.
Reference graph
Works this paper leans on
-
[1]
A. V . Aho, M. R. Garey, and J. D. Ullman. The transitive reduction of a directed graph.SIAM Journal on Computing, 1(2):131–137, 1972
work page 1972
-
[2]
Closure properties and decision problems of dag automata.Inf
Siva Anantharaman, Paliath Narendran, and Micha ¨el Rusinowitch. Closure properties and decision problems of dag automata.Inf. Process. Lett., 94(5):231–240, 2005
work page 2005
-
[3]
Dana Angluin. Learning regular sets from queries and counterexamples.Information and computation, 75(2):87– 106, 1987
work page 1987
-
[4]
Unification of concept terms in description logics
Franz Baader and Paliath Narendran. Unification of concept terms in description logics. In Henri Prade, editor, ECAI 1998, pages 331–335. John Wiley and Sons, 1998
work page 1998
-
[5]
Cambridge University Press, 1998
Franz Baader and Tobias Nipkow.Term rewriting and all that. Cambridge University Press, 1998
work page 1998
-
[6]
Beno ˆıt Barbot, Benedikt Bollig, Alain Finkel, Serge Haddad, Igor Khmelnitsky, Martin Leucker, Daniel Neider, Rajarshi Roy, and Lina Ye. Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. InICGI 2021, pages 113–129. PMLR, 2021
work page 2021
- [7]
-
[8]
Automata learning and identification of the support of language models
Satwik Bhattamishra, Michael Hahn, and Varun Kanade. Automata learning and identification of the support of language models. InThe Fourteenth International Conference on Learning Representations, 2026
work page 2026
-
[9]
On the regularity and learnability of ordered DAG lan- guages
Henrik Bj ¨orklund, Johanna Bj¨orklund, and Petter Ericson. On the regularity and learnability of ordered DAG lan- guages. In Arnaud Carayol and Cyril Nicaud, editors,CIAA 2017, volume 10329 ofLecture Notes in Computer Science, pages 27–39. Springer, 2017. 15
work page 2017
-
[10]
Automata and logics for unranked and unordered trees
Iovka Boneva and Jean-Marc Talbot. Automata and logics for unranked and unordered trees. In J ¨urgen Giesl, editor,RTA 2005, volume 3467 ofLecture Notes in Computer Science, pages 500–515. Springer, 2005
work page 2005
-
[11]
J ´an ˇCern`y. Pozn ´amka k homog´ennym experimentom s koneˇcn`ymi automatmi.Matematicko-fyzik ´alny ˇcasopis, 14(3):208–216, 1964
work page 1964
-
[12]
Automata on dag representations of finite trees
Witold Charatonik. Automata on dag representations of finite trees. 1999
work page 1999
-
[13]
Hubert Comon, Max Dauchet, R ´emi Gilleron, Florent Jacquemard, Denis Lugiez, Christof L¨oding, Sophie Tison, and Marc Tommasi.Tree Automata Techniques and Applications. 2008
work page 2008
-
[14]
Scalable tree-based register automata learning
Simon Dierl, Paul Fiterau-Brostean, Falk Howar, Bengt Jonsson, Konstantinos Sagonas, and Fredrik T ˚aquist. Scalable tree-based register automata learning. InTACAS 2024, pages 87–108, 2024
work page 2024
-
[15]
Learning a regular tree language from a teacher
Frank Drewes and Johanna H ¨ogberg. Learning a regular tree language from a teacher. In Zolt´an ´Esik and Zolt´an F¨ul¨op, editors,DLT 2003, volume 2710 ofLecture Notes in Computer Science, pages 279–291. Springer, 2003
work page 2003
-
[16]
Query learning of regular tree languages: How to avoid dead states.Theory Comput
Frank Drewes and Johanna H ¨ogberg. Query learning of regular tree languages: How to avoid dead states.Theory Comput. Syst., 40(2):163–185, 2007
work page 2007
-
[17]
Reset sequences for monotonic automata.SIAM J
David Eppstein. Reset sequences for monotonic automata.SIAM J. Comput., 19(3):500–510, 1990
work page 1990
-
[18]
Active automata learning with advice
Michal Fica and Jan Otop. Active automata learning with advice. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1655–1662. IOS Press, 2025
work page 2025
-
[19]
Active Automata Learning with Adaptive Distinguishing Sequences
Markus Theo Frohme. Active automata learning with adaptive distinguishing sequences.CoRR, abs/1902.01139, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1902
-
[20]
Learning definable hypotheses on trees
´Emilie Grienenberger and Martin Ritzert. Learning definable hypotheses on trees. In Pablo Barcel ´o and Marco Calautti, editors,ICDT 2019, volume 127 ofLIPIcs, pages 24:1–24:18. Schloss Dagstuhl - Leibniz-Zentrum f ¨ur Informatik, 2019
work page 2019
-
[21]
Learning multiplicity tree automata
Amaury Habrard and Jos ´e Oncina. Learning multiplicity tree automata. In Yasubumi Sakakibara, Satoshi Kobayashi, Kengo Sato, Tetsuro Nishino, and Etsuji Tomita, editors,ICGI 2006, volume 4201 ofLecture Notes in Computer Science, pages 268–280. Springer, 2006
work page 2006
-
[22]
The TTT algorithm: A redundancy-free approach to active automata learning
Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT algorithm: A redundancy-free approach to active automata learning. InRV 2014, pages 307–322, 2014
work page 2014
-
[23]
Four one-shot learners for regular tree languages and their polynomial characterizability.Theor
Anna Kasprzik. Four one-shot learners for regular tree languages and their polynomial characterizability.Theor. Comput. Sci., 485:85–106, 2013
work page 2013
-
[24]
Learning tree automata with term rewriting
Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting. Into appear at IJCAI 2026. ijcai.org, 2026
work page 2026
-
[25]
Learning tree automata with term rewriting: code and data
Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting: code and data. https://github. com/jotop/LearnDFTAwithTRS, 2026
work page 2026
-
[26]
Information extraction from web documents based on local unranked tree automaton inference
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bussche, and Hendrik Blockeel. Information extraction from web documents based on local unranked tree automaton inference. InIJCAI 2003, pages 403–408. Morgan Kaufmann, 2003
work page 2003
-
[27]
Small test suites for active automata learning
Loes Kruger, Sebastian Junges, and Jurriaan Rot. Small test suites for active automata learning. InTACAS 2024, pages 109–129, 2024
work page 2024
-
[28]
Actively learning el terminologies from large language models
Matteo Magnini, Riccardo Squarcialupi, Martin Tunge Sterri, Ana Ozaki, and Rivera Castillo. Actively learning el terminologies from large language models. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1792–1799. IOS Press, 2025. 16
work page 2025
-
[29]
Ines Marusic and James Worrell. Complexity of equivalence and learning for multiplicity tree automata.Journal of Machine Learning Research, 16:2465–2500, 2015
work page 2015
-
[30]
An introduction to description logics.Description logic handbook, 1:40, 2003
Daniele Nardi, Ronald J Brachman, et al. An introduction to description logics.Description logic handbook, 1:40, 2003
work page 2003
-
[31]
Learning of structurally unambiguous probabilistic gram- mars
Dolav Nitay, Dana Fisman, and Michal Ziv-Ukelson. Learning of structurally unambiguous probabilistic gram- mars. InAAAI 2021, pages 9170–9178. AAAI Press, 2021
work page 2021
-
[32]
Moeketsi Raselimo and Bernd Fischer. Automatic grammar repair. In Eelco Visser, Dimitris S. Kolovos, and Emma S¨oderberg, editors,SLE 2021, pages 126–142. ACM, 2021
work page 2021
-
[33]
Learning regular tree languages from correction and equivalence queries.J
Catalin Ionut Tirn ˘auc˘a and Cristina Tirn˘auc˘a. Learning regular tree languages from correction and equivalence queries.J. Autom. Lang. Comb., 12(4):501–524, 2007
work page 2007
-
[34]
Craig A. Tovey. A simplified np-complete satisfiability problem.Discret. Appl. Math., 8(1):85–89, 1984
work page 1984
-
[35]
Model learning.Communications of the ACM, 60(2):86–95, 2017
Frits Vaandrager. Model learning.Communications of the ACM, 60(2):86–95, 2017
work page 2017
-
[36]
Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann
Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. InTACAS 2022, pages 223–243, 2022
work page 2022
-
[37]
Gerco van Heerdt, Tobias Kapp´e, Jurriaan Rot, and Alexandra Silva. Learning pomset automata. In Stefan Kiefer and Christine Tasson, editors,ETAPS 2021, volume 12650, pages 510–530. Springer, 2021
work page 2021
-
[38]
Witwicki, Matei Zaharia, and Sanjit A
Marcell Vazquez-Chanlatte, Karim Elmaaroufi, Stefan J. Witwicki, Matei Zaharia, and Sanjit A. Seshia.l ∗lm: Learning automata from examples using natural language oracles, 2025
work page 2025
-
[39]
Synthesizing transformations on hierarchically structured data
Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. Synthesizing transformations on hierarchically structured data. In Chandra Krintz and Emery D. Berger, editors,PLDI 2016, pages 508–521. ACM, 2016. 17
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.