Recognition: no theorem link
Minimization of Streaming Transducers
Pith reviewed 2026-05-14 21:42 UTC · model grok-4.3
The pith
Streaming transducers admit minimal models under general criteria that yield effective minimization for string-to-tree variants.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We provide general criteria for the existence of minimal models of streaming transducers, namely devices that read an input word and produce an output value by iteratively updating an internal memory. This abstract model subsumes classical (sub)sequential transducers, streaming string-to-string transducers, polynomial automata, and variants of streaming string-to-tree transducers. We then instantiate these criteria to obtain effective minimization results for variants of the latter model, where outputs are terms constructed incrementally by extending (tuples of) terms either at the leaves or at the roots.
What carries the argument
General criteria for the existence of minimal models of streaming transducers, which are devices that read an input word and produce an output by iteratively updating internal memory.
Load-bearing premise
The abstract streaming transducer model correctly subsumes the cited classical models and the general criteria can be instantiated to yield effective procedures for the string-to-tree variants without hidden restrictions on determinism or output structure.
What would settle it
A specific streaming transducer instance satisfying the general criteria for which no minimal model exists or for which the instantiated minimization procedure fails to produce a smaller equivalent transducer.
Figures
read the original abstract
We provide general criteria for the existence of minimal models of streaming transducers, namely devices that read an input word and produce an output value by iteratively updating an internal memory. This abstract model subsumes classical (sub)sequential transducers (Sch\"utzenberger), streaming string-to-string transducers (Alur-\v{C}ern\'y), polynomial automata (Benedikt et al.), and variants of streaming string-to-tree transducers (Alur-D'Antoni). We then instantiate these criteria to obtain effective minimization results for variants of the latter model, where outputs are terms constructed incrementally by extending (tuples of) terms either at the leaves or at the roots.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces general criteria for the existence of minimal models of streaming transducers, an abstract model that subsumes classical models including sequential transducers (Schützenberger), streaming string-to-string transducers (Alur-Černý), polynomial automata (Benedikt et al.), and variants of streaming string-to-tree transducers (Alur-D'Antoni). It derives these criteria via a lattice-theoretic argument on memory-update functions and instantiates them to obtain effective minimization results for string-to-tree variants where outputs are terms constructed incrementally by extending tuples of terms at the leaves or at the roots.
Significance. If the results hold, the work offers a significant unification of minimization across transducer models in automata theory. Strengths include the explicit embeddings of prior models into the abstract framework, the parameter-free lattice-theoretic derivation of minimality criteria in Section 3, and the effective procedures obtained in Section 5 by direct specialization using finite-state memory restrictions and the inductive structure of output terms.
minor comments (2)
- [§3] §3: The lattice-theoretic argument on memory-update functions would benefit from an explicit small example showing how the partial order is used to identify minimal elements, to improve accessibility for readers unfamiliar with the technique.
- [§5] §5: The termination argument for the leaf- and root-extension cases relies on the finite-state restriction; a brief remark on the precise complexity (e.g., polynomial in the number of states) would strengthen the effectiveness claim.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of our work and for recommending minor revision. We are pleased that the unification of minimization criteria across transducer models, the lattice-theoretic derivation, and the effective procedures for string-to-tree variants were viewed as significant contributions. As no specific major comments were raised in the report, we have no detailed points to address point-by-point at this stage. We will incorporate any minor editorial or presentational improvements in the revised manuscript.
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper develops general criteria for minimal models of streaming transducers via a lattice-theoretic argument on memory-update functions (Section 3), after establishing subsumption of classical models through explicit embeddings of Schützenberger sequential transducers, Alur-Černý streaming string transducers, Benedikt polynomial automata, and Alur-D'Antoni string-to-tree variants. The instantiations for leaf- and root-extension string-to-tree cases (Section 5) follow by direct specialization of those criteria, with effectiveness and termination following from the finite-state memory restriction and inductive term structure; no steps reduce by construction to inputs, fitted parameters, or self-citation chains. All cited prior models are external and independent, and the central claims rest on the abstract lattice argument rather than renaming or smuggling ansatzes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The abstract streaming transducer model subsumes classical (sub)sequential transducers, streaming string-to-string transducers, polynomial automata, and variants of streaming string-to-tree transducers.
Reference graph
Works this paper leans on
-
[1]
J. Ad´ amek, H. Herrlich, and G.E. Strecker.Abstract and Concrete Categories: The Joy of Cats. John Wiley & Sons, 1990. URL:http://katmat.math.uni-bremen. de/acc
work page 1990
-
[2]
R. Alur and P. Cern´ y. Expressiveness of streaming string transducer. InFSTTCS’10, volume 8 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 1–12. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2010
work page 2010
-
[3]
R. Alur and L. D’Antoni. Streaming tree transducers.J. ACM, 64(5):31:1–31:55, 2017.doi:10.1145/3092842
-
[4]
R. Alur, L. D’Antoni, J. Deshmukh, M. Raghothaman, and Y. Yuan. Regular func- tions and cost register automata. InLICS’13, pages 13–22. IEEE, 2013
work page 2013
-
[5]
Decision Problems for Additive Regular Functions
R. Alur and M. Raghothaman. Decision problems for additive regular func- tions. InICALP’20, volume 7966 ofLecture Notes in Computer Science, pages 37–48. Springer, 2013. URL:https://arxiv.org/abs/1304.7029,doi:10.1007/ 978-3-642-39212-2_8
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[6]
D. Angluin. Learning regular sets from queries and counterexamples.Inf. Comput., 75(2):87–106, 1987.doi:10.1016/0890-5401(87)90052-6
- [7]
- [8]
-
[9]
M. Balachander, E. Filiot, R. Gentilini, and N. Tzevelekos. Register automata with permutations. In P. Gawrychowski, F. Mazowiecki, and M. Skrzypczak, editors, MFCS’25, volume 345 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 14:1–14:18. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2025. URL: 10.4230/LIPIcs.MFCS.2025.14,doi:10...
-
[10]
F. Baschenis, O. Gauwin, A. Muscholl, and G. Puppis. Minimizing resources of sweeping and streaming string transducers. In I. Chatzigiannakis, M. Mitzen- macher, Y. Rabani, and D. Sangiorgi, editors,ICALP’16, volume 55 ofLeib- niz International Proceedings in Informatics (LIPIcs), pages 114:1–114:14. Schloss Dagstuhl – Leibniz-Zentrum f¨ ur Informatik, 20...
-
[11]
A. Baumgartner, T. Kutsia, J. Levy, and M. Villaret. Higher-order pattern anti- unification in linear time.Journal of Automated Reasoning, 58(2):293–310, 2017. doi:10.1007/s10817-016-9383-3
-
[12]
J.P. Bell and D. Smertnig. Noncommutative rational p´ olya series.Selecta Mathemat- ica, 27(3), 2021.doi:10.1007/s00029-021-00629-2
-
[13]
J.P. Bell and D. Smertnig. Computing the linear hull: Deciding deterministic? and unambiguous? for weighted automata over fields. InLICS’23, pages 1–13. IEEE,
-
[14]
URL:https://arxiv.org/abs/2209.02260,doi:10.1109/LICS56636.2023. 10175691
-
[15]
Y.I. Benalioua, N. Lhote, and P.-A. Reynier. Minimizing cost register automata over a field. In R. Kr´ aloviˇ c and A. Kuˇ cera, editors,MFCS’24, volume 306 ofLeibniz In- ternational Proceedings in Informatics (LIPIcs), pages 23:1–23:15. Schloss Dagstuhl – Leibniz-Zentrum f¨ ur Informatik, 2024. URL:https://drops.dagstuhl.de/ entities/document/10.4230/LIP...
-
[16]
Minimizing Streaming String Transducers: An algebraic approach
Y.I. Benalioua, N. Lhote, and P.-A. Reynier. Minimizing streaming string transducers: An algebraic approach, 2026.arXiv:2604.11567
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[17]
M. Benedikt, T. Duff, A. Sharad, and J. Worrell. Polynomial automata: Zeroness and applications. InLICS’17, pages 1–12. IEEE, 2017
work page 2017
-
[18]
M. Benedikt, C. Ley, and G. Puppis. What you must remember when processing data words. InPODS’10, volume 619 ofCEUR Workshop Proceedings. CEUR-WS.org, 2010
work page 2010
-
[19]
A. Boiret, R. Pi´ orkowski, and J. Schmude. Reducing transducer equivalence to register automata problems solved by Hilbert method. InFSTTCS’18, volume 122 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 48:1–48:16. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2018
work page 2018
-
[20]
M. Boja´ nczyk. Transducers with origin information. InICALP’14, number 8572 in Lecture Notes in Computer Science, pages 26–37. Springer, 2014. 28
work page 2014
-
[21]
M. Boja´ nczyk, B. Klin, and S. Lasota. Automata theory in nominal sets.LMCS, 10(3), 2014
work page 2014
-
[22]
J.W. Carlyle and A. Paz. Realizations by stochastic finite automata.Journal of Computer and System Sciences, 5(1):26–40, 1971
work page 1971
- [23]
- [24]
- [25]
-
[26]
Logics with rigidly guarded data tests
T. Colcombet, C. Ley, and G. Puppis. Logics with rigidly guarded data tests.Logical Methods in Computer Science, http://arxiv.org/abs/1410.2022, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[27]
T. Colcombet and D. Petrisan. Automata minimization: a functorial approach.Log. Methods Comput. Sci., 16(1), 2020.doi:10.23638/LMCS-16(1:32)2020
-
[28]
T. Colcombet, D. Petrisan, and R. Stabile. Learning automata and transducers: A categorical approach. In C. Baier and J. Goubault-Larrecq, editors,CSL’21, volume 183 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 15:1–15:17. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2021.doi:10.4230/LIPICS.CSL. 2021.15
-
[29]
T. Colcombet, S. van Gool, and R. Morvan. First-order separation over count- able ordinals. In P. Bouyer and L. Schr¨ oder, editors,FoSSaCS’20, volume 13242 ofLecture Notes in Computer Science, pages 264–284. Springer, 2022.doi:10.1007/ 978-3-030-99253-8\_14
work page 2022
- [30]
-
[31]
D.A. Cox, J. Little, and D. O’Shea.Ideals, Varieties, and Algorithms. Springer, 4 edition, 2015
work page 2015
-
[32]
K. Culik II and J. Karhum¨ aki. The equivalence of finite valued transducers (on HDT0L languages) is decidable.Theor. Comput. Sci., 47:71–84, 1986
work page 1986
-
[33]
L. Daviaud, P.-A. Reynier, and J.-M. Talbot. A generalised twinning property for minimisation of cost register automata. InLICS’16, pages 857–866. ACM, 2016. doi:10.1145/2933575.2934549
-
[34]
de la Higuera.Grammatical Inference: Learning Automata and Grammars
C. de la Higuera.Grammatical Inference: Learning Automata and Grammars. Cam- bridge University Press, 2010
work page 2010
-
[35]
Eilenberg.Automata, Languages, and Machines
S. Eilenberg.Automata, Languages, and Machines. Volume B. Academic Press, 1976
work page 1976
-
[36]
S. Eilenberg and M.-P. Sch¨ utzenberger. On pseudovarieties.Advances in Mathemat- ics, 19(3):413–418, 1976.doi:10.1016/0001-8708(76)90029-3. 29
-
[37]
J. Eisner. Simpler and more general minimization for weighted finite-state automata. InNAACL-HLT’03, NAACL’03, pages 64–71. Association for Computational Lin- guistics, 2003.doi:10.3115/1073445.1073454
- [38]
-
[39]
E. Filiot, S.N. Krishna, and A. Trivedi. First-order definable string transforma- tions. In V. Raman and S.P. Suresh, editors,FSTTCS’20, volume 29 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 147–159. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2014.doi:10.4230/LIPICS.FSTTCS.2014.147
-
[40]
M. Fliess. Matrices de hankel.Journal de Math´ ematiques Pures et Appliqu´ ees, 53(9):197–222, 1974. Corrigendum: Journal de Math´ ematiques Pures et Appliqu´ ees, vol. 54, p. 481, 1975
work page 1974
-
[41]
S. Gerdjikov. A general class of monoids supporting canonisation and minimisation of (sub)sequential transducers. In S.T. Klein, C. Mart´ ın-Vide, and D. Shapira, editors, Language and Automata Theory and Applications, pages 143–155. Springer, 2018. doi:10.1007/978-3-319-77313-1_11
-
[42]
J.A. Goguen. Minimal realization of machines in closed categories.Bulletin of the American Mathematical Society, 78(5):777–783, 1972
work page 1972
-
[43]
D. Hilbert. Ueber die theorie der algebraischen formen.Mathematische Annalen, 36(4):473–531, 1890.doi:10.1007/BF01208378
-
[44]
J.E. Hopcroft and J.D. Ullman.Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979
work page 1979
-
[45]
E. Hrushovski, J. Ouaknine, A. Pouly, and J. Worrell. On strongest algebraic program invariants.J. ACM, 70(5), 2023.doi:10.1145/3614319
-
[46]
M. Kaminski and N. Francez. Finite-memory automata.Theoretical Computer Sci- ence, 134(2):329–363, 1994.doi:10.1016/0304-3975(94)90242-9
-
[47]
M. Karr. Affine relationships among variables of a program.Acta Informatica, 6(2):157–169, 1976.doi:10.1007/BF00268497
-
[48]
Laurence.Normalisation et apprentissage de transductions d’arbres en mots
G. Laurence.Normalisation et apprentissage de transductions d’arbres en mots. PhD thesis, ´Ecole doctorale Sciences pour l’Ing´ enieur (Lille), 2014. Th` ese de doctorat. URL:https://hal.science/tel-01053084/
work page 2014
-
[49]
A. Maletti. Myhill-Nerode theorem for sequential transducers over unique GCD- monoids. InInternational Conference on Implementation and Application of Au- tomata, pages 323–324. Springer, 2004
work page 2004
-
[50]
A. Martelli and U. Montanari. An efficient unification algorithm.ACM Transactions on Programming Languages and Systems (TOPLAS), 4(2):258–282, 1982.doi:10. 1145/357162.357169
-
[51]
J. Moerman, M. Sammartino, A. Silva, B. Klin, and M. Szynwelski. Learning nominal automata. In G. Castagna and A.D. Gordon, editors,POPL’17, pages 613–625. ACM, 2017.doi:10.1145/3009837.3009879. 30
-
[52]
A. Muscholl and G. Puppis. Equivalence of finite-valued streaming string transducers is decidable. In C. Baier, I. Chatzigiannakis, P. Flocchini, and S. Leonardi, editors, ICALP’19, volume 132 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 122:1–122:15. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik, 2019.doi: 10.4230/LIPIcs.ICALP...
-
[53]
A. Muscholl and G. Puppis. The many facets of string transducers (invited paper). In STACS’19, volume 126 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 2:1–2:22. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2019
work page 2019
-
[54]
F. Pfenning. Unification and anti-unification in the calculus of constructions. In LICS’91, pages 74–85. IEEE Computer Society, 1991.doi:10.1109/LICS.1991. 151632
-
[55]
G.D. Plotkin. A note on inductive generalization.Machine Intelligence, 5:153–163, 1970
work page 1970
- [56]
-
[57]
J.F. Ritt. Prime and composite polynomials.Transactions of the American Mathe- matical Society, 23(1):51–66, 1922.doi:10.2307/1988911
-
[58]
R.L. Rivest and R.E. Schapire. Inference of finite automata using homing sequences. Information and Computation, 103(2):299–347, 1993.doi:10.1006/inco.1993. 1021
-
[59]
J.A. Robinson. A machine-oriented logic based on the resolution principle.Journal of the ACM, 12(1):23–41, 1965.doi:10.1145/321250.321253
-
[60]
C. Corrales Rodrig´ a˜ nez. A note on Ritt’s theorem on decomposition of polyno- mials.Journal of Pure and Applied Algebra, 68:293–296, 1990.doi:10.1016/ 0022-4049(90)90086-W
work page 1990
-
[61]
O. Saarikivi and M. Veanes. Minimization of symbolic transducers. In R. Majumdar and V. Kuncak, editors,CAV’20, volume 10427 ofLecture Notes in Computer Science, pages 176–196. Springer, 2017.doi:10.1007/978-3-319-63390-9\_10
-
[62]
M.-P. Sch¨ utzenberger. On finite monoids having only trivial subgroups.Information and Control, 8:190–194, 1965
work page 1965
-
[63]
M.P. Sch¨ utzenberger. Sur une variante des fonctions sequentielles.TCS, 4(1):47–57, 1977.doi:10.1016/0304-3975(77)90055-X
-
[64]
H. Seidl, S. Maneth, and G. Kemper. Equivalence of deterministic top-down tree-to- string transducers is decidable.J. ACM, 65(4), 2018.doi:10.1145/3182653
-
[65]
M. Shahbaz and R. Groz. Inferring mealy machines. In A. Cavalcanti and D. Dams, editors,Formal Methods, Second World Congress (FM), volume 5850 of Lecture Notes in Computer Science, pages 207–222. Springer, 2009.doi:10.1007/ 978-3-642-05089-3\_14. 31
work page 2009
-
[66]
J.M. Vilar. Query learning of subsequential transducers. In L. Miclet and C. de la Higuera, editors,Grammatical Inference: Learning Syntax from Sentences, 3rd In- ternational Colloquium, ICGI-96, Montpellier, France, September 25-27, 1996, Pro- ceedings, volume 1147 ofLecture Notes in Computer Science, pages 72–83. Springer, 1996.doi:10.1007/BFB0033343. 3...
-
[67]
sincef;h ′=h;g, we haveh ′(f(⟨R⟩))=g(h(⟨R⟩))
-
[68]
sincegis an inclusion map, we haveg(h(⟨R⟩))=h(⟨R⟩)⊆⟨Q⟩, and hence, by monotonicity, cl(g(h(⟨R⟩)))⊆cl(⟨Q⟩)=⟨Q⟩
-
[69]
sincefis epi, by Lemma 12 we have⟨R ′⟩⊆cl(f(⟨R⟩)). Putting all together and using Lemma 11 to pull out the closure, we get Rng(h′)=h ′(⟨R′⟩)⊆h′(cl(f(⟨R⟩)))⊆cl(h′(f(⟨R⟩)))=cl(g(h(⟨R⟩)))⊆⟨Q⟩. We can now definedash ′with the codomain restricted to⟨Q⟩. This can implemented at the level of specifications, as follows. Given a specification( ˆh′, ˇh′)ofh ′, the ...
-
[70]
This trivially satisfies the first property of the lemma fors=ε
and we letG[ε]be an EGCD ofH[ε,−]andR[ε,−]=G[ε]/H[ε,−]. This trivially satisfies the first property of the lemma fors=ε. For the inductive step, we consider a words∈Σ∗and a lettera∈Σ, and we define first ∂a[s]andR[sa,−], and thenG[sa]. In doing so we assume as inductive hypothesis that G[s]andR[s,−]are already defined. We then define∂ a[s]as an EGCD of th...
-
[71]
Sinceeis epi, by Lemma 12, ˆeis a surjective partial function, and hence for every q′∈QB, the set ˆe−1(q′)is non-empty (this set contains control states ofI, which are words over Σ). Hereafter,sis tacitly assumed to be an arbitrary word from ˆe −1(q′) that induces a non-trivial residual vectorR[s,−]
-
[72]
Sinceeis a morphism fromItoB, we also havef s =ˇe(s)
For everyt∈Σ∗, the outputφ(st)ofB, seen as an update from the singleton domain D▷, factorizes asf s;h t, wheref s =ˇχ▷s(q▷)andh t =ˇχt◁(q′), namely,f s (resp.h t) is the update induced by▷s(resp.t◁) inBstarting fromq ▷(resp.q ′). Sinceeis a morphism fromItoB, we also havef s =ˇe(s)
-
[73]
BecauseBcomputes the transductionφ, we have H[s,−]= (φ(st))t∈Σ∗=f s;V q′
LetV q′=(h t)t∈Σ∗. BecauseBcomputes the transductionφ, we have H[s,−]= (φ(st))t∈Σ∗=f s;V q′. Moreover, by Lemma 20,G[s]is an EGCD ofH[s,−]andR[s,−]is the correspond- ing residual vector. We thus have R[s,−]=G[s]/(f s;V q′)
-
[74]
Now, letg s be an EGCD of the vectorV q′, and recall from Lemma 17 that any two EGCDs ofV q′are isomorphic. In particular, sinceV q′depends only onq ′, for all s, s′∈ˆe−1(q′), there exists an isomorphismi s,s′between the codomains ofg s andg s′ such that gs =g s′;i s,s′
-
[75]
We have H[s,−]=f s;V q′=f s;W s,q′ and similarly fors ′∈ˆe−1(q′)
LetW s,q′=g s/V q′be the residual vector ofV q′via the EGCDg s. We have H[s,−]=f s;V q′=f s;W s,q′ and similarly fors ′∈ˆe−1(q′). The isomorphismi s,s′witnessingg s ≅gs′also induces a≅-equivalence between the corresponding residual vectors: R[s,−] is,s′ ≅R[s′,−]. In particular, all the vectorsR[s,−]fors∈ˆe−1(q′)lie in the same≅-class. We can now specify t...
-
[76]
the closure cl(D 1∪D2)of the union of two constrained domainsD 1, D2,
-
[77]
the closure cl(f(D))of the application of an updatefto a constrained domainD. In the following we shall focus on these two sub-problems, and finally discuss termination of the fixpoint computation. For both sub-problems we shall exploit Lemma 32, which shows that every constrained domain admits an equivalent parametric form computable as the most general ...
-
[78]
the images ofe, seen as occurrences of subterms insideT, are pairwisedisjoint, meaning they are not contained one in another as subterms
-
[79]
A functioneas above is called asubterm embeddingofSintoT, and it is denoted for short bye∶S↪T
the images ofecover all the variables that appear inT. A functioneas above is called asubterm embeddingofSintoT, and it is denoted for short bye∶S↪T. Below is an example: S = { , , } T = ⎧⎪⎪⎨⎪⎪⎩ , ⎫⎪⎪⎬⎪⎪⎭ When we writeS, T, . . .↪U, V, . . .we mean that there are embeddings of every multiset of the left hand-side into every multiset of the right hand-side...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.