pith. machine review for the scientific record. sign in

cs.FL

Formal Languages and Automata Theory

Covers automata theory, formal language theory, grammars, and combinatorics on words. This roughly corresponds to ACM Subject Classes F.1.1, and F.4.3. Papers dealing with computational complexity should go to cs.CC; papers dealing with logic should go to cs.LO.

0
cs.FL 2026-05-14 Recognition

Conventions enable composition of policies for multiple omega-regular goals

Decoupled Planning for Multiple Omega-Regular Objectives

Büchi needs no scheduler info while parity requires knowing which agent is active to ensure all goals are met.

Figure from the paper full image
abstract click to expand
We study the problem of generating paths on a graph that satisfy a collection of {\omega}-regular objectives. We propose a decoupled framework in which each objective is assigned to an independent agent that selects a local policy, while a scheduler -- oblivious to the graph and objective -- dynamically composes these policies into a single path. We ask when such a composition satisfies all objectives, assuming their conjunction is realizable. The framework enables modular policy design but raises fundamental compositional challenges. We show that even extremely fair deterministic schedulers do not ensure correctness, and that stochastic schedulers, while necessary, are insufficient without coordination. For safety objectives, we demonstrate that fully decentralized implementations are impossible, and we introduce a protocol for synchronizing on maximal safe actions. For non-safety objectives, we introduce conventions -- simple, a priori restrictions agreed upon before the graph or objectives are revealed -- that guarantee satisfaction of all objectives when followed by all agents. We characterize minimally restrictive conventions for major subclasses of {\omega}-regular objectives. In particular, B\"uchi objectives admit universal composition of finite-memory policies without scheduler communication; co-B\"uchi objectives require only knowledge of whether the agent was scheduled; and parity objectives additionally require knowledge of which agent was scheduled.
0
0
cs.FL 2026-05-13 Recognition

Flattening lets QuAK analyze nested quantitative automata

Extending QuAK with Nested Quantitative Automata

Reductions preserve threshold answers so existing procedures check properties with unbounded values like average response time.

Figure from the paper full image
abstract click to expand
Quantitative automata (QAs) extend finite-state omega-automata with weighted transitions to specify quantitative system properties. However, their finite weight sets rule out properties like average response time, where response times can be arbitrarily large. Nested quantitative automata (NQAs) overcome this limitation: a parent automaton spawns child automata to compute unbounded values over finite infixes and aggregates them into a final result. Despite this expressiveness, NQAs have lacked practical tool support to date. We address this gap by extending the Quantitative Automata Kit (QuAK), a software tool for QA analysis, to support NQAs. Our core contribution is implementing a suite of flattening procedures that reduce NQAs to QAs, leveraging QuAK's existing decision procedures. These reductions preserve the answers to threshold decision problems, while allowing users to specify properties in the more expressive NQA formalism. The tool handles all combinations of parent aggregators, including limits and averages, and child functions, extrema and monotonic or bounded summations, for which emptiness and universality are known to be decidable. Experiments on response-time and resource-consumption benchmarks demonstrate QuAK's effectiveness.
0
0
cs.FL 2026-05-13 Recognition

Syntactic obligations translate to MTBDD automata for fast synthesis

Fast Obligation Translation and Synthesis

Minimal deterministic weak automata are built directly as diagrams, enabling on-the-fly controller synthesis without full state expansion.

Figure from the paper full image
abstract click to expand
Syntactic obligations are a fragment of LTL formulas that translate to deterministic weak $\omega$-automata (DWA). We show that syntactic obligations can be very efficiently converted to minimal DWA represented using multi-terminal binary decision diagrams (MTBDDs), and that synthesis of such specifications can be solved directly on the MTBDD representation on the fly. Our implementation in Spot shows substantial runtime improvements in translation and synthesis.
0
0
cs.FL 2026-05-13 2 theorems

Sentence-interface types identify bounded-fan-out linear MCFGs from positive data

Finite Sentence-Interface Control for Learning Bounded-Fan-Out Linear MCFGs under Fixed Monoid Typing

Fixed monoid typing plus tuple substitutability yield polynomial-time exact reconstruction once a characteristic sample appears.

Figure from the paper full image
abstract click to expand
We study positive-data learning of bounded-fan-out linear multiple context-free grammars under a fixed explicit finite monoid homomorphism \(h\). The main obstacle beyond the context-free case is that an MCFG nonterminal derives a tuple whose components may be placed in a surrounding sentence in different orders. We introduce sentence-interface types as finite external control objects for such tuple occurrences. A type records the permutation of tuple components in the final sentence together with the \(h\)-values of the boundary intervals between them. For reduced working binary linear nondeleting MCFG presentations whose string languages satisfy \((f,h)\)-tuple substitutability, we build a typed refinement, a finite characteristic sample, and a canonical positive-data learner. Once the sample contains this characteristic sample and remains contained in the target language, the learner reconstructs the language exactly. Consequently, for fixed fan-out bound \(f\) and fixed explicit \(h\), the resulting class is identifiable in the limit from positive data. Moreover, the hypothesis associated with any given finite sample is constructible in polynomial time for fixed \(f\) and fixed \(h\), including output size. Thus sentence-interface control is the finite mechanism that lifts fixed-\(h\) distributional reconstruction from context-free grammars to bounded-fan-out linear MCFGs.
0
0
cs.FL 2026-05-12 Recognition

General criteria guarantee minimal models for streaming transducers

Minimization of Streaming Transducers

The criteria cover sequential, polynomial, and string-to-tree variants and yield effective minimization procedures for term outputs.

Figure from the paper full image
abstract click to expand
We provide general criteria for the existence of minimal models of streaming transducers, namely devices that read an input word and produce an output value by iteratively updating an internal memory. This abstract model subsumes classical (sub)sequential transducers (Sch\"utzenberger), streaming string-to-string transducers (Alur-\v{C}ern\'y), polynomial automata (Benedikt et al.), and variants of streaming string-to-tree transducers (Alur-D'Antoni). We then instantiate these criteria to obtain effective minimization results for variants of the latter model, where outputs are terms constructed incrementally by extending (tuples of) terms either at the leaves or at the roots.
0
0
cs.FL 2026-05-12 2 theorems

Quotients of permutation automata attain every positive accepting-state count

A Unary-to-Nonunary Transition in the Accepting-State Spectrum of Right Quotient for Permutation Automata

Nonempty inputs allow any number of accepting states in the minimal quotient automaton, completing the spectrum for larger alphabets.

abstract click to expand
This paper resolves the open larger-alphabet quotient case in the accepting-state complexity theory of permutation automata. Rauch and Holzer showed that, in the unary setting, the attainable right-quotient accepting-state complexities are exactly $[1,mn]$. We prove that over arbitrary alphabets the exact spectrum is $g^{\operatorname{asc}}_{-1,\mathrm{PFA}}(m,n)=\{0\}$ if $m=0$ or $n=0$, and $g^{\operatorname{asc}}_{-1,\mathrm{PFA}}(m,n)=\mathbb{N}_{>0}$ if $m,n\ge 1$. Thus, once both input languages are nonempty, every positive accepting-state complexity is attainable for right quotient, and $0$ is the only unavoidable magic value. The proof has two parts. First, we show that if $m,n\ge 1$, then the quotient language $KL^{-1}$ cannot be empty when $K$ and $L$ are accepted by permutation automata with $\operatorname{asc}(K)=m$ and $\operatorname{asc}(L)=n$; this follows from the bijectivity of the transition action. Second, for every $m,n\ge 1$ and every $\alpha\ge m$, we construct a ternary witness pair $(A^{\mathrm{q}}_{m,\alpha},B^{\mathrm{q}}_{n,\alpha})$ such that $\operatorname{asc}(L(A^{\mathrm{q}}_{m,\alpha}))=m$, $\operatorname{asc}(L(B^{\mathrm{q}}_{n,\alpha}))=n$, and $\operatorname{asc}(L(A^{\mathrm{q}}_{m,\alpha})L(B^{\mathrm{q}}_{n,\alpha})^{-1})=\alpha$. The high-range construction is group-theoretic: the words accepted by $B^{\mathrm{q}}_{n,\alpha}$ induce exactly a point stabilizer in a symmetric group, and the standard quotient construction then saturates the original final set of $A^{\mathrm{q}}_{m,\alpha}$ to a full orbit, yielding a minimal quotient automaton with exactly $\alpha$ final states. Combined with the known unary interval $[1,mn]$, this yields the complete spectrum and resolves the larger-alphabet right-quotient case for permutation automata.
0
0
cs.FL 2026-05-12 2 theorems

R-alignment gives bounded-depth forest decompositions

A Factorization Theorem for Forest Algebras

A new compatibility condition on morphisms lets Simon-style factorization work for forests, with counterexamples showing the condition is ne

abstract click to expand
Simon's factorization theorem is a celebrated tool in algebraic automata theory, providing bounded-depth decompositions of words with respect to morphisms into finite semigroups. We develop an analogue of Simon's theorem for \emph{forests} in the setting of forest algebras. In contrast with words, this presents a basic difficulty: recursively factoring a forest requires keeping track of where each subforest ``fits''. This difficulty ripples throughout the proof, and we overcome it by augmenting the free forest algebra and by developing a framework that supports recursive factorization of forests, along with its semantic implications. Our main result identifies a new semantic restriction on morphisms (called $\mathcal{R}$-alignment) which intuitively ensures that different ways of cutting a forest remain compatible (in a certain sense) at the semigroup level. Under this condition, we prove that every morphism admits decompositions of bounded depth. We also prove that without this restriction, there are morphisms for which no bounded-depth decomposition exists (under our notion of decomposition).
0
0
cs.FL 2026-05-11 2 theorems

Asymptotic Hausdorff turns edit distances into language metrics

Asymptotic Hausdorff and Language Similarity

The construction ignores finite mismatches and yields computable distances between regular languages based on their long-word behavior.

Figure from the paper full image
abstract click to expand
We introduce the \textit{Asymptotic Hausdorff} lifting, denoted $\mathbb{AH}_{d}$, a general method for lifting an element-level metric $d$ to a (pseudo-) metric on sets, that captures asymptotic similarity in infinite domains equipped with a notion of size. The construction is designed to be insensitive to finite deviations and to avoid the limitations of classical Hausdorff-based approaches, which are often overly sensitive to outliers and fail to reflect asymptotic behavior. Formal languages provide a central motivating instance of this framework, where elements are words and sets are languages. When applied to normalized edit distances, the Asymptotic Hausdorff lifting yields metric-valued distances between languages that reflect asymptotic edit behavior while preserving metric structure. We study the equivalence classes of regular languages induced by $\mathbb{AH}_{d}$ for normalized edit distances $d$, and characterize their asymptotic essence. Focusing in particular on the normalized edit distance of Marzal and Vidal, $\textsf{ned}$, we investigate the computation of $\mathbb{AH}_{\textsf{ned}}$ for regular languages and for bounded context-free languages.
0
0
cs.FL 2026-05-11 3 theorems

Automata over infinite alphabets can have non-rational Parikh images

Star Complexity of Parikh Images of Languages over Infinite Alphabets

A two-register counterexample shows the commutative projection need not be rational and separates automata from context-free grammars in the

Figure from the paper full image
abstract click to expand
It has been conjectured that the Parikh (commutative) image of every language over an infinite alphabet recognized by an automaton with registers is defined by a rational expression. This conjecture is known to hold for all languages recognized by one-register automata. We refine this result by proving that the star-height of the Parikh image of any language recognized by a one-register automaton is universally bounded by two. Furthermore, we show that one-register context-free languages have rational commutative images of arbitrarily high star height. We then disprove the conjecture for multiple registers, as well as disprove the equivalence of commutative expressive power between context-free grammars and automata over infinite alphabets. In other words, we show that Parikh's theorem fails for infinite alphabets.
0
0
cs.FL 2026-05-11 2 theorems

NL separated from logCFL via pebble automata entropy

Entropy of pebble automata and space complexity

The separation immediately shows logspace differs from polynomial time and nondeterministic logspace differs from P.

abstract click to expand
Let L denote the class Logpsace and NL the class NLogspace. We use logCFL to denote the closure under logspace reductions of the set of context-free languages. We prove that NL is different from logCFL. This result implies L different from Ptime and the stronger separation NL different from Ptime.
0
0
cs.FL 2026-05-11 2 theorems

Quantum Büchi automata accept languages as limits of finite-word quantum languages

Measure Many Quantum Finite Automata on Infinite Words

The model is closed under union, emptiness is semi-decidable, and universality, inclusion, equivalence, and membership are undecidable.

abstract click to expand
We define a quantum computational model over infinite words, called Measure-Many Quantum B\"uchi Automata (MMQBA), which extends Measure-many Quantum Finite automata (MMQFA) to the infinite word setting with B\"uchi acceptance condition. In MMQBA, the quantum state evolves through unitary transformations followed by repeated projective measurements. An infinite word is accearaq2ppted with respect to a cutpoint p is in (0, 1] if (i) the run visits accepting states infinitely often, (ii) the limiting cumulative acceptance probability is at least p, and (iii) the limiting cumulative rejection lprobability is strictly less than p. We formalize the semantics of MMQBA, establish a language-theoretic characterization showing that MMQBA languages are precisely of the form lim(L(M, p)) for MMQFA M , and develop a decomposition of the non-halting subspace. We prove that MMQBA is closed under union but not under intersection or complementation. On the algorithmic side, we show that the emptiness problem is semi-decidable, while universality, inclusion, equivalence, and membership remain undecidable.
0
0
cs.FL 2026-05-11 Recognition

SMT algorithm learns minimal weighted automata over semirings

SMT-Based Active Learning of Weighted Automata

It terminates for all finite semirings and returns smaller models with fewer teacher queries than prior methods.

Figure from the paper full image
abstract click to expand
We present an SMT-based active learning algorithm for nondeterministic weighted automata (WFAs) as a practical and robust alternative to Hankel/L*-style methods. Our algorithm is parametric in a given semiring and, if it terminates, guaranteed to produce minimal WFAs. We prove partial correctness and provide a sufficient termination condition, which in particular implies termination for all finite semirings. Our extensive experimental evaluation shows that our algorithm is capable of learning numerous minimal WFAs over both finite and infinite semirings, vastly outperforms a naive baseline, and is competitive with a state-of-the-art algorithm while producing significantly smaller automata and requiring less interaction with the teacher.
0
0
cs.FL 2026-05-11 2 theorems

Term rewriting cuts queries to learn tree automata

Learning Tree Automata with Term Rewriting

A supplied rewrite system deduces answers to some membership and equivalence queries, lowering the total number required for languages with

abstract click to expand
We present an extension of the Angluin-style learning algorithm for tree automata that incorporates deductive inference. The learning algorithm is provided with a term rewriting system that specifies properties of the target tree language (e.g., the order of subtrees under a symbol f is irrelevant). This term rewriting system is used to infer answers to some queries, which reduces the query complexity of the learning algorithm. We present examples of rewrite systems that express natural properties of tree-structured data, which yield a significant reduction in the number of queries.
0
0
cs.FL 2026-05-08 2 theorems

DFA primality decision is NP-hard

Deciding DFA-Primality is NP-Hard

A SAT reduction shows that checking whether a minimal DFA is an intersection of smaller ones is NP-hard.

abstract click to expand
A DFA $\mathcal{A}$ is composite if there exist DFAs $\mathcal{A}_1,\dots,\mathcal{A}_t$ with $\mathcal{L}(\mathcal{A}) = \bigcap_{i=1}^{t} \mathcal{L}(\mathcal{A}_i)$ such that each $\mathcal{A}_i$ has strictly less states than the minimal DFA deciding $\mathcal{L}(\mathcal{A})$. Otherwise, it is prime. Prime-DFA is the problem of deciding primality for a given DFA. It was defined by Kupferman and Mosheiff in 2015 and it was shown to be NL-hard and in ExpSpace. This paper proves the NP-hardness of Prime-DFA, thereby making the first progress in closing this doubly-exponential gap. It proves the NP-hardness by a reduction from the propositional logic satisfiability problem. The correctness of the reduction relies on an involved characterization of primality for a class of DFAs which contains those that can occur in the reduction.
0
0
cs.FL 2026-05-08

Temporal causal models encode linear bounded automata

Temporal Causal Models as a Model of Computation

With countably many variables they are Turing complete and link causality to computation theory.

abstract click to expand
Causal models, also known as Structural Equation Models (SEM), are a well-known formalism for representing and reasoning about causal dependencies between events. In this paper, we show that Temporal SEMs (TSEMs), which extend SEMs to support causal reasoning in temporal settings, can be interpreted as a model of computation. We prove that TSEMs can encode Linear Bounded Automata, and thus causal settings representable in context sensitive languages. We also prove that TSEMs with countably many variables are Turing complete. These results establish a formal connection between causal reasoning and classical models of computation, enabling the integration of counterfactual reasoning techniques from causal inference into the theory of computation.
0
0
cs.FL 2026-05-08

Edit distance computable for finite-valued transducers

Edit Distance of Finite-Valued Transducers

A constant bound on outputs per input lets an algorithm calculate the minimal edits needed to match every output of one transducer to the 2.

abstract click to expand
Transducers generalise automata by producing output word(s) for each input word, thereby defining a relation over words. A transducer is said to be finite-valued if, for every input word, it produces at most $k$ output words, for some constant $k$. If $k = 1$, then the transducer is said to be functional. The edit distance between two transducers is the minimal number of edits required to transform every output of one transducer into some output of the other, for each input word. This notion has been studied for functional transducers, where it is shown to be computable. However, it is uncomputable for transducers in general. In this work, we show the computability of the edit distance of finite-valued transducers, a class that is strictly more expressive than functional transducers.
0
0
cs.FL 2026-05-08 Recognition

Viability games gain decidability beyond pure counters

Infinite-state Games with Energy Objectives Beyond Counters

Full classification shows decidable cases on pushdown-counter hybrids where reachability itself is undecidable.

abstract click to expand
In the theory of games on infinite-state arenas, there is a stark contrast between (i) recursion-based models such as pushdown systems and extensions on one hand, and (ii) counter-based models like vector addition systems with states (VASS) on the other. For pushdown systems and extensions, there is a rich variety of decidable and well-understood games, whereas on VASS arenas, even extremely simple games are undecidable. Here, a VASS is an automaton with counters that can be incremented and decremented, but not tested for zero. Crucially, the counters can only assume non-negative values. However, certain VASS games become decidable when using energy semantics: An energy game is played on a system with counters, but the arena includes configurations with negative counters. The requirement that the counters stay non-negative is, instead, part of the winning condition of the existential player. We study an analogue of energy semantics -- legality of instructions as part of the winning condition rather than arena -- on a broad class of infinite-state systems, where we call them viability games. Specifically, we study viability games in the framework of valence systems over graph monoids, where (undirected, loops allowed) graphs specify various infinite-state systems, such as pushdowns, VASS counters, integer counters, and combinations thereof. In our main results, we provide a complete description of the decidability and complexity landscape of viability games across valence systems over graph monoids. Our results reveal encouraging decidability properties. For example, in certain combinations of pushdowns and counters, viability games are decidable, despite non-termination games being undecidable there. Moreover, viability games are even decidable for certain systems where (single-player) control-state reachability is undecidable.
0
0
cs.FL 2026-05-08

Viability games decidable on mixed pushdown-counter systems

Infinite-state Games with Energy Objectives Beyond Counters

A full classification shows energy-style winning conditions remain tractable even when reachability is undecidable.

abstract click to expand
In the theory of games on infinite-state arenas, there is a stark contrast between (i) recursion-based models such as pushdown systems and extensions on one hand, and (ii) counter-based models like vector addition systems with states (VASS) on the other. For pushdown systems and extensions, there is a rich variety of decidable and well-understood games, whereas on VASS arenas, even extremely simple games are undecidable. Here, a VASS is an automaton with counters that can be incremented and decremented, but not tested for zero. Crucially, the counters can only assume non-negative values. However, certain VASS games become decidable when using energy semantics: An energy game is played on a system with counters, but the arena includes configurations with negative counters. The requirement that the counters stay non-negative is, instead, part of the winning condition of the existential player. We study an analogue of energy semantics -- legality of instructions as part of the winning condition rather than arena -- on a broad class of infinite-state systems, where we call them viability games. Specifically, we study viability games in the framework of valence systems over graph monoids, where (undirected, loops allowed) graphs specify various infinite-state systems, such as pushdowns, VASS counters, integer counters, and combinations thereof. In our main results, we provide a complete description of the decidability and complexity landscape of viability games across valence systems over graph monoids. Our results reveal encouraging decidability properties. For example, in certain combinations of pushdowns and counters, viability games are decidable, despite non-termination games being undecidable there. Moreover, viability games are even decidable for certain systems where (single-player) control-state reachability is undecidable.
0
0
cs.FL 2026-05-06 3 theorems

Hennie machines define linear size-to-height tree functions

Tree transducers of linear size-to-height increase (and the additive conjunction of linear logic)

Bounded-visit transducers with label rewriting sit between macro tree transducers and MSO interpretations and form a composition-closed, non

abstract click to expand
We investigate a natural generalization to trees of Hennie machines, a known automaton model for regular string functions. Tree-to-tree Hennie machines are tree-walking tree transducers with the ability to rewrite the node labels of their input tree, subject to a bounded visit restriction. Interestingly, they do not merely compute regular tree functions (i.e. MSO transductions), but a larger class of functions with linear size-to-height increase (LSHI). We prove that this class sits between LSHI macro tree transducers (MTTs) and MSO set interpretations. To argue for its robustness, we show that it is closed under precomposition (resp. postcomposition) by MTTs of linear size (resp. height) increase. As a consequence, it contains the entire composition hierarchy of MTTs of linear height increase; we also prove that this composition hierarchy is strict. Finally, we give an alternative characterization of this function class based on a lambda-calculus with linear types. The key difference with similar characterizations of MSO transductions is the use of additive tuples in the encoding of output trees. Our equivalence proof, using game semantics / geometry of interaction, is heavily inspired by an analogous result on higher-order recursion schemes.
0
0
cs.FL 2026-05-06

Hyper-minimization is decidable for well-typed register automata

Hyper-Minimization for Deterministic Register Automata

The algorithm yields the smallest equivalent automaton in both states and registers among all well-typed DRAs.

Figure from the paper full image
abstract click to expand
We investigate hyper-minimization for deterministic register automata (DRAs). We begin by introducing DRA counterparts of classical notions from deterministic finite automata. Building on these foundations, we present an algorithm for hyper-minimizing well-typed DRAs, where each state is associated with a unique register type. The resulting automata are minimal with respect to both the number of states and registers among all well-typed DRAs. We prove the correctness of the proposed algorithm, thereby establishing the decidability of hyper-minimization for well-typed DRAs.
0
0
cs.FL 2026-05-04

NFA square roots require up to n cubed states

Nondeterministic state complexity of square root

Proves that n^3 states are both sufficient and necessary in the worst case for the square-root operation.

abstract click to expand
We investigate the nondeterministic state complexity of the square-root operation $\sqrt{L}=\{\,w \mid ww\in L\,\}$ on regular languages represented by nondeterministic finite automata. For an $n$-state NFA accepting $L$, it was previously known that $\sqrt{L}$ can be accepted by an NFA with at most $n^{3}$ states, while the best lower bound was only (n-1)(n-2)(n-3). In this paper, we close this gap completely and prove that $n^{3}$ states are sufficient and necessary in the worst case.
0
0
cs.FL 2026-05-01

Greedy dueling sequence converges to Thue-Morse at specific rate

The speed of convergence in greedy Galois games

The speed at which the shooter sequence approaches the infinite Thue-Morse pattern as p goes to zero is calculated explicitly.

Figure from the paper full image
abstract click to expand
In 2013 Cooper and Dutle invented a dueling scenario where Alice and Bob shoot at each other until one is hit. Each shot is successful with some fixed probability $p$, $0 < p < 1$. The shooting order is given by a greedy algorithm, where at each step a shot is assigned to the player whose current probability of success is smaller. Cooper and Dutle observed that as $p \rightarrow 0$, the resulting sequence of shots (by Alice or Bob) converges to the infinite Thue-Morse sequence t, but left the speed of convergence as an open problem. In this note we determine the speed of this convergence.
0
0
cs.FL 2026-04-30

Regular languages split into constant or logarithmic FO rank growth

Finite-Horizon First-Order Rank Profiles of Regular Languages

Aperiodic syntactic monoids keep the needed quantifier depth fixed for short words, while others force logarithmic growth with no middle gap

abstract click to expand
We introduce the finite-horizon first-order rank profile of a language $L \subseteq \Sigma^*$: the least quantifier rank needed by an $\mathrm{FO}[<]$ sentence to classify membership in $L$ correctly on all words of length at most $n$. The invariant measures quantifier depth only; formula size is deliberately not bounded. First, we prove a rank calculus that is independent of regularity. Every language satisfies $\rho_L(n) \le \lceil \log_2 n \rceil + 4$, via balanced first-order distance formulas and exact-word definitions. Moreover, $\sup_n \rho_L(n) < \infty$ holds exactly when $L$ is globally $\mathrm{FO}[<]$-definable, and the supremum equals the minimum quantifier rank of such a definition. Second, for regular languages we prove a sharp aperiodicity gap: if the syntactic monoid of $L$ is aperiodic, then $\rho_L(n) = O(1)$; otherwise $\rho_L(n) = \log_2 n + O_L(1)$. The lower bound extracts a nontrivial cyclic component from the syntactic monoid and combines it with an Ehrenfeucht-Fraisse power lemma for long repetitions of a fixed word. Thus, for full $\mathrm{FO}[<]$ quantifier rank, regular languages admit no intermediate finite-horizon growth between bounded and logarithmic rank.
0
0
cs.FL 2026-04-30

Fixed-parameter graph languages identifiable in the limit

Distributional Learning of Graph Languages Generated by Fixed-Interface Clause Systems

An algorithm reconstructs clause systems from positive examples and membership queries once bounds on degree and clause size are fixed.

Figure from the paper full image
abstract click to expand
Distributional learning provides a framework for studying the learnability of structured languages from positive data. In this paper, we extend this framework to graph languages generated by fixed-interface clause systems. We formulate fixed-interface graph pattern clause systems and define a learning model based on positive presentations and membership queries. We consider a bounded class of graph languages satisfying the finite context property under a bounded-degree assumption. The bounds are expressed by a parameter tuple $(\Delta,m,s,t,w,d)$, which controls both the generated graph class and the structural complexity of the clause systems. We give an oracle-guided learning algorithm that constructs hypotheses from boundary representations induced by observed positive examples. The proof shows that target contexts eventually appear in the sample, target clauses are reconstructed over the corresponding predicate representatives, and spurious clauses are excluded by membership queries. Hence, for every fixed parameter tuple $(\Delta,m,s,t,w,d)$, the target language is identifiable in the limit from positive data and membership queries. We also prove that the learner has polynomial-time update on $\mathcal{FICSL}^{\mathrm{FCP}}_{\Delta}(m,s,t,w,d)$. Thus, the paper gives a parameterized reformulation of distributional learning for regular FGS-style graph languages in a fixed-interface setting.
0
0
cs.FL 2026-04-29

Ideal automata split into smaller intersection components in poly time

Decomposition of Automata recognizing Ideals

NL decides decomposability into intersections or unions; a polynomial algorithm constructs valid ideal-preserving intersection factors when

abstract click to expand
Minimizing the size of finite automata is a fundamental problem in theoretical computer science. Beyond standard minimization, further reductions can be achieved by decomposing an automaton into smaller components whose languages combine via intersection or union to recover the original language. However, in general, no polynomial-time algorithm is known for computing such decompositions. In this paper, we focus on automata that recognize ideals, that is, languages at level 1/2 in the Straubing-Th\'erien hierarchy. Equivalently, these languages are expressible as a finite union of languages of the form $\Sigma^*a_1\Sigma^*\dots\Sigma^*a_n\Sigma^*$ where $\Sigma$ is an alphabet and $a_i$ are letters of $\Sigma$. We show that the two problems of deciding whether such a language can be decomposed into an intersection or a union of smaller automata are decidable in NL. Moreover, we provide a polynomial-time algorithm that computes a decomposition into an intersection, if one exists, while ensuring that the resulting components also recognize ideal languages.
0
0
cs.FL 2026-04-29

Transducer output Hamming distance check is NL-complete for fixed k

Hamming distance between finite transducers

The problem is co-NP-complete when k is binary and the maximum distance is at most quadratic when bounded.

abstract click to expand
We study bounded deviation of non-deterministic finite transducers under the Hamming distance: the bounded comparison problem asks, given two transducers and $k \in \mathbb{N}$, whether for every input the two transducers produce words at Hamming distance at most $k$. This problem is known to be decidable in polynomial time when $k$ is fixed, and in co-NP otherwise. We show that the problem is NL-complete when $k$ is fixed, co-NP-complete when $k$ is given in binary, and it is DP-complete to decide if the distance is exactly $k$. We also prove that if the two transducers have bounded comparison, then the maximal distance is at most quadratic in the size of both transducers, and that this bound is asymptotically tight. We prove the results on deviations problem, which asks similar questions on the distance of the pairs of input and output of a single transducer, and show that these two families of problems are logspace many-one equivalent.
0
0
cs.FL 2026-04-28

Lower bounds computed for reachability in random-clock automata

Minimum Reachability Probabilities in Rectangular Automata with Random Clocks

The technique separates stochastic and nondeterministic choices to support worst-case safety analysis instead of only optimistic upper-bound

Figure from the paper full image
abstract click to expand
Control applications for cyber-physical systems must make reliably safe control decisions in the presence of continuous dynamics as well as stochastic uncertainty. Providing safety guarantees for such systems requires formal modeling and analysis techniques that capture these aspects. For modeling, in this paper we consider rectangular automata with random clocks under prophetic scheduling. For this model class, existing methods can compute only upper bounds on reachability probabilities, enabling optimistic, best-case safety reasoning. We complement this view by introducing a novel method to compute lower bounds, thereby enabling worst-case analysis that is essential for safety-critical applications. Although both upper and lower bounds rely on reachability analysis, they are not dual: computing lower bounds requires an explicit separation of stochastic and nondeterministic choices along executions. We implement our approach and demonstrate its practical feasibility on an electric vehicle charging scenario, showing that meaningful worst-case guarantees can be obtained.
0
0
cs.FL 2026-04-28

Regular grammars give singly-exponential recognizers for SP graphs

Regular Grammars as Effective Representations of Recognizable Sets of Series-Parallel Graphs

The construction improves the prior double-exponential bound and proves ExpTime-completeness for intersection and inclusion.

Figure from the paper full image
abstract click to expand
Series-parallel (SP) graphs are binary edge-labeled graphs with a designated source and target vertex, built using serial and parallel composition. A set of graphs is recognizable if membership depends only on its image under a homomorphism into a finite algebra. For SP-graphs, and more generally, for graphs of bounded tree-width, recognizability coincides with definability in Counting Monadic Second-Order (CMSO) logic. Despite this strong logical characterization, the conciseness and algorithmic effectiveness of syntactic representations of recognizable sets of SP (and bounded-tree-width) graphs remain poorly understood. Building on previously introduced regular grammars for SP-graphs, we show that recognizable sets admit concise and effective syntactic representations. The main contribution is an improved construction of finite recognizer algebras whose size is singly-exponential in the size of a regular grammar, improving upon the previously known double-exponential bound. As a consequence, the problems of intersection and language inclusion for sets represented by regular grammars are shown to be ExpTime-complete, thus improving on a previously known 2ExpTime upper bound.
0
0
cs.FL 2026-04-28

VAS reachability improves to F_{d-2} bound from F_d

Improving Reachability in Vector Addition Systems through Pumpability

Refined pumpability analysis and projection technique tighten upper bounds on configuration reachability for vector addition systems.

abstract click to expand
Vector addition systems (VAS) constitute an important model of computation and concurrency that is equally expressive as the Petri net model. Recently, a lot of research has been conducted on vector addition systems with states (VASS), which are VASes equipped with a finite state control. Results on VASS naturally carry over to VAS, but no straightforward improvement is available. In this paper, we investigate the reachability problem in VAS in fixed dimensions. Based on a pumpability analysis of VAS that refines Rackoff's extraction for VASS, we obtain an F_{d-2} upper bound for the d-dimensional VAS reachability problem, improving the F_d upper bound inherited from the d-dimensional VASS reachability problem. Low-dimensional VASes are also considered. In particular, we establish a PSPACE upper bound for reachability in 4-dimensional VAS and an ELEMENTARY upper bound for 5-dimensional VAS, while the same upper bounds were known only for 2-VASS and 3-VASS, respectively. The result for 4-VAS particularly hinges on a simplified projection technique developed for geometrically 2-dimensional VASSes, whose reachability problem is shown to be equivalent to 2-VASS.
0
0
cs.FL 2026-04-27

Local algebraic checks decide Eve-positionality for omega-regular languages

An algebraic characterisation of Eve-positional languages

A finite list of elementary conditions on the language algebra guarantees that Eve needs no history memory to play optimally.

abstract click to expand
We present a new algebraic characterisation of Eve-positionality for $\omega$-regular languages. It involves only a limited number of elementary local properties to be checked. An $\omega$-regular language is Eve-positional if, in all games with this language as objective, the existential player (Eve) can play optimally without keeping any information concerning the history of the moves seen so far. This notion plays a crucial role in verification, automata theory and synthesis. Our proof heavily relies on a recent result of Casares and Ohlmann which states several characterisations of Eve-positionality for $\omega$-regular languages. More precisely, it relies on their a 1-to-2 player lift result: for an $\omega$-regular language, being Eve-positionally over all finite Eve-only arenas suffices for being Eve-positional over all two-player arenas.
0
0
cs.FL 2026-04-24

Algorithm learns state machines with registers and guards

Active Inference of Extended Finite State Machine Models with Registers and Guards

Black-box active learning works without resets or assumptions about data flow, enabling modeling of systems where internal values affect how

Figure from the paper full image
abstract click to expand
Extended finite state machines (EFSMs) model stateful systems with internal data variables and have numerous applications in software engineering. A major advantage of this type of model lies in its ability to model both the data flow and the data-dependent control behaviour. In the absence of such models, it is desirable to reverse-engineer them by observing the system's behaviour. However, existing approaches generally require the ability to reset the system during inference, or can only handle situations where the control flow depends exclusively on the input parameters, and not on the values of the stored data. In this work, we present a black-box active learning algorithm that infers EFSMs with guards and registers, and which significantly relaxes the assumptions that have to be made about the system in comparison to previous attempts.
0
0
cs.FL 2026-04-22

Forbidden context and order collapse CDGS to five classical classes

Forbidden-Context & Ordered Grammar Systems

Four regulation scenarios added to cooperating distributed grammar systems reduce exactly to known families from regulated rewriting and end

Figure from the paper full image
abstract click to expand
In this paper, we consider combining the ideas of forbidden random context grammars as well as of ordered grammars with cooperating distributed grammar systems (CDGS). We focus on investigating their generative capacities. Both ideas can be added to CDGS in two ways: either having (e.g.) a strict order of the rules in each component, or having a strict order on the components. This leads to four different scenarios, only some of them have been addressed in the literature before. While in the area of CDGS, many inclusions among language classes have been %are still open questions for decades, the proposed addition of forbidden random context and ordered regulation variants leads to a clear picture which allows us to get down to only five different classes of languages well known from classical regulated rewriting. This way, we also solve some open problems from the literature.
0
0
cs.FL 2026-04-22

Binary languages define all graphs via word patterns

On Languages Describing Large Graph Classes

Words from palindromes, Dyck words, and similar languages act as edge patterns; suitable restrictions capture every graph or large families.

Figure from the paper full image
abstract click to expand
In this work, we introduce a new notion for representing graph classes with formal languages. In contrast to the seminal work by Kitaev and Pyatkin to represent graphs by words, we use formal binary languages in order to have a set of patterns (given by the languages' words) defining the edges in the graph. In particular, we investigate famous languages like the palindromes, copy-words, Lyndon words, and Dyck words to represent all graphs or specific graph classes by restricting these languages.
0
0
cs.FL 2026-04-20

Weighted automata model financial payoffs under uncertainty

Weighted Automata and Regular Expressions for Financial Systems

They translate to regular expressions that let analysts compute extremal values across market scenarios.

Figure from the paper full image
abstract click to expand
We introduce weighted finite finance automata (WFFA), a formal framework for modeling and analyzing quantitative properties of financial systems driven by uncertain economic variables such as stock prices, interest rates, and exchange rates. The model provides a compositional and language-theoretic approach to scenario-based financial analysis, enabling systematic evaluation of financial instruments and trading strategies. To specify such systems, we introduce weighted finance regular expressions, a declarative language for quantitative financial properties. We establish a Kleene-Sch\"utzenberger-type correspondence between WFFAs and weighted finance regular expressions, together with effective translation procedures between the two formalisms. On the algorithmic side, we investigate fundamental decision and optimization problems for WFFAs, including the computation of extremal payoffs, and identify expressive yet computationally tractable subclasses. These results provide a foundation for formal, compositional, and efficient analysis of financial systems under multiple market scenarios.
0
0
cs.FL 2026-04-17

The paper develops new techniques for embedding word structures from Euclidean Bianchi…

On Word Representations and Embeddings in Complex Matrices

New techniques are constructed for word representations of Euclidean Bianchi groups inside 2x2 complex matrices, offering a framework for…

abstract click to expand
Embeddings of word structures into matrix semigroups provide a natural bridge between combinatorics on words and linear algebra. However, low-dimensional matrix semigroups impose strong structural restrictions on possible embeddings. Certain finitely generated groups admit faithful representations in SL(2, C) and other similar matrix groups. On the other hand, it is known that the product of two free semigroups on two generators cannot be embedded into the 2x2 complex matrices. In this paper we study embeddings of word structures into low-dimensional matrix semigroups over the complex numbers and develop new techniques for constructing word representations of the Euclidean Bianchi groups. These representations provide a symbolic framework and a natural first step towards analysing fundamental decision problems in 2x2 matrix semigroups.
0
0
cs.FL 2026-04-15

Normality iff automaton gamblers' capital converges exponentially

Characterizing normality via automata and random matrix products

Probabilistic finite automata characterize normal sequences by requiring exponential convergence of expected capital to a finite limit.

Figure from the paper full image
abstract click to expand
For a fixed alphabet A, an infinite sequence X is said to be normal if every word w over A appears in X with the same frequency as any other word of the same length. A classical result relates normality to finite automata as follows: a sequence X is normal if and only if all gambling strategies implementable with finite deterministic automata lose all their capital when trying to predict the next bit of X after seing the ones before. More precisely, Schnorr and Stimm (1972) proved that the capital goes exponentially fast to zero unless the automaton represents the gambler that never bets, in which case the capital remains constant. In this paper we show that an analogous result holds when considering probabilistic automata: a sequence X is normal if and only if for any gambling strategy implementable with probabilistic finite automaton it holds that the expected value of the capital of the gambler converges exponentially fast to a finite value when playing against X. To obtain this result, we show a more general statement related to the convergence of martingales given by finite sets of non-negative matrices {M a } a$\in$A . In particular, we show that X is normal if and only if ||vM X[1] . . . M X[n] || converges exponentially fast to a finite value for any non-negative starting vector v. Moreover, we distinguish three distinctive behaviours that this sequence can attain, and prove that the problem of recognizing, given a family of matrices, to which case it belongs, is decidable.
0
0
cs.FL 2026-04-14 Recognition

Bimachine bijection minimizes registers of string transducers in polynomial time

Minimizing Streaming String Transducers: An algebraic approach

The correspondence also shows NP-completeness for joint state-register minimization and for fixed automata cases.

Figure from the paper full image
abstract click to expand
In this work, we study minimization of rational functions given as appending streaming string transducers (aSST for short). We rely on an algebraic presentation of these functions, known as bimachines, to address the minimization of both states and registers of aSST. First, we show a bijection between a subclass of aSST and bimachines, which maps the numbers of states and registers of the aSST to two natural parameters of the bimachine. Using known results on the minimization of bimachines, this yields a Ptime (resp. NP) procedure to minimize this subclass of aSST with respect to registers (resp. both states and registers). In a second step, we introduce a new model of bimachines, named asynchronous bimachines, which allows to lift the bijection to the whole class of aSST. Based on this, we prove that register minimization with a fixed underlying automaton is NP-complete.
1 0
0
cs.FL 2026-04-09 2 theorems

Quantum automata need exactly Θ(n²) classical states for exact simulation

The Quadratic State Cost of Classical Simulation of One-Way Quantum Finite Automata

O(n²) probabilistic states always suffice and some n-state quantum automata require n²-1

abstract click to expand
Generalized finite automata (GFAs), probabilistic finite automata (PFAs), and one-way general quantum finite automata (1gQFA) recognize the same strict-cutpoint languages, but the state complexity of exact probabilistic simulation has remained unclear. This paper determines that worst-case cost exactly: every \(n\)-state 1gQFA admits exact strict-cutpoint simulation by a one-way PFA with \(O(n^2)\) states, via the standard \(n^2\)-dimensional mixed-state linearization together with an explicit alphabet-preserving construction that converts each \(k\)-state GFA into a one-way PFA with at most \(2k+6\) states; conversely, for every \(n\ge 2\), there exists an \(n\)-state 1gQFA for which every equivalent one-way PFA requires at least \(n^2-1\) states, obtained from a prepare--test construction and a Vapnik--Chervonenkis dimension argument. Hence the worst-case probabilistic state cost of exact strict-cutpoint simulation is \(\Theta(n^2)\).
1 0
0
cs.FL 2026-04-08 2 theorems

Symbolic states yield compact reachability graph for Petri net product lines

Efficient Construction of Reachability Graphs for Petri Net Product Lines

Family-preserving successor rules and on-the-fly merging let one graph stand in for all product configurations.

Figure from the paper full image
abstract click to expand
This paper presents a set of algorithms for computing the reachability graph of Petri Net Product Lines (PNPLs). These algorithms address the combined challenges of concurrency and variability that arise from product-line configurations. The proposed approach integrates symbolic state representations with family-based variability handling to generate a compact, parameterised reachability graph that captures behaviour across all products without exhaustive product enumeration. The main contributions are threefold. First, we introduce a symbolic state encoding adapted to PNPL semantics. Second, we define a family-preserving successor generation procedure that applies feature constraints during exploration. Third, we propose reduction techniques to mitigate state-space explosion, including on-the-fly merging of equivalent symbolic states and selective abstraction of irrelevant state details. We prove soundness and completeness of the construction with respect to standard per-product semantics and analyse computational complexity. An implementation integrated into our modelling tool demonstrates substantial savings in memory and time compared with naive product-based exploration, while preserving diagnostic and verification capabilities. The results indicate that the method enables practical reachability analysis for realistically sized product-line models, thereby facilitating verification and design-space exploration in configurable concurrent systems.
0
0
cs.FL 2026-04-03 Recognition

Streaming algorithm learns PDFAs with PAC guarantees

PAC learning PDFA from data streams

A sketch-based merge heuristic processes data streams and proves learnability while speeding up for large samples.

Figure from the paper full image
abstract click to expand
This is an extended version of our publication Learning state machines from data streams: A generic strategy and an improved heuristic, International Conference on Grammatical Inference (ICGI) 2023, Rabat, Morocco. It has been extended with a formal proof on PAC-bounds, and the discussion and analysis of a similar approach has been moved from the appendix and now has a full dedicated section. State machine models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the beginning of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset. Additionally, we provide a formal analysis of our algorithm, showing that it is capable of learning within the PAC framework, and show a theoretical improvement to increase run-time, without sacrificing correctness of the algorithm in larger sample sizes.
0

browse all of cs.FL → full archive · search · sub-categories