arxiv: 2605.07031 · v1 · submitted 2026-05-07 · 💻 cs.FL

Recognition: 2 theorem links

· Lean Theorem

Deciding DFA-Primality is NP-Hard

Daniel Alexander Spenner

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:33 UTC · model grok-4.3

classification 💻 cs.FL

keywords DFA primalityNP-hardnessSAT reductiondeterministic finite automataregular language intersectionautomata complexity

0 comments

The pith

Deciding whether a given DFA is prime is NP-hard.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

A DFA is prime when its language cannot be written as the intersection of languages from strictly smaller DFAs. Prime-DFA is the decision problem that asks whether an input DFA has this property. The problem was already known to be NL-hard and contained in ExpSpace. This paper shows it is NP-hard by a polynomial-time reduction from SAT whose correctness rests on a characterization of primality that applies to the automata arising in the construction.

Core claim

Prime-DFA is NP-hard. The proof is a reduction from propositional satisfiability that builds, for each formula, a DFA whose primality encodes whether the formula is satisfiable; the reduction is shown correct by an explicit characterization of which automata in the relevant syntactic class are prime.

What carries the argument

A polynomial-time reduction from SAT to Prime-DFA whose validity depends on a primality characterization for the class of DFAs produced by the reduction.

If this is right

Prime-DFA lies in NP intersect coNP or higher, narrowing the previous NL-to-ExpSpace gap.
No polynomial-time algorithm for Prime-DFA exists unless P equals NP.
The same reduction technique may apply to related questions about DFA decompositions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result suggests that proving NP-completeness for Prime-DFA is now the next natural target.
The primality characterization might be reusable for other intersection-based properties of regular languages.

Load-bearing premise

The primality characterization for the class of DFAs that appear in the reduction is correct and the constructed automata belong to that class.

What would settle it

An explicit DFA produced by the reduction from an unsatisfiable formula that is nevertheless prime, or from a satisfiable formula that is nevertheless composite.

read the original abstract

A DFA $\mathcal{A}$ is composite if there exist DFAs $\mathcal{A}_1,\dots,\mathcal{A}_t$ with $\mathcal{L}(\mathcal{A}) = \bigcap_{i=1}^{t} \mathcal{L}(\mathcal{A}_i)$ such that each $\mathcal{A}_i$ has strictly less states than the minimal DFA deciding $\mathcal{L}(\mathcal{A})$. Otherwise, it is prime. Prime-DFA is the problem of deciding primality for a given DFA. It was defined by Kupferman and Mosheiff in 2015 and it was shown to be NL-hard and in ExpSpace. This paper proves the NP-hardness of Prime-DFA, thereby making the first progress in closing this doubly-exponential gap. It proves the NP-hardness by a reduction from the propositional logic satisfiability problem. The correctness of the reduction relies on an involved characterization of primality for a class of DFAs which contains those that can occur in the reduction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims to establish NP-hardness of Prime-DFA (deciding whether a given DFA is prime, i.e., its language cannot be expressed as the intersection of languages of strictly smaller DFAs) via a polynomial-time reduction from SAT. The reduction constructs, for each formula, a DFA A such that A is prime precisely when the formula is unsatisfiable; correctness is asserted to follow from an involved characterization of primality that applies to the class of DFAs arising in the construction.

Significance. If the reduction and the supporting primality characterization are correct, the result supplies the first concrete progress toward closing the complexity gap for Prime-DFA (previously known to be NL-hard and in ExpSpace). The approach of reducing from SAT while relying on a tailored characterization of primality for a restricted DFA class constitutes a non-trivial technical step in automata theory.

major comments (2)

[Abstract; Reduction and Characterization sections] The abstract and the proof of the main theorem rest on an 'involved characterization of primality for a class of DFAs which contains those that can occur in the reduction.' No independent verification of this characterization (machine-checked proof, exhaustive enumeration on small instances, or alternative combinatorial argument) is supplied, leaving the central equivalence between primality and unsatisfiability unverifiable.
[Reduction construction] The reduction must guarantee that every DFA produced by the SAT-to-DFA construction lies inside the class for which the primality characterization is proved, and that the characterization is both sound and complete for that class. The manuscript provides no explicit argument ruling out composite DFAs inside the class that the construction could generate, which would break the equivalence.

minor comments (1)

[Throughout] The definition of composite/prime DFAs is stated clearly in the abstract; ensure that the same terminology and the reference to the minimal DFA are used uniformly in all subsequent sections.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the careful review and for highlighting the need for greater clarity around the primality characterization and the reduction. We address each major comment below. The core proof of the characterization is combinatorial and contained in the manuscript; we are prepared to expand explanatory material in a revision but cannot supply machine-checked verification at this stage.

read point-by-point responses

Referee: [Abstract; Reduction and Characterization sections] The abstract and the proof of the main theorem rest on an 'involved characterization of primality for a class of DFAs which contains those that can occur in the reduction.' No independent verification of this characterization (machine-checked proof, exhaustive enumeration on small instances, or alternative combinatorial argument) is supplied, leaving the central equivalence between primality and unsatisfiability unverifiable.

Authors: The primality characterization for the relevant class of DFAs is stated and proved in full in the Characterization section. The proof proceeds by showing both directions: (i) if the formula is satisfiable then the constructed DFA admits a non-trivial intersection decomposition, and (ii) if the DFA is composite then the formula must be satisfiable, via an explicit reconstruction of a satisfying assignment from any decomposition. The argument relies only on the structural properties of the constructed automata (unique accepting paths for each clause and variable gadget) and does not invoke external results. While we acknowledge that a machine-checked formalization or exhaustive enumeration on small instances is absent, the combinatorial proof is self-contained and can be checked by direct inspection of the lemmas. In a revision we will insert additional intermediate lemmas and a small worked example to improve readability. revision: partial
Referee: [Reduction construction] The reduction must guarantee that every DFA produced by the SAT-to-DFA construction lies inside the class for which the primality characterization is proved, and that the characterization is both sound and complete for that class. The manuscript provides no explicit argument ruling out composite DFAs inside the class that the construction could generate, which would break the equivalence.

Authors: The Reduction section defines the DFA construction explicitly and immediately proves (Lemma 4.3) that every automaton generated from a SAT instance belongs to the class C for which the characterization is stated. The class C is defined by three syntactic restrictions on the transition graph (single initial state, clause-variable gadget separation, and unique accepting paths), all of which are preserved by the construction. Soundness and completeness of the characterization for C are established by the two directions of the main theorem, which together show that a DFA in C is composite if and only if the formula is satisfiable. No composite DFA inside C can arise from the construction without violating the unsatisfiability assumption, because any decomposition would yield a satisfying assignment via the path-labeling argument in the proof. We will add an explicit corollary in the revision that restates this closure property. revision: partial

standing simulated objections not resolved

Supplying a machine-checked proof of the primality characterization in a proof assistant, which would require a substantial separate formalization effort beyond the scope of the current manuscript.

Circularity Check

0 steps flagged

No circularity: standard SAT reduction with independent primality characterization

full rationale

The derivation proceeds by a direct polynomial-time reduction from the independently NP-complete problem SAT to Prime-DFA. For each formula the construction produces a DFA A whose language is the intersection of strictly smaller DFAs precisely when the formula is unsatisfiable; primality of A is then shown equivalent to unsatisfiability by means of a characterization of primality that is proved inside the paper for the specific class of DFAs generated by the construction. This characterization is not obtained by self-definition, by fitting parameters to the target result, or by any self-citation chain; the only external citation is to Kupferman & Mosheiff 2015 (different authors) for the original definition of Prime-DFA and its NL-hardness/ExpSpace membership. No equation or lemma reduces to its own input by construction, and the central claim therefore rests on an independent proof rather than on a renaming or tautological restatement of the input.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proof relies on standard automata-theoretic facts rather than new postulates.

axioms (1)

standard math Standard properties of deterministic finite automata, including the existence of a unique minimal DFA for each regular language and closure of regular languages under intersection.
These facts are invoked in the definition of primality and in the reduction; they are background results from automata theory.

pith-pipeline@v0.9.0 · 5464 in / 1342 out tokens · 46578 ms · 2026-05-11T01:33:56.302356+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The correctness of the reduction relies on an involved characterization of primality for a class of DFAs which contains those that can occur in the reduction.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A minimal linear safety ADFA+ is prime if and only if it does not have the mp-prop (Theorem 3.9).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages

[1]

MIT Press, Cambridge, MA (2021), https://mitpress.mit.edu/9780262044776

Christel Baier and Joost - Pieter Katoen. Principles of Model Checking . MIT Press, 2008. URL: https://mitpress.mit.edu/9780262026499/principles-of-model-checking/

work page arXiv 2008
[2]

Stephen A. Cook. The complexity of theorem-proving procedures. In Michael A. Harrison, Ranan B. Banerji, and Jeffrey D. Ullman, editors, Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, May 3-5, 1971, Shaker Heights, Ohio, USA , pages 151--158. ACM , 1971. https://doi.org/10.1145/800157.805047 doi:10.1145/800157.805047

work page doi:10.1145/800157.805047 1971
[3]

de Roever, Hans Langmaack, and Amir Pnueli, editors

Willem P. de Roever, Hans Langmaack, and Amir Pnueli, editors. Compositionality: The Significant Difference, International Symposium, COMPOS'97, Bad Malente, Germany, September 8-12, 1997. Revised Lectures , volume 1536 of Lecture Notes in Computer Science . Springer, 1998. https://doi.org/10.1007/3-540-49213-5 doi:10.1007/3-540-49213-5

work page doi:10.1007/3-540-49213-5 1997
[4]

Assisted problem solving and decompositions of finite automata

Peter Gazi and Branislav Rovan. Assisted problem solving and decompositions of finite automata. In Viliam Geffert, Juhani Karhum \" a ki, Alberto Bertoni, Bart Preneel, Pavol N \' a vrat, and M \' a ria Bielikov \' a , editors, SOFSEM 2008: Theory and Practice of Computer Science, 34th Conference on Current Trends in Theory and Practice of Computer Scienc...

work page doi:10.1007/978-3-540-77566-9 2008
[5]

Mark Gold

E. Mark Gold. Complexity of automaton identification from given data. Inf. Control. , 37(3):302--320, 1978. https://doi.org/10.1016/S0019-9958(78)90562-4 doi:10.1016/S0019-9958(78)90562-4

work page doi:10.1016/s0019-9958(78)90562-4 1978
[6]

Unary prime languages

Isma \" e l Jecker, Orna Kupferman, and Nicolas Mazzocchi. Unary prime languages. In Javier Esparza and Daniel Kr \' a l', editors, 45th International Symposium on Mathematical Foundations of Computer Science, MFCS 2020, August 24-28, 2020, Prague, Czech Republic , volume 170 of LIPIcs , pages 51:1--51:12. Schloss Dagstuhl - Leibniz-Zentrum f \" u r Infor...

work page doi:10.4230/lipics.mfcs.2020.51 2020
[7]

Decomposing permutation automata

Isma \" e l Jecker, Nicolas Mazzocchi, and Petra Wolf. Decomposing permutation automata. In Serge Haddad and Daniele Varacca, editors, 32nd International Conference on Concurrency Theory, CONCUR 2021, August 24-27, 2021, Virtual Conference , volume 203 of LIPIcs , pages 18:1--18:19. Schloss Dagstuhl - Leibniz-Zentrum f \" u r Informatik, 2021. URL: https:...

work page doi:10.4230/lipics.concur.2021.18 2021
[8]

Prime languages

Orna Kupferman and Jonathan Mosheiff. Prime languages. Inf. Comput. , 240:90--107, 2015. URL: https://doi.org/10.1016/j.ic.2014.09.010, https://doi.org/10.1016/J.IC.2014.09.010 doi:10.1016/J.IC.2014.09.010

work page doi:10.1016/j.ic.2014.09.010 2015
[9]

Niklas Lauffer, Beyazit Yalcinkaya, Marcell Vazquez - Chanlatte, Ameesh Shah, and Sanjit A. Seshia. Learning deterministic finite automata decompositions from examples and demonstrations. In Alberto Griggio and Neha Rungta, editors, 22nd Formal Methods in Computer-Aided Design, FMCAD 2022, Trento, Italy, October 17-21, 2022 , pages 325--330. TU Wien Acade...

work page doi:10.34727/2022/isbn.978-3-85448-053-2 2022
[10]

Decomposition of safe languages

Alon Netser and Orna Kupferman. Decomposition of safe languages. Amirim Research Project Report from the Hebrew University , 2018

work page 2018
[11]

Decomposing finite languages

Daniel Alexander Spenner. Decomposing finite languages. In J \' e r \^ o me Leroux, Sylvain Lombardy, and David Peleg, editors, 48th International Symposium on Mathematical Foundations of Computer Science, MFCS 2023, August 28 to September 1, 2023, Bordeaux, France , volume 272 of LIPIcs , pages 83:1--83:14. Schloss Dagstuhl - Leibniz-Zentrum f \" u r Inf...

work page doi:10.4230/lipics.mfcs.2023.83 2023
[12]

Compositionality in the science of system design

Stavros Tripakis. Compositionality in the science of system design. Proc. IEEE , 104(5):960--972, 2016. https://doi.org/10.1109/JPROC.2015.2510366 doi:10.1109/JPROC.2015.2510366

work page doi:10.1109/jproc.2015.2510366 2016
[13]

Vardi and Pierre Wolper

Moshe Y. Vardi and Pierre Wolper. An automata-theoretic approach to automatic program verification. In Proceedings of the Symposium on Logic in Computer Science (LICS '86), Cambridge, Massachusetts, USA, June 16-18, 1986 , pages 332--344. IEEE Computer Society, 1986. URL: https://hdl.handle.net/2268/116609

work page 1986