arxiv: 2604.27024 · v1 · submitted 2026-04-29 · 💻 cs.FL · cs.LO

Recognition: unknown

Finite-Horizon First-Order Rank Profiles of Regular Languages

Madina Bazarova , Faruk Alpay

Authors on Pith no claims yet

Pith reviewed 2026-05-07 11:09 UTC · model grok-4.3

classification 💻 cs.FL cs.LO

keywords finite-horizon rank profileregular languagesfirst-order logicaperiodicitysyntactic monoidquantifier rankEhrenfeucht-Fraisse lemma

0 comments

The pith

Regular languages have either bounded or logarithmic finite-horizon first-order rank profiles, separated exactly by whether their syntactic monoid is aperiodic.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the finite-horizon first-order rank profile of a language as the smallest quantifier rank of an FO[<] sentence that decides membership correctly for every word up to a given length n. It first shows that every language obeys an upper bound of ceiling of log base two of n plus four by constructing balanced first-order formulas for distances and exact words. For regular languages the profile is then proved to stay constant when the syntactic monoid is aperiodic but to grow exactly as log base two of n plus a constant otherwise, establishing that no intermediate growth rates are possible. A sympathetic reader would care because this links an algebraic property of the language directly to the logical depth required on finite but arbitrary horizons.

Core claim

We prove a sharp aperiodicity gap for regular languages: if the syntactic monoid of L is aperiodic, then ρ_L(n) = O(1); otherwise ρ_L(n) = log₂ n + O_L(1). This follows from a general rank calculus that holds for every language and yields ρ_L(n) ≤ ⌈log₂ n⌉ + 4, together with a lower bound obtained by extracting a nontrivial cyclic component from the syntactic monoid and combining it with an Ehrenfeucht-Fraisse power lemma for long repetitions of a fixed word. Consequently regular languages admit no intermediate finite-horizon growth between bounded and logarithmic rank.

What carries the argument

The finite-horizon first-order rank profile ρ_L(n), the minimal quantifier rank of an FO[<] sentence that classifies all words of length at most n correctly.

If this is right

Every language satisfies ρ_L(n) ≤ ⌈log₂ n⌉ + 4.
The profile ρ_L(n) is bounded over all n if and only if L is globally definable in FO[<].
Non-aperiodic regular languages require quantifier rank that grows logarithmically with n even when only finite horizons are considered.
Aperiodic regular languages admit a fixed quantifier rank that works for all finite lengths.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation shows that the group structure inside the syntactic monoid is the precise algebraic marker that forces increasing logical depth on longer finite inputs.
One could look for analogous sharp gaps when the logic is extended by modular quantifiers or when the languages are taken from larger classes such as context-free.
The universal log n upper bound supplies a concrete resource estimate for any finite-horizon verification task phrased in first-order logic.

Load-bearing premise

The lower bound assumes that every non-aperiodic syntactic monoid contains a cyclic component that can be turned into a forcing construction via repeated words in an Ehrenfeucht-Fraisse argument.

What would settle it

A regular language whose syntactic monoid contains a nontrivial cycle yet whose rank profile stays bounded for all n, or grows at a rate strictly between constant and logarithmic.

read the original abstract

We introduce the finite-horizon first-order rank profile of a language $L \subseteq \Sigma^*$: the least quantifier rank needed by an $\mathrm{FO}[<]$ sentence to classify membership in $L$ correctly on all words of length at most $n$. The invariant measures quantifier depth only; formula size is deliberately not bounded. First, we prove a rank calculus that is independent of regularity. Every language satisfies $\rho_L(n) \le \lceil \log_2 n \rceil + 4$, via balanced first-order distance formulas and exact-word definitions. Moreover, $\sup_n \rho_L(n) < \infty$ holds exactly when $L$ is globally $\mathrm{FO}[<]$-definable, and the supremum equals the minimum quantifier rank of such a definition. Second, for regular languages we prove a sharp aperiodicity gap: if the syntactic monoid of $L$ is aperiodic, then $\rho_L(n) = O(1)$; otherwise $\rho_L(n) = \log_2 n + O_L(1)$. The lower bound extracts a nontrivial cyclic component from the syntactic monoid and combines it with an Ehrenfeucht-Fraisse power lemma for long repetitions of a fixed word. Thus, for full $\mathrm{FO}[<]$ quantifier rank, regular languages admit no intermediate finite-horizon growth between bounded and logarithmic rank.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a finite-horizon FO rank profile and proves regular languages split cleanly into bounded or logarithmic growth based on syntactic monoid aperiodicity.

read the letter

The main thing to know is that this paper gives a precise split for regular languages: their finite-horizon first-order rank is either O(1) or log n plus a language-dependent constant, with no intermediate rates, and it ties the split directly to whether the syntactic monoid is aperiodic. It also gives a general upper bound of ceiling log2 n plus 4 that holds for every language, not just regular ones, via balanced distance formulas and exact-word definitions in FO[<].

Referee Report

0 major / 1 minor

Summary. The paper introduces the finite-horizon first-order rank profile ρ_L(n) of a language L ⊆ Σ*, defined as the least quantifier rank of an FO[<] sentence that correctly classifies membership in L for all words of length at most n. It proves that every language satisfies ρ_L(n) ≤ ⌈log₂ n⌉ + 4 via balanced distance formulas, that sup_n ρ_L(n) < ∞ if and only if L is FO[<]-definable (with the value equal to the minimal such rank), and that regular languages exhibit a sharp gap: ρ_L(n) = O(1) when the syntactic monoid is aperiodic, and ρ_L(n) = log₂ n + O_L(1) otherwise, via extraction of a nontrivial cyclic component combined with an Ehrenfeucht-Fraïssé power lemma.

Significance. If the results hold, the work establishes a precise algebraic-logical dichotomy for the finite-horizon quantifier complexity of regular languages, showing that no intermediate growth rates between bounded and logarithmic are possible. Strengths include the language-independent upper bound with explicit constructions in §3, the reduction to standard FO[<] definability for the aperiodic case, and the monoid-dependent constants in the lower bound.

minor comments (1)

[§3] §3: the balanced distance formulas achieve the general upper bound, but a short illustrative example of how the +4 additive constant arises from the exact-word definitions would improve readability without altering the proof.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their accurate summary of the paper and for recommending acceptance. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external standard results

full rationale

The paper's central claims rest on explicit constructions and standard external theorems rather than self-referential reductions. The universal upper bound ρ_L(n) ≤ ⌈log₂ n⌉ + 4 is derived from balanced first-order distance formulas and exact-word definitions that apply to arbitrary languages and do not presuppose the target rank profile. For regular languages, the O(1) case when the syntactic monoid is aperiodic follows from the known FO[<]-definability of aperiodic regular languages (Schützenberger's theorem) together with a uniform quantifier-rank bound extracted from the monoid presentation; this is an independent application of a classical result. The logarithmic lower bound is obtained by extracting a generator of a nontrivial cyclic subgroup from the syntactic monoid and invoking the Ehrenfeucht-Fraïssé power lemma on long word repetitions; both steps are standard and the constants are shown to depend only on the monoid. No step equates a derived quantity to a fitted parameter or renames an input as a prediction, and no load-bearing premise collapses to a self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the new definition of ρ_L(n) and standard background results from automata theory and finite model theory; no free parameters are introduced.

axioms (2)

domain assumption Syntactic monoids classify regular languages and their algebraic properties determine logical definability
Invoked to obtain the aperiodicity gap.
standard math Ehrenfeucht-Fraisse games characterize the quantifier rank of FO[<] formulas
Used for the lower bound via power lemma on repeated words.

invented entities (1)

finite-horizon first-order rank profile ρ_L(n) no independent evidence
purpose: Measures minimal quantifier rank needed for correct classification on words of length ≤ n
Newly defined invariant; no independent evidence outside the paper.

pith-pipeline@v0.9.0 · 5559 in / 1514 out tokens · 71158 ms · 2026-05-07T11:09:55.678856+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references

[1]

Richard Büchi

J. Richard Büchi. Weak second-order arithmetic and finite automata.Zeitschrift für Mathematische Logik und Grundlagen der Mathematik, 6:66–92, 1960

1960
[2]

On finite monoids having only trivial subgroups.Information and Control, 8(2):190–194, 1965

Marcel-Paul Schützenberger. On finite monoids having only trivial subgroups.Information and Control, 8(2):190–194, 1965

1965
[3]

MIT Press, 1971

Robert McNaughton and Seymour Papert.Counter-Free Automata. MIT Press, 1971

1971
[4]

Volume B

Samuel Eilenberg.Automata, Languages, and Machines. Volume B. Academic Press, 1976

1976
[5]

North Oxford Academic, 1986

Jean-Éric Pin.Varieties of Formal Languages. North Oxford Academic, 1986

1986
[6]

Birkhäuser, 1994

Howard Straubing.Finite Automata, Formal Logic, and Circuit Complexity. Birkhäuser, 1994

1994
[7]

Languages, automata, and logic

Wolfgang Thomas. Languages, automata, and logic. In Grzegorz Rozenberg and Arto Salomaa, editors,Handbook of Formal Languages, Volume 3, pages 389–455. Springer, 1997

1997
[8]

Regular languages

Sheng Yu. Regular languages. In Grzegorz Rozenberg and Arto Salomaa, editors,Handbook of Formal Languages, Volume 1, pages 41–110. Springer, 1997

1997
[9]

Springer, 2004

Leonid Libkin.Elements of Finite Model Theory. Springer, 2004

2004
[10]

Separating regular languages with first-order logic.Logical Methods in Computer Science, 12(1), 2016

Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic.Logical Methods in Computer Science, 12(1), 2016

2016
[11]

On all things star-free

Thomas Place and Marc Zeitoun. On all things star-free. In46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132 ofLIPIcs, pages 126:1–126:14, 2019. 19

2019