pith. machine review for the scientific record. sign in

arxiv: 2605.09077 · v1 · submitted 2026-05-09 · 💻 cs.LO · cs.FL

Recognition: 2 theorem links

· Lean Theorem

Set Automata and Limits of Decidability of Two-Variable Logic on Data Words

Amaldev Manuel, Shibashis Guha, S P Rishal

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:49 UTC · model grok-4.3

classification 💻 cs.LO cs.FL
keywords two-variable logicdata wordsset automataguarded predicatesdecidabilitymonoidsclass automatamulticounter automata
0
0 comments X

The pith

Two-variable logic on data words extended with guarded predicates is decidable exactly for idempotent monoids whose two-sided ideals are linearly ordered.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a precise boundary for when extending two-variable logic on data words with guarded regular predicates keeps satisfiability decidable. It introduces set automata as a model equivalent to existing class automata, then isolates a subclass of ordered quasi-normal set automata whose emptiness problem reduces to that of ordered multicounter automata. The logic with predicates recognised by a semigroup S matches quasi-normal set automata whose transformations come from S, so the subclass yields decidability precisely when S is a linear band monoid. This matters because data words model sequences with repeated data values, and the result draws the line between fragments that can be automatically verified and those that cannot.

Core claim

The two-variable logic on data words extended with guarded predicates of the form L̃(x,y) recognised by a semigroup S is expressively equivalent to quasi-normal set automata whose transformation semigroup is S. When S is a linear band monoid the automata are ordered, and their emptiness problem is decidable by reduction to ordered multicounter automata. Consequently the extension is decidable exactly when S is an idempotent monoid whose two-sided ideals are linearly ordered. Set automata are introduced as an equivalent formalism to class automata and inherit their undecidability in general.

What carries the argument

Ordered quasi-normal set automata, which capture the extended logic for suitable monoids and reduce emptiness to ordered multicounter automata.

If this is right

  • Satisfiability for the extended logic is decidable whenever the recognising monoid is a linear band monoid.
  • Set automata are expressively equivalent to class automata on data words.
  • Emptiness of ordered quasi-normal set automata is decidable by reduction to emptiness of ordered multicounter automata.
  • The decidable cases of the extended logic are exactly those monoids that are idempotent with linearly ordered two-sided ideals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The monoid characterisation could be used as a quick syntactic check before attempting verification of a data-word specification.
  • Similar guarded-predicate extensions might be analysed in other automata models by translating them into set automata.
  • The reduction to multicounter automata may supply new algorithms for related problems on data words with limited repetition patterns.
  • The boundary identified here could guide the design of restricted query languages that remain decidable while covering useful properties of data values.

Load-bearing premise

The exact equivalence between the guarded two-variable logic and quasi-normal set automata with transformation semigroup S, together with the decidability of emptiness for the ordered subclass via reduction to ordered multicounter automata.

What would settle it

An explicit guarded predicate from an idempotent monoid with non-linearly ordered two-sided ideals for which satisfiability is undecidable, or a concrete ordered quasi-normal set automaton whose emptiness cannot be decided by the multicounter reduction.

read the original abstract

We extend the two-variable logic on data words with guarded regular binary predicates of the form $\widetilde{L}(x,y)$ that is true if positions $x$ and $y$ are in the same class and the factor strictly between $x$ and $y$ is in the regular language $L$. We characterise the class of monoids for which the extension of the two-variable logic with guarded predicates recognised by the monoid is decidable, namely the class of idempotent monoids whose two-sided ideals are linearly ordered. For this, we introduce an automata formalism, set automata, that is equivalent to the class automata of Boja\'nczyk and Lasota and thus has an undecidable emptiness problem. We identify a subclass of set automata called ordered quasi-normal set automata that has a decidable emptiness problem by reduction to the emptiness problem of ordered multicounter automata. We show that the two-variable logic extended with guarded regular predicates recognised by a semigroup $S$ is expressively equivalent to a quasi-normal set automaton with the semigroup of transformations $S$. In particular, if $S$ is a linear band monoid then the resulting automaton is ordered, and the decidability result follows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper extends two-variable logic on data words with guarded regular binary predicates of the form L̃(x,y) recognized by a monoid. It introduces set automata, proves them equivalent to the class automata of Bojańczyk and Lasota (hence undecidable emptiness), and isolates the subclass of ordered quasi-normal set automata whose emptiness reduces to that of ordered multicounter automata. It then establishes that the extended logic over semigroup S is expressively equivalent to quasi-normal set automata whose transformation monoid is exactly S, and that these automata are ordered precisely when S is an idempotent monoid with linearly ordered two-sided ideals, yielding the decidability characterization.

Significance. If the stated equivalences and reductions hold, the work supplies a clean algebraic characterization of the decidability frontier for this natural extension of FO² on data words. The set-automata formalism and the quasi-normal subclass provide reusable tools that connect logic, automata, and semigroup theory, building directly on prior results about class automata and multicounter automata emptiness.

minor comments (2)
  1. The abstract uses both 'idempotent monoids whose two-sided ideals are linearly ordered' for the characterization and 'linear band monoid' for the special case that yields ordered automata; a brief clarification of the relationship (or proof that they coincide) would remove potential reader confusion.
  2. The reduction from ordered quasi-normal set automata to ordered multicounter automata is asserted to preserve decidability, but a short remark on how the ordering of the transformation monoid is maintained through the construction would strengthen the presentation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful and positive assessment of our work, including the accurate summary of the main contributions and the recommendation for minor revision. We will incorporate any minor changes suggested during the revision process.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation proceeds by introducing set automata as a formalism equivalent to the externally known class automata of Bojańczyk-Lasota (with undecidable emptiness), then defining the syntactic subclass of ordered quasi-normal set automata whose emptiness reduces to the known decidable emptiness problem for ordered multicounter automata. The central equivalence theorem asserts that the guarded two-variable logic over semigroup S is expressively equivalent to a quasi-normal set automaton whose transformation monoid is exactly S, with the ordering property holding precisely when S is an idempotent monoid with linearly ordered ideals. These steps are presented as explicit theorems with reductions to prior external automata results rather than any self-definitional closure, fitted-parameter prediction, or load-bearing self-citation chain internal to the paper. The characterization therefore remains independent of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The paper rests on standard results from automata theory (equivalence of set automata to class automata, decidability of ordered multicounter emptiness) and introduces new formalisms whose properties are proved within the work.

axioms (2)
  • standard math Emptiness for ordered multicounter automata is decidable
    Invoked as the target problem in the reduction establishing decidability of the ordered quasi-normal subclass.
  • domain assumption Set automata are equivalent in power to class automata of Bojańczyk and Lasota
    Stated directly in the abstract as the foundation for using set automata to model the logic.
invented entities (2)
  • set automata no independent evidence
    purpose: Automata model equivalent to class automata for capturing the extended logic
    New formalism introduced to study emptiness and decidability limits.
  • ordered quasi-normal set automata no independent evidence
    purpose: Subclasse of set automata whose emptiness is decidable
    Defined to obtain the decidability result for linear band monoids.

pith-pipeline@v0.9.0 · 5520 in / 1496 out tokens · 74691 ms · 2026-05-12T01:49:56.283994+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    We ensure that the first position is labelled by transitions fromI, i.e., ∀x(∀y y+ 1̸=x→ ⋁ δ∈I Xδ(x))

    LetI⊆∆denote the set of transitions from an initial state of the automaton. We ensure that the first position is labelled by transitions fromI, i.e., ∀x(∀y y+ 1̸=x→ ⋁ δ∈I Xδ(x))

  2. [2]

    We ensure that the last position is labelled by a transition inF, i.e., ∀x(∀y x+ 1̸=y→ ⋁ δ∈F Xδ(x))

    Let F⊆∆denote the set of transitions to a final state of the automaton. We ensure that the last position is labelled by a transition inF, i.e., ∀x(∀y x+ 1̸=y→ ⋁ δ∈F Xδ(x))

  3. [3]

    We state that consecutive positions are labelled by compatible transitions, i.e., ∀x∀y(x+ 1 =y→ ⋁ δ,δ′∈∆, comp(δ,δ′) Xδ(x)∧Xδ′(y))

    We say a pair of transitions(δ,δ′)arecompatible, denoted as comp(δ,δ′), if the target state ofδis the source state ofδ′. We state that consecutive positions are labelled by compatible transitions, i.e., ∀x∀y(x+ 1 =y→ ⋁ δ,δ′∈∆, comp(δ,δ′) Xδ(x)∧Xδ′(y))

  4. [4]

    We ensure that class-minimal positions are labelled by transitions from∆¯0, i.e., ∀x(¬∃yy 1 =x→ ⋁ δ∈∆¯0 Xδ(x))

    Let∆ ¯0 be the set of transitions where the characteristic vector is the zero-vector. We ensure that class-minimal positions are labelled by transitions from∆¯0, i.e., ∀x(¬∃yy 1 =x→ ⋁ δ∈∆¯0 Xδ(x))

  5. [5]

    Let F⊆∆denote the set of transitions to a local final state ofA

    Assume that the set automatonA has the acceptance condition (2) given in Lemma 41. Let F⊆∆denote the set of transitions to a local final state ofA. We state that every class-maximal position is labelled by a transition to a local final state, i.e., ∀x(∀yx 1̸=y→ ⋁ δ∈F Xδ(x))

  6. [6]

    That is, the labelling by theY-predicates are consistent with the run indicated by theX-predicates

    It remains to ensure that the data values present in the setsS are according to the updates made by the transitions in∆. That is, the labelling by theY-predicates are consistent with the run indicated by theX-predicates. Toward this, we introduce the below formula. Letρ∈2S be a characteristic vector. To capture that the characteristic vector of membership...

  7. [7]

    Let2 Z denote the characteristic vectors of the monadic predicatesZ

    Finally, we ensure that the data values present in the setsS in two class-successive positions are consistent with the global updates in the positions strictly between them. Let2 Z denote the characteristic vectors of the monadic predicatesZ. LetΣ ′= Σ×2Z. LetΣ ′⊆Σ×2X be the set of all vectors that denotes the interpretation of the monadic variablesY. Let...

  8. [8]

    (F) The data valued is inFm if and only if there is a factor defined by the latestαi in d’s class and the latest occurrence ofd whose image ism

    for somex∈αd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestαi in d’s class and the latest occurrence ofd whose image ism. That is to say, the factorw(xi 0,x0) has imagem. On encountering a pair(ℓ,d′)the automaton performs the following actions in addition to those described for the sets inXand...

  9. [9]

    (F) The data valued is inFm if and only if there is a factor defined by the latestαi ofd that has not been witnessed and the latest occurrence ofd whose image ism

    for somex∈αd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestαi ofd that has not been witnessed and the latest occurrence ofd whose image ism. That is to say, the factorw (xi 0,x0) has imagem. On encountering a pair(ℓ,d′)the automaton performs the following actions in addition to those described...

  10. [10]

    (F) The data valued is inFm if and only if there is a factor defined by the latestβi of d that has not been witnessed and the latest occurrence ofd whose image ism

    for somex∈βd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestβi of d that has not been witnessed and the latest occurrence ofd whose image ism. That is to say, the factorw (xi 0,x0) has imagem. 32 Set Automata and Limits of Decidability of Two-Variable Logic on Data Words On encountering a pair(...

  11. [11]

    Let∆I⊆∆denote the set of transitions from an initial state

    The first position is labelled by a transition from an initial state. Let∆I⊆∆denote the set of transitions from an initial state. We write ∃x ( (∀yx<y∨x=y)→ ⋁ γj∈∆I γj(x) )

  12. [12]

    Let∆F⊆∆denote the set of transitions to a final state

    The last position is labelled by a transition to a final state. Let∆F⊆∆denote the set of transitions to a final state. We write ∃x ( (∀yy <x∨y=x)→ ⋁ γj∈∆F γj(x) )

  13. [13]

    Then, we state that consecutive positions are labelled by compatible transitions

    We say two transitionsγi,γj∈∆arecompatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj. Then, we state that consecutive positions are labelled by compatible transitions. We write ∀x∀y ( y=x+ 1∧γi(x)→ ⋁ γi,γj∈∆, comp(γi,γj) γj(y) ) . 34 Set Automata and Limits of Decidability of Two-Variable Logic on Data Words 5.Each class i...

  14. [14]

    We write ∀x∀y ⋀ i∈{1,2} ( ⋁ γιi∈∆ιi γδi∈∆δi ( γιi(x)∧γδi(y)∨γδi(x)∧γιi(y) ) ∧x∼y→¬^J∆∗∆ zi∆∗K(x,y) )

    There is no position with a label from∆zi between two positions with labels from∆ιi and∆ δi of the same class, fori∈{1,2}. We write ∀x∀y ⋀ i∈{1,2} ( ⋁ γιi∈∆ιi γδi∈∆δi ( γιi(x)∧γδi(y)∨γδi(x)∧γιi(y) ) ∧x∼y→¬^J∆∗∆ zi∆∗K(x,y) ) . We observe that the languages∆∗∆ z1∆∗and∆ ∗∆ z2∆∗belong to the given familyF withΣ = ∆, renaming∆ z1 asaand∆ z2 asb. This concludes...

  15. [15]

    We say two transitionsγi,γj ∈∆are compatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj

    The string projection is a word in(∆c∗ 1c∗ 2)∗. We say two transitionsγi,γj ∈∆are compatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj. Then, we state that two positions labelled by transitions in∆with no such position in between, are labelled by compatible transitions. This is written as anEMSO2[Σ,<, +1] formula

  16. [16]

    This is written as anEMSO2[Σ,< ,+1,∼, 1]formula

    Each class is a word in∆ιic+ i ∆ δi or∆ zi fori∈{1, 2}. This is written as anEMSO2[Σ,< ,+1,∼, 1]formula

  17. [17]

    Let∆I⊆∆denote the set of transitions from an initial state

    The first position is labelled by a transition from an initial state. Let∆I⊆∆denote the set of transitions from an initial state. Then, we write ∃x ( (∀yx<y∨x=y)→ ⋁ γj∈∆I γj(x) ) . S. Guha, A. Manuel, and S P. Rishal 35

  18. [18]

    Let∆F⊆∆denote the set of transitions to a final state

    The last position is labelled by a transition to a final state. Let∆F⊆∆denote the set of transitions to a final state. Then, we write ∃x ( (∀yy <x∨y=x)→ ⋁ γj∈∆F γj(x) )

  19. [19]

    This is a regular property that can be written as an EMSO2[Σ,<,+1]formula

    For each position with a label from∆zi, we ensure that the previous factor of the form ∆c∗ 1c∗ 2 does not have anyci’s. This is a regular property that can be written as an EMSO2[Σ,<,+1]formula

  20. [20]

    We write ∀x∀y ⋀ i∈{1,2} ( ci(x)∧ci(y)∧x 1 =y→ ^J(c1 +c 2)∗∆(c 1 +c 2)∗K(x,y) )

    Between two positions labelled byci of the same class, there is exactly one position with label from∆, fori∈{1,2}. We write ∀x∀y ⋀ i∈{1,2} ( ci(x)∧ci(y)∧x 1 =y→ ^J(c1 +c 2)∗∆(c 1 +c 2)∗K(x,y) ) . We observe that the language(c1 +c2)∗∆(c1 +c2)∗belongs to the given familyF with alphabetΣ = ∆∪{c1,c 2}, renaming∆asa. Hence, the result follows.◀ Proofs for Sec...

  21. [21]

    2.(F⊆Q,Fℓ⊆Q):q∈Fand at every class maximal position the state reached is inFℓ

    (F⊆Q,C⊆2Y ): q∈Fand the characteristic vector of each data value present inXis inC. 2.(F⊆Q,Fℓ⊆Q):q∈Fand at every class maximal position the state reached is inFℓ

  22. [22]

    4.(F⊆Q,S⊆Y):q∈Fand the subsetSof the sets are empty

    (F⊆Q,C⊆2Y ): q∈Fand at every class maximal position with data valued, the characteristic vector ofdis inC. 4.(F⊆Q,S⊆Y):q∈Fand the subsetSof the sets are empty. Proof. (1⇒2): LetA1 = (Q,Y,A, ∆,I,F,C ), with acceptance condition (1). The idea is to verify that at each class maximal positioni, the characteristic vectorci of the corresponding data value is su...