arxiv: 2605.09077 · v1 · submitted 2026-05-09 · 💻 cs.LO · cs.FL

Recognition: 2 theorem links

· Lean Theorem

Set Automata and Limits of Decidability of Two-Variable Logic on Data Words

Amaldev Manuel, Shibashis Guha, S P Rishal

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:49 UTC · model grok-4.3

classification 💻 cs.LO cs.FL

keywords two-variable logicdata wordsset automataguarded predicatesdecidabilitymonoidsclass automatamulticounter automata

0 comments

The pith

Two-variable logic on data words extended with guarded predicates is decidable exactly for idempotent monoids whose two-sided ideals are linearly ordered.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a precise boundary for when extending two-variable logic on data words with guarded regular predicates keeps satisfiability decidable. It introduces set automata as a model equivalent to existing class automata, then isolates a subclass of ordered quasi-normal set automata whose emptiness problem reduces to that of ordered multicounter automata. The logic with predicates recognised by a semigroup S matches quasi-normal set automata whose transformations come from S, so the subclass yields decidability precisely when S is a linear band monoid. This matters because data words model sequences with repeated data values, and the result draws the line between fragments that can be automatically verified and those that cannot.

Core claim

The two-variable logic on data words extended with guarded predicates of the form L̃(x,y) recognised by a semigroup S is expressively equivalent to quasi-normal set automata whose transformation semigroup is S. When S is a linear band monoid the automata are ordered, and their emptiness problem is decidable by reduction to ordered multicounter automata. Consequently the extension is decidable exactly when S is an idempotent monoid whose two-sided ideals are linearly ordered. Set automata are introduced as an equivalent formalism to class automata and inherit their undecidability in general.

What carries the argument

Ordered quasi-normal set automata, which capture the extended logic for suitable monoids and reduce emptiness to ordered multicounter automata.

If this is right

Satisfiability for the extended logic is decidable whenever the recognising monoid is a linear band monoid.
Set automata are expressively equivalent to class automata on data words.
Emptiness of ordered quasi-normal set automata is decidable by reduction to emptiness of ordered multicounter automata.
The decidable cases of the extended logic are exactly those monoids that are idempotent with linearly ordered two-sided ideals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The monoid characterisation could be used as a quick syntactic check before attempting verification of a data-word specification.
Similar guarded-predicate extensions might be analysed in other automata models by translating them into set automata.
The reduction to multicounter automata may supply new algorithms for related problems on data words with limited repetition patterns.
The boundary identified here could guide the design of restricted query languages that remain decidable while covering useful properties of data values.

Load-bearing premise

The exact equivalence between the guarded two-variable logic and quasi-normal set automata with transformation semigroup S, together with the decidability of emptiness for the ordered subclass via reduction to ordered multicounter automata.

What would settle it

An explicit guarded predicate from an idempotent monoid with non-linearly ordered two-sided ideals for which satisfiability is undecidable, or a concrete ordered quasi-normal set automaton whose emptiness cannot be decided by the multicounter reduction.

read the original abstract

We extend the two-variable logic on data words with guarded regular binary predicates of the form $\widetilde{L}(x,y)$ that is true if positions $x$ and $y$ are in the same class and the factor strictly between $x$ and $y$ is in the regular language $L$. We characterise the class of monoids for which the extension of the two-variable logic with guarded predicates recognised by the monoid is decidable, namely the class of idempotent monoids whose two-sided ideals are linearly ordered. For this, we introduce an automata formalism, set automata, that is equivalent to the class automata of Boja\'nczyk and Lasota and thus has an undecidable emptiness problem. We identify a subclass of set automata called ordered quasi-normal set automata that has a decidable emptiness problem by reduction to the emptiness problem of ordered multicounter automata. We show that the two-variable logic extended with guarded regular predicates recognised by a semigroup $S$ is expressively equivalent to a quasi-normal set automaton with the semigroup of transformations $S$. In particular, if $S$ is a linear band monoid then the resulting automaton is ordered, and the decidability result follows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper cleanly characterizes the monoids that keep two-variable logic with guarded predicates decidable on data words by introducing set automata and a decidable subclass.

read the letter

The paper pins down exactly when you can add guarded regular predicates to two-variable logic on data words and still have decidable satisfiability. It does this by defining set automata, proving they match the known class automata, and then isolating the ordered quasi-normal subclass whose emptiness reduces to ordered multicounter automata. The main theorem says the extended logic is decidable precisely for idempotent monoids whose two-sided ideals are linearly ordered, because the logic matches a quasi-normal automaton whose transformation monoid is S and that automaton is ordered exactly when S has those properties. This is new: the set automata model and the monoid characterization do not appear in the cited prior work. It organizes scattered decidability results around a single structural condition on the monoid and gives a uniform reduction path. The reductions look direct and build on established emptiness results for multicounter automata without obvious circularity. The soft spots are in the details of the equivalences. The claim that the logic is expressively equivalent to the quasi-normal automaton with transformation semigroup S needs the encoding of the guarded predicates to be exact in both directions, and any looseness there would weaken the boundary. The step that turns the ordering of ideals into an ordered automaton also requires careful checking to ensure no extra power sneaks in. These are standard technical points rather than load-bearing gaps. This is for people working on automata and logic for data words, especially in verification and database theory. A reader focused on decidability frontiers in these logics will get a usable boundary result. I would send it to peer review. The characterization is sharp enough and the new formalism is concrete enough to merit referee time.

Referee Report

0 major / 2 minor

Summary. The paper extends two-variable logic on data words with guarded regular binary predicates of the form L̃(x,y) recognized by a monoid. It introduces set automata, proves them equivalent to the class automata of Bojańczyk and Lasota (hence undecidable emptiness), and isolates the subclass of ordered quasi-normal set automata whose emptiness reduces to that of ordered multicounter automata. It then establishes that the extended logic over semigroup S is expressively equivalent to quasi-normal set automata whose transformation monoid is exactly S, and that these automata are ordered precisely when S is an idempotent monoid with linearly ordered two-sided ideals, yielding the decidability characterization.

Significance. If the stated equivalences and reductions hold, the work supplies a clean algebraic characterization of the decidability frontier for this natural extension of FO² on data words. The set-automata formalism and the quasi-normal subclass provide reusable tools that connect logic, automata, and semigroup theory, building directly on prior results about class automata and multicounter automata emptiness.

minor comments (2)

The abstract uses both 'idempotent monoids whose two-sided ideals are linearly ordered' for the characterization and 'linear band monoid' for the special case that yields ordered automata; a brief clarification of the relationship (or proof that they coincide) would remove potential reader confusion.
The reduction from ordered quasi-normal set automata to ordered multicounter automata is asserted to preserve decidability, but a short remark on how the ordering of the transformation monoid is maintained through the construction would strengthen the presentation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful and positive assessment of our work, including the accurate summary of the main contributions and the recommendation for minor revision. We will incorporate any minor changes suggested during the revision process.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation proceeds by introducing set automata as a formalism equivalent to the externally known class automata of Bojańczyk-Lasota (with undecidable emptiness), then defining the syntactic subclass of ordered quasi-normal set automata whose emptiness reduces to the known decidable emptiness problem for ordered multicounter automata. The central equivalence theorem asserts that the guarded two-variable logic over semigroup S is expressively equivalent to a quasi-normal set automaton whose transformation monoid is exactly S, with the ordering property holding precisely when S is an idempotent monoid with linearly ordered ideals. These steps are presented as explicit theorems with reductions to prior external automata results rather than any self-definitional closure, fitted-parameter prediction, or load-bearing self-citation chain internal to the paper. The characterization therefore remains independent of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The paper rests on standard results from automata theory (equivalence of set automata to class automata, decidability of ordered multicounter emptiness) and introduces new formalisms whose properties are proved within the work.

axioms (2)

standard math Emptiness for ordered multicounter automata is decidable
Invoked as the target problem in the reduction establishing decidability of the ordered quasi-normal subclass.
domain assumption Set automata are equivalent in power to class automata of Bojańczyk and Lasota
Stated directly in the abstract as the foundation for using set automata to model the logic.

invented entities (2)

set automata no independent evidence
purpose: Automata model equivalent to class automata for capturing the extended logic
New formalism introduced to study emptiness and decidability limits.
ordered quasi-normal set automata no independent evidence
purpose: Subclasse of set automata whose emptiness is decidable
Defined to obtain the decidability result for linear band monoids.

pith-pipeline@v0.9.0 · 5520 in / 1496 out tokens · 74691 ms · 2026-05-12T01:49:56.283994+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
We characterise the class of monoids for which the extension ... is decidable, namely the class of idempotent monoids whose two-sided ideals are linearly ordered.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
ordered quasi-normal set automata ... reduction to the emptiness problem of ordered multicounter automata

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

We ensure that the first position is labelled by transitions fromI, i.e., ∀x(∀y y+ 1̸=x→ ⋁ δ∈I Xδ(x))

LetI⊆∆denote the set of transitions from an initial state of the automaton. We ensure that the first position is labelled by transitions fromI, i.e., ∀x(∀y y+ 1̸=x→ ⋁ δ∈I Xδ(x))

work page
[2]

We ensure that the last position is labelled by a transition inF, i.e., ∀x(∀y x+ 1̸=y→ ⋁ δ∈F Xδ(x))

Let F⊆∆denote the set of transitions to a final state of the automaton. We ensure that the last position is labelled by a transition inF, i.e., ∀x(∀y x+ 1̸=y→ ⋁ δ∈F Xδ(x))

work page
[3]

We state that consecutive positions are labelled by compatible transitions, i.e., ∀x∀y(x+ 1 =y→ ⋁ δ,δ′∈∆, comp(δ,δ′) Xδ(x)∧Xδ′(y))

We say a pair of transitions(δ,δ′)arecompatible, denoted as comp(δ,δ′), if the target state ofδis the source state ofδ′. We state that consecutive positions are labelled by compatible transitions, i.e., ∀x∀y(x+ 1 =y→ ⋁ δ,δ′∈∆, comp(δ,δ′) Xδ(x)∧Xδ′(y))

work page
[4]

We ensure that class-minimal positions are labelled by transitions from∆¯0, i.e., ∀x(¬∃yy 1 =x→ ⋁ δ∈∆¯0 Xδ(x))

Let∆ ¯0 be the set of transitions where the characteristic vector is the zero-vector. We ensure that class-minimal positions are labelled by transitions from∆¯0, i.e., ∀x(¬∃yy 1 =x→ ⋁ δ∈∆¯0 Xδ(x))

work page
[5]

Let F⊆∆denote the set of transitions to a local final state ofA

Assume that the set automatonA has the acceptance condition (2) given in Lemma 41. Let F⊆∆denote the set of transitions to a local final state ofA. We state that every class-maximal position is labelled by a transition to a local final state, i.e., ∀x(∀yx 1̸=y→ ⋁ δ∈F Xδ(x))

work page
[6]

That is, the labelling by theY-predicates are consistent with the run indicated by theX-predicates

It remains to ensure that the data values present in the setsS are according to the updates made by the transitions in∆. That is, the labelling by theY-predicates are consistent with the run indicated by theX-predicates. Toward this, we introduce the below formula. Letρ∈2S be a characteristic vector. To capture that the characteristic vector of membership...

work page
[7]

Let2 Z denote the characteristic vectors of the monadic predicatesZ

Finally, we ensure that the data values present in the setsS in two class-successive positions are consistent with the global updates in the positions strictly between them. Let2 Z denote the characteristic vectors of the monadic predicatesZ. LetΣ ′= Σ×2Z. LetΣ ′⊆Σ×2X be the set of all vectors that denotes the interpretation of the monadic variablesY. Let...

work page
[8]

(F) The data valued is inFm if and only if there is a factor defined by the latestαi in d’s class and the latest occurrence ofd whose image ism

for somex∈αd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestαi in d’s class and the latest occurrence ofd whose image ism. That is to say, the factorw(xi 0,x0) has imagem. On encountering a pair(ℓ,d′)the automaton performs the following actions in addition to those described for the sets inXand...

work page
[9]

(F) The data valued is inFm if and only if there is a factor defined by the latestαi ofd that has not been witnessed and the latest occurrence ofd whose image ism

for somex∈αd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestαi ofd that has not been witnessed and the latest occurrence ofd whose image ism. That is to say, the factorw (xi 0,x0) has imagem. On encountering a pair(ℓ,d′)the automaton performs the following actions in addition to those described...

work page
[10]

(F) The data valued is inFm if and only if there is a factor defined by the latestβi of d that has not been witnessed and the latest occurrence ofd whose image ism

for somex∈βd i (0,xi 0)with the imagem. (F) The data valued is inFm if and only if there is a factor defined by the latestβi of d that has not been witnessed and the latest occurrence ofd whose image ism. That is to say, the factorw (xi 0,x0) has imagem. 32 Set Automata and Limits of Decidability of Two-Variable Logic on Data Words On encountering a pair(...

work page
[11]

Let∆I⊆∆denote the set of transitions from an initial state

The first position is labelled by a transition from an initial state. Let∆I⊆∆denote the set of transitions from an initial state. We write ∃x ( (∀yx<y∨x=y)→ ⋁ γj∈∆I γj(x) )

work page
[12]

Let∆F⊆∆denote the set of transitions to a final state

The last position is labelled by a transition to a final state. Let∆F⊆∆denote the set of transitions to a final state. We write ∃x ( (∀yy <x∨y=x)→ ⋁ γj∈∆F γj(x) )

work page
[13]

Then, we state that consecutive positions are labelled by compatible transitions

We say two transitionsγi,γj∈∆arecompatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj. Then, we state that consecutive positions are labelled by compatible transitions. We write ∀x∀y ( y=x+ 1∧γi(x)→ ⋁ γi,γj∈∆, comp(γi,γj) γj(y) ) . 34 Set Automata and Limits of Decidability of Two-Variable Logic on Data Words 5.Each class i...

work page
[14]

We write ∀x∀y ⋀ i∈{1,2} ( ⋁ γιi∈∆ιi γδi∈∆δi ( γιi(x)∧γδi(y)∨γδi(x)∧γιi(y) ) ∧x∼y→¬^J∆∗∆ zi∆∗K(x,y) )

There is no position with a label from∆zi between two positions with labels from∆ιi and∆ δi of the same class, fori∈{1,2}. We write ∀x∀y ⋀ i∈{1,2} ( ⋁ γιi∈∆ιi γδi∈∆δi ( γιi(x)∧γδi(y)∨γδi(x)∧γιi(y) ) ∧x∼y→¬^J∆∗∆ zi∆∗K(x,y) ) . We observe that the languages∆∗∆ z1∆∗and∆ ∗∆ z2∆∗belong to the given familyF withΣ = ∆, renaming∆ z1 asaand∆ z2 asb. This concludes...

work page
[15]

We say two transitionsγi,γj ∈∆are compatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj

The string projection is a word in(∆c∗ 1c∗ 2)∗. We say two transitionsγi,γj ∈∆are compatible, denoted as comp(γi,γj), if the target state ofγi is the source state ofγj. Then, we state that two positions labelled by transitions in∆with no such position in between, are labelled by compatible transitions. This is written as anEMSO2[Σ,<, +1] formula

work page
[16]

This is written as anEMSO2[Σ,< ,+1,∼, 1]formula

Each class is a word in∆ιic+ i ∆ δi or∆ zi fori∈{1, 2}. This is written as anEMSO2[Σ,< ,+1,∼, 1]formula

work page
[17]

Let∆I⊆∆denote the set of transitions from an initial state

The first position is labelled by a transition from an initial state. Let∆I⊆∆denote the set of transitions from an initial state. Then, we write ∃x ( (∀yx<y∨x=y)→ ⋁ γj∈∆I γj(x) ) . S. Guha, A. Manuel, and S P. Rishal 35

work page
[18]

Let∆F⊆∆denote the set of transitions to a final state

The last position is labelled by a transition to a final state. Let∆F⊆∆denote the set of transitions to a final state. Then, we write ∃x ( (∀yy <x∨y=x)→ ⋁ γj∈∆F γj(x) )

work page
[19]

This is a regular property that can be written as an EMSO2[Σ,<,+1]formula

For each position with a label from∆zi, we ensure that the previous factor of the form ∆c∗ 1c∗ 2 does not have anyci’s. This is a regular property that can be written as an EMSO2[Σ,<,+1]formula

work page
[20]

We write ∀x∀y ⋀ i∈{1,2} ( ci(x)∧ci(y)∧x 1 =y→ ^J(c1 +c 2)∗∆(c 1 +c 2)∗K(x,y) )

Between two positions labelled byci of the same class, there is exactly one position with label from∆, fori∈{1,2}. We write ∀x∀y ⋀ i∈{1,2} ( ci(x)∧ci(y)∧x 1 =y→ ^J(c1 +c 2)∗∆(c 1 +c 2)∗K(x,y) ) . We observe that the language(c1 +c2)∗∆(c1 +c2)∗belongs to the given familyF with alphabetΣ = ∆∪{c1,c 2}, renaming∆asa. Hence, the result follows.◀ Proofs for Sec...

work page
[21]

2.(F⊆Q,Fℓ⊆Q):q∈Fand at every class maximal position the state reached is inFℓ

(F⊆Q,C⊆2Y ): q∈Fand the characteristic vector of each data value present inXis inC. 2.(F⊆Q,Fℓ⊆Q):q∈Fand at every class maximal position the state reached is inFℓ

work page
[22]

4.(F⊆Q,S⊆Y):q∈Fand the subsetSof the sets are empty

(F⊆Q,C⊆2Y ): q∈Fand at every class maximal position with data valued, the characteristic vector ofdis inC. 4.(F⊆Q,S⊆Y):q∈Fand the subsetSof the sets are empty. Proof. (1⇒2): LetA1 = (Q,Y,A, ∆,I,F,C ), with acceptance condition (1). The idea is to verify that at each class maximal positioni, the characteristic vectorci of the corresponding data value is su...

work page