An Erd\H{o}s--Szekeres type result for words with repeats

Abigail Ollson; Jun Yan; Kyle Celano; Niraj Velankar

arxiv: 2510.23573 · v4 · pith:ACG7FNTJnew · submitted 2025-10-27 · 🧮 math.CO

An ErdH{o}s--Szekeres type result for words with repeats

Kyle Celano , Abigail Ollson , Niraj Velankar , Jun Yan This is my paper

Pith reviewed 2026-05-18 03:19 UTC · model grok-4.3

classification 🧮 math.CO MSC 05A0568R15

keywords Erdős–Szekeres theoremwords with repeatspattern avoidanceunavoidable setscombinatorics on wordsorder-isomorphic subwords

0 comments

The pith

Every word with kn^6+1 repeats must contain one of seven specific patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves an Erdős–Szekeres style result for finite words over the natural numbers that permit repeated entries. It defines a repeat as any occurrence of a value after its first appearance and a pattern as a not-necessarily-consecutive subword order-isomorphic to a given sequence. The central result states that any such word containing kn^6+1 repeats must include at least one of seven listed patterns, including a run of k+2 identical symbols, certain double-rising or double-falling blocks, and repeating blocks such as 012⋯n012⋯n. When k equals 1 the bound is shown to be tight by an explicit construction of a word with exactly n^6 repeats that avoids every one of the seven patterns.

Core claim

We prove that every word with kn^6+1 repeats contains one of the following patterns: 0^{k+2}, 0011⋯nn, nn⋯1100, 012⋯n012⋯n, 012⋯nn⋯210, n⋯210012⋯n, n⋯210n⋯210. Moreover, when k=1, this is best possible by constructing a word with n^6 repeats that does not contain any of these patterns.

What carries the argument

The seven enumerated patterns that become unavoidable once the number of repeats in the word exceeds kn^6.

If this is right

Any word exceeding the repeat threshold is forced to realize at least one of the seven order-isomorphic subwords.
The quantitative bound kn^6+1 is tight for k=1 because an explicit avoiding word of length n^6 exists.
The result supplies a concrete threshold beyond which certain rising, falling, or repeated blocks must appear in sequences that allow repetitions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar avoidance thresholds might be derived for other finite collections of target patterns.
The construction for k=1 could be generalized to produce large avoiding words for higher k.
The patterns identified here may serve as a basis set for studying longer unavoidable configurations in words with bounded repeat multiplicity.

Load-bearing premise

The claim depends on these seven patterns being exactly the complete set of minimal unavoidable configurations once the repeat count passes the given threshold.

What would settle it

A single explicit word containing kn^6+1 repeats but none of the seven listed patterns would falsify the statement.

read the original abstract

We prove an Erd\H{o}s--Szekeres type result for finite words over $\mathbb{N}$ with repeated values. Specifically, we define a \emph{repeat} in a word to be an occurrence of a value which is not its first occurrence. We define an occurrence of a \emph{pattern} $\pi$ in a word $w$ to be a (not necessarily consecutive) subword of $w$ that is order isomorphic to $\pi$. In this note, we show that every word with $kn^6+1$ repeats contains one of the following patterns: $0^{k+2}$, $0011\cdots nn$, $nn\cdots1100$, $012 \cdots n012 \cdots n$, $012 \cdots nn\cdots 210$, $n\cdots 210012\cdots n$, $n\cdots 210n\cdots 210$. Moreover, when $k=1$, we show that this is best possible by constructing a word with $n^6$ repeats that does not contain any of these patterns.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This extends Erdős-Szekeres to words allowing repeats by bounding repeats before forcing one of seven patterns, with a matching construction only for k=1.

read the letter

This paper gives a bound on the number of repeats in a word before it must contain one of seven specific patterns. Specifically, kn^6 +1 repeats forces a pattern like 0^{k+2} or one of the mixed repeat patterns listed in the abstract. For k=1 they show this is tight by building a word with n^6 repeats that avoids all of them. The contribution is a natural extension of Erdős-Szekeres to sequences with repeated entries, where repeats are counted separately from the distinct values. They define patterns via order isomorphism on subwords, which fits the setting. The proof of the upper bound and the explicit construction for the lower bound when k=1 are both present, which is good. The construction for tightness at k=1 is the strongest part, as it provides a concrete example rather than just an existence argument. This makes the result more convincing for that case. The potential issue is in the upper bound. It seems to rely on case analysis or a structural breakdown of how repeats can be added without creating the forbidden patterns. If that analysis misses some configuration, then the kn^6 threshold might not hold, and words could have more repeats while avoiding the seven patterns. The paper does not show a matching construction for k greater than 1, so the general bound could perhaps be improved or might need adjustment. This is relevant for people studying extremal combinatorics on words. Anyone working on generalizations of permutation patterns to words with repeats would get value from seeing how the bound is derived and the construction is built. The work shows clear thinking on the problem and engages with the literature on Erdős-Szekeres. I would bring this to a reading group to go over the proof. It is worth citing in related work on pattern avoidance. It deserves peer review to check the details of the case analysis and see if the exponent 6 can be lowered.

Referee Report

2 major / 2 minor

Summary. The paper proves an Erdős–Szekeres-type result for finite words over ℕ: any word with kn^6 + 1 repeats must contain one of seven specified patterns (0^{k+2}, 0011⋯nn, nn⋯1100, 012⋯n012⋯n, 012⋯nn⋯210, n⋯210012⋯n, or n⋯210n⋯210). For k=1 the bound is shown to be tight by an explicit construction of a word with exactly n^6 repeats that avoids all seven patterns.

Significance. The result supplies a quantitative bound on unavoidable order-isomorphic patterns in words once the number of repeats exceeds a polynomial threshold in n. The matching construction for k=1 is a concrete strength, as it demonstrates that the stated threshold is best possible in that case and supplies an explicit extremal example.

major comments (2)

[§3] §3 (upper-bound proof): the structural decomposition that classifies words avoiding the seven patterns according to the order of first occurrences and the placement of subsequent repeats is load-bearing for the kn^6 claim. The argument must show that every possible placement of repeats either creates one of the seven patterns or is bounded by kn^6; if any configuration is omitted, the quantitative bound fails. Please supply an explicit enumeration of the cases or a diagram that makes the exhaustion visible.
[§4] §4 (construction for k=1): the explicit word with n^6 repeats is asserted to avoid all seven patterns, but the verification that it contains no occurrence of, e.g., 012⋯n012⋯n or n⋯210n⋯210 is only sketched. A short table or inductive argument confirming that each forbidden pattern is blocked would strengthen the lower-bound claim.

minor comments (2)

[§1] The seven target patterns are introduced in the abstract and §1 with abbreviated notation (e.g., 0011⋯nn). A single displayed definition block that writes each pattern in full, together with a brief explanation of what “order-isomorphic” means in this context, would improve readability.
[§2] Notation for repeats (occurrences that are not first appearances) is used throughout but never given a formal symbol. Introducing a short-hand such as r(w) for the number of repeats would make the statements of the main theorems easier to parse.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading, the positive assessment of the result, and the recommendation for minor revision. We address each major comment below and will revise the manuscript to improve the clarity of the arguments.

read point-by-point responses

Referee: [§3] §3 (upper-bound proof): the structural decomposition that classifies words avoiding the seven patterns according to the order of first occurrences and the placement of subsequent repeats is load-bearing for the kn^6 claim. The argument must show that every possible placement of repeats either creates one of the seven patterns or is bounded by kn^6; if any configuration is omitted, the quantitative bound fails. Please supply an explicit enumeration of the cases or a diagram that makes the exhaustion visible.

Authors: We agree that the structural decomposition in Section 3 would benefit from greater explicitness to make the exhaustion of cases fully transparent. In the revised version we will insert a diagram that depicts the possible orderings of first occurrences together with the admissible placements of subsequent repeats. We will also add an enumerated list of the resulting configurations, showing in each case either that one of the seven patterns is forced or that the total number of repeats is at most kn^6. This revision directly addresses the concern that an omitted configuration could invalidate the quantitative bound. revision: yes
Referee: [§4] §4 (construction for k=1): the explicit word with n^6 repeats is asserted to avoid all seven patterns, but the verification that it contains no occurrence of, e.g., 012⋯n012⋯n or n⋯210n⋯210 is only sketched. A short table or inductive argument confirming that each forbidden pattern is blocked would strengthen the lower-bound claim.

Authors: We thank the referee for this concrete suggestion. In the revision we will augment Section 4 with a short table that records, for each of the seven patterns, the structural reason it is absent from the constructed word. For the two more involved patterns 012⋯n012⋯n and n⋯210n⋯210 we will supply a brief inductive argument on n that confirms avoidance. These additions will make the verification of the extremal example fully rigorous while preserving the explicit nature of the construction. revision: yes

Circularity Check

0 steps flagged

Direct combinatorial proof and explicit construction are self-contained

full rationale

The paper establishes the upper bound via structural decomposition of words avoiding the seven listed patterns, using the sequence of first occurrences and relative placements of repeats to bound their total number by kn^6. This is paired with an explicit construction achieving n^6 repeats for k=1 without any of the patterns. No step reduces by definition to its own output, no parameters are fitted to data and then relabeled as predictions, and no load-bearing premise rests on a self-citation chain. The argument is a standard case-analysis proof in extremal combinatorics on words and remains independent of its own conclusions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proof relies on standard definitions of order-isomorphic patterns and the notion of repeats as non-first occurrences; no free parameters, invented entities, or ad-hoc axioms are introduced in the abstract statement.

axioms (1)

standard math A pattern occurrence is a (not necessarily consecutive) subword order-isomorphic to the template sequence.
This is the standard definition in permutation pattern avoidance extended to words.

pith-pipeline@v0.9.0 · 5729 in / 1449 out tokens · 34944 ms · 2026-05-18T03:19:32.078774+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Every word with kn⁶+1 repeats contains one of the following patterns: 0^{k+2}, 0011⋯nn, nn⋯1100, 012⋯n012⋯n, …

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.