arxiv: 2605.12215 · v1 · submitted 2026-05-12 · 🧮 math.CO

Recognition: 2 theorem links

· Lean Theorem

A Tighter Upper Bound for the Number of Distinct Squares in Circular Words

Shuo Li, Yuan Song

Pith reviewed 2026-05-13 04:24 UTC · model grok-4.3

classification 🧮 math.CO

keywords circular wordssquare factorsupper boundscombinatorics on wordscyclic rotationsdistinct squares

0 comments

The pith

Any circular word of length n has at most 5/3 n distinct square factors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a new upper bound showing that the distinct square factors across all cyclic rotations of any finite word w of length n total at most 5/3 n. This improves the prior bound of 1.8 n from Charalampopoulos et al. and narrows the gap to their conjectured sharp bound of 1.5 n. A square is any factor of the form uu, and the circular word [w] collects every rotation of w so that the count considers squares appearing in any of them. The result follows from a refined counting argument that tracks how squares can overlap or repeat across rotations.

Core claim

We improve this upper bound to 5/3 n. Given a finite word w of length n, let [w] denote the corresponding circular word, i.e., the set of all cyclic rotations of w. We study the number of distinct square factors of the elements of [w].

What carries the argument

Combinatorial case analysis of overlaps and positions of squares across the cyclic rotations in [w].

If this is right

The total count of distinct uu factors across all rotations of any word is at most 5/3 n.
This bound holds uniformly for every finite word and every cyclic shift.
The known upper bound is now strictly tighter than the previous 1.8 n result.
The gap between the proven bound and the conjectured 1.5 n limit is reduced.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the same overlap-counting technique can be tightened further, it may reach the conjectured 1.5 n bound.
The counting method may extend to related problems such as bounding other periodic factors in circular strings.

Load-bearing premise

The case analysis correctly accounts for all possible overlaps and placements of squares in the rotations without missing configurations or overcounting distinct factors.

What would settle it

A concrete word of length n whose rotations together contain more than 5/3 n distinct squares.

Figures

Figures reproduced from arXiv: 2605.12215 by Shuo Li, Yuan Song.

**Figure 1.** Figure 1: Γ1(p ω) a b c ab ba ac ca [p]1 = {a, b, c} [p]2 = {ab, ba, ac, ca} Γ2(p ω) ab ba ac ca aba aca bac cab [p]2 = {ab, ba, ac, ca} [p]3 = {aba, bac, aca, cab} Γ3(p ω) aba bac aca cab abac acab baca caba [p]3 = {aba, bac, aca, cab} [p]4 = {abac, baca, acab, caba} [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

read the original abstract

A \emph{square} is a word of the form $uu$, where $u$ is a nonempty finite word. Given a finite word $w$ of length $n$, let $[w]$ denote the corresponding \emph{circular word}, i.e., the set of all cyclic rotations of $w$. We study the number of distinct square factors of the elements of $[w]$. Amit and Gawrychowski first showed that this number is upper bounded by $3.14n$. In a recent article, Charalampopoulos et al. improved this upper bound to $1.8n$ and conjectured that the sharp upper bound is $1.5n$. In this note, we improve this upper bound to $\frac{5}{3}n$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This note tightens the upper bound on distinct squares in circular words to 5/3 n through a refined count of overlaps across rotations.

read the letter

This paper tightens the upper bound for distinct squares in circular words to 5/3 n. That's the main point to take away. Li and Song start from the 1.8n bound of Charalampopoulos et al. and derive a better constant through a more detailed look at square overlaps and their behavior under cyclic rotations. The argument partitions the squares by period and counts how many distinct ones can survive across the rotations of the word. The work does what it sets out to do: it delivers an explicit improvement without introducing new concepts or heavy machinery. The progression from 3.14n to 1.8n to 5/3 n shows steady refinement in the field. The main concern is the completeness of the case analysis. The bound depends on showing that no additional distinct squares appear beyond what their counting allows in the circular case. Any overlooked configuration of nested or overlapping squares would weaken the 5/3 factor. Since the paper is a short note, the details of that split are crucial and should be checked carefully. This is for researchers focused on combinatorics on words, particularly those studying repetition bounds or square factors. A reader following the conjecture toward 1.5n would find it relevant as another data point. I would recommend sending it for peer review. The result is a clear, falsifiable improvement, and the subfield has the expertise to validate the casework.

Referee Report

2 major / 2 minor

Summary. The manuscript improves the known upper bound on the number of distinct square factors across all cyclic rotations of a length-n word from 1.8n (Charalampopoulos et al.) to (5/3)n. The argument proceeds by a combinatorial case analysis that partitions squares according to their periods and the manner in which they appear (or fail to appear) under cyclic rotation, then charges the distinct squares to obtain the coefficient 5/3.

Significance. A correct proof would constitute a concrete, parameter-free tightening of the bound on square-richness in circular words and would narrow the gap to the 1.5n conjecture. Because the derivation is purely combinatorial and contains no fitted constants, it would be a useful incremental result in combinatorics on words.

major comments (2)

[§3] §3 (main counting argument): the case split on square periods and their survival across rotations is presented as exhaustive, yet the text does not supply an independent counting lemma or enumeration that would confirm no configuration of nested or multiply-overlapping squares has been omitted. Any missed configuration permitting an extra distinct square in some rotation would falsify the 5/3 coefficient.
[Theorem 1] Theorem 1 and its proof: the charging scheme assigns each distinct square to a unique rotation or period class, but the argument does not explicitly bound the number of squares that could be charged to the same class when the word contains long runs of identical letters or high-period squares that appear in multiple non-consecutive rotations.

minor comments (2)

The abstract states the new bound but does not recall the exact statement of the 1.8n result it improves; a one-sentence reminder would aid readability.
Notation for the set [w] of rotations is introduced without an explicit definition of what constitutes a 'distinct square factor' of an element of [w]; a short clarifying sentence would prevent ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive suggestions. The comments highlight areas where the presentation of the case analysis and charging argument can be strengthened for clarity. We address each major comment below and will revise the manuscript to incorporate additional explicit statements and a supporting lemma.

read point-by-point responses

Referee: [§3] §3 (main counting argument): the case split on square periods and their survival across rotations is presented as exhaustive, yet the text does not supply an independent counting lemma or enumeration that would confirm no configuration of nested or multiply-overlapping squares has been omitted. Any missed configuration permitting an extra distinct square in some rotation would falsify the 5/3 coefficient.

Authors: We agree that the exhaustiveness of the case split in Section 3 would benefit from an independent verification. Every square is classified by its minimal period p and by whether it survives under all cyclic rotations or only a subset (determined by the gcd of p and the rotation distance). Nested and multiply-overlapping squares are handled by the period hierarchy: if two squares overlap with periods p and q where p divides q, the finer period governs the charging. To make this rigorous, we will insert a new Lemma 3.1 that enumerates all admissible overlap configurations for a fixed period class and proves that no additional distinct square can appear outside the counted cases. This lemma will be proved by contradiction on the possible border lengths. revision: yes
Referee: [Theorem 1] Theorem 1 and its proof: the charging scheme assigns each distinct square to a unique rotation or period class, but the argument does not explicitly bound the number of squares that could be charged to the same class when the word contains long runs of identical letters or high-period squares that appear in multiple non-consecutive rotations.

Authors: The charging scheme assigns each square to the lexicographically smallest rotation in which it occurs as a factor, together with its minimal period class. For unary runs (period 1), at most floor(n/2) distinct squares exist and they are all charged to the single rotation class consisting of the all-identical word; this is already bounded by 1/2 n, well below the 5/3 coefficient. For high-period squares appearing in non-consecutive rotations, the proof shows that if a square of period p appears at two rotations distance d apart with gcd(p,d) = g > 1, then it must also appear at all multiples of g, forcing it to be charged only to the canonical representative of its orbit. We will add a short paragraph after the charging definition that explicitly states these two bounds, together with a reference to the period lemma used in the case analysis. revision: partial

Circularity Check

0 steps flagged

No circularity; combinatorial bound derived independently via case analysis on rotations and overlaps.

full rationale

The paper improves the known upper bound on distinct square factors in circular words from 1.8n (Charalampopoulos et al.) to 5/3 n through direct combinatorial counting of square occurrences across cyclic rotations of w. No equations reduce to fitted parameters, no self-definitional loops appear, and the cited prior bounds are external results used only as starting points. The derivation relies on partitioning configurations of periods and overlaps, which is a standard proof technique independent of the target bound itself. This is the expected outcome for a short note presenting a tighter combinatorial estimate.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard definitions and combinatorial properties of words and squares with no new entities, fitted parameters, or ad-hoc axioms introduced.

axioms (2)

standard math A square is a word of the form uu where u is a nonempty finite word.
Basic definition from combinatorics on words used to frame the problem.
domain assumption The circular word [w] is the set of all cyclic rotations of the finite word w.
Standard definition for studying factors in circular strings.

pith-pipeline@v0.9.0 · 5423 in / 1200 out tokens · 87723 ms · 2026-05-13T04:24:40.581756+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Main Theorem: Sq([w]) ≤ (5/3)n proved by case analysis on splitting of circuits C(p,·) in Rauzy graphs Γ_i(W) for W = w²
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear
Use of primitive words, Power([w]), Class_p(w), small circuits sc_i(w) and Indep(Γ(W)) ≤ 2n

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

Theory of Computing Systems 69, 31 (2025) Title Suppressed Due to Excessive Length 11

Allouche, J.P., Campbell, J.M., Li, S., Shallit, J., Stipulanti, M.: The reflection complexity of sequences over finite alphabets. Theory of Computing Systems 69, 31 (2025) Title Suppressed Due to Excessive Length 11

work page 2025
[2]

In: String Process- ing and Information Retrieval (SPIRE 2017)

Amit, M., Gawrychowski, P.: Distinct squares in circular words. In: String Process- ing and Information Retrieval (SPIRE 2017). Lecture Notes in Computer Science, vol. 10508, pp. 27–37. Springer (2017)

work page 2017
[3]

Greenwood Press (1982)

Berge, C.: The Theory of Graphs and Its Applications. Greenwood Press (1982)

work page 1982
[4]

In: Frid, A., Merca¸ s, R

Brlek, S., Li, S.: On the number of distinct squares in finite sequences: Some old and new results. In: Frid, A., Merca¸ s, R. (eds.) Combinatorics on Words: 14th Interna- tional Conference, WORDS 2023, Ume˚ a, Sweden, June 12–16, 2023, Proceedings. Lecture Notes in Computer Science, vol. 13899, pp. 35–44. Springer, Cham (2023)

work page 2023
[5]

Combinatorial Theory 5(1), Article 3 (2025)

Brlek, S., Li, S.: On the number of squares in a finite word. Combinatorial Theory 5(1), Article 3 (2025)

work page 2025
[6]

In: 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)

Charalampopoulos, P., Mohamed, M., Radoszewski, J., Rytter, W., Walen, T., Zuba, W.: Improved bounds on the maximum number of distinct squares in circular words. In: 37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026). Lecture Notes in Computer Science (dec 2025)

work page 2026
[7]

Deza, A., Franek, F., Thierry, A.: How many double squares can a string contain? Discrete Applied Mathematics 180, 52–69 (2015)

work page 2015
[8]

Fraenkel, A.S., Simpson, J.: How many squares can a string contain? J. Comb. Theory, Ser. A 82(1), 112–120 (1998)

work page 1998
[9]

Ilie, L.: A note on the number of squares in a word. Theor. Comput. Sci. 380(3), 373–376 (2007)

work page 2007
[10]

The Electronic Journal of Combinatorics 9(1), N10 (2002)

JamesD.Currie: There are ternary circular square-free words of lengthnforn≤18. The Electronic Journal of Combinatorics 9(1), N10 (2002)

work page 2002
[11]

AdvOL-Report 2 (2013)

Lam, N.H.: On the number of squares in a string. AdvOL-Report 2 (2013)

work page 2013
[12]

The Electronic Journal of Combinatorics 31(3), P3.14 (2024)

Li, S., Pachocki, J., Radoszewski, J.: A note on the maximum number ofk-powers in a finite word. The Electronic Journal of Combinatorics 31(3), P3.14 (2024)

work page 2024
[13]

Cambridge University Press, Cambridge (1997)

Lothaire, M.: Combinatorics on Words. Cambridge University Press, Cambridge (1997)

work page 1997
[14]

Cambridge University Press, Cam- bridge (2005)

Lothaire, M.: Applied Combinatorics on Words. Cambridge University Press, Cam- bridge (2005)

work page 2005
[15]

Seminar on Number Theory (Univervit´ e de Bordeaux I, Talence) 25, 1–16 (1983)

Rauzy, G.: Suites ` a termes dans un alphabet fini. Seminar on Number Theory (Univervit´ e de Bordeaux I, Talence) 25, 1–16 (1983)

work page 1983
[16]

Journal of Number Theory 64(2), 31 (1994)

Rote, G.: Sequences with subword complexity 2n. Journal of Number Theory 64(2), 31 (1994)

work page 1994
[17]

The Electronic Journal of Com- binatorics 17(1), R140 (2010)

Shur, A.M.: On ternary square-free circular words. The Electronic Journal of Com- binatorics 17(1), R140 (2010)

work page 2010
[18]

Thierry, A.: A proof that a word of length n has less than 1.5n distinct squares (2020),https://arxiv.org/abs/2001.02996

work page arXiv 2020
[19]

Norske Videnskabers Selskabs Skrifter Mat.-Nat

Thue, A.: ¨Uber unendliche zeichenreihen. Norske Videnskabers Selskabs Skrifter Mat.-Nat. Kl. 7, 1–22 (1906)

work page 1906