Recognition: 2 theorem links
· Lean TheoremMinimax optimal submatrix detection: Sharp non-asymptotic rates
Pith reviewed 2026-05-12 02:20 UTC · model grok-4.3
The pith
Matching non-asymptotic bounds establish the exact minimal signal strength needed to detect a hidden submatrix in any Gaussian matrix.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the model with i.i.d. N(0,1) entries under the null and an unknown s1 × s2 block of i.i.d. N(μ,1) entries under the alternative, the smallest μ* permitting a test with Type I and Type II errors both bounded away from 1 is characterized by matching non-asymptotic upper and lower bounds that hold uniformly over all d1, d2, s1, s2. A combination of novel test statistics achieves these bounds, and the same construction can be made adaptive to unknown sparsity levels.
What carries the argument
The matching non-asymptotic upper and lower bounds on the critical signal strength μ*, realized by a combination of novel test statistics.
If this is right
- Reliable detection with controlled errors is possible if and only if the signal strength exceeds the derived threshold.
- The same threshold and tests apply without any conditions relating the matrix dimensions to the submatrix dimensions.
- Detection procedures remain optimal when the submatrix dimensions are unknown in advance.
- The fundamental limit is fully characterized in finite samples rather than only in asymptotic regimes.
Where Pith is reading between the lines
- The same bounding technique may extend to detection problems with multiple submatrices or with row and column correlations.
- Practical implementations could test whether the proposed statistics remain computationally feasible for matrices with thousands of rows and columns.
- The explicit form of the bounds might suggest analogous sharp thresholds for submatrix detection under non-Gaussian noise models.
Load-bearing premise
The observed matrix has independent standard normal entries except inside one unknown rectangular block of fixed size, where the entries are independent normals sharing a common positive mean μ.
What would settle it
For any fixed d1, d2, s1, s2, compute or simulate the smallest μ at which some test keeps both error probabilities below 1/3 and check whether this value lies strictly above or below the paper's explicit bound.
read the original abstract
We consider the problem of detecting a hidden submatrix of size $s_1 \times s_2$ in a high-dimensional Gaussian matrix of size $d_1 \times d_2$. Under the null hypothesis, the observed matrix has i.i.d.\ entries with distribution $N(0,1)$. Under the alternative hypothesis, there exists an unknown submatrix of size $s_1 \times s_2$ with i.i.d.\ entries with distribution $N(\mu, 1)$ for some $\mu>0$, while all other entries outside the submatrix are i.i.d.\ $N(0,1)$. Specifically, we provide non-asymptotic upper and lower bounds on the smallest signal strength $\mu^*$ that is both necessary and sufficient to ensure the existence of a test with small enough Type I and Type II errors. We also derive novel minimax-optimal tests achieving these fundamental limits, and describe extensions of these tests that are adaptive to unknown sparsity levels $s_1$ and $s_2$. Our proposed detection procedure is a careful combination of novel test statistics which may be of independent interest. In contrast with previous work, which required restrictive assumptions on $d_1, d_2, s_1$ and $s_2$, our non-asymptotic upper and lower bounds match for any configuration of these parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper considers detection of an unknown s1 × s2 submatrix with entries N(μ,1) inside a d1 × d2 matrix whose entries are otherwise i.i.d. N(0,1). It derives non-asymptotic upper and lower bounds on the critical signal strength μ* that are claimed to match exactly for arbitrary d1,d2,s1,s2 (no regime restrictions), constructs novel test statistics that attain these bounds, and provides adaptive extensions for unknown sparsity levels.
Significance. If the claimed exact matching of non-asymptotic bounds holds for all parameter configurations, the work would constitute a substantial advance in the minimax theory of sparse signal detection in matrices, supplying sharp, regime-free characterizations and explicit optimal procedures where earlier results were limited to asymptotic regimes or specific sparsity-density relations.
minor comments (3)
- [Section 2 / Theorem 1] The abstract states that the upper and lower bounds match for any configuration of d1,d2,s1,s2; the main text should explicitly display the common expression for μ* (presumably in the form of a threshold involving log terms and sparsity ratios) and verify the equality in the most extreme regimes (e.g., s1=1 or s2=d2).
- [Section 3] The novel test statistics are described as a 'careful combination' of procedures; their explicit definitions, computational complexity, and the precise way they achieve the lower bound should be stated with equation numbers so that readers can verify optimality without reconstructing the argument.
- [Section 4] The adaptive versions for unknown s1,s2 are mentioned only briefly; the paper should clarify whether the adaptation incurs an extra logarithmic factor or preserves the exact non-adaptive rate.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our manuscript and the recommendation for minor revision. The referee accurately captures our contributions in deriving non-asymptotic upper and lower bounds on the critical signal strength μ* for submatrix detection that match exactly for arbitrary d1, d2, s1, s2 with no regime restrictions, along with the novel tests and adaptive extensions. We confirm that these claims hold as stated in the abstract and full paper, providing regime-free characterizations where prior work was limited.
Circularity Check
No significant circularity identified
full rationale
The paper's central result consists of matching non-asymptotic upper and lower bounds on the minimal detectable signal strength μ* in the standard planted submatrix model with i.i.d. Gaussian entries. Lower bounds are derived via information-theoretic techniques (e.g., reductions to simple hypothesis testing and analysis of distances between the null and composite alternative measures), while upper bounds follow from explicit construction and error-probability analysis of novel test statistics. These steps operate directly on the model assumptions without parameter fitting, self-referential definitions, or load-bearing reliance on prior self-citations. The matching of bounds for arbitrary d1,d2,s1,s2 is presented as the outcome of the analysis rather than an input by construction. No patterns of self-definition, fitted-input prediction, or ansatz smuggling appear in the derivation chain.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The matrix entries are independent and identically distributed as standard normal under the null hypothesis.
- domain assumption Under the alternative, there is a submatrix with i.i.d. N(μ,1) entries and the rest N(0,1).
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearTheorem 1: (μ*)² ≍ (ψ12 + ψ21) ∧ ϕ12 ∧ ϕ21 with ψ, ϕ defined via log(1 + d/s log(e d/s)) terms; tests are linear + truncated-χ² + Bonferroni scans
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearLower bound via second-moment method on likelihood ratio under uniform prior over planted supports
Reference graph
Works this paper leans on
-
[1]
Springer Science & Business Media. 21 Ingster, Y. I. (1982). On the minimax nonparametric detection of signals in white gaussian noise.Problemy Peredachi Informatsii, 18(2):61–73. Ingster, Y. I. (1987). Minimax testing of nonparametric hypotheses on a distribution density in the l_p metrics.Theory of Probability & Its Applications, 31(2):333–337. Ingster,...
-
[2]
In thegeneral sparsitysetting, meaning thats1≤c1d1 and s2≤c2d2 for sufficiently small constants c1,c 2 > 0, we show that there exists a constantcµ> 0such that if µ2≤cµR, then (41) holds. The proof of this claim constitutes the primary technical difficulty of the derivation of our lower bound, and relies on a precise analysis of the moment generating funct...
-
[3]
Otherwise, we place our selves in thevery densesetting and assume without loss of generality that s1≥cd1 for a constantc∈(0, 1). In this case, the problem roughly reduces to the sparse signal detection problem in a standard Gaussian sequence model. In Section A.2, we prove that there exist constants cµ,c′> 0such that if µ2≤cµd1 s2 1 log ( 1 +c′d2 s2 2 ) ,...
-
[4]
Suppose thatc−1 µs2 log ( cd2 s2 ) ≤C∗ s2 1 d1 . Then if µ2 <C−1 ∗ d1 s2 1 log ( cd2 s2 ) + cµ s2 log (C∗ 2e ) , it holds E [ exp(µ2XY)1 ( X≥⌈C∗ s2 1 d1 ⌉ )] <α
-
[5]
Suppose thatC∗ s2 1 d1 <c−1 µs2 log ( cd2 s2 ) ≤s1. Then if µ2 < cµ s2 log (s2d1 log(cd2 s2 ) cµ2es2 1 ) , it holds E [ exp(µ2XY)1 ( X≥⌈C∗ s2 1 d1 ⌉ )] <α
-
[6]
Proof.DefineA={⌈C∗ s2 1 d1 ⌉,...,s1}andgas in (43)
Suppose thats1 <c−1 µs2 log ( cd2 s2 ) .Then if µ2 < 1 s1 log ( cd2 s2 ) + cµ s2 log (d1 2es1 ) , it holds E [ exp(µ2XY)1 ( X≥⌈C∗ s2 1 d1 ⌉ )] <α. Proof.DefineA={⌈C∗ s2 1 d1 ⌉,...,s1}andgas in (43). Fork∈A, it holds cµk s2 log (kd1 2es2 1 ) ≥cµC∗s2 1 s2d1 log (C∗ 2e ) ≥cµC∗(2e)−4log (C∗ 2e ) >0 where the second inequality follows from the assumptions2 1≥(...
-
[7]
Suppose thatc−1 µs2 log ( cd2 s2 ) ≤C∗ s2 1 d1 . Thenf is increasing overA, and by our assumption onµ2, it holds µ2 <C−1 ∗ d1 s2 1 log ( cd2 s2 ) + cµ s2 log (C∗ 2e ) ≤min k∈A f(k) ≤min k∈A g(k)
-
[8]
Suppose thatC∗ s2 1 d1 <c−1 µs2 log ( cd2 s2 ) ≤s1. Then by our assumption onµ2, we have µ2 < cµ s2 log (s2d1 log(cd2 s2 ) cµ2es2 1 ) ≤min k∈A f(k) ≤min k∈A g(k)
-
[9]
Thenf is decreasing overA and is minimized atk =s1
Suppose thats1 <c−1 µs2 log ( cd2 s2 ) . Thenf is decreasing overA and is minimized atk =s1. Therefore by our assumption onµ2, we have µ2 < 1 s1 log ( cd2 s2 ) + cµ s2 log (d1 2es1 ) = min k∈A f(k) ≤min k∈A g(k). By Lemma 2, the proof is complete. 40 A.3.1 Simplification of the rate For anys1,s 2,d 1,d 2∈N, we define the following quantities ψ(s1,s 2,d 1,...
-
[10]
Then the following two properties hold ( ψ12 +ψ21 ) ∧ϕ12≍ψ21∧ϕ12 (47) ψ21∧ϕ12≍ ( ψ21 +β12 ) ∧ϕ12.(48)
-
[11]
It follows that, for anyα>0, there exists a constantcµ>0such that, wheneverµ2≤cµR, we have E [ exp(µ2XY)1 ( X≥C∗ s2 1 d1 )] <α. Proof of Lemma 8
-
[12]
41 We start by showing that ψ21∧ϕ12≍ψ21∧˜ϕ12
For anys1,s 2,d 1,d 2, we let ˜ϕ12 = d1 s2 1 log ( 1 +d2 s2 2 ) . 41 We start by showing that ψ21∧ϕ12≍ψ21∧˜ϕ12. This is clear ifd1 s2 1 ≤Cby definition ofϕ12. Assume now thatd1 s2 1 >C , which implies thatϕ12 =∞and ψ21∧ϕ12 =ψ21. By the assumptions2 1≥¯cd1s2, we obtains2≤s2 1 d1¯c≤1 ¯c. Therefore, we have ϕ12 = d1 s2 1 log ( 1 +d2 s2 2 ) ≥d1 s2 1 log ( 1 +...
-
[13]
Assume first thatc−1 µs2 log ( cd2 s2 ) ≤C∗ s2 1 d1
Now, letα>0and letc µ,c>0be the constants defined in Lemma 7. Assume first thatc−1 µs2 log ( cd2 s2 ) ≤C∗ s2 1 d1 . Recalling that d2 s2 ≥1 c′where c′can be taken arbitrarily small, we have R≤ϕ12 = d1 s2 1 log ( 1 +d2 s2 2 ) ≤d1 s2 1 log ( 2d2 s2 ) providedc′is small enough ≤2d1 s2 1 log ( cd2 s2 ) providedc′is small enough ≤C−1 ∗ d1 s2 1 log ( cd2 s2 ) +...
-
[14]
Then for some sufficiently small constantc> 0depending only on µ, it holds thatM≥cR. Hence, for anyα>0, there exists a constantc′ µ>0such that, wheneverµ2≤c′ µR, we have E [ exp(µ2XY)1 ( 1∨C∗ s2 1 d1 ≤X≤s1∧ s2 log (d1s2 s2 1 ) )] <α
-
[15]
Moreover, we have R≥ 1 2(ψ12 +β21)∧ϕ12∧ϕ21 ifs 1≤ s2 log ( d1s2 s2 1 ) 1 2(ψ21 +β12)∧ϕ12∧ϕ21 otherwise. (51) Proof of Lemma 9
-
[16]
We start by showing thatM≥c(ψ21 +ψ12). Using Lemma 4 and Lemma 25.(i), we obtain 2M≥1 s1 log ( 1 +cµ d2s1 s2 2 log( d1 2es1 ) ) + log (d1s2 s2 1 ) s2 log ( 1 +cµ d2 s2 log (d1s2 s2 1 )log ( d1s2 2es2 1 log(d1s2 s2 1 ) )) ≥cµ s1 log ( 1 +d2s1 s2 2 log( d1 2es1 ) ) +cµ log (d1s2 s2 1 ) s2 log ( 1 + d2 s2 log (d1s2 s2 1 )log ( d1s2 2es2 1 log(d1s2 s2 1 ) )) ...
-
[17]
Assume first thats1≤ s2 log ( d1s2 s2 1 )
We now prove equation (51). Assume first thats1≤ s2 log ( d1s2 s2 1 ). Then we obtain d1 s1 ≤ d1s2 s2 1 log (d1s2 s2 1 ),hence d1s2 s2 1 ≥d1 s1 log (d1 s1 ) ,i.e. 1 s1 ≥1 s2 log (d1 s1 ) by Lemma 24.(ii). Note that, sinced2s1 s2 2 ≥d1 s1 ≥1 c′can be made arbitrarily large by takingc′small enough, we have β21≤1 s2 log (d1 s1 ) ≤1 s1 ≤1 s1 log ( 1 +d2s1 s2 ...
-
[18]
Then for anyα>0, there exist constantscµ> 0and C∗≥1such that if1 ∨ ⌈ C∗ s2 1 d1 ⌉ ∨ ⌈ C∗s2 2/d2 log (d1s2 2 s2 1d2 ) ⌉ ≤ s1∧ ⌊ s2 log ( d1s2 s2 1 )⌋ andµ2≤cµR, E [ exp(µ2XY)1 ( C∗ s2 1 d1 ∨C∗s2 2/d2 log (d1s2 2 s2 1d2 )≤X≤s1∧ s2 log (d1s2 s2 1 ) )] <α. 47
-
[19]
Moreover, we have R≥ ( ψ21 +β12 ) ∧ϕ12∧ϕ21 ifs 2≤s1 log ( d1 s1 ) ( ψ12 +β21 ) ∧ϕ12∧ϕ21 otherwise. Proof of Lemma 11
-
[20]
By the assumptiond1s2 s2 1 ≥¯c−1≥16e4, we can repeat the steps leading to equation (52) to obtain log (d1s2 s2 1 ) s2 log ( 1 +cµ d2 s2 log (d1s2 s2 1 )log ( d1s2 2es2 1 log(d1s2 s2 1 ) )) ≥cµ 4 log ( 1 + d1s2 s2 1 log (d2 s2 )) s2 = cµ 4 ψ21. It immediately follows that, for some small enough constantc, we have µ2≤cR ≤c(ψ12 +ψ21) ≤1 2 1 s1 log ( 1 +cµ d2...
-
[21]
Assume now thats2≤s1 log ( d1 s1 ) . Then we have ψ12 = 1 s1 log ( 1 +d2s1 s2 2 log (d1 s1 )) ≥1 s1 log ( 1 +d2 s2 ) ≥β12, which yields the desired result. Assume now thats2 >s 1 log ( d1 s1 ) . Then ψ21 = 1 s2 log ( 1 +d1s2 s2 1 log (d2 s2 )) > 1 s2 log (d1 s1 log (d1 s1 ) log (d2 s2 )) ≥1 s2 log (d1 s1 ) ≥β21. This concludes the proof. The lemma below e...
-
[22]
Then for anyα4 >0, there exist constantscµ>0andC ∗≥1such that ifµ2≤cµR,then E [ exp(µ2XY)1 ( X≥ s2 log (d1s2 s2 1 )∨C∗ s2 1 d1 ⌉)] <α4
-
[23]
Moreover, it holds that R≥ 1 4 ( (ψ21 +β12) + (ψ12 +β21) ) ∧ϕ12∧ϕ21 ifs 1≤s2 log ( d2 s2 ) 1 2 ( ψ21 +β12 ) ∧ϕ12∧ϕ21 otherwise. Proof of Lemma 12
-
[24]
Assume first thats1 <s 2 log(d2/s2)
Note that the relations2 <s 1 log ( (d1s2)/s2 1 ) implies that d1s2 s2 1 < d1 s1 log (d1s2 s2 1 ) ≤2d1 s1 log (2d1 s1 ) by Lemma 24.(i) which yields s2≤4s1 log(d1/s1).(53) Now, we obtain d2s1 s2 2 log (d1 s1 ) = d2 s2 ·s1 s2 log (d1 s1 ) ≥1 4c′≥1 providedc′≤1/4, and d1s2 s2 1 log (d2 s2 ) ≥1 ¯clog ( 1 c′ ) ≥1 providedc′and¯care small enough, which ensures...
-
[25]
Assume first thats1 <s 2 log ( d2 s2 ) . Then we have ψ12 +ψ21 = 1 s1 log ( 1 +d2s1 s2 2 log (d1 s1 )) + 1 s2 log ( 1 +d1s2 s2 1 log (d2 s2 )) ≥1 s1 log ( 1 + d2 4s2 ) + 1 s2 log (d1s2 s2 1 log (d2 s2 )) > 1 4s1 log ( 1 +d2 s2 ) + 1 s2 log (d1 s1 ) ≥1 4 ( β12 +β21 ) , which yieldsR≥1 8 ( (ψ21 +β12) + (ψ12 +β21) ) ∧ϕ12∧ϕ21, as desired. Assume now thats1≥s2...
work page 2021
-
[26]
Similarly, if˜R =ϕ21 and d1≥s2 1, the result follows by Lemma 16
Then the result follows by Lemma 16. Similarly, if˜R =ϕ21 and d1≥s2 1, the result follows by Lemma 16. Finally, if none of the conditions above are satisfied, then we have ∆∗= ∆ h2 lin and the result follows by Lemma 14. The proof is complete. B.6 Analysis of adaptive tests B.6.1 Adaptive Bonferroni corrected linear test Lemma 18.Letα∈(0,1)be given. Defin...
-
[27]
However, it is true thats+ 1 :=d12−⌈log2(d1/s1)⌉belongs toΩ 1
We have not assumed thats∗ 1∈Ω 1. However, it is true thats+ 1 :=d12−⌈log2(d1/s1)⌉belongs toΩ 1. Sinces∗ 1≤s+ 1 , there exists a setS+ 1 ∈Ps+ 1 (d1)such that S∗ 1⊂S+ 1 . Define the statistic tS+ 1 = d2∑ j=1 ¯YS+ 1,j = 1√ s+ 1d2 d2∑ j=1 ∑ i∈S+ 1 Yij. Under the alternative hypothesis,EX[tS+ 1 ] = (µs2s∗ 1)/ √ s+ 1d2 and var(tS1) = 1. By our assumption onµan...
work page 2023
-
[28]
Lemma 23.Let f : S→(0,∞)be differentiable, where S⊆(0,∞)is an interval
Then for anyk∈{1,...,n}, we have Pr(X=k)≤ (2enp k )k exp ( −np ) Proof.Using the bound (n k ) ≤ (ne k )k (Appendix A in Roch (2024)), we have Pr(X=k) = (n k ) pk( 1−p )n−k ≤ (npe k )k( 1−p )n−k = (npe k )k( 1−p )n( 1−p )−k ≤ (npe k )k( 1−p )n( 1/2 )−k = (2enp k )k( 1−p )n ≤ (2enp k )k exp ( −np ) where the final inequality uses(1−x)b≤exp(−xb)for anyx,b≥0....
work page 2024
-
[29]
First case: Assumelog(d 1)>s 2. Then the quantities involved in the rate simplify as follows •ψ12 = log ( 1 + d2 s2 2 log(ed1) ) ≍log ( ed2 s2 ) + log ( elog(d 1) s2 ) •ψ21 = 1 s2 log ( 1 +d 1 log(e (d2 s2 ) ) ) ≍logd 1 s2 + log(log(e( d2 s2))) s2 , hence ψ12 +ψ21≍log (ed2 s2 ) + logd 1 s2 . Moreover, we also have ϕ21 = d2 s2 2 log(1 +d 1)≳log (ed2 s2 ) +...
-
[30]
We haveψ12 = log ( 1 + d2 s2 2 log(ed1) ) , and •ϕ12 =d 1 log ( 1 + d2 s2 2 ) ≥ψ12 by Lemma 25.(ii)
Assume now thatlog(d1)<s 2. We haveψ12 = log ( 1 + d2 s2 2 log(ed1) ) , and •ϕ12 =d 1 log ( 1 + d2 s2 2 ) ≥ψ12 by Lemma 25.(ii). •ϕ21 = d2 s2 2 log(1 +d 1)≳ψ12. Assume first thatd2 s2 2 log(ed1)< 1. It follows thatψ12≍ϕ21, so that(µ∗)2≍(ψ12+ψ21)∧ϕ12∧ϕ21≍ϕ12, as claimed. Now, assume thatd2 s2 2 log(ed1)≥1. Then by Lemma 25.(ii), we have s2ψ12≥log ( 1 +d2 l...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.