pith. machine review for the scientific record. sign in

arxiv: 2605.10713 · v1 · submitted 2026-05-11 · 📊 stat.ML · cs.IT· cs.LG· math.IT· math.ST· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data

David Gamarnik, Youssef Chaabouni

Pith reviewed 2026-05-12 04:21 UTC · model grok-4.3

classification 📊 stat.ML cs.ITcs.LGmath.ITmath.STstat.TH
keywords sparse recoverymixed-quality dataheterogeneous noiseLASSOprice of qualityinformation theoretic boundsagnostic decodersample complexity
0
0 comments X

The pith

Sparse recovery succeeds with mixed-quality data under a linear Price of Quality trade-off.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates conditions under which a sparse signal can be recovered from a combination of high-quality low-noise measurements and low-quality high-noise measurements. It shows that the total number of samples must satisfy a linear relation between the counts of each type, which defines the Price of Quality as the replacement ratio. For decoders that do not know the quality of each sample, this ratio is bounded by two, meaning one high-quality sample can be replaced by at most two low-quality ones. The LASSO estimator remains effective as long as the average noise level is below the standard threshold, demonstrating robustness to the mixture of qualities.

Core claim

It is sufficient for (n1, n2) to satisfy a linear trade-off defining the Price of Quality. In the agnostic setting one high-quality sample is never worth more than two low-quality samples, while in the informed setting the price can grow arbitrarily large. LASSO recovery threshold matches the homogeneous-noise case and depends only on average noise level.

What carries the argument

The Price of Quality linear trade-off between high- and low-quality sample counts, and the average-noise invariance of the LASSO recovery threshold.

If this is right

  • Sufficient conditions exist for both information-theoretic recovery and LASSO-based algorithmic recovery in the mixed-quality setting.
  • The agnostic decoder requires only a bounded price of quality no larger than two.
  • The informed decoder can achieve arbitrarily better trade-offs by using knowledge of per-sample variances.
  • LASSO performance is governed solely by the mean noise level and does not degrade due to heterogeneity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Without knowing qualities in advance, a modest number of high-quality samples remains useful but the majority of data can safely be lower quality.
  • The average-noise invariance may extend to other convex methods for sparse recovery under data heterogeneity.
  • The results suggest checking whether real measurement systems with varying precision follow the same average-noise rule for recovery success.

Load-bearing premise

The observations follow a standard sparse linear model with additive Gaussian noise of exactly two distinct variances and the signal is exactly sparse.

What would settle it

A simulation or construction where LASSO fails to recover despite the average noise variance being below the homogeneous threshold, or where the minimal n2/n1 ratio for agnostic recovery exceeds two.

read the original abstract

We study sparse recovery when observations come from mixed-quality sources: a small collection of high-quality measurements with small noise variance and a larger collection of lower-quality measurements with higher variance. For this heterogeneous-noise setting, we establish sample-size conditions for information-theoretic and algorithmic recovery. On the information-theoretic side, we show that it is sufficient for $(n_1, n_2)$ to satisfy a linear trade-off defining the Price of Quality: the number of low-quality samples needed to replace one high-quality sample. In the agnostic setting, where the decoder is completely agnostic to the quality of the data, it is uniformly bounded, and in particular one high-quality sample is never worth more than two low-quality samples for this sufficient condition to hold. In the informed setting, where the decoder is informed of per-sample variances, the price of quality can grow arbitrarily large. On the algorithmic side, we analyze the LASSO in the agnostic setting and show that the recovery threshold matches the homogeneous-noise case and only depends on the average noise level, revealing a striking robustness of computational recovery to data heterogeneity. Together, these results give the first conditions for sparse recovery with mixed-quality data and expose a fundamental difference between how the information-theoretic and algorithmic thresholds adapt to changes in data quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper studies sparse recovery from mixed-quality observations consisting of n1 high-quality samples with low noise variance and n2 low-quality samples with higher variance under a standard sparse linear model with additive Gaussian noise. It derives information-theoretic sufficient conditions showing that recovery is possible whenever (n1, n2) satisfy a linear trade-off defining the Price of Quality. In the agnostic setting the price is uniformly bounded by 2; in the informed setting it can grow arbitrarily large. For the LASSO in the agnostic setting the recovery threshold matches the homogeneous-noise case and depends only on the average noise level.

Significance. If the derivations hold, the work supplies the first explicit sample-complexity conditions for sparse recovery under heterogeneous data quality, a setting of clear practical relevance. The contrast between the bounded agnostic price and the unbounded informed price, together with the demonstrated robustness of LASSO to variance heterogeneity, clarifies how information-theoretic and algorithmic thresholds respond differently to measurement quality. The reduction of the LASSO result to the known homogeneous case via averaging is a clean technical observation.

major comments (2)
  1. [information-theoretic analysis] The information-theoretic sufficiency claim (abstract) is established by exhibiting a decoder whose success is shown via probabilistic arguments. These arguments in the sparse-recovery literature require the design matrix to obey RIP-type or uniform concentration properties adapted to the two distinct noise variances; the manuscript must explicitly state and verify these matrix regularity conditions for the mixed-variance model, otherwise the linear trade-off holds only conditionally.
  2. [LASSO analysis] The algorithmic claim that the LASSO recovery threshold matches the homogeneous-noise case and depends only on the average noise level (abstract) is load-bearing for the robustness conclusion. The proof should confirm that no extra terms arise from the variance split, for example by showing that the sub-Gaussian parameter or the effective noise level reduces exactly to the averaged variance without additional factors.
minor comments (1)
  1. [abstract] The abstract states the main claims clearly but contains no theorem references, proof sketches, or explicit assumptions on the design matrix, making immediate verification of the central trade-off difficult.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments help clarify the presentation of our results on sparse recovery with mixed-quality data. We address each major comment below and will revise the manuscript to incorporate the suggested clarifications.

read point-by-point responses
  1. Referee: [information-theoretic analysis] The information-theoretic sufficiency claim (abstract) is established by exhibiting a decoder whose success is shown via probabilistic arguments. These arguments in the sparse-recovery literature require the design matrix to obey RIP-type or uniform concentration properties adapted to the two distinct noise variances; the manuscript must explicitly state and verify these matrix regularity conditions for the mixed-variance model, otherwise the linear trade-off holds only conditionally.

    Authors: We agree that an explicit statement of the matrix regularity conditions strengthens the result. The probabilistic arguments in the paper rely on standard concentration inequalities that extend directly to the heterogeneous variance setting, but we will add a dedicated paragraph in the revised manuscript stating the precise RIP-type conditions adapted to the two noise variances and confirming that they hold under the usual sub-Gaussian assumptions on the design matrix entries. This will make the sufficient conditions unconditional. revision: yes

  2. Referee: [LASSO analysis] The algorithmic claim that the LASSO recovery threshold matches the homogeneous-noise case and depends only on the average noise level (abstract) is load-bearing for the robustness conclusion. The proof should confirm that no extra terms arise from the variance split, for example by showing that the sub-Gaussian parameter or the effective noise level reduces exactly to the averaged variance without additional factors.

    Authors: We appreciate this point. The proof of the LASSO result proceeds by showing that the effective noise vector remains sub-Gaussian with parameter determined exactly by the average variance; the heterogeneity does not introduce additional multiplicative factors because the concentration bounds combine linearly across the two groups. We will insert an explicit lemma or remark in the revised version that isolates this reduction and confirms the absence of extra terms, thereby clarifying the match to the homogeneous case. revision: yes

Circularity Check

0 steps flagged

No circularity: sufficient conditions derived information-theoretically; LASSO reduces to homogeneous case via averaging without tautology

full rationale

The paper establishes a linear trade-off (Price of Quality) as a sufficient condition for recovery by probabilistic arguments exhibiting a decoder under the mixed-variance Gaussian model. This is not self-definitional, as the bound (e.g., at most 2 in agnostic case) follows from the analysis rather than being presupposed. The LASSO result matches the homogeneous threshold by depending only on average noise level, which is a direct calculation and not a fitted input renamed as prediction. No load-bearing self-citations, uniqueness theorems from authors, or ansatz smuggling are present. The derivation chain is self-contained against standard sparse recovery benchmarks and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the standard sparse linear regression model with two noise variances and the existence of suitable decoders; no new free parameters, axioms beyond domain-standard assumptions, or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Sparse linear model with additive Gaussian noise of two distinct variances
    The setting is defined with high-quality and low-quality measurements having different noise variances.

pith-pipeline@v0.9.0 · 5539 in / 1397 out tokens · 48104 ms · 2026-05-12T04:21:38.867998+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    (34) Similarly to (34), we obtain the following expression forM −∆2 i (θ): M−∆2 i (θ) =    1q 1−2|U∪V| (−θ+2θ2σ2

    if|U∪V| −θ+ 2θ 2σ2 1 <1/2 +∞else. (34) Similarly to (34), we obtain the following expression forM −∆2 i (θ): M−∆2 i (θ) =    1q 1−2|U∪V| (−θ+2θ2σ2

  2. [2]

    reasonable

    if|U∪V| −θ+ 2θ 2σ2 2 <1/2 +∞else. (35) Therefore, for anyθ≥0: M−∆1 i (θ)n1 M−∆2 i (θ)n2 =    1−2|U∪V| −θ+ 2θ 2σ2 1 −n1/2 × 1−2|U∪V| −θ+ 2θ 2σ2 2 −n2/2 if |U∪V| −θ+ 2θ 2σ2 1 <1/2 |U∪V| −θ+ 2θ 2σ2 2 <1/2 +∞else. Remark A.1(Best Chernoff bound).To find the best Chernoff bound (33), we need to solve the optimization problem in (33), defined b...

  3. [3]

    We know that: X j∈U X1 ij − X j∈V X1 ij d = X j∈U∪V X1 ij ∼ N(0,|U∪V|)

    =E X 1 i ,Z1 i h e−θ[⟨X 1 i ,β⋆−β(S)⟩ 2+2Z1 i ⟨X 1 i ,β⋆−β(S)⟩]/σ2 1 i =E X 1 i h e−θ⟨X 1 i ,β⋆−β(S)⟩ 2/σ2 1 EZ1 i h e−2θZ 1 i ⟨X 1 i ,β⋆−β(S)⟩/σ 2 1 X1 i ii =E X 1 i h e−θ⟨X 1 i ,β⋆−β(S)⟩ 2/σ2 1 MZ1 i |X 1 i −2θ⟨X1 i , β⋆ −β(S)⟩/σ 2 1 i =E X 1 i h e−θ⟨X 1 i ,β⋆−β(S)⟩ 2/σ2 1 e 1 2(−2θ⟨X 1 i ,β⋆−β(S)⟩/σ 2 1) 2 σ2 1 i =E Xi h e(−θ+2σ2 1)⟨X 1 i ,β⋆−β(S)⟩ 2/σ...

  4. [4]

    (40) 21 Published as a conference paper at ICLR 2026 Similarly to (40), we obtain the following expression forM −∆2 i (θ/σ2 2): M−∆2 i (θ/σ2

    = ( 1√ 1−2|U∪V|(−θ+2θ 2)/σ2 1 if|U∪V| −θ+ 2θ 2 /σ2 1 <1/2 +∞else. (40) 21 Published as a conference paper at ICLR 2026 Similarly to (40), we obtain the following expression forM −∆2 i (θ/σ2 2): M−∆2 i (θ/σ2

  5. [5]

    (41) From above, we clearly have: arg min θ≥0 M−∆1 i (θ/σ2

    = ( 1√ 1−2|U∪V|(−θ+2θ 2)/σ2 2 if|U∪V| −θ+ 2θ 2 /σ2 2 <1/2 +∞else. (41) From above, we clearly have: arg min θ≥0 M−∆1 i (θ/σ2

  6. [6]

    = arg min θ≥0 M−∆2 i (θ/σ2

  7. [7]

    = arg min θ≥0 −θ+ 2θ 2 = 1 4 , and, takingθ ⋆ := 1/4we have: M−∆1 i (θ/σ2

  8. [8]

    = 1q 1 + |U∪V| 4σ2 1 andM −∆2 i (θ/σ2

  9. [9]

    = 1q 1 + |U∪V| 4σ2 2 . Therefore: inf θ≥0 M−∆1 i θ σ2 1 n1 M−∆2 i θ σ2 2 n2 = M−∆1 i θ⋆ σ2 1 n1 M−∆2 i θ⋆ σ2 2 n2 =    1q 1 + |U∪V| 4σ2 1    n1   1q 1 + |U∪V| 4σ2 2    n2 Therefore: inf θ≥0 M−∆1 i θ σ2 1 n1 M−∆2 i θ σ2 2 n2 = 1 + |U∪V| 4σ2 1 −n1/2 1 + |U∪V| 4σ2 2 −n2/2 . (42) Since|U∪V|= 2M(S)≥2δs, the above yields: inf θ≥0 M−∆1 i θ σ2 1 n1 M−...

  10. [10]

    log (p−s) n3 = (1 +o(1)) √ 1−δ s 2slog (p−s) n−s−1 + 2 (n−s) (n 1σ2 1 +n 2σ2

  11. [11]

    We consider two cases, depending on the asymptotic behavior of n1σ2 1+n2σ2 2 λ2pn2 : • Case 1:lim p→+∞ n1σ2 1+n2σ2 2 λ2pn2 = 0

    log (p−s) λ2pn3 . We consider two cases, depending on the asymptotic behavior of n1σ2 1+n2σ2 2 λ2pn2 : • Case 1:lim p→+∞ n1σ2 1+n2σ2 2 λ2pn2 = 0. • Case 2:lim p→+∞ n1σ2 1+n2σ2 2 λ2pn2 >0. Case 1:Assumelim p→+∞ n1σ2 1+n2σ2 2 λ2pn2 = 0. By the above inequality, we have: 1 λp E max j∈S c |Vj| XS, W ≥(1 +o(1)) √ 1−δ r 2slog (p−s) n−s−1 . Using condition (26),...

  12. [12]

    log (p−s) λ2pn3 ≥(1 +o(1)) √ 1−δ s 2 (n−s) (n 1σ2 1 +n 2σ2

  13. [13]

    Therefore, fornlarge enough, we have: E max j∈S c |Vj| XS, W ≥4λ p

    log (p−s) λ2pn3 = (1 +o(1)) √ 1−δ s 2 n−s n n1σ2 1 +n 2σ2 2 λ2pn2 p log (p−s) p→+∞ − →+∞. Therefore, fornlarge enough, we have: E max j∈S c |Vj| XS, W ≥4λ p. Now using Lemma D.3 on concentration of Gaussian maxima, we have for allη >0: P max j∈S c |Vj|<E max j∈S c |Vj| −η ≤exp − η2 2 (1 +δ)E[M p] . Fixingη :=E[max j∈S c |Vj|]/2andδ :=ε/2we get, fornlarge ...

  14. [14]

    log (p−s) λ2pn2 + (n1σ2 1 +n 2σ2

  15. [15]

    log 2 λ2pn2 . 29 Published as a conference paper at ICLR 2026 Taking the limsup asp→+∞and using conditions (27) and (28) we get, conditionally onT(δ) c: lim sup p→+∞ 1 λp E max j∈S c |Vj| XS, W ≤lim sup p→+∞ s (1 +δ) 2slog (p−s) (1 +ε) (2slog (p−s) +s+ 1)−s−1 ≤lim sup p→+∞ s (1 +δ) 2slog (p−s) (1 +ε) (2slog (p−s) +s+ 1)−s−1 ≤ r 1 +δ 1 +ε . Fixδ :=ε/4. By ...

  16. [16]

    log (s) n(n−s−1) . 46 Published as a conference paper at ICLR 2026 Therefore, we have: P max i∈S Ui ≥ρ ≤P max i∈S Ui ≥ρ T c +K s n1 + s n2 ≤E P max i∈S Ui ≥ρ T c, XS +K s n1 + s n2 (59) ≤ 1 ρ nλp √s n−s−1 + 2 s (n1σ2 1 +n 2σ2

  17. [17]

    Hence we have: P max i∈S Ui ≥ρ ≤ 1 ρ λp √s+ r (n1σ2 1 +n 2σ2

    log (s) n(n−s−1) ! +K s n1 + s n2 . Hence we have: P max i∈S Ui ≥ρ ≤ 1 ρ λp √s+ r (n1σ2 1 +n 2σ2

  18. [18]

    Using a similar argument, we establish the same bound for{−U i}i∈S, that: P max i∈S {−Ui} ≥ρ ≤ 1 ρ λp √s+ r (n1σ2 1 +n 2σ2

    log (s) n2 ! (1 +o p (1)) +o p (1),(60) which converges to0asp→+∞under condition (28). Using a similar argument, we establish the same bound for{−U i}i∈S, that: P max i∈S {−Ui} ≥ρ ≤ 1 ρ λp √s+ r (n1σ2 1 +n 2σ2

  19. [19]

    D.3.1 PROOF OFLEMMAD.7 Proof of Lemma D.7

    log (s) n2 ! (1 +o p (1)) +o p (1).(61) Bringing together (60) and (61) and using a union bound, we conclude: P max i∈S |Ui|< ρ p→+∞ − →1. D.3.1 PROOF OFLEMMAD.7 Proof of Lemma D.7. Mean ofY i.Recall that: Yi =−λ peT i 1 n X T S XS −1 ⃗b=−λ pneT i X T S XS −1 ⃗b Note thatX T S XS ∼ W s (Is, n). Using properties of the Wishart distribution (see Lemma 7.7.1...