arxiv: 2604.19484 · v1 · submitted 2026-04-21 · 💻 cs.CG · cs.CC· math.OC

Recognition: unknown

Maximum Solow--Polasky Diversity Subset Selection Is NP-hard Even in the Euclidean Plane

Michael T. M. Emmerich , Ksenia Pereverdieva , Andr\'e H. Deutz

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:05 UTC · model grok-4.3

classification 💻 cs.CG cs.CCmath.OC

keywords NP-hardnessSolow-Polasky diversitysubset selectionEuclidean planecomputational geometryunit-disk independent setnonlinear objectivereduction

0 comments

The pith

Selecting a fixed-size Solow-Polasky diversity subset is NP-hard for any finite point set in the Euclidean plane.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that, for any fixed positive threshold theta zero, finding a subset of given size that maximizes the Solow-Polasky diversity score is NP-hard when points lie in two-dimensional Euclidean space. This result strengthens an earlier hardness proof that applied only to arbitrary metric spaces by showing the same intractability survives the strong restriction to planar Euclidean distances. A sympathetic reader would care because the result rules out efficient exact algorithms for a natural diversity measure used in ecology, facility location, and experimental design, even when distances are the familiar Euclidean ones. The proof works by reducing from the known NP-hard Geometric Unit-Disk Independent Set problem and controlling the nonlinear objective through a new comparison lemma on bounded perturbations of pairwise distances.

Core claim

For every fixed theta zero greater than zero, the problem of selecting a cardinality-k subset that maximizes one transpose times Z inverse times one, where Z sub i j equals e to the minus theta zero times Euclidean distance between points i and j, is NP-hard on finite point sets in R squared.

What carries the argument

Bounded-box comparison lemma that bounds the change in the nonlinear objective 1^T Z^{-1} 1 under small perturbations of the pairwise distances, allowing the reduction gap from unit-disk independent set to be transferred via two distance thresholds.

If this is right

The same NP-hardness holds for every fixed dimension d at least 2.
No polynomial-time exact algorithm exists for the problem unless P equals NP, even when all distances are Euclidean.
Practical solution methods must rely on approximation algorithms, heuristics, or exponential-time exact methods on planar instances.
The two-threshold geometric technique cannot be replaced by the exact two-distance construction that worked for arbitrary metrics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar hardness proofs may extend to other nonlinear diversity indicators whose Hessians or inverses behave monotonically with distance.
The bounded-box lemma technique could be reused to prove hardness for related subset-selection problems whose objective is a smooth function of the distance matrix.
If a polynomial-time algorithm is found for a restricted subclass of point sets, it would have to exploit structure that the reduction deliberately avoids.

Load-bearing premise

The reduction assumes that Geometric Unit-Disk Independent Set remains NP-hard when points are placed in the plane with the specific distance thresholds chosen for the diversity objective.

What would settle it

A concrete point configuration in the plane together with a cardinality k for which the maximum Solow-Polasky score can be computed in polynomial time while violating the yes-no gap claimed by the reduction.

Figures

Figures reproduced from arXiv: 2604.19484 by Andr\'e H. Deutz, Ksenia Pereverdieva, Michael T. M. Emmerich.

read the original abstract

We prove that, for every fixed $\theta_0>0$, selecting a subset of prescribed cardinality that maximizes the Solow--Polasky diversity indicator is NP-hard for finite point sets in $\mathbb{R}^2$ with the Euclidean metric, and therefore also for finite point sets in $\mathbb{R}^d$ for every fixed dimension $d \ge 2$. This strictly strengthens our earlier NP-hardness result for general metric spaces by showing that hardness persists under the severe geometric restriction to the Euclidean plane. At the same time, the Euclidean proof technique is different from the conceptually easier earlier argument for arbitrary metric spaces, and that general metric-space construction does not directly translate to the Euclidean setting. In the earlier proof one can use an exact construction tailored to arbitrary metrics, essentially exploiting a two-distance structure. In contrast, such an exact realization is unavailable in fixed-dimensional Euclidean space, so the present reduction requires a genuinely geometric argument. Our Euclidean proof is based on two distance thresholds, which allow us to separate yes-instances from no-instances by robust inequalities rather than by the exact construction used in the general metric setting. The main technical ingredient is a bounded-box comparison lemma for the nonlinear objective $\mathbf{1}^{\top}Z^{-1}\mathbf{1}$, where $Z_{ij}=e^{-\theta_0 d(x_i,x_j)}$. This lemma controls the effect of perturbations in the pairwise distances well enough to transfer the gap created by the reduction. The reduction is from \emph{Geometric Unit-Disk Independent Set}. We present the main argument in geometric form for finite subsets of $\mathbb{R}^2$, with an appendix supplying the bit-complexity details needed for polynomial-time reducibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves NP-hardness for Solow-Polasky maximization in the Euclidean plane with a new geometric reduction from unit-disk independent set, though the comparison lemma's error control is the part that needs verification.

read the letter

The main takeaway is that Solow-Polasky diversity subset selection remains NP-hard for any fixed theta zero when points are restricted to the Euclidean plane. The proof reduces from geometric unit-disk independent set and uses two distance thresholds plus a bounded-box comparison lemma to carry the gap across to the nonlinear objective one transpose Z inverse one, where Z uses the exponential of minus theta zero times distance.

Referee Report

1 major / 1 minor

Summary. The manuscript proves that, for every fixed θ₀ > 0, the problem of selecting a k-subset of a finite point set in the Euclidean plane that maximizes the Solow–Polasky diversity objective 1ᵀZ⁻¹1 (where Z_{ij} = exp(−θ₀ d(x_i, x_j))) is NP-hard. The proof proceeds by a many-one reduction from Geometric Unit-Disk Independent Set; the reduction employs a geometric embedding with two distance thresholds and relies on a bounded-box comparison lemma to guarantee that the objective gap between yes- and no-instances survives the distance perturbations introduced by the embedding. An appendix supplies the bit-complexity analysis required for polynomial-time reducibility.

Significance. If correct, the result is significant: it establishes NP-hardness under the restrictive Euclidean metric in fixed dimension 2, thereby strengthening the authors’ prior hardness result for arbitrary metrics. The bounded-box comparison lemma provides a reusable technique for controlling a nonlinear objective under bounded distance perturbations, and the explicit bit-complexity appendix is a positive contribution to making the reduction fully rigorous.

major comments (1)

[bounded-box comparison lemma] The bounded-box comparison lemma (introduced to transfer the objective gap) is load-bearing for the central claim. The manuscript must explicitly bound the Euclidean embedding error δ arising from the unit-disk construction and verify that the lemma’s perturbation tolerance is strictly smaller than the objective gap induced by differing independent-set sizes, for every instance size and every fixed θ₀ > 0. Without this quantitative verification the yes/no separation does not necessarily carry over.

minor comments (1)

[Introduction] The introduction’s high-level contrast between the exact two-distance construction (general metrics) and the two-threshold robust inequalities (Euclidean case) is helpful; a short additional sentence clarifying why the general-metric construction cannot be realized exactly in the plane would further aid readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying the need for a more explicit quantitative verification of the perturbation tolerance. We address the single major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [bounded-box comparison lemma] The bounded-box comparison lemma (introduced to transfer the objective gap) is load-bearing for the central claim. The manuscript must explicitly bound the Euclidean embedding error δ arising from the unit-disk construction and verify that the lemma’s perturbation tolerance is strictly smaller than the objective gap induced by differing independent-set sizes, for every instance size and every fixed θ₀ > 0. Without this quantitative verification the yes/no separation does not necessarily carry over.

Authors: We agree that an explicit, uniform bound relating the embedding error δ to the objective gap is required for a fully rigorous transfer of the yes/no separation. The manuscript already states that the bounded-box comparison lemma controls perturbations sufficiently to preserve the gap, and the appendix supplies bit-complexity bounds on the coordinates that implicitly limit δ. However, the referee is correct that a direct, instance-size-independent comparison (showing that the lemma’s tolerance exceeds the minimal gap for every n and every fixed θ₀ > 0) is only sketched rather than written out in full. In the revised version we will add, in the appendix, an explicit calculation: we first derive a concrete upper bound δ ≤ 2^{-Ω(n)} on the Euclidean distance error introduced by the geometric embedding of the unit-disk graph; we then expand the proof of the comparison lemma to obtain a Lipschitz-style estimate showing that the change in 1ᵀZ⁻¹1 is at most O(k² θ₀ δ) (or the precise constant derived from the matrix derivative); finally we verify that this quantity is strictly smaller than the gap of at least c(θ₀) > 0 that arises from any difference in independent-set cardinality. Because θ₀ is fixed and the gap lower bound depends only on θ₀ and the two distance thresholds of the reduction, the inequality holds uniformly for all instance sizes. This addition will be placed immediately after the statement of the lemma. revision: yes

Circularity Check

0 steps flagged

No circularity in the NP-hardness reduction

full rationale

The paper's derivation is a direct many-one reduction from the independently established NP-hardness of Geometric Unit-Disk Independent Set, augmented by a new bounded-box comparison lemma introduced and justified within this manuscript to control perturbations of the nonlinear objective 1^T Z^{-1} 1 under Euclidean distance errors. The self-citation to the authors' prior general-metric result is purely contextual, as the Euclidean proof explicitly employs a different two-threshold geometric construction that does not rely on or reduce to the earlier argument. No load-bearing step equates a derived quantity to a fitted input, self-definition, or unverified self-citation chain; the argument remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the established NP-hardness of Geometric Unit-Disk Independent Set together with the correctness of the newly introduced bounded-box comparison lemma; no free parameters are fitted and no new entities are postulated.

axioms (1)

domain assumption Geometric Unit-Disk Independent Set is NP-hard
Standard result in computational geometry invoked as the source problem for the reduction.

pith-pipeline@v0.9.0 · 5638 in / 1151 out tokens · 81977 ms · 2026-05-10T01:05:16.981744+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Exact Dynamic Programming for Solow--Polasky Diversity Subset Selection on Lines and Staircases
cs.CG 2026-04 unverdicted novelty 6.0

An O(kn^2) dynamic program exactly maximizes Solow-Polasky diversity on ordered line metrics and monotone l1 staircases.

Reference graph

Works this paper leans on

12 extracted references · 1 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

B. N. Clark, C. J. Colbourn, and D. S. Johnson. Unit disk graphs.Discrete Mathematics, 86(1–3):165– 177, 1990

1990
[2]

G. K. Das, G. D. da Fonseca, and R. K. Jallu. Efficient independent set approximation in unit disk graphs.Discrete Applied Mathematics, 280:63–70, 2020

2020
[3]

M. T. M. Emmerich, K. Pereverdieva, and A. H. Deutz. Selecting a Maximum Solow–Polasky Diversity Subset in General Metric Spaces Is NP-hard.arXiv preprint arXiv:2604.05495, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Huntsman

S. Huntsman. Diversity enhancement via magnitude. InEvolutionary Multi-Criterion Optimization, volume 13970 ofLecture Notes in Computer Science, pages 377–390. Springer, 2023

2023
[5]

Leinster and M

T. Leinster and M. W. Meckes. Maximizing diversity in biology and beyond.Entropy, 18(3):88, 2016

2016
[6]

Leinster.Entropy and Diversity: The Axiomatic Approach

T. Leinster.Entropy and Diversity: The Axiomatic Approach. Cambridge University Press, 2021

2021
[7]

Pereverdieva, A

K. Pereverdieva, A. Deutz, T. Ezendam, T. Bäck, H. Hofmeyer, and M. T. M. Emmerich. Compar- ative analysis of indicators for multi-objective diversity optimization. InEvolutionary Multi-Criterion Optimization, volume 15513 ofLecture Notes in Computer Science, pages 58–71. Springer, 2025

2025
[8]

A. R. Solow and S. Polasky. Measuring biological diversity.Environmental and Ecological Statistics, 1(2):95–103, 1994

1994
[9]

Tkachenko and H

A. Tkachenko and H. Wang. Dominating set, independent set, discretek-center, dispersion, and related problems for planar points in convex position. In O. Beyersdorff, M. Pilipczuk, E. Pimentel, and N. K. Thăng, editors,42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025), volume 327 ofLeibniz International Proceedings in In...

2025
[10]

Ulrich, J

T. Ulrich, J. Bader, and L. Thiele. Defining and optimizing indicator-based diversity measures in multiobjective search. In R. Schaefer, C. Cotta, J. Kołodziej, and G. Rudolph, editors,Parallel Problem Solving from Nature, PPSN XI, volume 6238 ofLecture Notes in Computer Science, pages 707–717. Springer, 2010

2010
[11]

Ulrich.Exploring Structural Diversity in Evolutionary Algorithms

T. Ulrich.Exploring Structural Diversity in Evolutionary Algorithms. PhD thesis, ETH Zurich, 2012

2012
[12]

Yevseyeva, E

I. Yevseyeva, E. B. Lenselink, A. de Vries, A. P. IJzerman, A. H. Deutz, and M. T. Emmerich. Appli- cation of portfolio optimization to drug discovery.Information Sciences, 475:29–43, 2019. 10

2019