On the cost of essentially fair clusterings

arxiv: 1811.10319 · v1 · pith:3D3EAF26new · submitted 2018-11-26 · 💻 cs.DS

On the cost of essentially fair clusterings

Ioana O. Bercea , Martin Gro{\ss} , Samir Khuller , Aounon Kumar , Clemens R\"osner , Daniel R. Schmidt , Melanie Schmidt This is my paper

classification 💻 cs.DS

keywords fairclusteringprotectedapproximationcenterclassesproblemalready

0 comments p. Extension

pith:3D3EAF26 Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{3D3EAF26}

Prints a linked pith:3D3EAF26 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

Clustering is a fundamental tool in data mining. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. However, this process may harm protected (minority) classes if the clustering algorithm does not adequately represent them in desirable clusters -- especially if the data is already biased. At NIPS 2017, Chierichetti et al. proposed a model for fair clustering requiring the representation in each cluster to (approximately) preserve the global fraction of each protected class. Restricting to two protected classes, they developed both a 4-approximation for the fair $k$-center problem and a $O(t)$-approximation for the fair $k$-median problem, where $t$ is a parameter for the fairness model. For multiple protected classes, the best known result is a 14-approximation for fair $k$-center. We extend and improve the known results. Firstly, we give a 5-approximation for the fair $k$-center problem with multiple protected classes. Secondly, we propose a relaxed fairness notion under which we can give bicriteria constant-factor approximations for all of the classical clustering objectives $k$-center, $k$-supplier, $k$-median, $k$-means and facility location. The latter approximations are achieved by a framework that takes an arbitrary existing unfair (integral) solution and a fair (fractional) LP solution and combines them into an essentially fair clustering with a weakly supervised rounding scheme. In this way, a fair clustering can be established belatedly, in a situation where the centers are already fixed.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Fast and effective algorithms for fair clustering at scale
cs.LG 2026-05 conditional novelty 6.0

A framework plus three heuristics for fair clustering that give precise cost-fairness control and scale to millions of objects while beating existing solvers on benchmark data.