Differential Parity: Relative Fairness Between Two Sets of Decisions
Pith reviewed 2026-05-24 12:03 UTC · model grok-4.3
The pith
Differential parity defines relative fairness as the independence of decision differences from a sensitive attribute.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Differential parity holds that the difference between two decision sets should be statistically independent of a sensitive attribute; when a reference set of ground-truth or trusted decisions exists, this independence supplies a new group fairness condition distinct from separation and sufficiency, while in the absence of any reference it directly quantifies relative bias between the two sets.
What carries the argument
Differential parity, the statistical independence between the difference of two decision vectors and a sensitive attribute.
If this is right
- When a reference decision set exists, differential parity supplies an additional group fairness test that can be checked alongside existing criteria.
- Without any reference, the measure still identifies which of two decision processes exhibits greater dependence on the sensitive attribute.
- The same framework applies to any pair of decision sources, including human versus algorithmic outputs or two different models.
- The bridging model extends the test to populations that never overlap, removing the requirement that both sets act on identical individuals.
Where Pith is reading between the lines
- The measure could be used to audit whether an updated model reduces or increases bias relative to its predecessor on new data.
- Repeated application over time would track whether fairness between successive decision systems is improving or drifting.
- The independence test could be adapted to multiple sensitive attributes simultaneously by checking joint independence.
Load-bearing premise
A machine learning model can be trained to predict what decisions would have been made on the other set's subjects with enough accuracy to estimate the true differential parity.
What would settle it
Compute differential parity directly on an overlapping population where both decision sets are observed, then compare that value to the value obtained after replacing one set with model predictions; a large discrepancy would show the bridging step fails.
Figures
read the original abstract
With AI systems widely applied to assist humans in decision-making processes such as talent hiring, school admission, and loan approval; there is an increasing need to ensure that the decisions made are fair. One major challenge for analyzing fairness in decisions is that the standards are highly subjective and contextual -- there is no consensus for what absolute fairness means for every scenario. That is not to say that different fairness standards often conflict with each other. To bypass this issue, this work aims to test relative fairness in decisions. That is, instead of defining what are ``absolutely'' fair decisions, we propose to test the relative fairness of one decision set against another with differential parity -- the difference between two sets of decisions should be independent of a certain sensitive attribute. This proposed notion of differential parity fairness has the following benefits: (1) it avoids the ambiguous and contradictory definition of what absolutely fair decisions are; (2) when a reference set (of ground truth or reliable fair decisions) is available, differential parity can serve as a new group fairness notion (similar to but different from separation and sufficiency); (3) even when no reference set is available, it reveals the relative preference or bias between different decision sets. One limitation for differential parity is that it requires the two sets of decisions under comparison to be made on the same data subjects. To overcome this limitation, we propose to utilize a machine learning model to bridge the gap between the two sets of decisions made on difference data and estimate the differential parity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes differential parity as a relative fairness metric: the difference between two sets of decisions should be independent of a sensitive attribute. It claims this bypasses debates over absolute fairness, serves as a group fairness notion (distinct from separation/sufficiency) when a reference set exists, reveals relative bias otherwise, and can be estimated via an ML model when the two decision sets apply to different subjects.
Significance. If the ML bridging construction can be made rigorous, differential parity would supply a practical, reference-based alternative to standard group fairness definitions for comparing decision systems. The manuscript's explicit acknowledgment of the same-subject limitation and attempt to address it via imputation is a constructive step.
major comments (1)
- [Abstract] Abstract (limitation paragraph): the claim that differential parity remains usable when decision sets are made on different subjects rests on training an ML model to impute missing decisions. For the resulting statistic (difference independent of A) to retain its interpretation, the imputation error must itself be independent of A. The abstract provides no training objective, validation procedure against A, or error-independence guarantee; without this property the estimated parity is confounded by the auxiliary model rather than reflecting the original decisions.
minor comments (1)
- [Abstract] The abstract would be strengthened by an explicit mathematical statement of differential parity (e.g., P(D1 - D2 | A) = P(D1 - D2)) before describing its benefits.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the single major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract (limitation paragraph): the claim that differential parity remains usable when decision sets are made on different subjects rests on training an ML model to impute missing decisions. For the resulting statistic (difference independent of A) to retain its interpretation, the imputation error must itself be independent of A. The abstract provides no training objective, validation procedure against A, or error-independence guarantee; without this property the estimated parity is confounded by the auxiliary model rather than reflecting the original decisions.
Authors: We agree that the abstract's treatment of the ML imputation approach is insufficiently precise. The manuscript acknowledges the same-subject limitation and proposes an ML bridge, but does not articulate a training objective, validation against A, or error-independence condition in the abstract. Without such a guarantee the imputed differential parity can indeed be confounded. We will revise the abstract (and, if space permits, the limitation paragraph) to state explicitly that the bridging construction requires the auxiliary model's errors to be independent of A (or to be validated as such) and that this remains an assumption rather than a proven property of the current proposal. revision: yes
Circularity Check
No circularity: definition is a direct statistical independence condition with no self-referential reduction.
full rationale
The paper defines differential parity explicitly as the requirement that the difference between two decision sets is independent of the sensitive attribute. This is a primitive statistical notion introduced without reference to fitted parameters, prior self-citations, or any construction that would make the output equivalent to its inputs by definition. The ML bridging proposal for mismatched subjects is presented as an estimation technique rather than a load-bearing derivation step; no equations are supplied that would allow the estimated parity to reduce tautologically to the model outputs themselves. The central claim therefore remains self-contained and does not trigger any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Two sets of decisions can be compared directly or estimated via ML when made on different subjects.
Reference graph
Works this paper leans on
-
[1]
Machine bias: There’s software used across the country to predict future criminals
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. https://www.propublica.org/ article / machine - bias - risk - assessments - in-criminal-sentencing, 2016. 1
work page 2016
-
[2]
Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lo- hia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mo- jsilovic, et al. Ai fairness 360: An extensible toolkit for de- tecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943, 2018. 2
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[3]
Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lo- hia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mo- jsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Monin- der Singh, Kush R. Varshney, and Yunfeng Zhang. AI Fair- ness 360: An extens...
work page 2018
-
[4]
Joymallya Chakraborty, Suvodeep Majumder, and Tim Men- zies. Bias in machine learning software: Why? how? what to do? In Proceedings of the 29th ACM Joint Meeting on Eu- ropean Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, page 429–440, New York, NY , USA, 2021. Association for Computing Machinery. 3
work page 2021
-
[5]
Amazon scraps secret ai recruiting tool that showed bias against women
Jeffrey Dastin. Amazon scraps secret ai recruiting tool that showed bias against women. https : / / www.reuters.com/article/us- amazon- com- jobs - automation - insight / amazon - scraps - secret- ai- recruiting- tool- that- showed- bias-against-women-idUSKCN1MK08G , 2018. 1
work page 2018
-
[6]
UCI machine learning reposi- tory, 2017
Dheeru Dua and Casey Graff. UCI machine learning reposi- tory, 2017. 5
work page 2017
-
[7]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Rein- gold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226, 2012. 2
work page 2012
-
[8]
The case for process fairness in learning: Feature selection for fair decision making
Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P Gum- madi, and Adrian Weller. The case for process fairness in learning: Feature selection for fair decision making. In NIPS Symposium on Machine Learning and the Law , volume 1, page 2, 2016. 2
work page 2016
-
[9]
Equality of op- portunity in supervised learning
Moritz Hardt, Eric Price, and Nati Srebro. Equality of op- portunity in supervised learning. In Advances in neural in- formation processing systems, pages 3315–3323, 2016. 2
work page 2016
-
[10]
Data preprocessing tech- niques for classification without discrimination
Faisal Kamiran and Toon Calders. Data preprocessing tech- niques for classification without discrimination. Knowledge and Information Systems, 33(1):1–33, 2012. 3
work page 2012
-
[11]
Arjun Kharpal. Health care start-up says a.i. can diag- nose patients better than humans can, doctors call that ’dubious’. https : / / www . cnbc . com / 2018 / 06 / 28/babylon- claims- its- ai- can- diagnose- patients - better - than - doctors . html, June
work page 2018
-
[12]
The algorithm that beats your bank manager
Parmy Olson. The algorithm that beats your bank manager. https://www.forbes.com/sites/parmyolson/ 2011 / 03 / 15 / the - algorithm - that - beats - your-bank-manager/#15da2651ae99, 2011. 1
work page 2011
-
[13]
data for the propublica story ’machine bias’
propublica. data for the propublica story ’machine bias’. https : / / github . com / propublica / compas - analysis/, 2016. 5
work page 2016
-
[14]
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan and Andrew Zisserman. Very deep convo- lutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 5
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[15]
Yang Song and Zhifei Zhang. Utkface data. https:// susanqq.github.io/UTKFace/, 2016. 5
work page 2016
-
[16]
Zhe Yu, Chakraborty Joymallya, and Tim Menzies. Fair- balance: Improving machine learning fairness on multi- plesensitive attributes with data balancing. arXiv preprint arXiv:2107.08310, 2021. 2, 3, 5
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.