pith. sign in

arxiv: 2205.08809 · v1 · submitted 2022-05-18 · 💻 cs.SE

Software Fairness: An Analysis and Survey

Pith reviewed 2026-05-24 12:02 UTC · model grok-4.3

classification 💻 cs.SE
keywords software fairnessbias in machine learningfairness measuresrequirements engineeringsurveyunstructured datawhite-box analysis
0
0 comments X

The pith

A survey of 164 papers shows software fairness research understudies specification, certain measures, unstructured data, and white-box methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper surveys work on engineering fairness as a software property in learning-based systems. It collects and categorizes 164 publications according to the fairness measures used, the tasks studied, the type of analysis performed, the core ideas of each approach, and the level of access to the system under test. The authors conclude that several areas remain neglected, including how to specify fairness requirements upfront, measures that handle conditional or intersectional effects, analysis of audio image or text data, and methods that inspect or modify the internal workings of models during training. These gaps matter because they leave open the possibility that deployed systems will exhibit bias in ways that current techniques cannot detect or prevent.

Core claim

By analyzing 164 publications on the fairness of learning-based software, the authors observe that fairness specification and requirements engineering receive little attention, that conditional sequential and intersectional fairness measures remain under-explored, that unstructured datasets such as audio image and text are rarely examined, and that white-box in-processing machine learning analysis methods are seldom applied.

What carries the argument

A categorization scheme that groups each publication by the fairness measure evaluated, the task addressed, the analysis type, the main technical idea, and the access level (black-box, white-box, or grey-box).

If this is right

  • More research effort should go into defining and validating fairness requirements before systems are built.
  • Techniques for measuring conditional, sequential, and intersectional fairness need development and evaluation.
  • Fairness studies should expand beyond structured tabular data to include audio, image, and text inputs.
  • White-box and in-processing analysis methods should be explored as alternatives to black-box testing.
  • Policy-based bias handling and human-in-the-loop mitigation remain open directions for the field.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Without progress on the identified gaps, bias in real-world deployments may stay hidden in multi-attribute or sequential decision settings.
  • Tracking how many new papers address the four under-studied areas could serve as a simple metric for field progress in future surveys.
  • Socio-technical systems that combine automated checks with human oversight may be needed to handle fairness issues the current technical literature does not yet cover.

Load-bearing premise

The 164 publications collected by the survey's search strategy give a complete picture of current research on software fairness for learning-based systems.

What would settle it

An independent search that locates dozens of additional papers explicitly addressing fairness specification, intersectional or sequential fairness, unstructured datasets, or white-box in-processing techniques would undermine the claim that these topics are under-studied.

read the original abstract

In the last decade, researchers have studied fairness as a software property. In particular, how to engineer fair software systems? This includes specifying, designing, and validating fairness properties. However, the landscape of works addressing bias as a software engineering concern is unclear, i.e., techniques and studies that analyze the fairness properties of learning-based software. In this work, we provide a clear view of the state-of-the-art in software fairness analysis. To this end, we collect, categorize and conduct an in-depth analysis of 164 publications investigating the fairness of learning-based software systems. Specifically, we study the evaluated fairness measure, the studied tasks, the type of fairness analysis, the main idea of the proposed approaches, and the access level (e.g., black, white, or grey box). Our findings include the following: (1) Fairness concerns (such as fairness specification and requirements engineering) are under-studied; (2) Fairness measures such as conditional, sequential, and intersectional fairness are under-explored; (3) Unstructured datasets (e.g., audio, image, and text) are barely studied for fairness analysis; and (4) Software fairness analysis techniques hardly employ white-box, in-processing machine learning (ML) analysis methods. In summary, we observed several open challenges including the need to study intersectional/sequential bias, policy-based bias handling, and human-in-the-loop, socio-technical bias mitigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a survey of 164 publications on fairness as a software property in learning-based systems. It collects and categorizes these works by evaluated fairness measures, studied tasks, type of fairness analysis performed, main ideas of the proposed approaches, and access level (black/white/grey box). From the resulting distributions the authors conclude that fairness specification and requirements engineering are under-studied, that conditional/sequential/intersectional fairness measures are under-explored, that unstructured data (audio, image, text) receive little attention, and that white-box in-processing ML techniques are rarely employed; several open challenges are listed.

Significance. A well-executed survey of this size supplies a needed map of an emerging sub-area at the intersection of software engineering and machine-learning fairness. The explicit categorization along multiple orthogonal dimensions (measures, tasks, analysis type, access level) allows concrete identification of gaps and can usefully guide subsequent research. The scale of the corpus (164 papers) is itself a contribution provided the selection process is transparent.

major comments (2)
  1. [§3 (Search Strategy and Inclusion Criteria)] §3 (Search Strategy and Inclusion Criteria): the paper must supply the exact search strings, databases, time window, and inclusion/exclusion rules used to arrive at the final set of 164 papers. These details are load-bearing for findings (1)–(4); without them it is impossible to assess whether relevant work on, e.g., intersectional fairness or unstructured data was simply missed, rendering the “under-explored” claims unverifiable.
  2. [§4 (Categorization and Analysis)] §4 (Categorization and Analysis): the taxonomy used to assign papers to fairness-measure and analysis-type categories should be stated explicitly, together with any inter-rater agreement statistics or validation procedure. The four headline findings rest directly on the resulting frequency counts; any ambiguity in categorization undermines the quantitative basis for claiming under-exploration.
minor comments (2)
  1. [Table 1] Table 1 (or equivalent summary table) would benefit from an additional column indicating the publication year range of the surveyed papers.
  2. [Discussion] A short paragraph discussing threats to validity (publication bias, keyword choice, venue coverage) should be added even if the search protocol is expanded.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback. We address each major comment below and will revise the manuscript to improve methodological transparency.

read point-by-point responses
  1. Referee: §3 (Search Strategy and Inclusion Criteria): the paper must supply the exact search strings, databases, time window, and inclusion/exclusion rules used to arrive at the final set of 164 papers. These details are load-bearing for findings (1)–(4); without them it is impossible to assess whether relevant work on, e.g., intersectional fairness or unstructured data was simply missed, rendering the “under-explored” claims unverifiable.

    Authors: We agree that full transparency in the search process is required to substantiate the gap claims. The current manuscript provides only a high-level overview of the search in §3. In the revision we will expand §3 with the exact search strings, the complete list of databases queried, the precise time window, and the full inclusion/exclusion criteria that produced the final corpus of 164 papers. revision: yes

  2. Referee: §4 (Categorization and Analysis): the taxonomy used to assign papers to fairness-measure and analysis-type categories should be stated explicitly, together with any inter-rater agreement statistics or validation procedure. The four headline findings rest directly on the resulting frequency counts; any ambiguity in categorization undermines the quantitative basis for claiming under-exploration.

    Authors: We concur that the categorization taxonomy and assignment procedure must be documented explicitly. Section 4 currently reports the resulting distributions without defining category boundaries or the validation steps used. We will revise §4 to include explicit definitions for each dimension of the taxonomy together with a description of the assignment and validation process (including any inter-rater statistics or resolution protocol). revision: yes

Circularity Check

0 steps flagged

No circularity: standard literature survey with external corpus

full rationale

This is a survey paper that collects and categorizes 164 external publications on software fairness. Its four main findings (under-studied fairness concerns, measures, unstructured data, and white-box methods) are direct inferences from the collected corpus rather than any internal derivation, equation, fitted parameter, or self-referential construction. No self-citation load-bearing, uniqueness theorems, ansatzes, or renamings of known results occur. The representativeness of the search strategy is a methodological concern but does not create circularity by the paper's own definitions or reductions. The derivation chain is self-contained as a conventional literature review.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The survey's gap findings rest on the assumption that its literature collection is complete and its categorization scheme is unbiased; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption The literature search strategy employed captured all relevant publications on software fairness up to the time of the survey.
    The claims of under-studied areas depend directly on the completeness and representativeness of the 164-paper corpus.

pith-pipeline@v0.9.0 · 5791 in / 1270 out tokens · 31500 ms · 2026-05-24T12:02:30.250110+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

149 extracted references · 149 canonical work pages · 4 internal anchors

  1. [1]

    Serge Abiteboul and Julia Stoyanovich. 2019. Transparency, fairness, data protection, neutrality: Data management challenges in the face of new regulation. Journal of Data and Information Quality (JDIQ) 11, 3 (2019), 1–9

  2. [2]

    Julius A Adebayo et al . 2016. FairML: ToolBox for diagnosing bias in predictive modeling . Ph.D. Dissertation. Massachusetts Institute of Technology

  3. [3]

    Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. In International Conference on Machine Learning . PMLR, 60–69

  4. [4]

    Chirag Agarwal, Himabindu Lakkaraju, and Marinka Zitnik. 2021. Towards a unified framework for fair and stable graph representation learning. In Uncertainty in Artificial Intelligence . PMLR, 2114–2124

  5. [5]

    Aniya Aggarwal, Pranay Lohia, Seema Nagar, Kuntal Dey, and Diptikalyan Saha. 2019. Black box fairness testing of machine learning models. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 625–635

  6. [6]

    Aniya Aggarwal, Samiulla Shaikh, Sandeep Hans, Swastik Haldar, Rema Ananthanarayanan, and Diptikalyan Saha. 2021. Testing framework for black-box AI models. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion) . IEEE, 81–84

  7. [7]

    Aws Albarghouthi, Loris D’Antoni, Samuel Drews, and Aditya V Nori. 2017. Fairsquare: probabilistic verification of program fairness. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 1–30

  8. [8]

    Aws Albarghouthi, Loris D’Antoni, and Samuel Drews. 2017. Repairing decision-making programs under uncertainty. In International Conference on Computer Aided Verification . Springer, 181–200

  9. [9]

    Aws Albarghouthi and Samuel Vinitsky. 2019. Fairness-aware programming. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 211–219

  10. [10]

    Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. 2018. Themis: Automatically testing software for discrimination. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 871–875

  11. [11]

    Muhammad Hilmi Asyrofi, Zhou Yang, Imam Nur Bani Yusuf, Hong Jin Kang, Ferdian Thung, and David Lo. 2021. Biasfinder: Metamorphic test generation to uncover bias for sentiment analysis systems. IEEE Transactions on Software Engineering (2021)

  12. [12]

    Fatma Basak Aydemir and Fabiano Dalpiaz. 2018. A roadmap for ethics-aware software engineering. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). IEEE, 15–21

  13. [13]

    Niels Bantilan. 2018. Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. Journal of Technology in Human Services 36, 1 (2018), 15–30

  14. [14]

    Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. 2019. Probabilistic verification of fairness properties via concentration. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 1–27

  15. [15]

    Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilović, et al . 2019. AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development 63, 4/5 (2019), 4–1

  16. [16]

    Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, et al. 2019. Think your artificial intelligence software is fair? Think again. IEEE Software 36, 4 (2019), 76–80

  17. [17]

    Nelly Bencomo, Jin LC Guo, Rachel Harrison, Hans-Martin Heyn, and Tim Menzies. 2021. The Secret to Better AI and Better Software (Is Requirements Engineering). IEEE Software 39, 1 (2021), 105–110

  18. [18]

    Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A Convex Framework for Fair Regression. Fairness, Accountability, and Transparency in Machine Learning (2017)

  19. [19]

    Sarah Bird, Miro Dudík, Richard Edgar, Brandon Horn, Roman Lutz, Vanessa Milan, Mehrnoosh Sameki, Hanna Wallach, and Kathleen Walker

  20. [20]

    Microsoft, Tech

    Fairlearn: A toolkit for assessing and improving fairness in AI. Microsoft, Tech. Rep. MSR-TR-2020-32 (2020)

  21. [21]

    Sumon Biswas and Hridesh Rajan. 2020. Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 642–653. Manuscript submitted to ACM 30 Ezekiel Soremekun, Mi...

  22. [22]

    Sumon Biswas and Hridesh Rajan. 2021. Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline. arXiv preprint arXiv:2106.06054 (2021)

  23. [23]

    Emily Black, Samuel Yeom, and Matt Fredrikson. 2020. Fliptest: fairness testing via optimal transport. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 111–121

  24. [24]

    Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. Language (Technology) is Power: A Critical Survey of “Bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . 5454–5476

  25. [25]

    Su Lin Blodgett, Gilsinia Lopez, Alexandra Olteanu, Robert Sim, and Hanna Wallach. 2021. Stereotyping Norwegian salmon: an inventory of pitfalls in fairness benchmark datasets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Lon...

  26. [26]

    Martim Brandao. 2019. Age and gender bias in pedestrian detection algorithms. In Proceedings of the Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops

  27. [27]

    Yuriy Brun and Alexandra Meliou. 2018. Software fairness. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 754–759

  28. [28]

    Margaret Burnett, Simone Stumpf, Jamie Macbeth, Stephann Makri, Laura Beckwith, Irwin Kwan, Anicia Peters, and William Jernigan. 2016. GenderMag: A method for evaluating software’s gender inclusiveness. Interacting with Computers 28, 6 (2016), 760–787

  29. [29]

    Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, and Duen Horng Chau. 2019. FairVis: Visual analytics for discovering intersectional bias in machine learning. In 2019 IEEE Conference on Visual Analytics Science and Technology (V AST). IEEE, 46–56

  30. [30]

    Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data mining and knowledge discovery 21, 2 (2010), 277–292

  31. [31]

    Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. Advances in neural information processing systems 30 (2017)

  32. [32]

    Simon Caton and Christian Haas. 2020. Fairness in machine learning: A survey. arXiv preprint arXiv:2010.04053 (2020)

  33. [33]

    L Elisa Celis, Lingxiao Huang, Vijay Keswani, and Nisheeth K Vishnoi. 2019. Classification with fairness constraints: A meta-algorithm with provable guarantees. In Proceedings of the conference on fairness, accountability, and transparency . 319–328

  34. [34]

    Joymallya Chakraborty, Suvodeep Majumder, and Tim Menzies. 2021. Bias in machine learning software: why? how? what to do?. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 429–440

  35. [35]

    Joymallya Chakraborty, Suvodeep Majumder, Zhe Yu, and Tim Menzies. 2020. Fairway: A way to build fair ml software. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 654–665

  36. [36]

    Jürgen Cito, Isil Dillig, Seohyun Kim, Vijayaraghavan Murali, and Satish Chandra. 2021. Explaining mispredictions of machine learning models using rule induction. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 716–727

  37. [37]

    Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining . 797–806

  38. [38]

    Kate Crawford. 2017. The trouble with bias. In Conference on Neural Information Processing Systems, Invited Speaker . https://www.youtube.com/ watch?v=fMym_BKWQzk

  39. [39]

    Jenna Cryan, Shiliang Tang, Xinyi Zhang, Miriam Metzger, Haitao Zheng, and Ben Y Zhao. 2020. Detecting gender stereotypes: lexicon vs. supervised learning methods. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems . 1–11

  40. [40]

    Anubrata Das and Matthew Lease. 2019. A Conceptual Framework for Evaluating Fairness in Search. arXiv preprint arXiv:1907.09328 (2019)

  41. [41]

    Terrance De Vries, Ishan Misra, Changhan Wang, and Laurens Van der Maaten. 2019. Does object recognition work for everyone?. In Proceedings of the Workshop on Fairness Accountability Transparency and Ethics in Computer Vision at IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 52–59

  42. [42]

    Emily Denton, Ben Hutchinson, Margaret Mitchell, and Timnit Gebru. 2019. Detecting bias with generative counterfactual face attribute augmentation. (2019)

  43. [43]

    Marina Drosou, HV Jagadish, Evaggelia Pitoura, and Julia Stoyanovich. 2017. Diversity in big data: A review. Big data 5, 2 (2017), 73–84

  44. [44]

    Xiaoning Du, Xiaofei Xie, Yi Li, Lei Ma, Yang Liu, and Jianjun Zhao. 2019. DeepStellar: Model-Based Quantitative Analysis of Stateful Deep Learning Systems. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC/FSE 2019). Association f...

  45. [45]

    Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference . 214–226

  46. [46]

    Michael D Ekstrand, Robin Burke, and Fernando Diaz. 2019. Fairness and discrimination in recommendation and retrieval. In Proceedings of the 13th ACM Conference on Recommender Systems . 576–577

  47. [47]

    Simone Fabbrizzi, Symeon Papadopoulos, Eirini Ntoutsi, and Ioannis Kompatsiaris. 2021. A Survey on Bias in Visual Datasets. arXiv preprint arXiv:2107.07919 (2021)

  48. [48]

    Ming Fan, Wenying Wei, Wuxia Jin, Zijiang Yang, and Ting Liu. 2022. Explanation-Guided Fairness Testing through Genetic Algorithm. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) . IEEE. Manuscript submitted to ACM Software Fairness: An Analysis and Survey 31

  49. [49]

    Golnoosh Farnad, Behrouz Babaki, and Michel Gendreau. 2020. A unifying framework for fairness-aware influence maximization. In Companion Proceedings of the Web Conference 2020 . 714–722

  50. [50]

    Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining . 259–268

  51. [51]

    Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: Prioritizing Massive Tests to Enhance the Robustness of Deep Neural Networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (Virtual Event, USA) (ISSTA 2020). Association for Computing Machinery, New York, NY, USA, ...

  52. [52]

    Anjalie Field, Su Lin Blodgett, Zeerak Waseem, and Yulia Tsvetkov. 2021. A Survey of Race, Racism, and Anti-Racism in NLP. arXiv preprint arXiv:2106.11410 (2021)

  53. [53]

    Anthony Finkelstein, Mark Harman, S Afshin Mansouri, Jian Ren, and Yuanyuan Zhang. 2009. A search based approach to fairness analysis in requirement assignments to aid negotiation, mediation and decision making. Requirements engineering 14, 4 (2009), 231–245

  54. [54]

    Jessie Finocchiaro, Roland Maio, Faidra Monachou, Gourab K Patro, Manish Raghavan, Ana-Andreea Stoica, and Stratis Tsirtsis. 2021. Bridging machine learning and mechanism design towards algorithmic fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 489–503

  55. [55]

    Batya Friedman and Helen Nissenbaum. 1996. Bias in computer systems. ACM Transactions on Information Systems (TOIS) 14, 3 (1996), 330–347

  56. [56]

    Pratik Gajane and Mykola Pechenizkiy. 2017. On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184 (2017)

  57. [57]

    Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering . 498–510

  58. [58]

    Xuanqi Gao, Juan Zhai, Shiqing Ma, Chao Shen, Yufei Chen, and Qian Wang. 2022. FairNeuron: Improving Deep Neural Network Fairness with Adversary Games on Selective Neurons. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) . IEEE

  59. [59]

    Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92

  60. [60]

    Bishwamittra Ghosh, Debabrota Basu, and Kuldeep S Meel. 2021. Justicia: A Stochastic SAT Approach to Formally Verify Fairness. In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 35. 7554–7563

  61. [61]

    CV González Zelaya, P Missier, and D Prangle. 2019. Parametrised data sampling for fairness optimisation. In 2019 XAI Workshop at SIGKDD, Anchorage, AK, USA

  62. [62]

    Nina Grgic-Hlaca, Elissa M Redmiles, Krishna P Gummadi, and Adrian Weller. 2018. Human perceptions of fairness in algorithmic decision making: A case study of criminal risk prediction. In Proceedings of the 2018 world wide web conference . 903–912

  63. [63]

    Galen Harrison, Julia Hanson, Christine Jacinto, Julio Ramirez, and Blase Ur. 2020. An empirical study on the perceived fairness of realistic, imperfect machine learning models. In Proceedings of the 2020 conference on fairness, accountability, and transparency . 392–402

  64. [64]

    Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI conference on human factors in computing systems . 1–16

  65. [65]

    Sara Hooker, Nyalleng Moorosi, Gregory Clark, Samy Bengio, and Emily Denton. 2020. Characterising bias in compressed models. arXiv preprint arXiv:2010.03058 (2020)

  66. [66]

    Max Hort, Jie M Zhang, Federica Sarro, and Mark Harman. 2021. Fairea: a model behaviour mutation approach to benchmarking bias mitigation methods. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 994–1006

  67. [67]

    Chao Huang, Junbo Zhang, Yu Zheng, and Nitesh V. Chawla. 2018. DeepCrime: Attentive Hierarchical Recurrent Networks for Crime Prediction. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (Torino, Italy) (CIKM ’18). Association for Computing Machinery, New York, NY, USA, 1423–1432. https://doi.org/10.1145/3269...

  68. [68]

    Waqar Hussain, Davoud Mougouei, and Jon Whittle. 2018. Integrating social values into software design patterns. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). IEEE, 8–14

  69. [69]

    Ben Hutchinson and Margaret Mitchell. 2019. 50 years of test (un) fairness: Lessons for machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 49–58

  70. [70]

    HV Jagadish, Julia Stoyanovich, and Bill Howe. 2021. Covid-19 brings data equity challenges to the fore. Digital Government: Research and Practice 2, 2 (2021), 1–7

  71. [71]

    Philips George John, Deepak Vijaykeerthy, and Diptikalyan Saha. 2020. Verifying individual fairness in machine learning models. In Conference on Uncertainty in Artificial Intelligence . PMLR, 749–758

  72. [72]

    Brittany Johnson, Jesse Bartola, Rico Angell, Katherine Keith, Sam Witty, Stephen J Giguere, and Yuriy Brun. 2020. Fairkit, Fairkit, on the Wall, Who’s the Fairest of Them All? Supporting Data Scientists in Training Fair Models. arXiv e-prints (2020), arXiv–2012

  73. [73]

    Faisal Kamiran and Toon Calders. 2009. Classifying without discriminating. In 2009 2nd international conference on computer, control and communication. IEEE, 1–6

  74. [74]

    Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and information systems 33, 1 (2012), 1–33

  75. [75]

    Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In 2010 IEEE International Conference on Data Mining. IEEE, 869–874. Manuscript submitted to ACM 32 Ezekiel Soremekun, Mike Papadakis, Maxime Cordy, and Yves Le Traon

  76. [76]

    Faisal Kamiran, Asim Karim, and Xiangliang Zhang. 2012. Decision theory for discrimination-aware classification. In 2012 IEEE 12th International Conference on Data Mining . IEEE, 924–929

  77. [77]

    Faisal Kamiran, Sameen Mansha, Asim Karim, and Xiangliang Zhang. 2018. Exploiting reject option in classification for social discrimination control. Information Sciences 425 (2018), 18–33

  78. [78]

    Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European conference on machine learning and knowledge discovery in databases . Springer, 35–50

  79. [79]

    Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. 2018. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In International Conference on Machine Learning . PMLR, 2564–2572

  80. [80]

    Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. 2019. An empirical study of rich subgroup fairness for machine learning. In Proceedings of the conference on fairness, accountability, and transparency . 100–109

Showing first 80 references.