pith. machine review for the scientific record. sign in

arxiv: 2605.11614 · v1 · submitted 2026-05-12 · 📊 stat.AP

Recognition: no theorem link

Fairness Testing for Algorithmic Pricing

Fei Huang, Giles Hooker

Pith reviewed 2026-05-13 01:42 UTC · model grok-4.3

classification 📊 stat.AP
keywords fairness testingalgorithmic pricingdiscrimination auditsasymptotic varianceconditional demographic parityproxy discriminationinsurance pricingregression-based testing
0
0 comments X

The pith

Standard fairness audits for algorithmic pricing are invalid because the algorithms are deterministic.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Regulators commonly test algorithmic prices for discrimination by regressing outputs on protected attributes and controls, then applying ordinary least squares standard errors to the coefficients. The paper shows this procedure is structurally invalid: deterministic pricing means residuals reflect approximation error, not sampling variability, so classical standard errors are wrong in both direction and magnitude. The authors derive the correct asymptotic variance estimators for OLS and GLM audit regressions along with the proper cross-covariance term for proxy discrimination tests. When these corrected formulas are applied to quoted auto-insurance premiums from 34 Illinois insurers, every firm fails the conditional demographic parity test, with minority zip codes paying between $34 and $158 more per year than comparable-risk white zip codes.

Core claim

Algorithmic pricing systems are deterministic. When auditors fit a regression of those fixed prices on protected attributes and legitimate rating factors, the residuals represent model approximation error rather than random sampling noise. Classical OLS standard errors are therefore neither consistent nor correctly sized. The paper supplies the proper asymptotic variance expressions for both linear and generalized linear audit models, plus the adjusted cross-covariance needed for proxy discrimination tests. In the Illinois data the corrected tests detect violations in all 34 insurers while the standard formula detects none.

What carries the argument

The corrected asymptotic variance estimators for OLS and GLM regressions of deterministic algorithmic prices, together with the adjusted cross-covariance formula for proxy discrimination testing.

If this is right

  • All 34 Illinois insurers exhibit statistically significant conditional demographic parity violations under the corrected tests.
  • Minority zip codes pay $34 to $158 more per year than comparable-risk white zip codes after controlling for legitimate rating factors.
  • The standard proxy discrimination formula flags zero insurers, while the corrected version flags all 34, of which 16 exceed the substantive threshold.
  • The corrected variance framework applies to fairness testing of any deterministic algorithmic system that regulators evaluate with regression methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Regulators in credit and lending markets that rely on standard errors may be systematically under-detecting discrimination in deterministic pricing systems.
  • Firms could adopt the corrected variance formulas for internal compliance checks before deploying new pricing algorithms.
  • Audit practice in other domains that use regression on deterministic model outputs would require analogous variance corrections to avoid false negatives.
  • Bootstrap or simulation-based checks on the same data could provide an independent verification route for the asymptotic formulas.

Load-bearing premise

That pricing algorithms produce fixed outputs for any given inputs, turning regression residuals into pure approximation error rather than statistical noise.

What would settle it

Simulate a known deterministic pricing function with planted discrimination coefficients, run both classical and corrected audit regressions, and check whether only the corrected variances achieve nominal coverage of the true coefficients.

Figures

Figures reproduced from arXiv: 2605.11614 by Fei Huang, Giles Hooker.

Figure 1
Figure 1. Figure 1: End-to-end flow for the fairness audit protocol. [PITH_FULL_IMAGE:figures/full_fig_p016_1.png] view at source ↗
read the original abstract

Algorithmic systems now set prices across auto insurance, credit, and lending markets, and regulators increasingly require firms to demonstrate that these systems do not discriminate against protected groups. The standard audit regresses pricing output on a protected attribute and legitimate rating factors, then tests the resulting coefficient using ordinary least squares standard errors. We show that this approach is structurally invalid. Pricing algorithms are usually deterministic, so residuals reflect approximation error rather than sampling variability, rendering classical standard errors invalid in both direction and magnitude. We derive correct asymptotic variance estimators for OLS and GLM audit regressions and the correct cross-covariance formula for proxy discrimination testing. Applied to quoted premiums from 34 Illinois auto insurers, every insurer fails the conditional demographic parity test, with minority zip codes paying $34-$158 more per year than comparable-risk white zip codes. The standard proxy discrimination formula flags zero insurers. However, our corrected formula identifies all 34 as statistically significant, of which 16 exceed the substantive threshold. Our framework provides statistically valid audit tools for any deterministic algorithmic system subject to regression-based fairness testing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that standard OLS standard errors are structurally invalid for fairness audits of deterministic algorithmic pricing systems, as residuals represent approximation error rather than sampling variability. It derives corrected asymptotic variance estimators for OLS and GLM audit regressions, along with the appropriate cross-covariance for proxy discrimination testing. In an empirical application to quoted premiums from 34 Illinois auto insurers, the corrected methods reveal that all insurers fail the conditional demographic parity test, with minority zip codes paying $34 to $158 more per year than comparable-risk white zip codes, and 16 exceeding substantive thresholds, while the standard proxy discrimination formula flags none.

Significance. If the derivations hold, this provides a statistically valid framework for auditing deterministic algorithmic systems for fairness, directly relevant to regulatory requirements in insurance and credit markets. The empirical scale (34 insurers) and the contrast between standard and corrected results demonstrate practical impact. Credit is due for deriving the corrected formulas and for the reproducible application to real quoted-premium data.

major comments (2)
  1. [§2] §2 (Assumptions): The claim that classical OLS standard errors are invalid in both direction and magnitude rests on pricing being deterministic conditional on the observed audit covariates. If omitted factors induce conditional stochasticity (as raised by the stress-test note), the residuals contain an additional variance component not accounted for in the derived asymptotic variance, altering whether and how the classical formula is biased. This is load-bearing for the central invalidity argument.
  2. [§4] §4 (Derivations): The corrected asymptotic variance for the OLS coefficient and the cross-covariance formula for proxy discrimination are presented without an explicit statement of the regularity conditions under which the approximation error behaves as required for the asymptotics; this needs clarification to confirm the formulas apply to the finite-sample Illinois data.
minor comments (2)
  1. [Abstract] The abstract and introduction could more explicitly distinguish the conditioning set of the audit regression from the full input set of the pricing algorithm.
  2. [Table 2] Table 2: The reported premium differences lack standard errors from the corrected formula, which would aid interpretation of the $34-$158 range.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the assumptions and strengthen the presentation of the derivations. We respond to each major comment below.

read point-by-point responses
  1. Referee: [§2] §2 (Assumptions): The claim that classical OLS standard errors are invalid in both direction and magnitude rests on pricing being deterministic conditional on the observed audit covariates. If omitted factors induce conditional stochasticity (as raised by the stress-test note), the residuals contain an additional variance component not accounted for in the derived asymptotic variance, altering whether and how the classical formula is biased. This is load-bearing for the central invalidity argument.

    Authors: We agree that the argument relies on determinism conditional on the audit covariates. In fairness audits of algorithmic pricing, the relevant object is the deployed system's output as a function of the observed inputs; any unobservables that induce additional stochasticity would render the audit regression misspecified for causal interpretation but would not restore classical sampling variability to the residuals. The stress-test note acknowledges potential omitted factors while preserving the deterministic assumption for the observed system. We will revise §2 to state this conditioning explicitly and discuss how omitted variables affect both the standard-error bias and the substantive interpretation of the audit. revision: partial

  2. Referee: [§4] §4 (Derivations): The corrected asymptotic variance for the OLS coefficient and the cross-covariance formula for proxy discrimination are presented without an explicit statement of the regularity conditions under which the approximation error behaves as required for the asymptotics; this needs clarification to confirm the formulas apply to the finite-sample Illinois data.

    Authors: We will add an explicit statement of the regularity conditions in §4. These include finite second moments of the approximation error, positive-definiteness of the limiting design matrix, and standard moment bounds ensuring asymptotic normality of the misspecified OLS estimator. With more than 1,000 zip codes per insurer, the Illinois application satisfies the large-sample regime under which the corrected variance estimators are valid. We will also note that the formulas remain consistent for the finite-sample setting under these conditions. revision: yes

Circularity Check

0 steps flagged

Derivations of asymptotic variances are independent and self-contained

full rationale

The paper states the determinism assumption explicitly and derives the corrected asymptotic variance and cross-covariance formulas from first principles under that model. No quoted step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or ansatz imported from the authors' prior work. The central statistical claims rest on the stated modeling assumptions rather than on any circular reduction to the Illinois data or to previously fitted quantities. This is the normal case of an independent derivation; the reader's initial score of 2 reflects only the possibility of minor self-citation not visible in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests primarily on the domain assumption of determinism in pricing algorithms and the validity of the newly derived asymptotic results; no free parameters or invented entities are described in the abstract.

axioms (1)
  • domain assumption Pricing algorithms are deterministic
    Explicitly stated in the abstract as the reason classical standard errors are invalid.

pith-pipeline@v0.9.0 · 5470 in / 1320 out tokens · 43921 ms · 2026-05-13T01:42:28.508335+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

153 extracted references · 153 canonical work pages

  1. [1]

    2017 , url =

    Lynch, James , title =. 2017 , url =

  2. [2]

    Statistics and Public Policy , volume=

    Using first name information to improve race and ethnicity classification , author=. Statistics and Public Policy , volume=. 2018 , publisher=

  3. [3]

    2026 , month = feb, url =

    Fairness Metrics for Life Insurance , author =. 2026 , month = feb, url =

  4. [4]

    2026 , eprint=

    How Proxy Race Distorts Regression-Based Fairness Audits , author=. 2026 , eprint=

  5. [5]

    2015 , publisher =

    Imbens, Guido W and Rubin, Donald B , title =. 2015 , publisher =

  6. [6]

    Political Analysis , volume =

    Ho, Daniel E and Imai, Kosuke and King, Gary and Stuart, Elizabeth A , title =. Political Analysis , volume =

  7. [7]

    Journal of Statistical Software , volume =

    Ho, Daniel E and Imai, Kosuke and King, Gary and Stuart, Elizabeth A , title =. Journal of Statistical Software , volume =

  8. [8]

    Journal of Pharmacokinetics and Biopharmaceutics , volume =

    Schuirmann, Donald J , title =. Journal of Pharmacokinetics and Biopharmaceutics , volume =

  9. [9]

    1997 , publisher =

    A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data , author =. 1997 , publisher =

  10. [10]

    Journal of Econometrics , volume =

    MacKinnon, James G and White, Halbert , title =. Journal of Econometrics , volume =

  11. [11]

    1984 , url =

    Insurance Contracts Act 1984 , howpublished =. 1984 , url =

  12. [12]

    1975 , url =

    Racial Discrimination Act 1975 , howpublished =. 1975 , url =

  13. [13]

    1984 , url =

    Sex Discrimination Act 1984 , howpublished =. 1984 , url =

  14. [14]

    2004 , url =

    Age Discrimination Act 2004 , howpublished =. 2004 , url =

  15. [15]

    1992 , url =

    Disability Discrimination Act 1992 , howpublished =. 1992 , url =

  16. [16]

    2021 , url =

    Regulatory Guide 274: Product Design and Distribution , howpublished =. 2021 , url =

  17. [17]

    ACM computing surveys (CSUR) , volume=

    A survey on bias and fairness in machine learning , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

  18. [18]

    Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages=

    Legal taxonomies of machine bias: Revisiting direct discrimination , author=. Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency , pages=

  19. [19]

    Defining discrimination in insurance , author=

  20. [20]

    Available at SSRN 4709243 , year=

    A Fair price to pay: Exploiting causal graphs for fairness in insurance , author=. Available at SSRN 4709243 , year=

  21. [21]

    Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=

    What is proxy discrimination? , author=. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=

  22. [22]

    Fairness and Machine Learning: Limitations and Opportunities , author =

  23. [23]

    An Actuarial View of Data Bias: Definitions, Impacts, and Considerations , howpublished =

  24. [24]

    Philosophy & Technology , volume=

    Fairness and risk: an ethical argument for a group fairness definition insurers can use , author=. Philosophy & Technology , volume=. 2023 , publisher=

  25. [25]

    Journal of Business Ethics , volume=

    How fair is actuarial fairness? , author=. Journal of Business Ethics , volume=. 2015 , publisher=

  26. [26]

    Review of Economics and Statistics , pages=

    Price Discrimination in Selection Markets , author=. Review of Economics and Statistics , pages=. 2023 , publisher=

  27. [27]

    Glossary , howpublished =

  28. [28]

    THE USE OF CREDIT HISTORY FOR PERSONAL LINES OF INSURANCE: REPORT TO THE NATIONAL ASSOCIATION OF INSURANCE COMMISSIONERS , howpublished =

  29. [29]

    Frontiers in big data , volume=

    Social data: Biases, methodological pitfalls, and ethical boundaries , author=. Frontiers in big data , volume=. 2019 , publisher=

  30. [30]

    Proceedings of the 1st

    A framework for understanding sources of harm throughout the machine learning life cycle , author=. Proceedings of the 1st

  31. [31]

    Scientific Reports , volume=

    A clarification of the nuances in the fairness metrics landscape , author=. Scientific Reports , volume=. 2022 , publisher=

  32. [32]

    Econometric Reviews , volume=

    Yet another look at the omitted variable bias , author=. Econometric Reviews , volume=. 2023 , publisher=

  33. [33]

    Annual review of statistics and its application , volume=

    Algorithmic fairness: Choices, assumptions, and definitions , author=. Annual review of statistics and its application , volume=. 2021 , publisher=

  34. [34]

    North American Actuarial Journal , volume=

    The discriminating (pricing) actuary , author=. North American Actuarial Journal , volume=. 2023 , publisher=

  35. [35]

    Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , pages=

    Better together? How externalities of size complicate notions of solidarity and actuarial fairness , author=. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency , pages=

  36. [36]

    North American Actuarial Journal , pages=

    Antidiscrimination insurance pricing: Regulations, fairness criteria, and models , author=. North American Actuarial Journal , pages=. 2023 , publisher=

  37. [37]

    Proceedings of the 2020 conference on fairness, accountability, and transparency , pages=

    On the apparent conflict between individual and group fairness , author=. Proceedings of the 2020 conference on fairness, accountability, and transparency , pages=

  38. [38]

    What is fair?

    Lindholm, Mathias and Richman, Ronald and Tsanakas, Andreas and W. What is fair?. 2023 , type =

  39. [39]

    Advances in neural information processing systems , volume=

    Equality of opportunity in supervised learning , author=. Advances in neural information processing systems , volume=

  40. [40]

    International Conference on Machine Learning , pages=

    Fair regression: Quantitative definitions and reduction-based algorithms , author=. International Conference on Machine Learning , pages=. 2019 , organization=

  41. [41]

    arXiv preprint arXiv:1706.02409 , year=

    A convex framework for fair regression , author=. arXiv preprint arXiv:1706.02409 , year=

  42. [42]

    2021 , isbn =

    James, Gareth and Witten, Daniela and Hastie, Trevor and Tibshirani, Robert , title =. 2021 , isbn =

  43. [43]

    About private health insurance , howpublished =

  44. [44]

    2016 , type =

    The Impact of Price Discrimination in Markets with Adverse Selection , author=. 2016 , type =

  45. [45]

    ASTIN Bulletin: The Journal of the IAA , volume=

    Discrimination-free insurance pricing , author=. ASTIN Bulletin: The Journal of the IAA , volume=. 2022 , publisher=

  46. [46]

    Proceedings of the 3rd Innovations in Theoretical Computer Science Conference , pages =

    Dwork, Cynthia and Hardt, Moritz and Pitassi, Toniann and Reingold, Omer and Zemel, Richard , title =. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference , pages =. 2012 , isbn =. doi:10.1145/2090236.2090255 , abstract =

  47. [47]

    International conference on machine learning , pages=

    Learning fair representations , author=. International conference on machine learning , pages=. 2013 , organization=

  48. [48]

    2018 , eprint=

    Counterfactual Fairness , author=. 2018 , eprint=

  49. [49]

    North American Actuarial Journal , pages=

    A discrimination-free premium under a causal framework , author=. North American Actuarial Journal , pages=. 2024 , publisher=

  50. [50]

    2001 , publisher=

    Justice as fairness: A restatement , author=. 2001 , publisher=

  51. [51]

    Why fairness cannot be automated: Bridging the gap between

    Sandra Wachter and Brent Mittelstadt and Chris Russell , keywords =. Why fairness cannot be automated: Bridging the gap between. Computer Law & Security Review , volume =. 2021 , issn =. doi:https://doi.org/10.1016/j.clsr.2021.105567 , url =

  52. [52]

    Health-Based Proxy Discrimination, Artificial Intelligence, and Big Data , author=. Hous. J. Health L. & Pol'y , volume=. 2021 , publisher=

  53. [53]

    European Journal of Human Genetics , volume=

    Genetic discrimination by Australian insurance companies: A survey of consumer experiences , author=. European Journal of Human Genetics , volume=. 2020 , publisher=

  54. [54]

    Virginia Law Review , pages=

    Efficiency and fairness in insurance risk classification , author=. Virginia Law Review , pages=. 1985 , publisher=

  55. [55]

    Canadian Journal of Practical Philosophy , volume=

    Genetic discrimination, life insurance, and justice as fairness , author=. Canadian Journal of Practical Philosophy , volume=

  56. [56]

    UNSWLJ Student Series , number=

    Utmost Good Faith and Fairness in Life Insurance: Restoring Consumer Confidence , author=. UNSWLJ Student Series , number=

  57. [57]

    ASTIN Bulletin: The Journal of the IAA , volume=

    Rate making and society's sense of fairness , author=. ASTIN Bulletin: The Journal of the IAA , volume=. 1984 , publisher=

  58. [58]

    Journal of the American Academy of Dermatology , volume=

    Are life insurance underwriting practices fair? , author=. Journal of the American Academy of Dermatology , volume=. 2005 , publisher=

  59. [59]

    Assessing the impact of developments in genetic testing on insurers' risk exposure , howpublished =

  60. [60]

    Sauce, Marguerite and Chancel, Antoine and Ly, Antoine , journal=

  61. [61]

    American Journal of Medical Genetics Part A , volume=

    Life insurance and breast cancer risk assessment: adverse selection, genetic testing decisions, and discrimination , author=. American Journal of Medical Genetics Part A , volume=. 2003 , publisher=

  62. [62]

    Lynch, Elly L and Doherty, Rebecca J and Gaff, Clara L and Macrae, Finlay A and Lindeman, Geoffrey J , journal=. ``. 2003 , publisher=

  63. [63]

    Philosophical Transactions of the Royal Society of London

    Genetic testing, life insurance, and adverse selection , author=. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences , volume=. 1997 , publisher=

  64. [64]

    Health professionals’ views and experiences of the

    Dowling, Grace and Tiller, Jane and McInerney-Leo, Aideen and Belcher, Andrea and Haining, Casey and Barlow-Stewart, Kristine and Boughtwood, Tiffany and Gleeson, Penny and Delatycki, Martin B and Winship, Ingrid and others , journal=. Health professionals’ views and experiences of the. 2022 , publisher=

  65. [65]

    American Journal of Medical Genetics , volume=

    Genetic testing, adverse selection, and the demand for life insurance , author=. American Journal of Medical Genetics , volume=. 2000 , publisher=

  66. [66]

    The Journal of Law, Medicine & Ethics , volume=

    Time to end the use of genetic test results in life insurance underwriting , author=. The Journal of Law, Medicine & Ethics , volume=. 2018 , publisher=

  67. [67]

    The Monist , volume=

    Genetic information, life insurance, and social justice , author=. The Monist , volume=. 2006 , publisher=

  68. [68]

    Science as Culture , volume=

    Enacting actuarial fairness in insurance: From fair discrimination to behaviour-based fairness , author=. Science as Culture , volume=. 2018 , publisher=

  69. [69]

    The Geneva Papers on Risk and Insurance-Issues and Practice , volume=

    Fairness and equality in insurance classification , author=. The Geneva Papers on Risk and Insurance-Issues and Practice , volume=. 2006 , publisher=

  70. [70]

    Life's Not Fair

    Sayre, Mark , journal=. Life's Not Fair. 2023 , url=

  71. [71]

    CAS E-Forum , year=

    Considerations for Managing Potential Bias in Pricing Models , author=. CAS E-Forum , year=

  72. [72]

    Study on the use of age, disability, sex, religion or belief, racial or ethnic origin and sexual orientation in financial services, in particular in the insurance and banking sectors, Final Report, Part I: Main Report , howpublished =

  73. [73]

    2014 , type =

    The impact of gender-neutral pricing on the life insurance industry , author=. 2014 , type =

  74. [74]

    Fairness in the Life Insurance System -- Concept Note , howpublished =

  75. [75]

    International conference on machine learning , pages=

    Nonconvex optimization for regression with fairness constraints , author=. International conference on machine learning , pages=. 2018 , organization=

  76. [76]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    A sequentially fair mechanism for multiple sensitive attributes , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  77. [77]

    arXiv preprint arXiv:2202.12008 , year=

    A fair pricing model via adversarial learning , author=. arXiv preprint arXiv:2202.12008 , year=

  78. [78]

    Avoiding Discrimination through Causal Reasoning , url =

    Kilbertus, Niki and Rojas Carulla, Mateo and Parascandolo, Giambattista and Hardt, Moritz and Janzing, Dominik and Sch\". Avoiding Discrimination through Causal Reasoning , url =. Advances in Neural Information Processing Systems , editor =

  79. [79]

    Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =

    Corbett-Davies, Sam and Pierson, Emma and Feller, Avi and Goel, Sharad and Huq, Aziz , title =. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages =. 2017 , isbn =. doi:10.1145/3097983.3098095 , abstract =

  80. [80]

    Lisa A Schilling , title =

Showing first 80 references.