Counterfactual Reasoning for Fair Clinical Risk Prediction

Daisy Yi Ding; Nigam H. Shah; Stephen Pfohl; Tony Duan

arxiv: 1907.06260 · v1 · pith:5B7IQKZMnew · submitted 2019-07-14 · 💻 cs.LG · cs.CY· stat.ML

Counterfactual Reasoning for Fair Clinical Risk Prediction

Stephen Pfohl , Tony Duan , Daisy Yi Ding , Nigam H. Shah This is my paper

Pith reviewed 2026-05-24 21:25 UTC · model grok-4.3

classification 💻 cs.LG cs.CYstat.ML

keywords counterfactual fairnessclinical risk predictionmachine learninghealthcare disparitiesvariational autoencoderequalized oddselectronic health records

0 comments

The pith

Clinical risk models enforce individual fairness by requiring identical predictions for a patient and their counterfactual version when outcomes match.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an augmented counterfactual fairness criterion that extends group-level equalized odds to individuals. It requires a model to output the same risk prediction for a real patient and a version of that patient with an altered sensitive attribute, provided the factual and counterfactual health outcomes are the same. A variational autoencoder is trained under an assumed causal graph to generate the necessary counterfactual patient records from electronic health data. The resulting models can be trained to meet this criterion while trading off some predictive performance on tasks such as length-of-stay and mortality prediction. A sympathetic reader would care because the approach offers a concrete route to reduce the risk that observational biases in health records produce systematically different treatment recommendations for otherwise similar patients.

Core claim

We develop an augmented counterfactual fairness criteria to extend the group fairness criteria of equalized odds to an individual level by requiring that the same prediction be made for a patient, and a counterfactual patient resulting from changing a sensitive attribute, if the factual and counterfactual outcomes do not differ. We use a variational autoencoder to perform counterfactual inference in the context of an assumed causal graph.

What carries the argument

The augmented counterfactual fairness criterion, which demands equal model output for factual and counterfactual patients whenever their outcomes are identical, implemented through variational-autoencoder counterfactual generation under a fixed causal graph.

If this is right

Models trained this way can satisfy an individual-level fairness constraint while still producing usable risk scores for prolonged inpatient stay and mortality.
The method supplies a tunable knob between the strength of the fairness constraint and retained predictive accuracy inside the learned generative model.
The same construction can be applied to any clinical prediction task whose data admit an assumed causal graph and a variational autoencoder fit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the assumed causal graph is misspecified, the fairness guarantee collapses even if the numerical criterion is met on generated data.
The approach could be tested by checking whether real interventions that alter a sensitive attribute (while holding other factors fixed) produce the same model output as predicted by the counterfactuals.
The criterion may be combined with other causal fairness notions that operate at the group level rather than replacing them.

Load-bearing premise

The variational autoencoder and chosen causal graph correctly recover the data-generating process so that generated counterfactuals are valid.

What would settle it

Train the model under the criterion and then generate a set of factual-counterfactual pairs whose true outcomes are known to be identical; if the model outputs differ on any pair, the criterion is not satisfied.

read the original abstract

The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we develop an augmented counterfactual fairness criteria to extend the group fairness criteria of equalized odds to an individual level. We do so by requiring that the same prediction be made for a patient, and a counterfactual patient resulting from changing a sensitive attribute, if the factual and counterfactual outcomes do not differ. We investigate the extent to which the augmented counterfactual fairness criteria may be applied to develop fair models for prolonged inpatient length of stay and mortality with observational electronic health records data. As the fairness criteria is ill-defined without knowledge of the data generating process, we use a variational autoencoder to perform counterfactual inference in the context of an assumed causal graph. While our technique provides a means to trade off maintenance of fairness with reduction in predictive performance in the context of a learned generative model, further work is needed to assess the generality of this approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines an individual-level extension of equalized odds via counterfactuals inferred from a VAE on an assumed graph, but supplies no validation that the counterfactuals are accurate enough to support the fairness claim.

read the letter

The core move is to require that a model output the same prediction for a patient and their counterfactual version (after flipping the sensitive attribute) whenever the factual and counterfactual outcomes match. They implement this with a VAE that generates the counterfactuals inside a fixed causal graph, then apply it to length-of-stay and mortality prediction on EHR data. That combination of equalized odds with individual counterfactual fairness looks new relative to the cited prior work on group fairness and basic counterfactual fairness.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes an augmented counterfactual fairness criterion extending equalized odds to the individual level by requiring identical predictions for a factual patient and their counterfactual (obtained by intervening on a sensitive attribute) whenever the factual and counterfactual outcomes match. The criterion is applied to clinical risk prediction tasks (prolonged inpatient length of stay and mortality) on observational EHR data; because the criterion is ill-defined without the data-generating process, a variational autoencoder is used to perform counterfactual inference under an assumed causal graph, with the method positioned as enabling a controllable trade-off between fairness maintenance and predictive performance.

Significance. If the VAE recovers accurate counterfactual distributions under a correctly specified graph, the approach would supply a concrete mechanism for individual-level fairness in healthcare ML that is grounded in causal reasoning rather than purely observational group metrics, addressing a recognized limitation of standard fairness criteria in clinical settings.

major comments (2)

[Abstract] Abstract: the augmented individual-level fairness criterion requires that predictions remain identical for factual and counterfactual patients when outcomes match; this guarantee is only well-defined given accurate counterfactual inference, yet the manuscript supplies no validation, sensitivity analysis, or recovery metrics demonstrating that the VAE under the assumed causal graph recovers the true counterfactual distributions.
[Abstract] Abstract: the claimed ability to trade off fairness against predictive performance is described only at a high level; without reported quantitative results, error analysis, or ablation on the VAE or graph assumptions, it is impossible to evaluate whether the fairness guarantee is achieved at acceptable performance cost or whether graph misspecification breaks the criterion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the augmented individual-level fairness criterion requires that predictions remain identical for factual and counterfactual patients when outcomes match; this guarantee is only well-defined given accurate counterfactual inference, yet the manuscript supplies no validation, sensitivity analysis, or recovery metrics demonstrating that the VAE under the assumed causal graph recovers the true counterfactual distributions.

Authors: We agree that the fairness criterion depends on accurate counterfactual inference from the VAE under the assumed graph. In observational EHR data, ground-truth counterfactuals are unavailable, so direct recovery metrics cannot be computed. We will add sensitivity analyses with respect to the causal graph and VAE parameters in the revised manuscript. revision: yes
Referee: [Abstract] Abstract: the claimed ability to trade off fairness against predictive performance is described only at a high level; without reported quantitative results, error analysis, or ablation on the VAE or graph assumptions, it is impossible to evaluate whether the fairness guarantee is achieved at acceptable performance cost or whether graph misspecification breaks the criterion.

Authors: We acknowledge that the trade-off is described at a high level. The manuscript demonstrates the approach on the two tasks but lacks detailed ablations and error analysis. We will expand the experiments with quantitative results, error analysis, and ablations on the VAE and graph assumptions in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: fairness criterion and VAE inference remain independent of fitted outputs

full rationale

The paper explicitly defines the augmented counterfactual fairness criterion as an extension of equalized odds requiring identical predictions for factual and counterfactual patients when outcomes match. It states that the criterion is ill-defined without the data generating process and therefore adopts an assumed causal graph plus VAE for inference, without any equation showing that the resulting predictions or fairness metric reduce by construction to quantities defined solely by the fitted VAE parameters. No self-citation is load-bearing for the central claim, no uniqueness theorem is imported from the authors' prior work, and no ansatz is smuggled via citation. The derivation is therefore self-contained against external benchmarks of the assumed graph and generative model.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central approach rests on an assumed causal graph and the ability of a VAE to perform valid counterfactual inference; these are not derived from data but posited to make the fairness criterion operational.

axioms (1)

domain assumption Assumed causal graph for the data generating process
Explicitly stated as necessary because the fairness criteria is ill-defined without knowledge of the data generating process.

pith-pipeline@v0.9.0 · 5734 in / 1191 out tokens · 20750 ms · 2026-05-24T21:25:58.739701+00:00 · methodology

Counterfactual Reasoning for Fair Clinical Risk Prediction

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)