arxiv: 2605.05890 · v1 · submitted 2026-05-07 · 💻 cs.LG · stat.ME

Recognition: unknown

RepFlow: Representation Enhanced Flow Matching for Causal Effect Estimation

Yifei Xie , Jian Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-09 15:21 UTC · model grok-4.3

classification 💻 cs.LG stat.ME

keywords causal effect estimationflow matchingrepresentation learningWasserstein distanceselection biaspotential outcomesdistributional estimationobservational data

0 comments

The pith

RepFlow balances treated and control representations via Wasserstein distance then uses conditional flow matching to estimate both point and full distributional causal effects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops RepFlow to handle missing counterfactuals and selection bias in observational causal inference by casting the task as a joint optimization of representation learning and conditional flow matching. It reduces bias through minimization of the entropically regularized Wasserstein distance between treated and control representations and adds L2 normalization for stability. This produces latent spaces where the flow model can recover distributions of potential outcomes rather than only averages. A sympathetic reader cares because many real-world decisions require understanding not just average effects but their variability and uncertainty.

Core claim

RepFlow mitigates selection bias by minimizing the entropically regularized Wasserstein distance between treated and control representations, introduces an L2 normalization constraint on latent representations for numerical stability, and employs conditional flow matching so that the resulting balanced representations enable accurate capture of the full distribution of potential outcomes.

What carries the argument

Entropically regularized Wasserstein distance minimization to align treated and control representations, combined with L2 normalization and conditional flow matching to model potential outcome distributions.

If this is right

Enables estimation of full distributions of potential outcomes in addition to point estimates.
Reduces selection bias effects in observational data for causal tasks.
Achieves consistent outperformance over prior methods on point and distributional metrics across benchmarks.
Applies directly to domains like healthcare and economics that need distributional causal insights.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same balancing-plus-flow pattern could be tested with other conditional generative models such as diffusion processes for counterfactual sampling.
If the representations truly remove selection bias, the approach might improve performance in high-dimensional or multi-treatment settings where traditional balancing struggles.
Extending the framework to longitudinal data would require checking whether the Wasserstein term can be adapted to time-dependent representations.

Load-bearing premise

Minimizing the entropically regularized Wasserstein distance between treated and control representations plus L2 normalization produces representations free enough of selection bias for the flow model to recover true potential outcome distributions without adding new biases.

What would settle it

On synthetic data where the true distributions of potential outcomes are known, RepFlow's estimated distributions show larger discrepancies from ground truth than a version that omits the Wasserstein balancing step.

Figures

Figures reproduced from arXiv: 2605.05890 by Jian Huang, Yifei Xie.

**Figure 1.** Figure 1: Architecture of Repflow. the factual distribution losses for each group are: ϵ a=1 F = Z X dθ,ϕ(x, 1)p(x|a = 1)dx, ϵ a=0 F = Z X dθ,ϕ(x, 0)p(x|a = 0)dx. Definition 3.3. The expected distributional estimation error is defined as the total discrepancy between the true and estimated potential outcome distributions over the population: ϵD(θ, ϕ) = Z X (dθ,ϕ(x, 1) + dθ,ϕ(x, 0)) p(x)dx. (1) While many existing w… view at source ↗

**Figure 2.** Figure 2: Illustrations of dimensionality reduction results via UMAP are shown: the left panel displays original covariates, and the right panel shows learned representations. Red points indicate treated samples, while blue points correspond to control samples. where p1 = pθ(Y |X, A) and p2 = p(Y |X, A) denote the estimated and true conditional distributions of POs, respectively view at source ↗

**Figure 3.** Figure 3: Results of varying values of parameters on IHDP dataset. balancing fitting accuracy and distributional consistency. Performance remains stable across a wide range of latent dimensions, with only minor fluctuations in PEHE and ATE error, indicating RepFlow is not overly sensitive to this hyperparameter. Overall, RepFlow demonstrates robustness to hyperparameter variations, with stable performance across se… view at source ↗

read the original abstract

Estimating causal effects from observational data has become increasingly critical in diverse fields including healthcare, economics, and social policy. The fundamental challenge in causal inference arises from the missing counterfactuals and the selection bias. Existing methods are largely limited to point estimates and lack the capacity for distribution modeling. In this work, we propose RepFlow, a novel framework that formulates causal effect estimation as a joint optimization problem integrating representation learning with Conditional Flow Matching (CFM). RepFlow mitigates selection bias by minimizing the entropically regularized Wasserstein distance between treated and control representations. To enhance numerical stability, we further introduce an $L_2$ normalization constraint on latent representations. This balanced representation enables the flow model to accurately capture the distribution of potential outcomes. Extensive experiments across a wide range of benchmarks demonstrate that RepFlow consistently outperforms existing methods in both point and distributional causal effect estimation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RepFlow pairs Wasserstein-balanced representations with conditional flow matching to target both point and distributional causal effects, but the key step from balanced latents to unbiased counterfactual distributions rests on an unproven assumption.

read the letter

The paper's main move is to treat causal estimation as a single optimization that learns representations, balances them with an entropically regularized Wasserstein distance between treated and control groups, adds L2 normalization, and then trains a conditional flow matcher on those latents to model full potential-outcome distributions. That joint framing is new enough to notice even though each piece has earlier work behind it. Flow matching is a sensible choice when you actually want the whole distribution rather than just a mean or quantile, and the balancing step is a direct attempt to handle selection bias before the generative model sees the data. If the experiments really show gains across standard benchmarks for both point and distributional metrics, that would be the practical payoff. The L2 normalization is a small but sensible engineering detail for keeping the flows stable. The soft spot is exactly the one the stress-test note flags. Marginal alignment of representations does not automatically guarantee that the flow will recover the correct conditional distributions for the missing counterfactuals. Nothing in the abstract explains why the balanced latents remain a sufficient statistic or why the flow objective will extrapolate rather than memorize an artifact of the balancing procedure. Without a clear argument, ablation, or theoretical bound on that point, the distributional claims are harder to trust than the point-estimate ones. The abstract also gives no equations, no training details, and no experimental controls, so the reported outperformance cannot be assessed yet. This is the kind of paper a causal-ML reading group might want to discuss once the full manuscript is out, mainly to see whether the balancing-plus-flow combination actually delivers on the distributional side. It is worth sending to peer review because the problem is real, the framing is coherent, and the method is concrete enough that referees can check the missing justification and the experimental controls. A revise-and-resubmit is the likely outcome, but the work is solid enough to deserve that process rather than a desk reject.

Referee Report

2 major / 1 minor

Summary. The paper proposes RepFlow, a framework that formulates causal effect estimation as joint optimization of representation learning (via entropically regularized Wasserstein distance between treated/control groups plus L2 normalization) with Conditional Flow Matching (CFM) to model distributions of potential outcomes, claiming consistent outperformance over existing methods on benchmarks for both point and distributional estimates.

Significance. If the central claims hold, the work would offer a meaningful advance by extending causal inference beyond point estimates to full distributional modeling of counterfactuals using modern flow-based generative models. The explicit combination of Wasserstein balancing with CFM is a fresh direction, and the reported extensive benchmark experiments (if rigorously controlled) would provide useful empirical evidence for practitioners in healthcare and policy domains.

major comments (2)

[Method] The central modeling assumption—that minimizing the entropically regularized Wasserstein distance between treated and control representations (plus L2 normalization) yields latents that are sufficient for the CFM to recover unbiased distributions of potential outcomes—is stated without supporting argument or derivation. No demonstration is given that marginal alignment removes selection bias while preserving the conditional information needed for correct extrapolation to the missing counterfactual regime, rather than fitting an artifact of the balancing objective.
[Method] The abstract and method description provide no equations, no explicit loss function combining the Wasserstein term with the CFM objective, and no analysis of how the flow-matching training on balanced latents avoids introducing new biases when imputing counterfactuals.

minor comments (1)

The abstract claims performance gains but supplies no error bars, no description of experimental controls, and no list of baselines; these details are essential for evaluating the strongest empirical claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive view of the work's potential contribution. We agree that the manuscript requires stronger theoretical grounding for the balancing assumption and more explicit mathematical details. We will make major revisions to address both points.

read point-by-point responses

Referee: The central modeling assumption—that minimizing the entropically regularized Wasserstein distance between treated and control representations (plus L2 normalization) yields latents that are sufficient for the CFM to recover unbiased distributions of potential outcomes—is stated without supporting argument or derivation. No demonstration is given that marginal alignment removes selection bias while preserving the conditional information needed for correct extrapolation to the missing counterfactual regime, rather than fitting an artifact of the balancing objective.

Authors: We acknowledge that the current manuscript presents the balancing step as a modeling choice without a dedicated supporting argument or derivation. In the revised version we will add a new subsection (3.2) that motivates the approach by connecting it to the literature on representation balancing for causal inference (e.g., Shalit et al., CFR). The rationale is that entropically regularized Wasserstein alignment reduces dependence between the latent representation and treatment assignment, thereby mitigating selection bias while the subsequent conditional flow-matching step models the outcome distribution given the (now balanced) latent and treatment. We will also add an ablation study that measures retained predictive power of the balanced latents for observed outcomes to show that conditional information is not collapsed. A full formal proof that marginal alignment guarantees unbiased distributional extrapolation remains an open theoretical question; we will therefore explicitly list this as a limitation and a direction for future work. revision: yes
Referee: The abstract and method description provide no equations, no explicit loss function combining the Wasserstein term with the CFM objective, and no analysis of how the flow-matching training on balanced latents avoids introducing new biases when imputing counterfactuals.

Authors: We agree that the abstract (by design) and the current method write-up omit the combined objective and bias analysis. In the revision we will (i) state the joint loss explicitly in Section 3: L = L_CFM(θ; Z, T, Y) + λ W_ε(μ_t, μ_c) + μ ||Z||_2^2, where L_CFM is the conditional flow-matching loss, W_ε is the entropic regularized Wasserstein distance between treated and control latent distributions, and the L2 term enforces normalization; (ii) add a paragraph analyzing bias: because the flow is trained only on factual (Z, T, Y) pairs and counterfactuals are generated by swapping T while keeping the same balanced Z, the procedure inherits the standard ignorability assumption and does not introduce additional bias beyond what is already present in the representation; (iii) include a short discussion of how the flow-matching transport in latent space enables distributional imputation without re-introducing selection bias. These changes will be reflected in both the Method section and a new “Discussion of Assumptions and Limitations” paragraph. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper proposes a joint optimization framework that combines representation learning (via entropically regularized Wasserstein distance between treated/control groups plus L2 normalization) with Conditional Flow Matching to estimate causal effects. No equations or derivation steps are visible that reduce a claimed prediction or result to its own inputs by construction, such as fitting a parameter and then relabeling a related quantity as a prediction. The Wasserstein term functions as an explicit regularizer for bias mitigation rather than a self-referential target. Performance claims rest on benchmark experiments rather than a closed mathematical loop or load-bearing self-citation chain. The derivation is therefore self-contained against external validation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that balanced representations plus flow matching recover counterfactual distributions, but details are absent.

pith-pipeline@v0.9.0 · 5445 in / 1132 out tokens · 26454 ms · 2026-05-09T15:21:07.293429+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

71 extracted references · 13 canonical work pages · 2 internal anchors

[1]

Langley , title =

P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =

2000
[2]

T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980

1980
[3]

M. J. Kearns , title =
[4]

Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983

1983
[5]

R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000

2000
[6]

Suppressed for Anonymity , author=
[7]

Newell and P

A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981

1981
[8]

A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959

1959
[9]

Advances in neural information processing systems , volume=

Deep generalized method of moments for instrumental variable analysis , author=. Advances in neural information processing systems , volume=
[10]

Advances in neural information processing systems , volume=

Causal effect inference with deep latent-variable models , author=. Advances in neural information processing systems , volume=
[11]

International conference on machine learning , pages=

Learning representations for counterfactual inference , author=. International conference on machine learning , pages=. 2016 , organization=

2016
[12]

Proceedings of the national academy of sciences , volume=

Metalearners for estimating heterogeneous treatment effects using machine learning , author=. Proceedings of the national academy of sciences , volume=. 2019 , publisher=

2019
[13]

2023 , doi =

Bennett, Andrew and Kallus, Nathan , journal =. 2023 , doi =

2023
[14]

Journal of Computational and Graphical Statistics , volume=

Nonlinear variable selection via deep neural networks , author=. Journal of Computational and Graphical Statistics , volume=. 2021 , publisher=

2021
[15]

2020 , url =

Bennett, Andrew and Kallus, Nathan , journal =. 2020 , url =

2020
[16]

arXiv preprint arXiv:2010.07684 , volume=

Maximum moment restriction for instrumental variable regression , author=. arXiv preprint arXiv:2010.07684 , volume=. 2020 , publisher=

work page arXiv 2010
[17]

Econometrica , volume=

Instrumental variable estimation of nonparametric models , author=. Econometrica , volume=. 2003 , publisher=

2003
[18]

arXiv preprint arXiv:2304.01098 , year=

The synthetic instrument: From sparse association to sparse causation , author=. arXiv preprint arXiv:2304.01098 , year=

work page arXiv
[19]

2009 , publisher=

Causality , author=. 2009 , publisher=

2009
[20]

Journal of clinical epidemiology , volume=

External adjustment for unmeasured confounders improved drug--outcome association estimates based on health care utilization data , author=. Journal of clinical epidemiology , volume=. 2012 , publisher=

2012
[21]

Econometrica , volume=

The effect of job loss and unemployment insurance on crime in Brazil , author=. Econometrica , volume=. 2022 , publisher=

2022
[22]

Knowledge and Information Systems , volume=

Decision trees for uplift modeling with single and multiple treatments , author=. Knowledge and Information Systems , volume=. 2012 , publisher=

2012
[23]

Biometrika , volume=

The central role of the propensity score in observational studies for causal effects , author=. Biometrika , volume=. 1983 , publisher=

1983
[24]

2010 , publisher=

Causal inference , author=. 2010 , publisher=

2010
[25]

The Journal of Heart and Lung Transplantation , volume=

Second INTERMACS annual report: more than 1,000 primary left ventricular assist device implants , author=. The Journal of Heart and Lung Transplantation , volume=. 2010 , publisher=

2010
[26]

Statistics in medicine , volume=

Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , author=. Statistics in medicine , volume=. 2004 , publisher=

2004
[27]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Covariate balancing propensity score , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2014 , publisher=

2014
[28]

Double/debiased machine learning for treatment and structural parameters.The Econometrics Journal, 21(1):C1–C68, 2018

Double/debiased machine learning for treatment and structural parameters , volume =. The Econometrics Journal , author =. 2018 , pages =. doi:10.1111/ectj.12097 , number =

work page doi:10.1111/ectj.12097 2018
[29]

Econometrica , author =

Deep. Econometrica , author =. 2021 , pages =. doi:10.3982/ECTA16901 , number =

work page doi:10.3982/ecta16901 2021
[30]

Electronic Journal of Statistics , volume=

Towards optimal doubly robust estimation of heterogeneous causal effects , author=. Electronic Journal of Statistics , volume=. 2023 , publisher=

2023
[31]

Journal of the American Statistical Association , volume=

Estimation and inference of heterogeneous treatment effects using random forests , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

2018
[32]

International Conference on Learning Representations , year=

Learning disentangled representations for counterfactual regression , author=. International Conference on Learning Representations , year=
[33]

International conference on machine learning , pages=

Estimating individual treatment effect: generalization bounds and algorithms , author=. International conference on machine learning , pages=. 2017 , organization=

2017
[34]

Advances in Neural Information Processing Systems , volume=

Optimal transport for treatment effect estimation , author=. Advances in Neural Information Processing Systems , volume=
[35]

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=

Learning disentangled representations for counterfactual regression via mutual information minimization , author=. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=
[36]

psychometrika , volume=

Beyond the mean: A flexible framework for studying causal effects using linear models , author=. psychometrika , volume=. 2022 , publisher=

2022
[37]

Biometrika , volume=

Semiparametric counterfactual density estimation , author=. Biometrika , volume=. 2023 , publisher=

2023
[38]

International conference on learning representations , year=

GANITE: Estimation of individualized treatment effects using generative adversarial nets , author=. International conference on learning representations , year=
[39]

arXiv preprint arXiv:2302.00860 , volume=

Interventional and counterfactual inference with diffusion models , author=. arXiv preprint arXiv:2302.00860 , volume=

work page arXiv
[40]

Advances in Neural Information Processing Systems , volume=

DiffPO: A causal diffusion model for learning distributions of potential outcomes , author=. Advances in Neural Information Processing Systems , volume=
[41]

arXiv preprint arXiv:2504.03630 , year=

Enhancing Causal Effect Estimation with Diffusion-Generated Data , author=. arXiv preprint arXiv:2504.03630 , year=

work page arXiv
[42]

Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

PO-Flow: Flow-based Generative Models for Sampling Potential Outcomes and Counterfactuals , author=. arXiv preprint arXiv:2505.16051 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[43]

European Conference on Computer Vision , pages=

X-learner: Learning cross sources and tasks for universal visual representation , author=. European Conference on Computer Vision , pages=. 2022 , organization=

2022
[44]

Advances in neural information processing systems , volume=

Adapting neural networks for the estimation of treatment effects , author=. Advances in neural information processing systems , volume=
[45]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=
[46]

Communications of the ACM , volume=

Generative adversarial networks , author=. Communications of the ACM , volume=. 2020 , publisher=

2020
[47]

Lipman, Yaron and Chen, Ricky T. Q. and Ben-Hamu, Heli and Nickel, Maximilian and Le, Matthew , title =. International Conference on Learning Representations , year =
[48]

arXiv preprint arXiv:2412.12095 , year=

Causal diffusion transformers for generative modeling , author=. arXiv preprint arXiv:2412.12095 , year=

work page arXiv
[49]

O'Neil and Sotirios A

Pedro Sanchez and Xiao Liu and Alison Q. O'Neil and Sotirios A. Tsaftaris , title =. International Conference on Learning Representations , year =
[50]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

VACA: Designing Variational Graph Autoencoders for Causal Queries , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2022 , month=. doi:10.1609/aaai.v36i7.20789 , abstractNote=

work page doi:10.1609/aaai.v36i7.20789 2022
[51]

, author=

Estimating causal effects of treatments in randomized and nonrandomized studies. , author=. Journal of educational Psychology , volume=. 1974 , publisher=

1974
[52]

International conference on machine learning , pages=

Variational inference with normalizing flows , author=. International conference on machine learning , pages=. 2015 , organization=

2015
[53]

Advances in neural information processing systems , volume=

Neural ordinary differential equations , author=. Advances in neural information processing systems , volume=
[54]

International Conference on Machine Learning , pages=

Covariate balancing using the integral probability metric for causal inference , author=. International Conference on Machine Learning , pages=. 2023 , organization=

2023
[55]

, author=

On the Translocation of Masses. , author=. Journal of mathematical sciences , volume=
[56]

2008 , publisher=

Optimal transport: old and new , author=. 2008 , publisher=

2008
[57]

Advances in neural information processing systems , volume=

Lightspeed computation of optimal transport , author=. Advances in neural information processing systems , volume=
[58]

stat , volume=

Machine learning methods for estimating heterogeneous causal effects , author=. stat , volume=
[59]

Journal of Computational and Graphical Statistics , volume=

Bayesian nonparametric modeling for causal inference , author=. Journal of Computational and Graphical Statistics , volume=. 2011 , publisher=

2011
[60]

arXiv preprint arXiv:1802.05046 , year=

Benchmarking framework for performance-evaluation of causal inference analysis , author=. arXiv preprint arXiv:1802.05046 , year=

work page arXiv
[61]

Statistical science , volume=

Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition , author=. Statistical science , volume=. 2019 , publisher=

2019
[62]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

work page internal anchor Pith review arXiv
[63]

Journal of the American Statistical Association , volume=

A deep generative approach to conditional sampling , author=. Journal of the American Statistical Association , volume=. 2023 , publisher=

2023
[64]

Convergence of continuous normalizing flows for learning probability distributions.arXiv:2404.00551, 2024

Convergence of continuous normalizing flows for learning probability distributions , author=. arXiv preprint arXiv:2404.00551 , year=

work page arXiv
[65]

Kingma and Jimmy Ba , editor =

Diederik P. Kingma and Jimmy Ba , editor =. Adam:. International Conference on Learning Representations , year =
[66]

Journal of the American Statistical Association , volume=

Semiparametric proximal causal inference , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=

2024
[67]

International Conference on Machine Learning , pages=

Deep IV: A flexible approach for counterfactual prediction , author=. International Conference on Machine Learning , pages=. 2017 , organization=

2017
[68]

arXiv preprint arXiv:2505.07967 , year=

Wasserstein Distributionally Robust Nonparametric Regression , author=. arXiv preprint arXiv:2505.07967 , year=

work page arXiv
[69]

Data Mining and Knowledge Discovery , volume=

Adversarial balancing-based representation learning for causal effect inference with observational data , author=. Data Mining and Knowledge Discovery , volume=. 2021 , publisher=

2021
[70]

Monge, Gaspard , journal=. M
[71]

Applied numerical mathematics , volume=

A history of Runge-Kutta methods , author=. Applied numerical mathematics , volume=. 1996 , publisher=

1996