Benchmarking Framework for Performance-Evaluation of Causal Inference Analysis

Yishai Shimoni , Chen Yanover , Ehud Karavani , Yaara Goldschmnidt

Authors on Pith no claims yet

classification 📊 stat.ME cs.LGstat.ML

keywords dataframeworkcausalanalysisinferenceoutcomevalidationbenchmarking

read the original abstract

Causal inference analysis is the estimation of the effects of actions on outcomes. In the context of healthcare data this means estimating the outcome of counter-factual treatments (i.e. including treatments that were not observed) on a patient's outcome. Compared to classic machine learning methods, evaluation and validation of causal inference analysis is more challenging because ground truth data of counter-factual outcome can never be obtained in any real-world scenario. Here, we present a comprehensive framework for benchmarking algorithms that estimate causal effect. The framework includes unlabeled data for prediction, labeled data for validation, and code for automatic evaluation of algorithm predictions using both established and novel metrics. The data is based on real-world covariates, and the treatment assignments and outcomes are based on simulations, which provides the basis for validation. In this framework we address two questions: one of scaling, and the other of data-censoring. The framework is available as open source code at https://github.com/IBM-HRL-MLHLS/IBM-Causal-Inference-Benchmarking-Framework

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations
cs.LG 2026-05 unverdicted novelty 7.0

TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.
RepFlow: Representation Enhanced Flow Matching for Causal Effect Estimation
cs.LG 2026-05 unverdicted novelty 5.0

RepFlow combines representation learning and conditional flow matching to estimate both point and distributional causal effects while mitigating selection bias via entropically regularized Wasserstein distance on norm...