Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition

Dan Cervone; Jennifer Hill; Marc Scott; Uri Shalit; Vincent Dorie

arxiv: 1707.02641 · v5 · pith:YRU3VGB7new · submitted 2017-07-09 · 📊 stat.ME · stat.ML

Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition

Vincent Dorie , Jennifer Hill , Uri Shalit , Marc Scott , Dan Cervone This is my paper

classification 📊 stat.ME stat.ML

keywords methodsinferencecausaldataresearchersstrategiesanalysesanalysis

0 comments

read the original abstract

Statisticians have made great progress in creating methods that reduce our reliance on parametric assumptions. However this explosion in research has resulted in a breadth of inferential strategies that both create opportunities for more reliable inference as well as complicate the choices that an applied researcher has to make and defend. Relatedly, researchers advocating for new methods typically compare their method to at best 2 or 3 other causal inference strategies and test using simulations that may or may not be designed to equally tease out flaws in all the competing methods. The causal inference data analysis challenge, "Is Your SATT Where It's At?", launched as part of the 2016 Atlantic Causal Inference Conference, sought to make progress with respect to both of these issues. The researchers creating the data testing grounds were distinct from the researchers submitting methods whose efficacy would be evaluated. Results from 30 competitors across the two versions of the competition (black box algorithms and do-it-yourself analyses) are presented along with post-hoc analyses that reveal information about the characteristics of causal inference strategies and settings that affect performance. The most consistent conclusion was that methods that flexibly model the response surface perform better overall than methods that fail to do so. Finally new methods are proposed that combine features of several of the top-performing submitted methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Causal Discovery via Statistical Power (CDSP)
stat.ME 2026-05 unverdicted novelty 6.0

CDSP uses an effect-size asymmetry assumption and statistical power to estimate causal directions from bivariate data with uncertainty, reducing false discoveries by 18% on 100 benchmark pairs.
A renormalization-group inspired lattice-based framework for piecewise generalized linear models
stat.ME 2026-05 unverdicted novelty 6.0

RG-inspired lattice models for piecewise GLMs provide explicit interpretable partitions and a replica-analysis-derived scaling law for regularization that allows increasing complexity without expected rise in generali...