Hierarchical Causal Uplift Modeling in Overlapping Customer Journeys

Jorge Pellegrini

arxiv: 2604.24533 · v1 · submitted 2026-04-27 · 📊 stat.ME

Hierarchical Causal Uplift Modeling in Overlapping Customer Journeys

Jorge Pellegrini This is my paper

Pith reviewed 2026-05-08 02:08 UTC · model grok-4.3

classification 📊 stat.ME

keywords causal uplift modelingoverlapping journeysmultiplicative effectsMonte Carlo estimationmarketing campaignsincremental impactsynergiescustomer journeys

0 comments

The pith

A hierarchical model decomposes the pure incremental effects of overlapping marketing journeys by treating each as a multiplicative factor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Platforms running several marketing journeys at the same time create overlapping user exposures that bias standard A/B tests toward understating each journey's true added value. This paper builds a model that separates those pure effects from the observed marginal lifts and from any synergies or cannibalization between journeys. Journeys are represented as multiplicative factors whose interactions are estimated together with uncertainty in overlap rates and single-journey results. When applied to roughly three million users the model finds that the underlying pure lifts are substantially larger than the experimentally recorded ones, that synergies are positive yet modest, and that the combined global lift it predicts matches the measured total. A reader would care because accurate recovery of incremental impact would let campaigns be sized and timed according to their real contribution rather than their diluted observed contribution.

Core claim

The Hierarchical Causal Lift Model decomposes pure and global effects under journey overlap by modeling each journey as a multiplicative causal factor and letting interaction terms capture synergies or cannibalizations. Regularized nonlinear least squares are paired with Monte Carlo simulation to propagate uncertainty from overlap proportions, observed lifts, and single-journey effects. On an active base of approximately three million users the estimated pure lifts exceed the experimentally observed marginal lifts, modest positive synergies appear between journeys, and the model's predicted global lift closely reproduces the experimentally measured value.

What carries the argument

Multiplicative causal factors with interaction terms, estimated by regularized nonlinear least squares and Monte Carlo simulation that samples uncertainty in overlaps and lifts.

If this is right

Pure lifts of individual journeys are significantly larger than the marginal lifts recorded in standard experiments.
Interactions between journeys are positive but modest in size.
The global lift obtained by combining pure effects and interactions matches the total lift measured experimentally.
Incremental effects remain interpretable and recoverable even when journeys overlap.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the decomposition holds, platforms could adjust the timing or creative content of journeys to increase positive interactions rather than merely avoiding overlap.
The same multiplicative structure might be used to estimate incremental effects in non-marketing settings where multiple interventions occur simultaneously, such as public-health campaigns.
Simulated data sets in which true pure effects are known in advance could serve as an independent check on whether the Monte Carlo recovery is unbiased.

Load-bearing premise

Each journey functions as an independent multiplicative causal factor whose uncertainties are correctly captured by sampling overlap proportions, observed lifts, and single-journey effects.

What would settle it

A new experiment that isolates every journey and measures its lift directly would falsify the model if those isolated lifts do not match the pure lifts recovered from the overlapping data.

read the original abstract

Digital travel platforms often operate multiple marketing journeys simultaneously, resulting in overlapping user exposures that bias the standard A/B lift estimation. Because traditional lift experiments assume treatment isolation, the observed lifts reflect only marginal effects and may substantially underestimate the total incremental impact of each journey. This work introduces a Hierarchical Causal Lift Model that decomposes pure and global effects under journey overlap. Each journey is modeled as a multiplicative causal factor, and the interaction terms capture potential synergies or cannibalizations. The model is estimated through a Monte Carlo framework that incorporates uncertainty in overlap proportions, observed lifts, and single-journey effects. Regularized non-linear least squares are complemented with Monte Carlo simulation to quantify parameter uncertainty and assess the robustness of the solution. Applied to an active user base of approximately three million users, the model reveals positive but modest synergies between journeys and shows that pure lifts are significantly larger than those observed experimentally. The predicted global lift closely matches the experimentally measured value, demonstrating the ability of the model to recover incremental effects in an interpretable manner.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The global lift match follows from the model structure, so it does not independently confirm the pure-lift decomposition.

read the letter

The paper fits a multiplicative hierarchical model to marginal lifts from overlapping journeys, recovers larger pure effects plus modest positive interactions, and reports that the implied aggregate lift matches the experimental value. The Monte Carlo step propagates uncertainty in overlaps and single-journey estimates through regularized nonlinear least squares. That framing addresses a common practical problem in digital marketing where standard A/B tests ignore simultaneous exposures. The explicit interaction terms and uncertainty treatment are a clear incremental step beyond basic uplift models. The application to three million users shows the method can be run at scale. The central limitation is that the global match is automatic once the product structure and overlap proportions are fixed; any decomposition that respects the same functional form will reproduce the aggregate by construction. Without separate identification checks, ground-truth simulations, or hold-out experiments that isolate journeys, the claim that pure lifts are substantially larger remains tied to the modeling assumptions rather than external evidence. The abstract supplies no equations, no residual diagnostics, and no comparison to additive or other decompositions, which makes it hard to judge how sensitive the results are to post-hoc choices. This work is mainly for marketing analytics teams that already run multi-journey campaigns and need a pragmatic correction for overlap bias. Researchers focused on causal identification would want to see synthetic recovery tests before treating the pure-lift numbers as reliable. It deserves peer review because the problem is widespread and the proposed structure is concrete, but the authors should be asked to demonstrate that the decomposition recovers known effects when overlaps are present.

Referee Report

3 major / 2 minor

Summary. The paper introduces a Hierarchical Causal Uplift Model to decompose pure causal lifts of individual marketing journeys from observed marginal lifts when journeys overlap. Each journey is treated as a multiplicative causal factor whose interactions capture synergies or cannibalization; parameters are obtained via regularized nonlinear least squares inside a Monte Carlo procedure that propagates uncertainty in overlap proportions, observed lifts, and single-journey effects. On an active user base of roughly three million users the fitted model reports positive but modest synergies, substantially larger pure lifts than the experimentally observed marginal lifts, and a predicted global lift that closely reproduces the experimentally measured aggregate lift.

Significance. If the decomposition were shown to recover ground-truth pure effects, the framework would supply a practical, interpretable method for correcting overlap bias in multi-journey marketing experiments and for quantifying interaction effects. The Monte Carlo propagation of uncertainty in overlap proportions and the explicit multiplicative structure are constructive elements that could be extended to other overlapping-treatment settings.

major comments (3)

[Abstract and estimation procedure] Abstract and estimation procedure: the headline finding that pure lifts are 'significantly larger' than observed experimental lifts is obtained by fitting the multiplicative model directly to the observed marginal lifts; because the reported global lift is a deterministic function of the fitted pure lifts, overlap proportions, and interaction coefficients, agreement between predicted and measured global lift holds by construction for any decomposition that preserves the product structure and therefore supplies no independent evidence that the individual pure-lift estimates are correct.
[Methods and results sections] Methods and results sections: no simulation study or ground-truth recovery experiment is reported that demonstrates the hierarchical multiplicative model recovers known pure effects when overlap proportions and marginal lifts are generated from a known data-generating process; the Monte Carlo procedure only quantifies posterior uncertainty conditional on the assumed functional form.
[Results section] Results section: the claim of 'positive but modest synergies' rests on the sign and magnitude of the fitted interaction coefficients, yet no baseline comparison to an additive model, no sensitivity analysis to the regularization strength, and no cross-validation or hold-out assessment of the nonlinear least-squares fit are provided to show that the interaction terms are required by the data rather than artifacts of the chosen parameterization.

minor comments (2)

[Abstract] The abstract states that the model is estimated 'through a Monte Carlo framework' but does not specify the number of draws, the exact sampling distributions used for overlap proportions and observed lifts, or convergence diagnostics for the regularized NLS step.
[Model specification] Notation for the multiplicative factors and interaction terms is introduced without an explicit equation reference or table of parameter definitions, making it difficult to verify how the global lift is exactly reconstructed from the pure-lift estimates.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of validation and robustness. We respond to each major comment below and indicate the planned revisions.

read point-by-point responses

Referee: [Abstract and estimation procedure] Abstract and estimation procedure: the headline finding that pure lifts are 'significantly larger' than observed experimental lifts is obtained by fitting the multiplicative model directly to the observed marginal lifts; because the reported global lift is a deterministic function of the fitted pure lifts, overlap proportions, and interaction coefficients, agreement between predicted and measured global lift holds by construction for any decomposition that preserves the product structure and therefore supplies no independent evidence that the individual pure-lift estimates are correct.

Authors: We agree that the agreement between the predicted global lift and the experimentally measured aggregate lift follows by construction from preserving the multiplicative structure during fitting, and therefore does not provide independent validation of the pure-lift estimates. The model's primary utility is in decomposing the observed marginal lifts into pure effects and interactions to quantify overlap-induced bias. The result that pure lifts exceed marginal lifts is a direct consequence of the positive overlap proportions and near-unity interaction coefficients. In revision we will add explicit language in the abstract and discussion clarifying that the global-lift match is a consistency check rather than external validation. revision: partial
Referee: [Methods and results sections] Methods and results sections: no simulation study or ground-truth recovery experiment is reported that demonstrates the hierarchical multiplicative model recovers known pure effects when overlap proportions and marginal lifts are generated from a known data-generating process; the Monte Carlo procedure only quantifies posterior uncertainty conditional on the assumed functional form.

Authors: We acknowledge that the current manuscript does not contain a simulation study demonstrating recovery of known pure effects. We will add a new simulation section that generates synthetic data from a known data-generating process with controlled overlaps and marginal lifts, applies the full estimation pipeline, and reports recovery accuracy for the pure lifts together with calibration of the Monte Carlo uncertainty intervals. revision: yes
Referee: [Results section] Results section: the claim of 'positive but modest synergies' rests on the sign and magnitude of the fitted interaction coefficients, yet no baseline comparison to an additive model, no sensitivity analysis to the regularization strength, and no cross-validation or hold-out assessment of the nonlinear least-squares fit are provided to show that the interaction terms are required by the data rather than artifacts of the chosen parameterization.

Authors: We agree that additional model diagnostics are needed to support the interaction-term claims. In the revision we will add (i) a direct comparison of fit quality and residual diagnostics between the multiplicative model and an additive baseline, (ii) sensitivity plots of the interaction coefficients across a range of regularization strengths, and (iii) a hold-out evaluation in which the model is estimated on a random subset of journeys and assessed on the remainder to confirm that the modest positive synergies improve predictive performance. revision: yes

Circularity Check

1 steps flagged

Pure-lift and synergy claims reduce to fitted parameters; global-lift match is by construction

specific steps

fitted input called prediction [Abstract]
"the model reveals positive but modest synergies between journeys and shows that pure lifts are significantly larger than those observed experimentally. The predicted global lift closely matches the experimentally measured value, demonstrating the ability of the model to recover incremental effects in an interpretable manner."

Pure lifts and interaction terms are the direct outputs of fitting the multiplicative model to observed marginal lifts. The global lift is defined as the product of these fitted quantities with overlap proportions; therefore any solution that reproduces the marginal observations will match the global lift by construction, rendering the match non-informative about the validity of the decomposition.

full rationale

The paper fits a multiplicative hierarchical model (pure lifts as factors, interactions for synergies) via regularized non-linear least squares to the observed marginal lifts, then reports the resulting pure lifts as 'significantly larger' and synergies as 'positive but modest.' The sole consistency check—that the model's implied global lift reproduces the experimental aggregate—is automatic for any decomposition preserving the product structure, since the global quantity is a deterministic function of the fitted pure lifts, overlaps, and interactions. Monte Carlo quantifies uncertainty conditional on the assumed form but supplies no external identification of the individual pure effects.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that journeys act multiplicatively and that fitted interaction terms plus Monte Carlo draws recover the pure effects; no new entities are postulated.

free parameters (2)

journey-specific multiplicative factors
Core parameters estimated by regularized non-linear least squares from observed lifts.
interaction coefficients
Parameters capturing synergies or cannibalizations, also fitted from data.

axioms (2)

domain assumption Journeys act as multiplicative causal factors on the outcome
Invoked to decompose pure versus global effects under overlap.
domain assumption Uncertainty in overlap proportions and lifts can be represented by Monte Carlo draws
Used to propagate uncertainty into the parameter estimates.

pith-pipeline@v0.9.0 · 5467 in / 1436 out tokens · 52182 ms · 2026-05-08T02:08:18.104878+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

Data, competition, and digital platforms.American Economic Review, 114(8):2553–2595, 2024

Dirk Bergemann and Alessandro Bonatti. Data, competition, and digital platforms.American Economic Review, 114(8):2553–2595, 2024. 6 Hierarchical Causal Uplift Modeling in Overlapping Customer Journeys

work page 2024
[2]

Isaac Owusu Asante, Yushi Jiang, and Xiao Luo. Leveraging online omnichannel commerce to enhance consumer engagement in the digital transformation era.Journal of Theoretical and Applied Electronic Commerce Research, 20(1):2, 2024

work page 2024
[3]

Using marketing automation platforms to enhance customer experience during his buying journey

Diana Mariana Dinu, A Radu, and L V ˘aduva. Using marketing automation platforms to enhance customer experience during his buying journey. In32nd EBES Conference, volume 1106. Turkey, 2020

work page 2020
[4]

Benediktus Rolando. Marketing automation in e-commerce: Optimizing customer journey, revenue generation, and customer retention through digital innovation.Jurnal Ilmiah Manajemen Dan Kewirausahaan (JUMANAGE), 4(1):566–580, 2025

work page 2025
[5]

John Wiley & Sons, 2015

Dan Siroker and Pete Koomen.A/B testing: The most powerful way to turn clicks into customers. John Wiley & Sons, 2015

work page 2015
[6]

Controlled experiments on the web: survey and practical guide.Data mining and knowledge discovery, 18(1):140–181, 2009

Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M Henne. Controlled experiments on the web: survey and practical guide.Data mining and knowledge discovery, 18(1):140–181, 2009

work page 2009
[7]

Cambridge University Press, 2020

Ron Kohavi, Diane Tang, and Ya Xu.Trustworthy online controlled experiments: A practical guide to a/b testing. Cambridge University Press, 2020

work page 2020
[8]

Mapping the customer journey: Lessons learned from graph-based online attribution modeling.International journal of research in marketing, 33(3):457–474, 2016

Eva Anderl, Ingo Becker, Florian V on Wangenheim, and Jan Hendrik Schumann. Mapping the customer journey: Lessons learned from graph-based online attribution modeling.International journal of research in marketing, 33(3):457–474, 2016

work page 2016
[9]

Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment.Journal of marketing research, 51(1):40–56, 2014

Hongshuang Li and PK Kannan. Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment.Journal of marketing research, 51(1):40–56, 2014

work page 2014
[10]

Online ads and offline sales: measuring the effect of retail advertising via a controlled experiment on yahoo!Quantitative Marketing and Economics, 12(3):235–266, 2014

Randall A Lewis and David H Reiley. Online ads and offline sales: measuring the effect of retail advertising via a controlled experiment on yahoo!Quantitative Marketing and Economics, 12(3):235–266, 2014

work page 2014
[11]

Online controlled experiments at large scale

Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. Online controlled experiments at large scale. InProceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1168–1176, 2013

work page 2013
[12]

Princeton university press, 2009

Joshua D Angrist and Jörn-Steffen Pischke.Mostly harmless econometrics: An empiricist’s companion. Princeton university press, 2009

work page 2009
[13]

Chapman and Hall/CRC, 2 edition, 2024

George Casella and Roger Berger.Statistical Inference. Chapman and Hall/CRC, 2 edition, 2024

work page 2024
[14]

Improving the sensitivity of online controlled experiments by utilizing pre-experiment data

Alex Deng, Ya Xu, Ron Kohavi, and Toby Walker. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. InProceedings of the sixth ACM international conference on Web search and data mining, pages 123–132, 2013

work page 2013
[15]

Ridge regression: Biased estimation for nonorthogonal problems

Arthur E Hoerl and Robert W Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67, 1970

work page 1970
[16]

Chapman and Hall/CRC, 1995

Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin.Bayesian data analysis. Chapman and Hall/CRC, 1995

work page 1995
[17]

Pattern recognition and machine learning, 2006

Christopher M Bishop. Pattern recognition and machine learning, 2006

work page 2006
[18]

Springer, 1999

Christian P Robert, George Casella, and George Casella.Monte Carlo statistical methods, volume 2. Springer, 1999

work page 1999
[19]

The elements of statistical learning, 2009

Trevor Hastie, Robert Tibshirani, Jerome Friedman, et al. The elements of statistical learning, 2009. 7

work page 2009

[1] [1]

Data, competition, and digital platforms.American Economic Review, 114(8):2553–2595, 2024

Dirk Bergemann and Alessandro Bonatti. Data, competition, and digital platforms.American Economic Review, 114(8):2553–2595, 2024. 6 Hierarchical Causal Uplift Modeling in Overlapping Customer Journeys

work page 2024

[2] [2]

Isaac Owusu Asante, Yushi Jiang, and Xiao Luo. Leveraging online omnichannel commerce to enhance consumer engagement in the digital transformation era.Journal of Theoretical and Applied Electronic Commerce Research, 20(1):2, 2024

work page 2024

[3] [3]

Using marketing automation platforms to enhance customer experience during his buying journey

Diana Mariana Dinu, A Radu, and L V ˘aduva. Using marketing automation platforms to enhance customer experience during his buying journey. In32nd EBES Conference, volume 1106. Turkey, 2020

work page 2020

[4] [4]

Benediktus Rolando. Marketing automation in e-commerce: Optimizing customer journey, revenue generation, and customer retention through digital innovation.Jurnal Ilmiah Manajemen Dan Kewirausahaan (JUMANAGE), 4(1):566–580, 2025

work page 2025

[5] [5]

John Wiley & Sons, 2015

Dan Siroker and Pete Koomen.A/B testing: The most powerful way to turn clicks into customers. John Wiley & Sons, 2015

work page 2015

[6] [6]

Controlled experiments on the web: survey and practical guide.Data mining and knowledge discovery, 18(1):140–181, 2009

Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M Henne. Controlled experiments on the web: survey and practical guide.Data mining and knowledge discovery, 18(1):140–181, 2009

work page 2009

[7] [7]

Cambridge University Press, 2020

Ron Kohavi, Diane Tang, and Ya Xu.Trustworthy online controlled experiments: A practical guide to a/b testing. Cambridge University Press, 2020

work page 2020

[8] [8]

Mapping the customer journey: Lessons learned from graph-based online attribution modeling.International journal of research in marketing, 33(3):457–474, 2016

Eva Anderl, Ingo Becker, Florian V on Wangenheim, and Jan Hendrik Schumann. Mapping the customer journey: Lessons learned from graph-based online attribution modeling.International journal of research in marketing, 33(3):457–474, 2016

work page 2016

[9] [9]

Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment.Journal of marketing research, 51(1):40–56, 2014

Hongshuang Li and PK Kannan. Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment.Journal of marketing research, 51(1):40–56, 2014

work page 2014

[10] [10]

Online ads and offline sales: measuring the effect of retail advertising via a controlled experiment on yahoo!Quantitative Marketing and Economics, 12(3):235–266, 2014

Randall A Lewis and David H Reiley. Online ads and offline sales: measuring the effect of retail advertising via a controlled experiment on yahoo!Quantitative Marketing and Economics, 12(3):235–266, 2014

work page 2014

[11] [11]

Online controlled experiments at large scale

Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. Online controlled experiments at large scale. InProceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1168–1176, 2013

work page 2013

[12] [12]

Princeton university press, 2009

Joshua D Angrist and Jörn-Steffen Pischke.Mostly harmless econometrics: An empiricist’s companion. Princeton university press, 2009

work page 2009

[13] [13]

Chapman and Hall/CRC, 2 edition, 2024

George Casella and Roger Berger.Statistical Inference. Chapman and Hall/CRC, 2 edition, 2024

work page 2024

[14] [14]

Improving the sensitivity of online controlled experiments by utilizing pre-experiment data

Alex Deng, Ya Xu, Ron Kohavi, and Toby Walker. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. InProceedings of the sixth ACM international conference on Web search and data mining, pages 123–132, 2013

work page 2013

[15] [15]

Ridge regression: Biased estimation for nonorthogonal problems

Arthur E Hoerl and Robert W Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67, 1970

work page 1970

[16] [16]

Chapman and Hall/CRC, 1995

Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin.Bayesian data analysis. Chapman and Hall/CRC, 1995

work page 1995

[17] [17]

Pattern recognition and machine learning, 2006

Christopher M Bishop. Pattern recognition and machine learning, 2006

work page 2006

[18] [18]

Springer, 1999

Christian P Robert, George Casella, and George Casella.Monte Carlo statistical methods, volume 2. Springer, 1999

work page 1999

[19] [19]

The elements of statistical learning, 2009

Trevor Hastie, Robert Tibshirani, Jerome Friedman, et al. The elements of statistical learning, 2009. 7

work page 2009