Attributed, But Not Incremental: Cannibalization-Corrected Attribution for Large-Scale Advertising
Pith reviewed 2026-06-26 03:11 UTC · model grok-4.3
The pith
An experiment-calibrated framework converts sparse lift measurements into daily attribution corrections that reduce calibration error and measured cannibalization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an experiment-calibrated attribution correction framework, which converts sparse lift measurements into daily correction estimates and allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints, substantially reduces calibration error relative to raw attribution and fine-grained ML baselines; when deployed, the system supported budget and traffic adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.
What carries the argument
The experiment-calibrated attribution correction framework that allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints.
If this is right
- Corrected daily attribution signals support more accurate budget allocation and channel diagnosis than raw outputs.
- The framework produces lower calibration error than both uncorrected attribution and fine-grained ML baselines in offline forward validation.
- Structural consistency constraints keep the allocated correction volumes coherent across business hierarchies.
- Deployment of the corrected signals enables strategy adjustments that reduce measured cannibalization rates.
Where Pith is reading between the lines
- If the extrapolation step holds across markets, the method could lower the frequency of full-scale experiments needed for ongoing measurement.
- The approach implies that cannibalization bias is large enough in multi-channel systems to justify explicit correction layers rather than relying on model complexity alone.
- Similar sparse-anchor calibration could be tested in other domains where attribution or measurement systems face overlapping signals.
Load-bearing premise
Sparse lift measurements from incrementality experiments can be reliably extrapolated into daily correction estimates across all periods and hierarchies without material bias from unmodeled temporal or contextual shifts.
What would settle it
A new set of channel-level incrementality experiments run after the correction model is locked, with forward-in-time comparison of corrected attribution against the fresh lift readouts to check whether calibration error stays low.
Figures
read the original abstract
In large-scale paid acquisition and growth advertising systems, production attribution outputs are widely used for daily budget allocation and channel diagnosis. However, paid-attributed conversions such as daily new users (DNU) may systematically overstate true incremental growth when paid channels overlap with organic demand, brand-driven traffic, or other acquisition channels. This attribution-cannibalization mismatch can distort incremental ROI measurement and budget decisions at scale. We propose an experiment-calibrated attribution correction framework that uses incrementality experiments as causal anchors to convert sparse lift measurements into daily correction estimates. To make the corrected signal actionable at production granularity, we further allocate calibrated cannibalization volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality experiment readouts shows that the proposed framework substantially reduces calibration error relative to raw attribution and fine-grained ML baselines. Deployed across multiple global TikTok markets, the system supported budget and traffic strategy adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an experiment-calibrated attribution correction framework for large-scale advertising that converts sparse incrementality experiment lift measurements into daily cannibalization correction estimates and allocates the corrected volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality readouts is reported to show substantially lower calibration error than raw attribution or fine-grained ML baselines; deployment across global TikTok markets is claimed to have supported strategy adjustments followed by an approximately 15-percentage-point reduction in measured cannibalization rate.
Significance. If the central mapping from sparse lifts to stable daily corrections holds without material bias, the framework would provide a practical, production-scale method for aligning attribution outputs with incremental outcomes in paid acquisition systems, directly improving budget allocation and ROI diagnostics where cannibalization from organic/brand overlap is common.
major comments (2)
- [Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.
- [Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below, with proposed revisions where the concerns identify areas for clarification or qualification.
read point-by-point responses
-
Referee: [Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.
Authors: We thank the referee for this observation. The forward-in-time validation uses a temporal split of the available experiment readouts, deriving corrections from earlier windows and evaluating against later ones to approximate prospective application. We acknowledge that this design cannot fully rule out bias from unmeasured temporal or contextual shifts, consistent with the extrapolation assumption already flagged in the stress-test note. We will revise the validation discussion and abstract to explicitly state this limitation and temper the strength of the calibration-error claim accordingly. revision: yes
-
Referee: [Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.
Authors: We agree that the abstract deployment claim requires qualification to avoid over-attribution. The manuscript's deployment section describes the measurement approach, but the abstract is brief. We will revise the abstract paragraph to note that the reduction was observed following strategy adjustments informed by the framework, while acknowledging that concurrent operational changes may have contributed, and to direct readers to the full deployment details. revision: yes
Circularity Check
No significant circularity; derivation anchored on external experiments
full rationale
The framework converts sparse lift measurements from incrementality experiments into daily correction estimates and allocates under structural constraints. These experiments serve as independent causal anchors rather than self-derived inputs. Offline forward-in-time validation uses the same external readouts as benchmarks, and deployment results reference measured cannibalization reductions outside the model's fitted values. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain. The central claim remains falsifiable against external experiment data and does not reduce to its own inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Carlos Aguilar-Palacios, Sergio Muñoz-Romero, and José Luis Rojo-Álvarez. 2021. Causal Quantification of Cannibalization During Promotional Sales in Grocery Retail.IEEE Access9 (2021), 34078–34089. doi:10.1109/ACCESS.2021.3062222
-
[2]
Joel Barajas, Tom Zidar, and Mert Bay. 2020. Advertising Incrementality Measurement using Controlled Geo-Experiments: The Universal App Campaign Case Study. InProceedings of the 2020 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/advertising-incrementality- measurement-using-controlled-geo-experiments%3A-the-universal-app- campa...
2020
-
[3]
John Bencina, Erkut Aykutlug, Yue Chen, Zerui Zhang, Stephanie Sorenson, Shao Tang, and Changshuai Wei. 2025. LiDDA: Data Driven Attribution at LinkedIn. arXiv preprint arXiv:2505.09861(2025). doi:10.48550/arXiv.2505.09861
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.09861 2025
-
[4]
Thomas Blake, Chris Nosko, and Steven Tadelis. 2015. Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment.Econometrica83, 1 (2015), 155–174. doi:10.3982/ECTA12423
-
[5]
Brian Dalessandro, Claudia Perlich, Ori Stitelman, and Foster Provost. 2012. Causally Motivated Attribution for Online Advertising. InProceedings of the 6th International Workshop on Data Mining for Online Advertising and Internet Economy (ADKDD ’12). Association for Computing Machinery, New York, NY, USA, Article 3, 9 pages. doi:10.1145/2351356.2351363
-
[6]
Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network
Ruihuan Du, Yu Zhong, Harikesh S. Nair, Bo Cui, and Ruyang Shou. 2019. Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network. arXiv preprint arXiv:1902.00215(2019). doi:10.48550/arXiv.1902.00215
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1902.00215 2019
-
[7]
Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky
Brett R. Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.Marketing Science38, 2 (2019), 193–225. doi:10. 1287/mksc.2018.1135
arXiv 2019
-
[8]
Garrett A. Johnson, Randall A. Lewis, and Elmar I. Nubbemeyer. 2017. Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness.Journal of Marketing Research54, 6 (2017), 867–884. doi:10.1509/jmr.15.0297
-
[9]
Harang Ju, Michael Zhao, and Sinan Aral. 2025. Complementarity Between Paid and Organic Installs in Mobile App Advertising.arXiv preprint arXiv:2504.16151 (2025). doi:10.48550/arXiv.2504.16151
-
[10]
Randall A. Lewis and Justin M. Rao. 2015. The Unfavorable Economics of Mea- suring the Returns to Advertising.The Quarterly Journal of Economics130, 4 (2015), 1941–1973. doi:10.1093/qje/qjv023
-
[11]
Ning Li, Sai Kumar Arava, Chen Dong, William Yan, Abhishek Pani, and Linda Boyle. 2018. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. InProceedings of the 2018 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/deep-neural-net-with-attention-for- multi-channel-multi-touch-attribution/2018
2018
-
[12]
Xuhui Shao and Lexin Li. 2011. Data-driven Multi-touch Attribution Models. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11). Association for Computing Machinery, New York, NY, USA, 258–264. doi:10.1145/2020408.2020453
-
[13]
Dongdong Yang, Kevin Dyer, and Senzhang Wang. 2020. Interpretable Deep Learn- ing Model for Online Multi-touch Attribution.arXiv preprint arXiv:2004.00384 (2020). doi:10.48550/arXiv.2004.00384
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.