Attributed, But Not Incremental: Cannibalization-Corrected Attribution for Large-Scale Advertising

Bowen Yuan; Donghui Li; Lijing Song; Qinxin Chen; Zili Yang

arxiv: 2606.26690 · v1 · pith:EKCW4QOQnew · submitted 2026-06-25 · 💻 cs.IR · cs.LG

Attributed, But Not Incremental: Cannibalization-Corrected Attribution for Large-Scale Advertising

Donghui Li , Bowen Yuan , Zili Yang , Qinxin Chen , Lijing Song This is my paper

Pith reviewed 2026-06-26 03:11 UTC · model grok-4.3

classification 💻 cs.IR cs.LG

keywords attribution correctioncannibalizationincrementality experimentsadvertising measurementbudget allocationcausal calibrationstructural constraints

0 comments

The pith

An experiment-calibrated framework converts sparse lift measurements into daily attribution corrections that reduce calibration error and measured cannibalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to establish that raw paid attribution overstates true incremental conversions when channels overlap with organic demand or other sources, which distorts ROI and budget choices at scale. The proposed method anchors corrections on incrementality experiments, turns those sparse lifts into daily estimates, and distributes the calibrated cannibalization volumes across hierarchies while preserving structural consistency. Offline checks against held-out experiment readouts show lower error than raw attribution or fine-grained ML baselines. Real deployment across multiple markets coincided with an approximately 15-percentage-point drop in the observed cannibalization rate, making the corrected signal usable for daily decisions.

Core claim

The central claim is that an experiment-calibrated attribution correction framework, which converts sparse lift measurements into daily correction estimates and allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints, substantially reduces calibration error relative to raw attribution and fine-grained ML baselines; when deployed, the system supported budget and traffic adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.

What carries the argument

The experiment-calibrated attribution correction framework that allocates calibrated cannibalization volume across business hierarchies under structural consistency constraints.

If this is right

Corrected daily attribution signals support more accurate budget allocation and channel diagnosis than raw outputs.
The framework produces lower calibration error than both uncorrected attribution and fine-grained ML baselines in offline forward validation.
Structural consistency constraints keep the allocated correction volumes coherent across business hierarchies.
Deployment of the corrected signals enables strategy adjustments that reduce measured cannibalization rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the extrapolation step holds across markets, the method could lower the frequency of full-scale experiments needed for ongoing measurement.
The approach implies that cannibalization bias is large enough in multi-channel systems to justify explicit correction layers rather than relying on model complexity alone.
Similar sparse-anchor calibration could be tested in other domains where attribution or measurement systems face overlapping signals.

Load-bearing premise

Sparse lift measurements from incrementality experiments can be reliably extrapolated into daily correction estimates across all periods and hierarchies without material bias from unmodeled temporal or contextual shifts.

What would settle it

A new set of channel-level incrementality experiments run after the correction model is locked, with forward-in-time comparison of corrected attribution against the fresh lift readouts to check whether calibration error stays low.

Figures

Figures reproduced from arXiv: 2606.26690 by Bowen Yuan, Donghui Li, Lijing Song, Qinxin Chen, Zili Yang.

**Figure 2.** Figure 2: Overview of the proposed ETDC+HCA framework. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Operational locality validation under a local pol [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

In large-scale paid acquisition and growth advertising systems, production attribution outputs are widely used for daily budget allocation and channel diagnosis. However, paid-attributed conversions such as daily new users (DNU) may systematically overstate true incremental growth when paid channels overlap with organic demand, brand-driven traffic, or other acquisition channels. This attribution-cannibalization mismatch can distort incremental ROI measurement and budget decisions at scale. We propose an experiment-calibrated attribution correction framework that uses incrementality experiments as causal anchors to convert sparse lift measurements into daily correction estimates. To make the corrected signal actionable at production granularity, we further allocate calibrated cannibalization volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality experiment readouts shows that the proposed framework substantially reduces calibration error relative to raw attribution and fine-grained ML baselines. Deployed across multiple global TikTok markets, the system supported budget and traffic strategy adjustments that were followed by an approximately 15-percentage-point reduction in the measured cannibalization rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical deployed method for fixing cannibalization in ad attribution via experiment calibration and hierarchical allocation, though the abstract leaves the daily extrapolation step too thin to assess fully.

read the letter

The core of this paper is a production framework that takes sparse lift measurements from incrementality experiments, converts them into daily correction factors for attribution, and then allocates the cannibalization volume across business hierarchies while enforcing structural consistency. That combination is what they present as new.

It does well by anchoring corrections on actual causal experiments rather than pure observational modeling. The offline forward-in-time validation shows lower calibration error than raw attribution or fine-grained ML baselines, and the deployment across TikTok markets produced a measurable 15-percentage-point drop in the cannibalization rate after the system informed budget and traffic changes. Those outcomes give the work some external grounding.

The soft spot is the missing detail on the extrapolation step. The abstract does not show the mapping from sparse experiment readouts to daily estimates, nor any error analysis or handling of temporal or contextual shifts. The stress-test concern about unmodeled drift between experiment windows and production days is plausible on the given information; without equations or data rules it is hard to tell whether the validation could catch systematic bias in that mapping. If the full paper supplies the math and checks, that would strengthen the central claim.

This is aimed at engineers and analysts running large-scale paid acquisition systems who already use attribution for daily decisions. A reader facing similar measurement distortions would get value from the experiment-anchored approach and the reported deployment impact.

I would send it to peer review. The problem is real for platform budget allocation, the use of independent experiment anchors supplies some falsifiability, and the production results are concrete even if the methods need more exposition.

Referee Report

2 major / 0 minor

Summary. The paper proposes an experiment-calibrated attribution correction framework for large-scale advertising that converts sparse incrementality experiment lift measurements into daily cannibalization correction estimates and allocates the corrected volume across business hierarchies under structural consistency constraints. Offline forward-in-time validation against channel-level incrementality readouts is reported to show substantially lower calibration error than raw attribution or fine-grained ML baselines; deployment across global TikTok markets is claimed to have supported strategy adjustments followed by an approximately 15-percentage-point reduction in measured cannibalization rate.

Significance. If the central mapping from sparse lifts to stable daily corrections holds without material bias, the framework would provide a practical, production-scale method for aligning attribution outputs with incremental outcomes in paid acquisition systems, directly improving budget allocation and ROI diagnostics where cannibalization from organic/brand overlap is common.

major comments (2)

[Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.
[Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below, with proposed revisions where the concerns identify areas for clarification or qualification.

read point-by-point responses

Referee: [Abstract] Abstract (validation paragraph): The forward-in-time validation is anchored to the same sparse incrementality experiment readouts used to derive the daily corrections; this setup cannot detect systematic bias arising from temporal drift, seasonality, or unmeasured contextual shifts between experiment windows and production periods, leaving the claim of substantially reduced calibration error vulnerable to the extrapolation assumption identified in the stress-test note.

Authors: We thank the referee for this observation. The forward-in-time validation uses a temporal split of the available experiment readouts, deriving corrections from earlier windows and evaluating against later ones to approximate prospective application. We acknowledge that this design cannot fully rule out bias from unmeasured temporal or contextual shifts, consistent with the extrapolation assumption already flagged in the stress-test note. We will revise the validation discussion and abstract to explicitly state this limitation and temper the strength of the calibration-error claim accordingly. revision: yes
Referee: [Abstract] Abstract (deployment paragraph): The reported 15-percentage-point reduction in measured cannibalization rate is presented as a deployment outcome, but without details on the measurement protocol, exclusion rules, or controls for concurrent changes in traffic strategy, it is unclear whether the reduction can be attributed to the corrected attribution signal rather than other operational adjustments.

Authors: We agree that the abstract deployment claim requires qualification to avoid over-attribution. The manuscript's deployment section describes the measurement approach, but the abstract is brief. We will revise the abstract paragraph to note that the reduction was observed following strategy adjustments informed by the framework, while acknowledging that concurrent operational changes may have contributed, and to direct readers to the full deployment details. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation anchored on external experiments

full rationale

The framework converts sparse lift measurements from incrementality experiments into daily correction estimates and allocates under structural constraints. These experiments serve as independent causal anchors rather than self-derived inputs. Offline forward-in-time validation uses the same external readouts as benchmarks, and deployment results reference measured cannibalization reductions outside the model's fitted values. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain. The central claim remains falsifiable against external experiment data and does not reduce to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.1-grok · 5718 in / 995 out tokens · 36670 ms · 2026-06-26T03:11:30.594575+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

13 extracted references · 10 canonical work pages · 2 internal anchors

[1]

Carlos Aguilar-Palacios, Sergio Muñoz-Romero, and José Luis Rojo-Álvarez. 2021. Causal Quantification of Cannibalization During Promotional Sales in Grocery Retail.IEEE Access9 (2021), 34078–34089. doi:10.1109/ACCESS.2021.3062222

work page doi:10.1109/access.2021.3062222 2021
[2]

Joel Barajas, Tom Zidar, and Mert Bay. 2020. Advertising Incrementality Measurement using Controlled Geo-Experiments: The Universal App Campaign Case Study. InProceedings of the 2020 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/advertising-incrementality- measurement-using-controlled-geo-experiments%3A-the-universal-app- campa...

2020
[3]

John Bencina, Erkut Aykutlug, Yue Chen, Zerui Zhang, Stephanie Sorenson, Shao Tang, and Changshuai Wei. 2025. LiDDA: Data Driven Attribution at LinkedIn. arXiv preprint arXiv:2505.09861(2025). doi:10.48550/arXiv.2505.09861

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.09861 2025
[4]

Thomas Blake, Chris Nosko, and Steven Tadelis. 2015. Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment.Econometrica83, 1 (2015), 155–174. doi:10.3982/ECTA12423

work page doi:10.3982/ecta12423 2015
[5]

Brian Dalessandro, Claudia Perlich, Ori Stitelman, and Foster Provost. 2012. Causally Motivated Attribution for Online Advertising. InProceedings of the 6th International Workshop on Data Mining for Online Advertising and Internet Economy (ADKDD ’12). Association for Computing Machinery, New York, NY, USA, Article 3, 9 pages. doi:10.1145/2351356.2351363

work page doi:10.1145/2351356.2351363 2012
[6]

Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network

Ruihuan Du, Yu Zhong, Harikesh S. Nair, Bo Cui, and Ruyang Shou. 2019. Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network. arXiv preprint arXiv:1902.00215(2019). doi:10.48550/arXiv.1902.00215

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1902.00215 2019
[7]

Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky

Brett R. Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.Marketing Science38, 2 (2019), 193–225. doi:10. 1287/mksc.2018.1135

arXiv 2019
[8]

Johnson, Randall A

Garrett A. Johnson, Randall A. Lewis, and Elmar I. Nubbemeyer. 2017. Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness.Journal of Marketing Research54, 6 (2017), 867–884. doi:10.1509/jmr.15.0297

work page doi:10.1509/jmr.15.0297 2017
[9]

Harang Ju, Michael Zhao, and Sinan Aral. 2025. Complementarity Between Paid and Organic Installs in Mobile App Advertising.arXiv preprint arXiv:2504.16151 (2025). doi:10.48550/arXiv.2504.16151

work page doi:10.48550/arxiv.2504.16151 2025
[10]

Jon Vaver and Jim Koehler

Randall A. Lewis and Justin M. Rao. 2015. The Unfavorable Economics of Mea- suring the Returns to Advertising.The Quarterly Journal of Economics130, 4 (2015), 1941–1973. doi:10.1093/qje/qjv023

work page doi:10.1093/qje/qjv023 2015
[11]

Ning Li, Sai Kumar Arava, Chen Dong, William Yan, Abhishek Pani, and Linda Boyle. 2018. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. InProceedings of the 2018 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/deep-neural-net-with-attention-for- multi-channel-multi-touch-attribution/2018

2018
[12]

Xuhui Shao and Lexin Li. 2011. Data-driven Multi-touch Attribution Models. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11). Association for Computing Machinery, New York, NY, USA, 258–264. doi:10.1145/2020408.2020453

work page doi:10.1145/2020408.2020453 2011
[13]

Dongdong Yang, Kevin Dyer, and Senzhang Wang. 2020. Interpretable Deep Learn- ing Model for Online Multi-touch Attribution.arXiv preprint arXiv:2004.00384 (2020). doi:10.48550/arXiv.2004.00384

work page doi:10.48550/arxiv.2004.00384 2020

[1] [1]

Carlos Aguilar-Palacios, Sergio Muñoz-Romero, and José Luis Rojo-Álvarez. 2021. Causal Quantification of Cannibalization During Promotional Sales in Grocery Retail.IEEE Access9 (2021), 34078–34089. doi:10.1109/ACCESS.2021.3062222

work page doi:10.1109/access.2021.3062222 2021

[2] [2]

Joel Barajas, Tom Zidar, and Mert Bay. 2020. Advertising Incrementality Measurement using Controlled Geo-Experiments: The Universal App Campaign Case Study. InProceedings of the 2020 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/advertising-incrementality- measurement-using-controlled-geo-experiments%3A-the-universal-app- campa...

2020

[3] [3]

John Bencina, Erkut Aykutlug, Yue Chen, Zerui Zhang, Stephanie Sorenson, Shao Tang, and Changshuai Wei. 2025. LiDDA: Data Driven Attribution at LinkedIn. arXiv preprint arXiv:2505.09861(2025). doi:10.48550/arXiv.2505.09861

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.09861 2025

[4] [4]

Thomas Blake, Chris Nosko, and Steven Tadelis. 2015. Consumer Heterogeneity and Paid Search Effectiveness: A Large-Scale Field Experiment.Econometrica83, 1 (2015), 155–174. doi:10.3982/ECTA12423

work page doi:10.3982/ecta12423 2015

[5] [5]

Brian Dalessandro, Claudia Perlich, Ori Stitelman, and Foster Provost. 2012. Causally Motivated Attribution for Online Advertising. InProceedings of the 6th International Workshop on Data Mining for Online Advertising and Internet Economy (ADKDD ’12). Association for Computing Machinery, New York, NY, USA, Article 3, 9 pages. doi:10.1145/2351356.2351363

work page doi:10.1145/2351356.2351363 2012

[6] [6]

Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network

Ruihuan Du, Yu Zhong, Harikesh S. Nair, Bo Cui, and Ruyang Shou. 2019. Causally Driven Incremental Multi Touch Attribution Using a Recurrent Neural Network. arXiv preprint arXiv:1902.00215(2019). doi:10.48550/arXiv.1902.00215

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1902.00215 2019

[7] [7]

Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky

Brett R. Gordon, Florian Zettelmeyer, Neha Bhargava, and Dan Chapsky. 2019. A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook.Marketing Science38, 2 (2019), 193–225. doi:10. 1287/mksc.2018.1135

arXiv 2019

[8] [8]

Johnson, Randall A

Garrett A. Johnson, Randall A. Lewis, and Elmar I. Nubbemeyer. 2017. Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness.Journal of Marketing Research54, 6 (2017), 867–884. doi:10.1509/jmr.15.0297

work page doi:10.1509/jmr.15.0297 2017

[9] [9]

Harang Ju, Michael Zhao, and Sinan Aral. 2025. Complementarity Between Paid and Organic Installs in Mobile App Advertising.arXiv preprint arXiv:2504.16151 (2025). doi:10.48550/arXiv.2504.16151

work page doi:10.48550/arxiv.2504.16151 2025

[10] [10]

Jon Vaver and Jim Koehler

Randall A. Lewis and Justin M. Rao. 2015. The Unfavorable Economics of Mea- suring the Returns to Advertising.The Quarterly Journal of Economics130, 4 (2015), 1941–1973. doi:10.1093/qje/qjv023

work page doi:10.1093/qje/qjv023 2015

[11] [11]

Ning Li, Sai Kumar Arava, Chen Dong, William Yan, Abhishek Pani, and Linda Boyle. 2018. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. InProceedings of the 2018 KDD Workshop on Advertising and Data Mining. https://www.adkdd.org/papers/deep-neural-net-with-attention-for- multi-channel-multi-touch-attribution/2018

2018

[12] [12]

Xuhui Shao and Lexin Li. 2011. Data-driven Multi-touch Attribution Models. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11). Association for Computing Machinery, New York, NY, USA, 258–264. doi:10.1145/2020408.2020453

work page doi:10.1145/2020408.2020453 2011

[13] [13]

Dongdong Yang, Kevin Dyer, and Senzhang Wang. 2020. Interpretable Deep Learn- ing Model for Online Multi-touch Attribution.arXiv preprint arXiv:2004.00384 (2020). doi:10.48550/arXiv.2004.00384

work page doi:10.48550/arxiv.2004.00384 2020