Learning to target with network interference

Hamsa Bastani; Osbert Bastani; Xiaomeng Wang; Zhimei Ren

arxiv: 2605.27794 · v1 · pith:3OZYB2KFnew · submitted 2026-05-27 · 📊 stat.ML · cs.LG· stat.ME

Learning to target with network interference

Xiaomeng Wang , Hamsa Bastani , Osbert Bastani , Zhimei Ren This is my paper

Pith reviewed 2026-06-29 10:22 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.ME

keywords network interferencelinear banditsregret boundsadaptive targetingspillover effectsonline learningsparse models

0 comments

The pith

Knowledge of network interference structure sets the achievable regret in linear bandit targeting, with ignoring it causing inefficiency in large populations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that treating adaptive targeting under network interference as a standard linear bandit without accounting for spillovers leads to inefficient learning, especially in large groups. It derives regret lower bounds for three regimes of prior knowledge about the interference structure—full support, column support sizes, and none—and gives algorithms that nearly match those bounds in each case. This matters for settings where treatments affect connected individuals, such as social networks or markets, because it clarifies how much structural information is needed to learn efficiently. The work unifies the limits and methods across knowledge levels under a linear sparse interference model.

Core claim

In the linear model with sparse network interference, where each individual's outcome is affected by at most a few others, regret lower bounds characterize the fundamental limits for each level of knowledge about the interference structure. Algorithms are developed that achieve near-optimal regret when the full support is known, when only column support sizes are known, and when no prior knowledge is available, showing how the amount of structural information governs learning efficiency.

What carries the argument

Regret lower bounds and corresponding algorithms that adapt to the available information about the support of the interference matrix in the linear outcome model.

If this is right

Standard linear bandit methods are provably suboptimal when spillover effects exist but are ignored.
Full knowledge of the interference support permits the lowest regret among the three regimes.
Knowledge of column support sizes still improves regret over having no structural information.
Algorithms exist for each regime that can be deployed in targeting problems with network effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Collecting even partial network data before targeting could reduce total regret compared to starting with no information.
The sparsity assumption suggests that methods for estimating the support from initial observations may be worth developing to move between regimes.
Results could inform sample-efficient design in other domains with spillover, such as market experiments or epidemic control.

Load-bearing premise

The outcomes follow a linear model in which each individual is affected by at most a few others.

What would settle it

An experiment in a large sparse linear network where a standard linear bandit algorithm without structure information matches the regret of the structure-aware algorithms would contradict the lower bound on ignoring the network.

Figures

Figures reproduced from arXiv: 2605.27794 by Hamsa Bastani, Osbert Bastani, Xiaomeng Wang, Zhimei Ren.

**Figure 2.** Figure 2: Per-individual regret of proposed algorithms versus the network dimension [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Per-individual cumulative regret comparison. Blue denotes the baseline; orange denotes [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Cumulative regret comparison for different signal strengths. [PITH_FULL_IMAGE:figures/full_fig_p036_4.png] view at source ↗

**Figure 5.** Figure 5: Cumulative regret comparison for different sparsity levels. [PITH_FULL_IMAGE:figures/full_fig_p037_5.png] view at source ↗

read the original abstract

This paper studies adaptive targeting under network interference in a bandit setting, where treatments applied to one individual may affect others through spillover effects. We consider a linear model in a sparse regime, where each individual's outcome can be affected by at most a few others. We first establish a regret lower bound showing that ignoring the network structure and reducing the problem to a standard linear bandit inevitably leads to inefficient learning, particularly in large populations. To understand how structural information can be leveraged, we analyze regimes with varying levels of knowledge of the interference structure: (1) full support knowledge, (2) knowledge of the column support sizes, and (3) no prior knowledge. For each regime, we establish regret lower bounds characterizing the fundamental limits of learning, and develop algorithms that achieve near-optimal regret. Together, our results provide a unified view of how knowledge of the interference structure governs the efficiency of online learning under interference, and offer practical adaptive targeting algorithms in each setting. Numerical experiments on synthetic and real-world data demonstrate the practical benefits of our algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows ignoring network structure in sparse linear interference bandits leads to bad regret and gives near-matching algorithms across three knowledge regimes.

read the letter

The core result is a lower bound proving that a standard linear bandit approach wastes samples when interference is present, even in the sparse regime. They then give algorithms for full support knowledge, known column support sizes, and no knowledge at all, each nearly matching its own lower bound.

This separation into regimes is the useful part. It shows how much structure you need to know to get good rates, which is directly relevant for targeting applications. The linear sparse model is standard enough that the bounds feel grounded rather than contrived.

The experiments on synthetic and real data are mentioned as demonstrating practical gains, but the abstract gives no numbers or controls, so their weight is unclear. The proofs are also not visible here, though the stress-test found no internal gaps in the argument.

A minor soft spot is that the no-knowledge algorithm must still be doing some form of structure discovery; it would help to see how expensive that step is in practice. Otherwise the claims line up with the stated model.

This is for people working on bandits or causal inference with interference. It has enough new lower bounds and matching algorithms to deserve referee time rather than a desk reject.

Referee Report

0 major / 2 minor

Summary. The manuscript studies adaptive targeting under network interference in a bandit setting with a linear sparse model where each individual's outcome is affected by at most a few others. It first establishes a regret lower bound showing that ignoring the network structure and reducing to a standard linear bandit leads to inefficient learning in large populations. It then analyzes three regimes of knowledge about the interference structure—(1) full support knowledge, (2) knowledge of column support sizes, and (3) no prior knowledge—providing regret lower bounds characterizing fundamental limits and developing algorithms that achieve near-optimal regret in each regime. Numerical experiments on synthetic and real-world data demonstrate practical benefits.

Significance. If the results hold, the work provides a unified view of how knowledge of interference structure governs efficiency of online learning under network effects, with practical adaptive targeting algorithms for each knowledge regime. Credit is due for the regret lower bounds matched by algorithms across the three regimes and for the empirical validation on real data; these elements strengthen the contribution to the bandit literature with interference.

minor comments (2)

[Abstract] Abstract: the description of the experiments does not specify the performance metrics, baselines, or statistical controls used to demonstrate benefits; adding one sentence on these would improve clarity without altering the central claims.
[Introduction] The linear sparse model assumption is stated clearly but its implications for the regret bounds could be illustrated with a brief example in the introduction to aid readers unfamiliar with the sparse regime.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation of minor revision. The summary accurately captures the paper's contributions on regret lower bounds and near-optimal algorithms for linear bandits under sparse network interference in the three knowledge regimes. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper derives regret lower bounds for ignoring network structure and matching algorithms across knowledge regimes using standard linear bandit analysis under a sparse interference model. No quoted steps reduce predictions or bounds to fitted parameters by construction, no self-citation chains support load-bearing claims, and the central results rely on independent proof techniques rather than renaming or ansatz smuggling. This is the expected non-finding for a theoretical regret paper grounded in external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate specific free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5715 in / 963 out tokens · 26166 ms · 2026-06-29T10:22:55.234964+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references

[1]

In addition, for any v, r∈R d such thatsupp(v)∩supp(r) =∅and∥v∥ 0 +∥r∥ 0 ≤m, we have |v⊤Ar|= (v+r) ⊤A(v+r)−(v−r) ⊤A(v−r) 4 ≤x+ 1 32 .(21) We now apply the above results withm= 3s. Next, for an arbitrary vectorusatisfying∥u[Sc]∥1 ≤ 3∥u[S]∥1 for someS⊆[d]with|S| ≤s, we decompose its coordinates into blocks of sizeswith the first blockI 1 containing the indi...
[2]

By the definition of the blocks, we have for any k≥3that X k≥3 ∥u[Ik]∥2 ≤ X k≥3 1√s ∥u[Ik−1]∥1 ≤ 1√s ∥u[Ic 1]∥1 ≤ 3√s ∥u[I1]∥1 ≤3∥u[I 1]∥2

As for the second term, we have 1 T1 T1X t=1 KX k=3 X j∈Ik ujat,j X j∈I1∪I2 ujat,j ≤ KX k=3 1 T1 T1X t=1 X j∈Ik ujat,j X j∈I1∪I2 ujat,j ≤ x+ 1 32 KX k=3 ∥u[Ik]∥2∥u[I1 ∪I 2]∥2, where the last step uses the result in Equation (21). By the definition of the blocks, we have for any k≥3that X k≥3 ∥u[Ik]∥2 ≤ X k≥3 1√s ∥u[Ik−1]∥1 ≤ 1√s ∥u[Ic 1]∥1 ≤ 3√s ∥u[I1]∥1 ...

[1] [1]

In addition, for any v, r∈R d such thatsupp(v)∩supp(r) =∅and∥v∥ 0 +∥r∥ 0 ≤m, we have |v⊤Ar|= (v+r) ⊤A(v+r)−(v−r) ⊤A(v−r) 4 ≤x+ 1 32 .(21) We now apply the above results withm= 3s. Next, for an arbitrary vectorusatisfying∥u[Sc]∥1 ≤ 3∥u[S]∥1 for someS⊆[d]with|S| ≤s, we decompose its coordinates into blocks of sizeswith the first blockI 1 containing the indi...

[2] [2]

By the definition of the blocks, we have for any k≥3that X k≥3 ∥u[Ik]∥2 ≤ X k≥3 1√s ∥u[Ik−1]∥1 ≤ 1√s ∥u[Ic 1]∥1 ≤ 3√s ∥u[I1]∥1 ≤3∥u[I 1]∥2

As for the second term, we have 1 T1 T1X t=1 KX k=3 X j∈Ik ujat,j X j∈I1∪I2 ujat,j ≤ KX k=3 1 T1 T1X t=1 X j∈Ik ujat,j X j∈I1∪I2 ujat,j ≤ x+ 1 32 KX k=3 ∥u[Ik]∥2∥u[I1 ∪I 2]∥2, where the last step uses the result in Equation (21). By the definition of the blocks, we have for any k≥3that X k≥3 ∥u[Ik]∥2 ≤ X k≥3 1√s ∥u[Ik−1]∥1 ≤ 1√s ∥u[Ic 1]∥1 ≤ 3√s ∥u[I1]∥1 ...