A Thompson sampling algorithm jointly infers unknown network interference and learns optimal individual-level treatments, with sublinear regret for additive spillover models and an explore-then-commit variant for general neighborhood interference.
Therefore the regret bound of Theorem 1 applies to ideal posterior sampling over the unknown graph and reward parameters
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Adaptive Policy Learning Under Unknown Network Interference
A Thompson sampling algorithm jointly infers unknown network interference and learns optimal individual-level treatments, with sublinear regret for additive spillover models and an explore-then-commit variant for general neighborhood interference.