The paper establishes the first tilde O(epsilon^{-1}) upper bounds and matching lower bounds for forward-KL-regularized offline contextual bandits under single-policy concentrability in both tabular and general function approximation settings.
USSR computational mathematics and mathematical physics , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Laguerre cells, as they can be non-convex with curved boundaries.
citing papers explorer
-
Fast Rates for Offline Contextual Bandits with Forward-KL Regularization under Single-Policy Concentrability
The paper establishes the first tilde O(epsilon^{-1}) upper bounds and matching lower bounds for forward-KL-regularized offline contextual bandits under single-policy concentrability in both tabular and general function approximation settings.
-
Tessellations of Semi-Discrete Flow Matching
Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Laguerre cells, as they can be non-convex with curved boundaries.