A K-fold cross-fitted proximal bridge estimator for reward-emission and observation-transition functions in confounded POMDPs, with an oracle-comparator error bound decomposed into nuisance and averaging terms.
Deep IV: A flexible approach for counterfactual prediction,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Cross-fitted Proximal Learning for Model-Based Reinforcement Learning
A K-fold cross-fitted proximal bridge estimator for reward-emission and observation-transition functions in confounded POMDPs, with an oracle-comparator error bound decomposed into nuisance and averaging terms.