Establishes stability of belief filters to model error in log-linear and neural-softmax POMDPs under mixing conditions and derives finite-sample guarantees for preference-based reward learning that decouple statistical error from model-mismatch bias.
2020.9304386
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Introduces bcw single-sigmoid model for GRNs that recovers product-of-logistics critical value 1/2^{m_i} and shared equilibrium, with Jacobian and stability comparisons.
A conformal prediction certification for belief-space safety filters focuses verification on reliable inference regions to produce less conservative yet high-probability safe filters than standard baselines in human-vehicle simulations.
citing papers explorer
-
Preference-Based Reward Learning under Partial Observability with Inexact Dynamics
Establishes stability of belief filters to model error in log-linear and neural-softmax POMDPs under mixing conditions and derives finite-sample guarantees for preference-based reward learning that decouple statistical error from model-mismatch bias.