The work derives the optimal ratio of dynamics-to-reward samples that minimizes a bound on return error and characterizes the tradeoff between noisy but cheap rewards versus accurate but expensive ones in imagination-based policy optimization.
International Conference on Machine Learning , pages=
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A variational quantum classifier with normalized amplitude embeddings and bounded observables achieves competitive accuracy with improved robustness and stability over classical baselines in safety-critical settings.
citing papers explorer
-
On Training in Imagination
The work derives the optimal ratio of dynamics-to-reward samples that minimizes a bound on return error and characterizes the tradeoff between noisy but cheap rewards versus accurate but expensive ones in imagination-based policy optimization.
-
SAFE Quantum Machine Learning with Variational Quantum Classifiers
A variational quantum classifier with normalized amplitude embeddings and bounded observables achieves competitive accuracy with improved robustness and stability over classical baselines in safety-critical settings.