Guidance for twisted particle filter: a continuous-time perspective

Jianfeng Lu; Yuliang Wang

arxiv: 2409.02399 · v2 · submitted 2024-09-04 · 📊 stat.CO · math.OC

Guidance for twisted particle filter: a continuous-time perspective

Jianfeng Lu , Yuliang Wang This is my paper

Pith reviewed 2026-05-23 21:04 UTC · model grok-4.3

classification 📊 stat.CO math.OC

keywords twisted particle filtercontinuous timeneural networkKL divergencepath measuresimportance samplingMonte Carlosequential Monte Carlo

0 comments

The pith

A neural network trained to minimize KL divergence between path measures guides the Twisted-Path Particle Filter in continuous time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Twisted-Path Particle Filter, which parameterizes a twisting function with a neural network and trains it by minimizing a KL-divergence between path measures. This construction draws from control-based importance sampling methods that operate directly in continuous time. The goal is to lower the variance of Monte Carlo estimates for high-dimensional distributions and their normalizing constants. Numerical experiments are presented to show that the resulting algorithm improves approximation quality over standard particle filters.

Core claim

The Twisted-Path Particle Filter parameterizes its twisting function by a neural network and trains the network parameters to minimize a specific KL-divergence between path measures; the design is guided by existing control-based importance sampling algorithms in the continuous-time setting, and experiments indicate that the trained filter produces lower-variance Monte Carlo approximations than the untwisted particle filter.

What carries the argument

The neural-network-parameterized twisting function trained by minimizing KL divergence between path measures.

If this is right

Lower variance Monte Carlo estimates of normalizing constants become available for continuous-time models.
The same training procedure can be applied to other path-space importance samplers that admit a twisting function.
The continuous-time perspective supplies a principled objective for choosing the twisting function in discrete-time twisted particle filters.
High-dimensional filtering problems can be addressed without hand-crafting the twisting function.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may extend to settings where the underlying process is only partially observed, provided the path-measure KL objective can still be estimated.
Because the training objective is defined on entire paths, the approach could be combined with existing continuous-time control methods to produce hybrid samplers.
If the KL minimization succeeds, the resulting filter may serve as a building block for more accurate sequential Monte Carlo algorithms in non-Markovian or infinite-dimensional state spaces.

Load-bearing premise

Training the neural network to minimize the chosen KL divergence between path measures produces a net reduction in the variance of the particle filter estimator.

What would settle it

A side-by-side run on the same continuous-time model in which the empirical variance of the Twisted-Path Particle Filter estimator, after training, is no smaller than that of the ordinary particle filter.

Figures

Figures reproduced from arXiv: 2409.02399 by Jianfeng Lu, Yuliang Wang.

**Figure 1.** Figure 1: Linear Gaussian model: compare TPPF (trained with LRE, LCE, or LRECE) and its competitors (BPF, iAPF and FA-APF). Boxplot for log Z using 1000 replicates, with configurations d ∈ {2, 5, 15, 20}. The red cross represents the mean and the red dash line represents the medium. d=2 d=5 d=15 d=20 BPF 0.60 1.14 3.85 5.95 TPPF(RE) 0.27 0.38 0.87 1.23 TPPF(CE) 0.34 0.82 3.51 5.70 TPPF(RECE) 0.31 0.54 0.86 1.21 FA-A… view at source ↗

**Figure 2.** Figure 2: Lorenz-96 model: compare TPPF (trained with LRE, LCE, or LRECE) and its competitors (BPF, iAPF and FA-APF). (a): Empirical standard deviation for different external force strength α under dimension d = 3, using 20 replicates. (b): Boxplot of log Z using 20 replicates for d = 3, α = 3.0. The red cross represents the mean and the red dash line represents the medium. As we can see from [PITH_FULL_IMAGE:figur… view at source ↗

read the original abstract

The particle filter (PF), also known as sequential Monte Carlo (SMC), approximates high-dimensional probability distributions and their normalizing constants in the discrete-time setting. To reduce the variance of the Monte Carlo approximation, various twisted particle filters (TPFs) have been proposed, in which a twisting function is chosen or learned to modify the Markov transition kernel. Guided by existing control-based importance sampling algorithms in the continuous-time setting, we propose a novel algorithm called the ``Twisted-Path Particle Filter'' (TPPF), in which the twisting function is parameterized by a neural network and trained to minimize a specific KL-divergence between path measures. Numerical experiments illustrate the capability of the proposed algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a new continuous-time twisted particle filter with neural-net twisting trained on path KL, but the experiments stay illustrative and leave the net practical gain open.

read the letter

The central new idea is the Twisted-Path Particle Filter, where a neural network parameterizes the twisting function and is trained to minimize a KL divergence on path measures, drawing from continuous-time control-based importance sampling. This extends the discrete-time twisted particle filters by moving to a continuous-time path perspective with learnable twisting. The paper does a decent job laying out the motivation and the algorithm clearly. The soft spot is that the numerical experiments are presented as illustrations rather than rigorous benchmarks, so we lack data on whether the variance reduction outweighs the training effort in practice. The abstract didn't include derivations or error analysis, but the stress-test indicates the full paper avoids internal contradictions. This work is aimed at people in sequential Monte Carlo and stochastic control who want to explore variance reduction techniques. A reader looking for a new algorithm idea with some theoretical grounding could get value from it. It deserves a serious referee because it offers a concrete proposal that builds on established ideas without obvious flaws. I recommend sending it to peer review.

Referee Report

2 major / 3 minor

Summary. The paper proposes the Twisted-Path Particle Filter (TPPF) as an extension of twisted particle filters to the continuous-time setting. A neural network parameterizes the twisting function, which is trained by minimizing a KL divergence between path measures; the design is guided by existing control-based importance sampling methods. Numerical experiments are presented as illustrations of the algorithm's capability for Monte Carlo approximation of distributions and normalizing constants.

Significance. If the KL-trained twisting yields a net reduction in estimator variance after accounting for training cost, the continuous-time control perspective could provide a principled route to improved SMC performance on path-space problems. The explicit link to control-based IS is a constructive contribution that may aid future work on learned proposals.

major comments (2)

[§3] §3 (algorithm derivation): the manuscript states that the chosen KL objective between path measures produces an improved twisting function, but supplies no explicit variance bound or bias-variance decomposition showing that the resulting estimator variance is strictly smaller than the untwisted PF (or existing TPF baselines) for the same number of particles; without this, the central claim that the method 'improves Monte Carlo approximation' rests on the illustrative experiments alone.
[§5] §5 (numerical experiments): the reported runs use small state dimensions and short time horizons; no scaling study or comparison against a non-neural twisted filter (e.g., analytically chosen twisting) is given, so it remains unclear whether the NN parameterization delivers a practical advantage once training overhead is included.

minor comments (3)

[§2] Notation for the continuous-time path measure and the twisting function should be introduced with a single consistent symbol table; currently the same symbol appears to be reused for the discrete-time and continuous-time cases.
[§5] Figure captions should explicitly state the number of particles, the dimension of the state, and the training budget (epochs / samples) so that the plots can be reproduced without consulting the main text.
The reference list omits several standard works on continuous-time SMC and on control-based importance sampling (e.g., the original papers on the continuous-time Feynman-Kac framework); adding them would clarify the precise novelty of the KL choice.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and constructive comments. We address each major comment below, providing clarifications on the theoretical motivation and the scope of the numerical experiments.

read point-by-point responses

Referee: [§3] §3 (algorithm derivation): the manuscript states that the chosen KL objective between path measures produces an improved twisting function, but supplies no explicit variance bound or bias-variance decomposition showing that the resulting estimator variance is strictly smaller than the untwisted PF (or existing TPF baselines) for the same number of particles; without this, the central claim that the method 'improves Monte Carlo approximation' rests on the illustrative experiments alone.

Authors: We agree that the manuscript does not derive an explicit finite-particle variance bound. The KL objective is selected because it arises directly from the continuous-time control formulation of importance sampling, where the optimal twisting function minimizes a path-space cost that is known to yield the zero-variance estimator in the limit; this connection is the central guidance provided by the continuous-time perspective. A rigorous bias-variance decomposition for the resulting particle estimator is technically involved and lies beyond the scope of the present work, which focuses on algorithm derivation and the control-theoretic link. We will revise §3 to make this motivation and limitation explicit, while retaining the claim of improvement on the basis of the principled objective and supporting experiments. revision: partial
Referee: [§5] §5 (numerical experiments): the reported runs use small state dimensions and short time horizons; no scaling study or comparison against a non-neural twisted filter (e.g., analytically chosen twisting) is given, so it remains unclear whether the NN parameterization delivers a practical advantage once training overhead is included.

Authors: The experiments are explicitly described in the abstract and §5 as illustrations of the algorithm's capability rather than a comprehensive benchmark. The neural-network parameterization is intended for regimes in which closed-form twisting functions are unavailable; direct comparison to an analytic baseline is therefore not always feasible and would not demonstrate the method's intended use case. Training cost is acknowledged as part of the procedure, but the paper does not assert net computational superiority. We therefore do not plan revisions to §5, as expanding the experiments would shift the manuscript away from its stated focus on the continuous-time derivation. revision: no

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation introduces a neural-network-parameterized twisting function trained on an external KL-divergence between path measures, guided by prior control-based importance sampling results. This objective is independent of the final particle filter estimator and does not reduce to it by construction; the claimed variance reduction is presented as an empirical consequence rather than a definitional identity. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the abstract or described chain. The numerical experiments are explicitly illustrative, leaving the central proposal self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract supplies no explicit free parameters, axioms, or invented entities. The neural-network weights are implicit fitting parameters of the method rather than part of the claim itself. Standard properties of KL divergence and path measures are assumed but not listed as paper-specific axioms.

pith-pipeline@v0.9.0 · 5634 in / 1274 out tokens · 17386 ms · 2026-05-23T21:04:58.620347+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a novel algorithm called the Twisted-Path Particle Filter (TPPF), in which the twisting function is parameterized by a neural network and trained to minimize a specific KL-divergence between path measures.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the discrete-time model converges to a continuous-time limit, which can be solved through a series of well-studied control-based importance sampling algorithms.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space
stat.ML 2025-10 unverdicted novelty 6.0

Proposes Latent Interacting Particle Systems with an efficient parameterization of twist potentials to enable approximate posterior inference for coupled continuous-time hidden Markov models via twisted sequential Mon...

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages · cited by 1 Pith paper · 5 internal anchors

[1]

Zero-variance importance sampling estimators for Markov process expectations

Hernan P Awad, Peter W Glynn, and Reuven Y Rubinstein. Zero-variance importance sampling estimators for Markov process expectations. Mathematics of Operations Re- search, 38(2):358–388, 2013

work page 2013
[2]

An intuitive proof of the data processing inequality

Normand J Beaudry and Renato Renner. An intuitive proof of the data processing inequality. arXiv preprint arXiv:1107.0740 , 2011

work page internal anchor Pith review Pith/arXiv arXiv 2011
[3]

Monte Carlo twisting for particle filters

Joshua J Bon, Christopher Drovandi, and Anthony Lee. Monte Carlo twisting for particle filters. arXiv preprint arXiv:2208.04288 , 2022

work page arXiv 2022
[4]

A variational representation for certain functionals of Brownian motion

Michelle Bou´ e and Paul Dupuis. A variational representation for certain functionals of Brownian motion. The Annals of Probability , 26(4):1641–1659, 1998

work page 1998
[5]

Optimized auxiliary particle filters: adapting mix- ture proposals via convex optimization

Nicola Branchini and V´ ıctor Elvira. Optimized auxiliary particle filters: adapting mix- ture proposals via convex optimization. In Uncertainty in Artificial Intelligence , pages 1289–1299. PMLR, 2021

work page 2021
[6]

A sequential particle filter method for static models

Nicolas Chopin. A sequential particle filter method for static models. Biometrika, 89(3):539–552, 2002. 32

work page 2002
[7]

Approximation by superpositions of a sigmoidal function

George Cybenko. Approximation by superpositions of a sigmoidal function. Mathemat- ics of control, signals and systems , 2(4):303–314, 1989

work page 1989
[8]

Theoretical guarantees for approximate sampling from smooth and log-concave densities

Arnak S Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(3):651–676, 2017

work page 2017
[9]

Feynman-Kac formulae

Pierre Del Moral. Feynman-Kac formulae. Springer, 2004

work page 2004
[10]

Sequential Monte Carlo samplers

Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(3):411–436, 2006

work page 2006
[11]

On adaptive resampling strategies for sequential Monte Carlo methods

Pierrre Del Moral, Arnaud Doucet, and Ajay Jasra. On adaptive resampling strategies for sequential Monte Carlo methods. Bernoulli, 18(1):252–278, 2012

work page 2012
[12]

Large deviations, volume 342

Jean-Dominique Deuschel and Daniel W Stroock. Large deviations, volume 342. Amer- ican Mathematical Soc., 2001

work page 2001
[13]

Particle filtering

Petar M Djuric, Jayesh H Kotecha, Jianqui Zhang, Yufei Huang, Tadesse Ghirmai, M´ onica F Bugallo, and Joaquin Miguez. Particle filtering. IEEE signal processing magazine, 20(5):19–38, 2003

work page 2003
[14]

An introduction to sequential Monte Carlo methods

Arnaud Doucet, Nando De Freitas, and Neil Gordon. An introduction to sequential Monte Carlo methods. Sequential Monte Carlo methods in practice , pages 3–14, 2001

work page 2001
[15]

On sequential Monte Carlo sampling methods for Bayesian filtering

Arnaud Doucet, Simon Godsill, and Christophe Andrieu. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and computing , 10:197–208, 2000

work page 2000
[16]

A tutorial on particle filtering and smoothing: Fifteen years later

Arnaud Doucet, Adam M Johansen, et al. A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of nonlinear filtering, 12(656-704):3, 2009

work page 2009
[17]

Temporal difference learning in continuous time and space

Kenji Doya. Temporal difference learning in continuous time and space. Advances in neural information processing systems, 8, 1995

work page 1995
[18]

Time series analysis by state space methods , volume 38

James Durbin and Siem Jan Koopman. Time series analysis by state space methods , volume 38. OUP Oxford, 2012

work page 2012
[19]

Stochastic calculus: a practical introduction

Richard Durrett. Stochastic calculus: a practical introduction . CRC press, 2018

work page 2018
[20]

Stochastic equations with delay: Optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations

Marco Fuhrman, Federica Masiero, and Gianmario Tessitore. Stochastic equations with delay: Optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations. SIAM Journal on Control and Optimization , 48(7):4624–4651, 2010

work page 2010
[21]

On transforming a certain class of stochastic processes by absolutely continuous substitution of measures

Igor Vladimirovich Girsanov. On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theory of Probability & Its Appli- cations, 5(3):285–301, 1960

work page 1960
[22]

Monte Carlo methods in financial engineering , volume 53

Paul Glasserman. Monte Carlo methods in financial engineering , volume 53. Springer, 2004

work page 2004
[23]

Importance sampling for portfolio credit risk

Paul Glasserman and Jingyi Li. Importance sampling for portfolio credit risk. Man- agement science, 51(11):1643–1656, 2005

work page 2005
[24]

Importance sampling for stochastic simulations

Peter W Glynn and Donald L Iglehart. Importance sampling for stochastic simulations. Management science, 35(11):1367–1392, 1989

work page 1989
[25]

Novel approach to nonlinear/non-Gaussian Bayesian state estimation

Neil J Gordon, David J Salmond, and Adrian FM Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. In IEE proceedings F (radar and signal processing), volume 140, pages 107–113. IET, 1993. 33

work page 1993
[26]

The iterated auxiliary particle filter

Pieralberto Guarniero, Adam M Johansen, and Anthony Lee. The iterated auxiliary particle filter. Journal of the American Statistical Association , 112(520):1636–1647, 2017

work page 2017
[27]

Reinforcement learning with deep energy-based policies

Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, and Sergey Levine. Reinforcement learning with deep energy-based policies. In International conference on machine learn- ing, pages 1352–1361. PMLR, 2017

work page 2017
[28]

Solving high-dimensional partial differen- tial equations using deep learning

Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial differen- tial equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

work page 2018
[29]

Nonasymptotic bounds for suboptimal im- portance sampling

Carsten Hartmann and Lorenz Richter. Nonasymptotic bounds for suboptimal im- portance sampling. SIAM/ASA Journal on Uncertainty Quantification , 12(2):309–346, 2024

work page 2024
[30]

Variational characterization of free energy: Theory and algorithms

Carsten Hartmann, Lorenz Richter, Christof Sch¨ utte, and Wei Zhang. Variational characterization of free energy: Theory and algorithms. Entropy, 19(11):626, 2017

work page 2017
[31]

Efficient rare event simulation by optimal nonequilibrium forcing

Carsten Hartmann and Christof Sch¨ utte. Efficient rare event simulation by optimal nonequilibrium forcing. Journal of Statistical Mechanics: Theory and Experiment , 2012(11):P11004, 2012

work page 2012
[32]

Model reduction algorithms for optimal control and importance sampling of diffusions

Carsten Hartmann, Christof Sch¨ utte, and Wei Zhang. Model reduction algorithms for optimal control and importance sampling of diffusions. Nonlinearity, 29(8):2298, 2016

work page 2016
[33]

Controlled sequential Monte Carlo

Jeremy Heng, Adrian N Bishop, George Deligiannidis, and Arnaud Doucet. Controlled sequential Monte Carlo. The Annals of Statistics , 48(5):2904–2929, 2020

work page 2020
[34]

DenseNet: Implementing Efficient ConvNet Descriptor Pyramids

Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, and Kurt Keutzer. Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[35]

A new approach to linear filtering and prediction problems

Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960

work page 1960
[36]

Adaptive importance sampling with forward-backward stochastic differential equations

Omar Kebiri, Lara Neureither, and Carsten Hartmann. Adaptive importance sampling with forward-backward stochastic differential equations. In Stochastic Dynamics Out of Equilibrium: Institut Henri Poincar´ e, Paris, France, 2017, pages 265–281. Springer, 2019

work page 2017
[37]

Monte Carlo filter and smoother for non-Gaussian nonlinear state space models

Genshiro Kitagawa. Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. Journal of computational and graphical statistics , 5(1):1–25, 1996

work page 1996
[38]

Bayesian estimates of equation system param- eters: an application of integration by Monte Carlo

Teun Kloek and Herman K Van Dijk. Bayesian estimates of equation system param- eters: an application of integration by Monte Carlo. Econometrica: Journal of the Econometric Society, pages 1–19, 1978

work page 1978
[39]

Sequential imputations and Bayesian missing data problems

Augustine Kong, Jun S Liu, and Wing Hung Wong. Sequential imputations and Bayesian missing data problems. Journal of the American statistical association , 89(425):278–288, 1994

work page 1994
[40]

Sixo: Smoothing inference with twisted objectives

Dieterich Lawson, Allan Ravent´ os, Andrew Warrington, and Scott Linderman. Sixo: Smoothing inference with twisted objectives. Advances in Neural Information Process- ing Systems, 35:38844–38858, 2022

work page 2022
[41]

Twisted variational sequential Monte Carlo

Dieterich Lawson, George Tucker, Christian A Naesseth, Chris Maddison, Ryan P Adams, and Yee Whye Teh. Twisted variational sequential Monte Carlo. In Third workshop on Bayesian Deep Learning (NeurIPS) , 2018

work page 2018
[42]

Propagation of chaos in path spaces via information theory

Lei Li, Yuelin Wang, and Yuliang Wang. Propagation of chaos in path spaces via information theory. arXiv preprint arXiv:2312.00339 , 2023. 34

work page arXiv 2023
[43]

and Wang, Y

Lei Li and Yuliang Wang. A sharp uniform-in-time error estimate for Stochastic Gra- dient Langevin Dynamics. arXiv preprint arXiv:2207.09304 , 2022

work page arXiv 2022
[44]

On a strongly convex approximation of a stochastic optimal control problem for importance sampling of metastable diffusions

Han Cheng Lie. On a strongly convex approximation of a stochastic optimal control problem for importance sampling of metastable diffusions . PhD thesis, 2016

work page 2016
[45]

Blind deconvolution via sequential imputations

Jun S Liu and Rong Chen. Blind deconvolution via sequential imputations. Journal of the american statistical association , 90(430):567–576, 1995

work page 1995
[46]

Sequential Monte Carlo methods for dynamic systems

Jun S Liu and Rong Chen. Sequential Monte Carlo methods for dynamic systems. Journal of the American statistical association , 93(443):1032–1044, 1998

work page 1998
[47]

Predictability: A problem partly solved

Edward N Lorenz. Predictability: A problem partly solved. In Proc. Seminar on predictability, volume 1. Reading, 1996

work page 1996
[48]

Im- proved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity

Wenlong Mou, Nicolas Flammarion, Martin J Wainwright, and Peter L Bartlett. Im- proved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity. Bernoulli, 28(3):1577–1601, 2022

work page 2022
[49]

On Bellman equations for continuous-time policy eval- uation i: discretization and approximation

Wenlong Mou and Yuhua Zhu. On Bellman equations for continuous-time policy eval- uation i: discretization and approximation. arXiv preprint arXiv:2407.05966 , 2024

work page arXiv 2024
[50]

Anytime Monte Carlo

Lawrence M Murray, Sumeetpal S Singh, and Anthony Lee. Anytime Monte Carlo. Data-Centric Engineering, 2:e7, 2021

work page 2021
[51]

On the optimal and suboptimal nonlinear fil- tering problem for discrete-time systems

M Netto, L Gimeno, and M Mendes. On the optimal and suboptimal nonlinear fil- tering problem for discrete-time systems. IEEE Transactions on Automatic Control , 23(6):1062–1067, 1978

work page 1978
[52]

Filtering via simulation: Auxiliary particle filters

Michael K Pitt and Neil Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American statistical association , 94(446):590–599, 1999

work page 1999
[53]

Improv- ing control based importance sampling strategies for metastable diffusions via adapted metadynamics

Enric Ribera Borrell, Jannes Quer, Lorenz Richter, and Christof Sch¨ utte. Improv- ing control based importance sampling strategies for metastable diffusions via adapted metadynamics. SIAM Journal on Scientific Computing , 46(2):S298–S323, 2024

work page 2024
[54]

Solving high-dimensional PDEs, approximation of path space measures and importance sampling of diffusions

Lorenz Richter. Solving high-dimensional PDEs, approximation of path space measures and importance sampling of diffusions . PhD thesis, BTU Cottbus-Senftenberg, 2021

work page 2021
[55]

Bayesian filtering and smoothing , volume 17

Simo S¨ arkk¨ a and Lennart Svensson. Bayesian filtering and smoothing , volume 17. Cambridge university press, 2023

work page 2023
[56]

Equivalence Between Policy Gradients and Soft Q-Learning

John Schulman, Xi Chen, and Pieter Abbeel. Equivalence between policy gradients and soft q-learning. arXiv preprint arXiv:1704.06440 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[57]

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul F Christiano. Learning to summarize with human feedback. Advances in Neural Information Processing Systems , 33:3008–3021, 2020

work page 2020
[58]

Learning to predict by the methods of temporal differences

Richard S Sutton. Learning to predict by the methods of temporal differences. Machine learning, 3:9–44, 1988

work page 1988
[59]

Policy gradient methods for reinforcement learning with function approximation

Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999

work page 1999
[60]

An introduction to optimal control of FBSDE with incomplete information

Guangchen Wang, Zhen Wu, Jie Xiong, et al. An introduction to optimal control of FBSDE with incomplete information . Springer, 2018

work page 2018
[61]

Reinforcement learning in continuous time and space: A stochastic control approach.Journal of Machine Learning Research, 21(198):1–34, 2020

Haoran Wang, Thaleia Zariphopoulou, and Xun Yu Zhou. Reinforcement learning in continuous time and space: A stochastic control approach.Journal of Machine Learning Research, 21(198):1–34, 2020. 35

work page 2020
[62]

Mixture models, Monte Carlo, Bayesian updating, and dynamic models

Mike West. Mixture models, Monte Carlo, Bayesian updating, and dynamic models. Computing Science and Statistics , pages 325–325, 1993

work page 1993
[63]

Twisted particle filters

Nick Whiteley and Anthony Lee. Twisted particle filters. The Annals of Statistics , 42(1):115–141, 2014

work page 2014
[64]

Simple statistical gradient-following algorithms for connectionist reinforcement learning

Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8:229–256, 1992

work page 1992
[65]

FUDGE: Controlled text generation with future discrim- inators

Kevin Yang and Dan Klein. FUDGE: Controlled text generation with future discrim- inators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 3511–3535, 2021

work page 2021
[66]

Ap- plications of the cross-entropy method to importance sampling and optimal control of diffusions

Wei Zhang, Han Wang, Carsten Hartmann, Marcus Weber, and Christof Sch¨ utte. Ap- plications of the cross-entropy method to importance sampling and optimal control of diffusions. SIAM Journal on Scientific Computing , 36(6):A2654–A2672, 2014

work page 2014
[67]

Probabilistic inference in language models via twisted sequential monte carlo.arXiv preprint arXiv:2404.17546, 2024

Stephen Zhao, Rob Brekelmans, Alireza Makhzani, and Roger Grosse. Probabilis- tic inference in language models via twisted sequential Monte Carlo. arXiv preprint arXiv:2404.17546, 2024

work page arXiv 2024
[68]

Solving time-continuous stochastic optimal control prob- lems: Algorithm design and convergence analysis of actor-critic flow

Mo Zhou and Jianfeng Lu. Solving time-continuous stochastic optimal control prob- lems: Algorithm design and convergence analysis of actor-critic flow. arXiv preprint arXiv:2402.17208, 2024

work page arXiv 2024
[69]

Fine-Tuning Language Models from Human Preferences

Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. Fine-tuning language models from hu- man preferences. arXiv preprint arXiv:1909.08593 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1909
[70]

Universal and Transferable Adversarial Attacks on Aligned Language Models

Andy Zou, Zifan Wang, J Zico Kolter, and Matt Fredrikson. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023. 36

work page internal anchor Pith review Pith/arXiv arXiv 2023

[1] [1]

Zero-variance importance sampling estimators for Markov process expectations

Hernan P Awad, Peter W Glynn, and Reuven Y Rubinstein. Zero-variance importance sampling estimators for Markov process expectations. Mathematics of Operations Re- search, 38(2):358–388, 2013

work page 2013

[2] [2]

An intuitive proof of the data processing inequality

Normand J Beaudry and Renato Renner. An intuitive proof of the data processing inequality. arXiv preprint arXiv:1107.0740 , 2011

work page internal anchor Pith review Pith/arXiv arXiv 2011

[3] [3]

Monte Carlo twisting for particle filters

Joshua J Bon, Christopher Drovandi, and Anthony Lee. Monte Carlo twisting for particle filters. arXiv preprint arXiv:2208.04288 , 2022

work page arXiv 2022

[4] [4]

A variational representation for certain functionals of Brownian motion

Michelle Bou´ e and Paul Dupuis. A variational representation for certain functionals of Brownian motion. The Annals of Probability , 26(4):1641–1659, 1998

work page 1998

[5] [5]

Optimized auxiliary particle filters: adapting mix- ture proposals via convex optimization

Nicola Branchini and V´ ıctor Elvira. Optimized auxiliary particle filters: adapting mix- ture proposals via convex optimization. In Uncertainty in Artificial Intelligence , pages 1289–1299. PMLR, 2021

work page 2021

[6] [6]

A sequential particle filter method for static models

Nicolas Chopin. A sequential particle filter method for static models. Biometrika, 89(3):539–552, 2002. 32

work page 2002

[7] [7]

Approximation by superpositions of a sigmoidal function

George Cybenko. Approximation by superpositions of a sigmoidal function. Mathemat- ics of control, signals and systems , 2(4):303–314, 1989

work page 1989

[8] [8]

Theoretical guarantees for approximate sampling from smooth and log-concave densities

Arnak S Dalalyan. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society Series B: Statistical Methodology, 79(3):651–676, 2017

work page 2017

[9] [9]

Feynman-Kac formulae

Pierre Del Moral. Feynman-Kac formulae. Springer, 2004

work page 2004

[10] [10]

Sequential Monte Carlo samplers

Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(3):411–436, 2006

work page 2006

[11] [11]

On adaptive resampling strategies for sequential Monte Carlo methods

Pierrre Del Moral, Arnaud Doucet, and Ajay Jasra. On adaptive resampling strategies for sequential Monte Carlo methods. Bernoulli, 18(1):252–278, 2012

work page 2012

[12] [12]

Large deviations, volume 342

Jean-Dominique Deuschel and Daniel W Stroock. Large deviations, volume 342. Amer- ican Mathematical Soc., 2001

work page 2001

[13] [13]

Particle filtering

Petar M Djuric, Jayesh H Kotecha, Jianqui Zhang, Yufei Huang, Tadesse Ghirmai, M´ onica F Bugallo, and Joaquin Miguez. Particle filtering. IEEE signal processing magazine, 20(5):19–38, 2003

work page 2003

[14] [14]

An introduction to sequential Monte Carlo methods

Arnaud Doucet, Nando De Freitas, and Neil Gordon. An introduction to sequential Monte Carlo methods. Sequential Monte Carlo methods in practice , pages 3–14, 2001

work page 2001

[15] [15]

On sequential Monte Carlo sampling methods for Bayesian filtering

Arnaud Doucet, Simon Godsill, and Christophe Andrieu. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and computing , 10:197–208, 2000

work page 2000

[16] [16]

A tutorial on particle filtering and smoothing: Fifteen years later

Arnaud Doucet, Adam M Johansen, et al. A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of nonlinear filtering, 12(656-704):3, 2009

work page 2009

[17] [17]

Temporal difference learning in continuous time and space

Kenji Doya. Temporal difference learning in continuous time and space. Advances in neural information processing systems, 8, 1995

work page 1995

[18] [18]

Time series analysis by state space methods , volume 38

James Durbin and Siem Jan Koopman. Time series analysis by state space methods , volume 38. OUP Oxford, 2012

work page 2012

[19] [19]

Stochastic calculus: a practical introduction

Richard Durrett. Stochastic calculus: a practical introduction . CRC press, 2018

work page 2018

[20] [20]

Stochastic equations with delay: Optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations

Marco Fuhrman, Federica Masiero, and Gianmario Tessitore. Stochastic equations with delay: Optimal control via BSDEs and regular solutions of Hamilton-Jacobi-Bellman equations. SIAM Journal on Control and Optimization , 48(7):4624–4651, 2010

work page 2010

[21] [21]

On transforming a certain class of stochastic processes by absolutely continuous substitution of measures

Igor Vladimirovich Girsanov. On transforming a certain class of stochastic processes by absolutely continuous substitution of measures. Theory of Probability & Its Appli- cations, 5(3):285–301, 1960

work page 1960

[22] [22]

Monte Carlo methods in financial engineering , volume 53

Paul Glasserman. Monte Carlo methods in financial engineering , volume 53. Springer, 2004

work page 2004

[23] [23]

Importance sampling for portfolio credit risk

Paul Glasserman and Jingyi Li. Importance sampling for portfolio credit risk. Man- agement science, 51(11):1643–1656, 2005

work page 2005

[24] [24]

Importance sampling for stochastic simulations

Peter W Glynn and Donald L Iglehart. Importance sampling for stochastic simulations. Management science, 35(11):1367–1392, 1989

work page 1989

[25] [25]

Novel approach to nonlinear/non-Gaussian Bayesian state estimation

Neil J Gordon, David J Salmond, and Adrian FM Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. In IEE proceedings F (radar and signal processing), volume 140, pages 107–113. IET, 1993. 33

work page 1993

[26] [26]

The iterated auxiliary particle filter

Pieralberto Guarniero, Adam M Johansen, and Anthony Lee. The iterated auxiliary particle filter. Journal of the American Statistical Association , 112(520):1636–1647, 2017

work page 2017

[27] [27]

Reinforcement learning with deep energy-based policies

Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, and Sergey Levine. Reinforcement learning with deep energy-based policies. In International conference on machine learn- ing, pages 1352–1361. PMLR, 2017

work page 2017

[28] [28]

Solving high-dimensional partial differen- tial equations using deep learning

Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial differen- tial equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

work page 2018

[29] [29]

Nonasymptotic bounds for suboptimal im- portance sampling

Carsten Hartmann and Lorenz Richter. Nonasymptotic bounds for suboptimal im- portance sampling. SIAM/ASA Journal on Uncertainty Quantification , 12(2):309–346, 2024

work page 2024

[30] [30]

Variational characterization of free energy: Theory and algorithms

Carsten Hartmann, Lorenz Richter, Christof Sch¨ utte, and Wei Zhang. Variational characterization of free energy: Theory and algorithms. Entropy, 19(11):626, 2017

work page 2017

[31] [31]

Efficient rare event simulation by optimal nonequilibrium forcing

Carsten Hartmann and Christof Sch¨ utte. Efficient rare event simulation by optimal nonequilibrium forcing. Journal of Statistical Mechanics: Theory and Experiment , 2012(11):P11004, 2012

work page 2012

[32] [32]

Model reduction algorithms for optimal control and importance sampling of diffusions

Carsten Hartmann, Christof Sch¨ utte, and Wei Zhang. Model reduction algorithms for optimal control and importance sampling of diffusions. Nonlinearity, 29(8):2298, 2016

work page 2016

[33] [33]

Controlled sequential Monte Carlo

Jeremy Heng, Adrian N Bishop, George Deligiannidis, and Arnaud Doucet. Controlled sequential Monte Carlo. The Annals of Statistics , 48(5):2904–2929, 2020

work page 2020

[34] [34]

DenseNet: Implementing Efficient ConvNet Descriptor Pyramids

Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, and Kurt Keutzer. Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[35] [35]

A new approach to linear filtering and prediction problems

Rudolph Emil Kalman. A new approach to linear filtering and prediction problems. 1960

work page 1960

[36] [36]

Adaptive importance sampling with forward-backward stochastic differential equations

Omar Kebiri, Lara Neureither, and Carsten Hartmann. Adaptive importance sampling with forward-backward stochastic differential equations. In Stochastic Dynamics Out of Equilibrium: Institut Henri Poincar´ e, Paris, France, 2017, pages 265–281. Springer, 2019

work page 2017

[37] [37]

Monte Carlo filter and smoother for non-Gaussian nonlinear state space models

Genshiro Kitagawa. Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. Journal of computational and graphical statistics , 5(1):1–25, 1996

work page 1996

[38] [38]

Bayesian estimates of equation system param- eters: an application of integration by Monte Carlo

Teun Kloek and Herman K Van Dijk. Bayesian estimates of equation system param- eters: an application of integration by Monte Carlo. Econometrica: Journal of the Econometric Society, pages 1–19, 1978

work page 1978

[39] [39]

Sequential imputations and Bayesian missing data problems

Augustine Kong, Jun S Liu, and Wing Hung Wong. Sequential imputations and Bayesian missing data problems. Journal of the American statistical association , 89(425):278–288, 1994

work page 1994

[40] [40]

Sixo: Smoothing inference with twisted objectives

Dieterich Lawson, Allan Ravent´ os, Andrew Warrington, and Scott Linderman. Sixo: Smoothing inference with twisted objectives. Advances in Neural Information Process- ing Systems, 35:38844–38858, 2022

work page 2022

[41] [41]

Twisted variational sequential Monte Carlo

Dieterich Lawson, George Tucker, Christian A Naesseth, Chris Maddison, Ryan P Adams, and Yee Whye Teh. Twisted variational sequential Monte Carlo. In Third workshop on Bayesian Deep Learning (NeurIPS) , 2018

work page 2018

[42] [42]

Propagation of chaos in path spaces via information theory

Lei Li, Yuelin Wang, and Yuliang Wang. Propagation of chaos in path spaces via information theory. arXiv preprint arXiv:2312.00339 , 2023. 34

work page arXiv 2023

[43] [43]

and Wang, Y

Lei Li and Yuliang Wang. A sharp uniform-in-time error estimate for Stochastic Gra- dient Langevin Dynamics. arXiv preprint arXiv:2207.09304 , 2022

work page arXiv 2022

[44] [44]

On a strongly convex approximation of a stochastic optimal control problem for importance sampling of metastable diffusions

Han Cheng Lie. On a strongly convex approximation of a stochastic optimal control problem for importance sampling of metastable diffusions . PhD thesis, 2016

work page 2016

[45] [45]

Blind deconvolution via sequential imputations

Jun S Liu and Rong Chen. Blind deconvolution via sequential imputations. Journal of the american statistical association , 90(430):567–576, 1995

work page 1995

[46] [46]

Sequential Monte Carlo methods for dynamic systems

Jun S Liu and Rong Chen. Sequential Monte Carlo methods for dynamic systems. Journal of the American statistical association , 93(443):1032–1044, 1998

work page 1998

[47] [47]

Predictability: A problem partly solved

Edward N Lorenz. Predictability: A problem partly solved. In Proc. Seminar on predictability, volume 1. Reading, 1996

work page 1996

[48] [48]

Im- proved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity

Wenlong Mou, Nicolas Flammarion, Martin J Wainwright, and Peter L Bartlett. Im- proved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity. Bernoulli, 28(3):1577–1601, 2022

work page 2022

[49] [49]

On Bellman equations for continuous-time policy eval- uation i: discretization and approximation

Wenlong Mou and Yuhua Zhu. On Bellman equations for continuous-time policy eval- uation i: discretization and approximation. arXiv preprint arXiv:2407.05966 , 2024

work page arXiv 2024

[50] [50]

Anytime Monte Carlo

Lawrence M Murray, Sumeetpal S Singh, and Anthony Lee. Anytime Monte Carlo. Data-Centric Engineering, 2:e7, 2021

work page 2021

[51] [51]

On the optimal and suboptimal nonlinear fil- tering problem for discrete-time systems

M Netto, L Gimeno, and M Mendes. On the optimal and suboptimal nonlinear fil- tering problem for discrete-time systems. IEEE Transactions on Automatic Control , 23(6):1062–1067, 1978

work page 1978

[52] [52]

Filtering via simulation: Auxiliary particle filters

Michael K Pitt and Neil Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American statistical association , 94(446):590–599, 1999

work page 1999

[53] [53]

Improv- ing control based importance sampling strategies for metastable diffusions via adapted metadynamics

Enric Ribera Borrell, Jannes Quer, Lorenz Richter, and Christof Sch¨ utte. Improv- ing control based importance sampling strategies for metastable diffusions via adapted metadynamics. SIAM Journal on Scientific Computing , 46(2):S298–S323, 2024

work page 2024

[54] [54]

Solving high-dimensional PDEs, approximation of path space measures and importance sampling of diffusions

Lorenz Richter. Solving high-dimensional PDEs, approximation of path space measures and importance sampling of diffusions . PhD thesis, BTU Cottbus-Senftenberg, 2021

work page 2021

[55] [55]

Bayesian filtering and smoothing , volume 17

Simo S¨ arkk¨ a and Lennart Svensson. Bayesian filtering and smoothing , volume 17. Cambridge university press, 2023

work page 2023

[56] [56]

Equivalence Between Policy Gradients and Soft Q-Learning

John Schulman, Xi Chen, and Pieter Abbeel. Equivalence between policy gradients and soft q-learning. arXiv preprint arXiv:1704.06440 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[57] [57]

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, and Paul F Christiano. Learning to summarize with human feedback. Advances in Neural Information Processing Systems , 33:3008–3021, 2020

work page 2020

[58] [58]

Learning to predict by the methods of temporal differences

Richard S Sutton. Learning to predict by the methods of temporal differences. Machine learning, 3:9–44, 1988

work page 1988

[59] [59]

Policy gradient methods for reinforcement learning with function approximation

Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999

work page 1999

[60] [60]

An introduction to optimal control of FBSDE with incomplete information

Guangchen Wang, Zhen Wu, Jie Xiong, et al. An introduction to optimal control of FBSDE with incomplete information . Springer, 2018

work page 2018

[61] [61]

Reinforcement learning in continuous time and space: A stochastic control approach.Journal of Machine Learning Research, 21(198):1–34, 2020

Haoran Wang, Thaleia Zariphopoulou, and Xun Yu Zhou. Reinforcement learning in continuous time and space: A stochastic control approach.Journal of Machine Learning Research, 21(198):1–34, 2020. 35

work page 2020

[62] [62]

Mixture models, Monte Carlo, Bayesian updating, and dynamic models

Mike West. Mixture models, Monte Carlo, Bayesian updating, and dynamic models. Computing Science and Statistics , pages 325–325, 1993

work page 1993

[63] [63]

Twisted particle filters

Nick Whiteley and Anthony Lee. Twisted particle filters. The Annals of Statistics , 42(1):115–141, 2014

work page 2014

[64] [64]

Simple statistical gradient-following algorithms for connectionist reinforcement learning

Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8:229–256, 1992

work page 1992

[65] [65]

FUDGE: Controlled text generation with future discrim- inators

Kevin Yang and Dan Klein. FUDGE: Controlled text generation with future discrim- inators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 3511–3535, 2021

work page 2021

[66] [66]

Ap- plications of the cross-entropy method to importance sampling and optimal control of diffusions

Wei Zhang, Han Wang, Carsten Hartmann, Marcus Weber, and Christof Sch¨ utte. Ap- plications of the cross-entropy method to importance sampling and optimal control of diffusions. SIAM Journal on Scientific Computing , 36(6):A2654–A2672, 2014

work page 2014

[67] [67]

Probabilistic inference in language models via twisted sequential monte carlo.arXiv preprint arXiv:2404.17546, 2024

Stephen Zhao, Rob Brekelmans, Alireza Makhzani, and Roger Grosse. Probabilis- tic inference in language models via twisted sequential Monte Carlo. arXiv preprint arXiv:2404.17546, 2024

work page arXiv 2024

[68] [68]

Solving time-continuous stochastic optimal control prob- lems: Algorithm design and convergence analysis of actor-critic flow

Mo Zhou and Jianfeng Lu. Solving time-continuous stochastic optimal control prob- lems: Algorithm design and convergence analysis of actor-critic flow. arXiv preprint arXiv:2402.17208, 2024

work page arXiv 2024

[69] [69]

Fine-Tuning Language Models from Human Preferences

Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. Fine-tuning language models from hu- man preferences. arXiv preprint arXiv:1909.08593 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1909

[70] [70]

Universal and Transferable Adversarial Attacks on Aligned Language Models

Andy Zou, Zifan Wang, J Zico Kolter, and Matt Fredrikson. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023. 36

work page internal anchor Pith review Pith/arXiv arXiv 2023