pith. sign in

arxiv: 2407.07368 · v8 · submitted 2024-07-10 · 📡 eess.SP · cs.LG

Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements

Pith reviewed 2026-05-23 23:08 UTC · model grok-4.3

classification 📡 eess.SP cs.LG
keywords semi-supervised learningBayesian state estimationcompressed measurementsmodel-free processeschaotic dynamical systemsdata-driven methodsDANSE
0
0 comments X

The pith

SemiDANSE adds limited labeled pairs to regularize unsupervised learning and solves under-determined state estimation from compressed measurements in model-free processes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that existing unsupervised data-driven methods fail at Bayesian state estimation from compressed measurements when the process dynamics are unknown because they lack regularization for the under-determined inverse problem. It introduces SemiDANSE, which augments large amounts of unlabeled measurement data with a small set of labeled measurement-state pairs to supply that regularization. A sympathetic reader would care because many real sensing tasks involve chaotic or black-box dynamics where full models cannot be written down yet some paired examples can be obtained. If the claim holds, data-driven estimators become practical for compressed sensing without requiring exact dynamical equations.

Core claim

The central claim is that SemiDANSE, a semi-supervised extension of DANSE, uses a large volume of unlabeled measurement time series together with limited pairwise labeled measurement-and-state data to regularize the learning process, thereby solving the under-determined BSCM inverse problem for model-free processes and delivering state estimation performance competitive with both the hybrid KalmanNet and the model-driven extended and unscented Kalman filters that know the dynamics exactly, as shown empirically on benchmark chaotic dynamical systems across several measurement systems.

What carries the argument

SemiDANSE, the semi-supervised DANSE variant that injects limited labeled measurement-state pairs to regularize the unsupervised component for compressed measurement inversion.

If this is right

  • SemiDANSE solves BSCM tasks in which the temporal measurement dimension is lower than the state dimension.
  • The method operates without any knowledge of the underlying dynamical model.
  • It remains competitive when the measurement system changes among a handful of different linear or nonlinear mappings.
  • It succeeds on chaotic benchmark systems where purely unsupervised DANSE and deep Markov models fail to produce usable state estimates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same limited-label regularization pattern could be tested on other under-determined inverse problems that involve temporal data.
  • Varying the fraction of labeled pairs while holding total data fixed would reveal how little supervision is actually required for stable performance.
  • The approach suggests that hybrid semi-supervised wrappers might improve additional unsupervised time-series estimators beyond the DANSE family.

Load-bearing premise

The limited amount of labelled pairwise measurement-and-state data supplies sufficient regularization to make the unsupervised component solve the under-determined BSCM inverse problem for model-free processes.

What would settle it

If SemiDANSE state estimation errors remain substantially larger than those of the unscented Kalman filter on the same benchmark chaotic systems and measurement setups, the claim of competitive performance from the added regularization would be falsified.

Figures

Figures reproduced from arXiv: 2407.07368 by Anubhab Ghosh, Saikat Chatterjee, Yonina C. Eldar.

Figure 1
Figure 1. Figure 1: Schematic of the parameterization of the Gaussian prior [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Schematic of SemiDANSE at the t’th time instant. The dotted lines represent information flow specifically during the learning phase, i.e. calculation of losses and gradients for RNN learning, Dsemi represents the training dataset for SemiDANSE as defined in (13), Ls and Lu denote the supervised and the unsupervised loss respectively at the t’th instant as defined in (22) and (23) respectively. The dash-dot… view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of three chaotic dynamical systems: [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: The average NMSE (in dB) on Dtest versus SMNR (in dB) performances to illustrate the success of SemiDANSE for the BSCM problem setup described in Section IV-D, with σ 2 e corresponding to −10 dB. SemiDANSE (κ = 0.02) is compared with the model-driven EKF and UKF, the hybrid KalmanNet [22] and the unsupervised learning-based DANSE. κ = 0.02 to keep the comparison consistent. We observe that SemiDANSE is com… view at source ↗
Figure 7
Figure 7. Figure 7: Time-wise plot of a (true) state trajectory [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Demonstrating failure of DANSE and success of Semi [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: NMSE (in dB) vs. SMNR (in dB) for SemiDANSE on [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: Demonstrating failure of DANSE and success of [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Additional example of the failure of DANSE and [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
read the original abstract

We consider data-driven Bayesian state estimation from compressed measurements (BSCM) of a model-free process. The dimension of the temporal measurement vector is lower than that of the temporal state vector to be estimated, leading to an under-determined inverse problem. The underlying dynamical model of the state's evolution is unknown for a `model-free process.' Hence, it is difficult to use traditional model-driven methods, for example, Kalman and particle filters. Instead, we consider data-driven methods. We experimentally show that two existing unsupervised learning-based data-driven methods fail to address the BSCM problem in a model-free process. The methods are -- data-driven nonlinear state estimation (DANSE) and deep Markov model (DMM). While DANSE provides good predictive/forecasting performance to model the temporal measurement data as a time series, its unsupervised learning lacks suitable regularization for tackling the BSCM task. We then propose a semi-supervised learning approach and develop a semi-supervised learning-based DANSE method, referred to as SemiDANSE. In SemiDANSE, we use a large amount of unlabelled data along with a limited amount of labelled data, i.e., pairwise measurement-and-state data, which provides the desired regularization. Using {benchmark chaotic dynamical systems}, we {empirically} show that the data-driven SemiDANSE provides competitive state estimation performance for BSCM {using a handful of different measurement systems}, against a hybrid method called KalmanNet and two model-driven methods (extended Kalman filter and unscented Kalman filter) that know the dynamical models exactly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper addresses Bayesian state estimation from compressed measurements (BSCM) for model-free dynamical processes where the state dimension exceeds the measurement dimension. It argues that existing unsupervised data-driven methods (DANSE and DMM) fail due to insufficient regularization, proposes SemiDANSE which augments unsupervised training with limited labelled (measurement, state) pairs, and empirically demonstrates that SemiDANSE achieves competitive estimation performance against the hybrid KalmanNet and the model-aware EKF/UKF on benchmark chaotic systems across multiple measurement operators.

Significance. If the empirical claims are substantiated with proper statistical controls, the work would offer a practical semi-supervised route to model-free state estimation under compression, a setting where purely unsupervised methods are known to be under-regularized and model-based filters are inapplicable. The use of multiple measurement systems and direct comparison to exact-model baselines is a positive feature.

major comments (3)
  1. [Abstract / Experiments] Abstract and Experiments section: the central claim that 'unsupervised learning lacks suitable regularization' for BSCM rests on the assertion that DANSE and DMM 'fail' while SemiDANSE succeeds, yet no quantitative characterization is given of the labelled-data fraction or compression ratio at which the transition occurs. This quantification is load-bearing for the regularization premise.
  2. [Experiments] Experiments section: the reported comparisons to EKF, UKF and KalmanNet omit error bars, exact train/validation/test splits, hyperparameter-search protocol, and any statistical significance testing. Without these, it is impossible to determine whether SemiDANSE is statistically competitive or whether the results reflect post-hoc measurement-system choices.
  3. [SemiDANSE formulation / Experiments] § on SemiDANSE formulation: the paper does not provide an ablation or sensitivity analysis showing how performance scales with the number of labelled pairs; the weakest assumption—that a small labelled set suffices to regularize the under-determined inverse problem—therefore remains untested at the level required to support the competitiveness claim.
minor comments (1)
  1. [Notation / Experiments] Notation for the measurement operator and compression ratio should be introduced once and used consistently; currently the abstract refers to 'a handful of different measurement systems' without a table summarizing the operators and their dimensions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the thorough review. We address the major comments below and will revise the manuscript accordingly to strengthen the experimental validation.

read point-by-point responses
  1. Referee: [Abstract / Experiments] Abstract and Experiments section: the central claim that 'unsupervised learning lacks suitable regularization' for BSCM rests on the assertion that DANSE and DMM 'fail' while SemiDANSE succeeds, yet no quantitative characterization is given of the labelled-data fraction or compression ratio at which the transition occurs. This quantification is load-bearing for the regularization premise.

    Authors: We agree that quantifying the transition would provide stronger support for the regularization premise. In the revised version, we will include additional experiments that vary the fraction of labeled data and report the performance of DANSE, DMM, and SemiDANSE across a range of labeled fractions and compression ratios to characterize the point at which unsupervised methods begin to fail. revision: yes

  2. Referee: [Experiments] Experiments section: the reported comparisons to EKF, UKF and KalmanNet omit error bars, exact train/validation/test splits, hyperparameter-search protocol, and any statistical significance testing. Without these, it is impossible to determine whether SemiDANSE is statistically competitive or whether the results reflect post-hoc measurement-system choices.

    Authors: We acknowledge the omission of these details. In the revision, we will report error bars from multiple runs, specify the exact train/validation/test splits used, describe the hyperparameter search protocol, and include statistical significance testing (e.g., paired t-tests) to substantiate the competitiveness claims. We will also clarify that the measurement systems were chosen based on standard benchmarks rather than post-hoc selection. revision: yes

  3. Referee: [SemiDANSE formulation / Experiments] § on SemiDANSE formulation: the paper does not provide an ablation or sensitivity analysis showing how performance scales with the number of labelled pairs; the weakest assumption—that a small labelled set suffices to regularize the under-determined inverse problem—therefore remains untested at the level required to support the competitiveness claim.

    Authors: We agree that an ablation study on the number of labeled pairs is important to validate the assumption that a small labeled set suffices. We will add a sensitivity analysis in the revised manuscript showing how the estimation performance of SemiDANSE scales with the number of labeled pairs (e.g., from 10 to 1000 pairs) for the benchmark systems. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external benchmarks

full rationale

The paper advances a semi-supervised extension (SemiDANSE) of an existing unsupervised method and validates it solely through numerical experiments on standard chaotic dynamical systems, comparing against EKF, UKF (model-aware) and KalmanNet. No derivation, uniqueness theorem, or fitted-parameter prediction is offered; the central claim is that limited labelled pairs suffice for regularization, which is tested rather than assumed by construction. All load-bearing steps are external comparisons, so the derivation chain is self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach assumes that a modest quantity of state-labeled pairs can regularize an otherwise unsupervised time-series model for an under-determined inverse problem; the neural network architecture and training hyperparameters are free parameters whose values are not reported in the abstract.

free parameters (1)
  • neural network architecture and training hyperparameters
    Standard in data-driven methods; values chosen to achieve reported performance but not specified.
axioms (1)
  • domain assumption Limited labeled pairs provide the desired regularization for the BSCM task
    Invoked when stating that unsupervised DANSE fails without this addition.

pith-pipeline@v0.9.0 · 5812 in / 1246 out tokens · 17015 ms · 2026-05-23T23:08:05.972981+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. pDANSE: Particle-based Data-driven Nonlinear State Estimation from Nonlinear Measurements

    eess.SP 2025-10 unverdicted novelty 6.0

    pDANSE enables nonlinear state estimation for model-free processes by using RNN-parameterized Gaussian priors and reparameterization-based particle sampling to compute posterior second-order statistics from nonlinear ...

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    S ¨arkk¨a and L

    S. S ¨arkk¨a and L. Svensson, Bayesian filtering and smoothing , vol. 17, Cambridge university press, 2023

  2. [2]

    Simon, Optimal state estimation: Kalman, H∞, and nonlinear approaches, John Wiley & Sons, 2006

    D. Simon, Optimal state estimation: Kalman, H∞, and nonlinear approaches, John Wiley & Sons, 2006

  3. [3]

    T. D. Barfoot, State estimation for robotics, Cambridge University Press, 2024

  4. [4]

    New results in linear filtering and prediction theory,

    R.E. Kalman, “New results in linear filtering and prediction theory,” J. Basic Eng., vol. 83, pp. 95–108, 1961

  5. [5]

    A new approach to linear filtering and prediction problems,

    R.E. Kalman, “A new approach to linear filtering and prediction problems,” Trans. ASME, D , vol. 82, pp. 35–44, 1960

  6. [6]

    An approach to target tracking,

    M. Gruber, “An approach to target tracking,” Tech. Rep., MIT Lexington Lincoln Lab, 1967

  7. [7]

    Unscented filtering and nonlinear estimation,

    S.J. Julier and J.K. Uhlmann, “Unscented filtering and nonlinear estimation,” Proceedings of the IEEE , vol. 92, no. 3, pp. 401–422, 2004

  8. [8]

    A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking,

    M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking,” IEEE Transactions on Signal Processing , vol. 50, no. 2, pp. 174–188, 2002

  9. [9]

    GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models,

    J. Ko and D. Fox, “GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models,” Autonomous Robots , vol. 27, pp. 75–90, 2009

  10. [10]

    Bayesian inference and learning in gaussian process state-space models with particle mcmc,

    R. Frigola, F. Lindsten, T. B. Sch ¨on, and C. E. Rasmussen, “Bayesian inference and learning in gaussian process state-space models with particle mcmc,” Advances in Neural Information Processing Systems , vol. 26, 2013

  11. [11]

    Computationally efficient bayesian learning of gaussian process state space models,

    A. Svensson, A. Solin, S. S ¨arkk¨a, and T. B. Sch ¨on, “Computationally efficient bayesian learning of gaussian process state space models,” in Artificial Intelligence and Statistics (AISTATS) . PMLR, 2016, pp. 213– 221

  12. [12]

    EKFNet: Learning system noise statistics from measurement data,

    L. Xu and R. Niu, “EKFNet: Learning system noise statistics from measurement data,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . IEEE, 2021, pp. 4560–4564

  13. [13]

    Combining generative and discriminative models for hybrid inference,

    V . Garcia Satorras, Z. Akata, and M. Welling, “Combining generative and discriminative models for hybrid inference,” Advances in Neural Information Processing Systems , vol. 32, 2019

  14. [14]

    A mnemonic Kalman filter for non-linear systems with extensive temporal dependencies,

    S. Jung, I. Schlangen, and A. Charlish, “A mnemonic Kalman filter for non-linear systems with extensive temporal dependencies,” IEEE Signal Processing Letters, vol. 27, pp. 1005–1009, 2020

  15. [15]

    End-to-end semi- supervised learning for differentiable particle filters,

    H. Wen, X. Chen, G. Papagiannis, C. Hu, and Y . Li, “End-to-end semi- supervised learning for differentiable particle filters,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 5825–5831

  16. [16]

    Dynamical variational autoencoders: A comprehensive review,

    L. Girin, S. Leglaive, X. Bie, J. Diard, T. Hueber, and X. Alameda- Pineda, “Dynamical variational autoencoders: A comprehensive review,” Foundations and Trends in Machine Learning , vol. 15, no. 1-2, pp. 1– 175, 2021

  17. [17]

    A disentangled recognition and nonlinear dynamics model for unsupervised learning,

    M. Fraccaro, S. Kamronn, U. Paquet, and O. Winther, “A disentangled recognition and nonlinear dynamics model for unsupervised learning,” Advances in Neural Information Processing Systems , vol. 30, 2017

  18. [18]

    Deep Kalman Filters

    R.G. Krishnan, U. Shalit, and D. Sontag, “Deep Kalman filters,” arXiv preprint arXiv:1511.05121, 2015

  19. [19]

    Structured inference networks for nonlinear state space models,

    R. Krishnan, U. Shalit, and D. Sontag, “Structured inference networks for nonlinear state space models,” in Proceedings of the AAAI Confer- ence on Artificial Intelligence , 2017, vol. 31

  20. [20]

    DANSE: Data-Driven Non-Linear State Estimation of Model-Free Process in Unsupervised Bayesian Setup,

    A. Ghosh, A. Honor ´e, and S. Chatterjee, “DANSE: Data-Driven Non-Linear State Estimation of Model-Free Process in Unsupervised Bayesian Setup,” in 2023 31st European Signal Processing Conference (EUSIPCO), 2023, pp. 870–874

  21. [21]

    DANSE: Data-Driven Non-Linear State Estimation of Model-Free Process in Unsupervised Learning Setup,

    A. Ghosh, A. Honor ´e, and S. Chatterjee, “DANSE: Data-Driven Non-Linear State Estimation of Model-Free Process in Unsupervised Learning Setup,” IEEE Transactions on Signal Processing , vol. 72, pp. 1824–1838, 2024

  22. [22]

    Unsupervised learned Kalman filtering,

    G. Revach, N. Shlezinger, T. Locher, X. Ni, R.J.G. van Sloun, and Y .C. Eldar, “Unsupervised learned Kalman filtering,” in 2022 30th European Signal Processing Conference (EUSIPCO). IEEE, 2022, pp. 1571–1575

  23. [23]

    KalmanNet: Neural network aided Kalman filtering for partially known dynamics,

    G. Revach, N. Shlezinger, X. Ni, A. L. Escoriza, R.J.G. Van Sloun, and Y .C. Eldar, “KalmanNet: Neural network aided Kalman filtering for partially known dynamics,” IEEE Transactions on Signal Processing , vol. 70, pp. 1532–1547, 2022

  24. [24]

    Adaptive Kalmannet: Data-Driven Kalman Filter with Fast Adaptation,

    X. Ni, G. Revach, and N. Shlezinger, “Adaptive Kalmannet: Data-Driven Kalman Filter with Fast Adaptation,” in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 5970–5974

  25. [25]

    Split- KalmanNet: A Robust Model-Based Deep Learning Approach for State Estimation,

    G. Choi, J. Park, N. Shlezinger, Y . C. Eldar, and N. Lee, “Split- KalmanNet: A Robust Model-Based Deep Learning Approach for State Estimation,” IEEE Transactions on Vehicular Technology , vol. 72, no. 9, pp. 12326–12331, Sept. 2023

  26. [26]

    MAML- KalmanNet: A Neural Network-Assisted Kalman Filter Based on Model- Agnostic Meta-Learning,

    S. Chen, Y . Zheng, D. Lin, P. Cai, Y . Xiao, and S. Wang, “MAML- KalmanNet: A Neural Network-Assisted Kalman Filter Based on Model- Agnostic Meta-Learning,” IEEE Transactions on Signal Processing , 2025

  27. [27]

    From target tracking to targeting track: A data-driven yet analytical approach to joint target detection and tracking,

    T. Li, Y . Song, and H. Fan, “From target tracking to targeting track: A data-driven yet analytical approach to joint target detection and tracking,” Signal Processing, vol. 205, pp. 108883, 2023

  28. [28]

    Chapelle, B

    O. Chapelle, B. Schlkopf, and A. Zien, Semi-Supervised Learning, The MIT Press, 1st edition, 2010

  29. [29]

    Introduction to semi-supervised learning,

    X. Zhu and A. B. Goldberg, “Introduction to semi-supervised learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning , 2009

  30. [30]

    A survey on deep semi-supervised learning,

    X. Yang, Z. Song, I. King, and Z. Xu, “A survey on deep semi-supervised learning,” IEEE Transactions on Knowledge and Data Engineering , 2022

  31. [31]

    A survey on semi-supervised learning,

    J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Machine learning, vol. 109, no. 2, pp. 373–440, 2020

  32. [32]

    Semi- supervised regression: A recent review,

    G. Kostopoulos, S. Karlos, S. Kotsiantis, and O. Ragos, “Semi- supervised regression: A recent review,” Journal of Intelligent & Fuzzy Systems, vol. 35, no. 2, pp. 1483–1500, 2018

  33. [33]

    Deterministic nonperiodic flow,

    E.N. Lorenz, “Deterministic nonperiodic flow,” Journal of atmospheric sciences, vol. 20, no. 2, pp. 130–141, 1963

  34. [34]

    On the generalized lorenz canonical form,

    S. ˇCelikovsk`y and G. Chen, “On the generalized lorenz canonical form,” Chaos, Solitons & Fractals , vol. 26, no. 5, pp. 1271–1276, 2005

  35. [35]

    Yet another chaotic attractor,

    G. Chen and T. Ueta, “Yet another chaotic attractor,” International Journal of Bifurcation and chaos , vol. 9, no. 07, pp. 1465–1466, 1999

  36. [36]

    An equation for continuous chaos,

    O.E. R ¨ossler, “An equation for continuous chaos,” Physics Letters A , vol. 57, no. 5, pp. 397–398, 1976

  37. [37]

    On the properties of neural machine translation: Encoder–decoder approaches,

    K. Cho, B. van Merri ¨enboer, D. Bahdanau, and Y . Bengio, “On the properties of neural machine translation: Encoder–decoder approaches,” in Proceedings of 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8), 2014, pp. 103–111

  38. [38]

    C. M. Bishop and N. M. Nasrabadi, Pattern recognition and machine learning, vol. 4, Springer, 2006

  39. [39]

    J. C. Sprott, Chaos and time-series analysis , Oxford university press, 2003

  40. [40]

    Long short-term memory,

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

  41. [41]

    PyTorch: An imperative style, high-performance deep learning library,

    A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems , vol. 32, 2019

  42. [42]

    Adam: A method for stochastic optimization,

    D.P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations (ICLR) , 2015

  43. [43]

    FilterPy - Kalman and Bayesian filters in Python,

    R. Labbe, “FilterPy - Kalman and Bayesian filters in Python,” URL: https://filterpy.readthedocs.io/en/latest/, 2018

  44. [44]

    Goodfellow, Y

    I. Goodfellow, Y . Bengio, and A. Courville, Deep learning, MIT press, 2016

  45. [45]

    Backpropagation through time: what it does and how to do it,

    P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE , vol. 78, no. 10, pp. 1550–1560, 1990

  46. [46]

    Auto-encoding variational bayes,

    D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in 2nd International Conference on Learning Representations, (ICLR) , Yoshua Bengio and Yann LeCun, Eds., 2014