Recognition: no theorem link
BAMIFun: Bayesian Multiple Imputation for Functional Data
Pith reviewed 2026-05-11 02:39 UTC · model grok-4.3
The pith
Bayesian multiple imputation for functional data provides accurate reconstructions and reliable uncertainty estimates by drawing from posterior distributions rather than single point estimates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BAMIFun imposes a Bayesian low-rank model that incorporates penalized spline representations to enforce smoothness of eigenfunctions and derives an efficient Gibbs sampler algorithm for posterior computation. For single-level functional data this enables multiple imputations that properly account for estimation uncertainties in downstream analysis. The framework extends to multiway functional data using a low-rank Functional Tensor Singular Value Decomposition model. Simulation studies show that compared to existing methods BAMIFun achieves accurate imputation while providing substantially improved coverage and more reliable downstream inference.
What carries the argument
Bayesian low-rank model with penalized spline representations for eigenfunctions and Gibbs sampling for posterior draws, extended via low-rank Functional Tensor Singular Value Decomposition for multiway data.
If this is right
- Multiple imputations drawn from the posterior replace single point estimates and thereby avoid overconfident downstream inferences.
- Coverage probabilities for parameters in subsequent analyses improve relative to existing single-imputation functional principal component methods.
- The same framework applies directly to multiway functional data where no prior multiple-imputation methods existed.
- Case studies on physical activity trajectories and infant gut microbiome data confirm practical advantages when missingness is severe.
Where Pith is reading between the lines
- If the penalized spline low-rank representation remains adequate for other classes of smooth trajectories, the method could be applied to longitudinal biomedical studies that record irregular time courses.
- Efficiency of the Gibbs sampler may allow scaling to larger numbers of subjects or denser grids once computational cost is profiled.
- The multiway extension suggests a route to handle tensor-valued functional observations with missing entries in fields such as neuroimaging.
Load-bearing premise
The low-rank structure with penalized splines sufficiently captures the true underlying functional variation and the Bayesian model correctly represents the posterior uncertainty under the chosen priors and sampling scheme.
What would settle it
In a simulation study where true functional curves are generated from a model whose effective rank exceeds the low-rank assumption, the empirical coverage of 95 percent intervals constructed from BAMIFun imputations falls substantially below the nominal level.
Figures
read the original abstract
Missing data are pervasive in modern functional datasets, where trajectories are often sparsely or irregularly observed. Although Functional Principal Component Analysis (FPCA) is widely used to reconstruct incomplete curves, existing FPCA-based approaches typically employ single imputation, leading to overly optimistic inferences in downstream analyses. To address these challenges, we develop a novel Bayesian multiple imputation framework for functional data (BAMIFun). For single-level functional data, we impose a Bayesian low-rank model that incorporates penalized spline representations to enforce smoothness of eigenfunctions and derive an efficient Gibbs sampler algorithm for posterior computation. In addition, we demonstrate and validate how to properly account for the estimation uncertainties in downstream analysis. Furthermore, we extend the framework to multiway functional data using a low-rank Functional Tensor Singular Value Decomposition (FTSVD) model, enabling Bayesian multiple imputation in settings not supported by existing methods. Simulation studies show that, compared to existing methods, BAMIFun achieves accurate imputation while providing substantially improved coverage and more reliable downstream inference. Case studies using a physical activity dataset and an infant gut microbiome dataset further demonstrate the practical advantages of our proposed methods under severe missingness. Code for our algorithms is available at https://github.com/ZirenJiang/BAMIFun.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces BAMIFun, a Bayesian multiple imputation framework for functional data with missingness. For single-level data it specifies a low-rank model using penalized splines for smooth eigenfunctions together with a Gibbs sampler; the approach is extended to multiway data via a low-rank Functional Tensor Singular Value Decomposition (FTSVD) model. The central claims are that the method yields accurate imputations, substantially improved coverage relative to existing single-imputation FPCA methods, and more reliable downstream inference, supported by simulation studies and two case studies on physical activity and infant gut microbiome data.
Significance. If the low-rank penalized-spline and FTSVD assumptions hold for the target data, the framework supplies a coherent mechanism for propagating imputation uncertainty into downstream functional-data analyses, addressing a recognized limitation of single-imputation approaches. The public release of code is a positive feature for reproducibility.
major comments (3)
- [§4] §4 (Simulation studies): Data are generated from the same low-rank penalized-spline model that BAMIFun assumes; the reported coverage gains and downstream-inference improvements are therefore conditional on correct model specification and do not address performance under higher-rank or non-smooth trajectories, which is load-bearing for the claim of 'substantially improved coverage'.
- [§3.2] §3.2 (Downstream inference): The procedure for combining multiple imputations with a subsequent analysis model is outlined but lacks an explicit derivation or theorem establishing frequentist coverage or Bayesian calibration of the resulting intervals; without this, the assertion of 'more reliable downstream inference' rests only on empirical simulation results.
- [§3.3] §3.3 (Multiway extension): The FTSVD low-rank model is introduced without reported sensitivity checks on the chosen tensor rank or spline penalty parameters; because these are free parameters in the model, their misspecification directly affects the posterior uncertainty used for imputation.
minor comments (2)
- [Abstract and §3.2] The abstract states that the method 'properly account[s] for the estimation uncertainties'; the corresponding section should include a short algorithmic box or pseudocode showing the exact steps for propagating the imputed draws into a generic downstream estimator.
- [Tables in §4] Table captions for simulation results should report the number of Monte Carlo replications and the exact missingness mechanisms used.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive comments. We address each major point below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: §4 (Simulation studies): Data are generated from the same low-rank penalized-spline model that BAMIFun assumes; the reported coverage gains and downstream-inference improvements are therefore conditional on correct model specification and do not address performance under higher-rank or non-smooth trajectories, which is load-bearing for the claim of 'substantially improved coverage'.
Authors: We agree that the primary simulations are conducted under the assumed model. To evaluate robustness, we will add new simulation scenarios in the revised manuscript that generate data from higher-rank models and non-smooth trajectories, reporting imputation accuracy and coverage under these misspecification settings. revision: yes
-
Referee: §3.2 (Downstream inference): The procedure for combining multiple imputations with a subsequent analysis model is outlined but lacks an explicit derivation or theorem establishing frequentist coverage or Bayesian calibration of the resulting intervals; without this, the assertion of 'more reliable downstream inference' rests only on empirical simulation results.
Authors: We will revise Section 3.2 to include a clearer step-by-step derivation of the combining rules based on standard Bayesian multiple imputation theory (Rubin 1987), along with references to existing results on posterior calibration. A new general theorem on frequentist coverage is outside the scope of the current work, but the expanded explanation and simulation evidence will better support the downstream inference claims. revision: partial
-
Referee: §3.3 (Multiway extension): The FTSVD low-rank model is introduced without reported sensitivity checks on the chosen tensor rank or spline penalty parameters; because these are free parameters in the model, their misspecification directly affects the posterior uncertainty used for imputation.
Authors: We will incorporate sensitivity analyses for the multiway extension, varying the tensor rank and penalty parameters in both the simulation studies and the infant gut microbiome case study, and report their impact on imputation uncertainty and downstream results. revision: yes
Circularity Check
Bayesian hierarchical model for functional imputation is self-contained
full rationale
The paper defines a standard Bayesian low-rank penalized-spline model for single-level functional data and an FTSVD extension for multiway data, then derives a Gibbs sampler for posterior sampling and multiple imputation. Simulation studies evaluate coverage and imputation accuracy under the model's own generative assumptions, which is standard practice and does not constitute a reduction of any claimed result to its inputs by construction. No load-bearing step relies on self-citation of an unverified uniqueness theorem or ansatz; the framework applies established Bayesian hierarchical modeling techniques to functional data without tautological equivalences.
Axiom & Free-Parameter Ledger
free parameters (2)
- low-rank dimension
- spline penalty parameters
axioms (2)
- domain assumption Functional trajectories admit a low-rank representation with smooth eigenfunctions.
- standard math The Gibbs sampler produces draws from the target posterior distribution.
Reference graph
Works this paper leans on
-
[1]
Journal of the Royal Statistical Society Series A: Statistics in Society , pages=
Sensitivity analysis for the generalization of experimental results , author=. Journal of the Royal Statistical Society Series A: Statistics in Society , pages=. 2024 , publisher=
work page 2024
-
[2]
Medicine and science in sports and exercise , volume=
NHANES 2011-2014: Objective physical activity is the strongest predictor of all-cause mortality , author=. Medicine and science in sports and exercise , volume=
work page 2011
-
[3]
D. John and Tang, Q. and Albinali, F. and Intille, S.S. , title =. Human Kinetics Journal , volume =
-
[4]
Highly irregular functional generalized linear regression with electronic health records , author=. JRSSC , volume=. 2022 , publisher=
work page 2022
-
[5]
AStA Advances in Statistical Analysis , volume=
A survey of functional principal component analysis , author=. AStA Advances in Statistical Analysis , volume=. 2014 , publisher=
work page 2014
-
[6]
Principal component models for sparse functional data , author=. Biometrika , volume=. 2000 , publisher=
work page 2000
-
[7]
From sparse to dense functional data and beyond , journal=
Zhang, Xiaoke and Wang, Jane-Ling , year=. From sparse to dense functional data and beyond , journal=
-
[8]
Journal of the American statistical association , volume=
Functional data analysis for sparse longitudinal data , author=. Journal of the American statistical association , volume=. 2005 , publisher=
work page 2005
-
[9]
Wiley Interdisciplinary Reviews: Computational Statistics , volume=
Optimal Experimental Designs for Sparse Functional Data: A Review , author=. Wiley Interdisciplinary Reviews: Computational Statistics , volume=. 2025 , publisher=
work page 2025
- [10]
-
[11]
Journal of Computational and Graphical Statistics , volume=
Modeling longitudinal data using matrix completion , author=. Journal of Computational and Graphical Statistics , volume=. 2024 , publisher=
work page 2024
-
[12]
Handbook of functional MRI data analysis , author=. 2024 , publisher=
work page 2024
-
[13]
Recent advances in functional data analysis and related topics , pages=
Longitudinal functional principal component analysis , author=. Recent advances in functional data analysis and related topics , pages=. 2011 , publisher=
work page 2011
- [14]
-
[15]
Probability, random processes, and estimation theory for engineers , author=. 1986 , publisher=
work page 1986
-
[16]
Fast Bayesian Functional Principal Components Analysis , author=. JCGS , number=. 2025 , publisher=
work page 2025
-
[17]
Annual Review of Statistics and Its Application , volume=
Integrative Analysis of Multimodal Omics Data , author=. Annual Review of Statistics and Its Application , volume=. 2025 , publisher=
work page 2025
-
[18]
A test of weak separability for multi-way functional data, with application to brain connectivity studies , author=. Biometrika , volume=. 2018 , publisher=
work page 2018
-
[19]
The annals of applied statistics , volume=
Multilevel functional principal component analysis , author=. The annals of applied statistics , volume=
-
[20]
Influence of feeding type on gut microbiome development in hospitalized preterm infants , author=. Nursing research , volume=. 2017 , publisher=
work page 2017
-
[21]
Multiple imputation after 18+ years , author=. JASA , volume=. 1996 , publisher=
work page 1996
-
[22]
Statistics in Medicine , volume=
Tutorial on Bayesian Functional Regression Using Stan , author=. Statistics in Medicine , volume=. 2025 , publisher=
work page 2025
-
[23]
Statistical computing in functional data analysis: The R package fda. usc , author=. Journal of statistical Software , volume=
-
[24]
Computational Statistics & Data Analysis , volume=
Smooth lasso estimator for the function-on-function linear regression model , author=. Computational Statistics & Data Analysis , volume=. 2022 , publisher=
work page 2022
-
[25]
Statistical Modelling: An International Journal , volume=
Package ‘robflreg’ , author=. Statistical Modelling: An International Journal , volume=
-
[26]
Functional data analysis: An introduction and recent developments , author=. Biometrical Journal , volume=. 2024 , publisher=
work page 2024
-
[27]
Elastic analysis of irregularly or sparsely sampled curves , author=. Biometrics , volume=. 2023 , publisher=
work page 2023
-
[28]
Journal of the American Statistical Association , volume=
Bayesian framework for simultaneous registration and estimation of noisy, sparse, and fragmented functional data , author=. Journal of the American Statistical Association , volume=. 2022 , publisher=
work page 2022
-
[29]
Journal of the American Statistical Association , volume=
Guaranteed functional tensor singular value decomposition , author=. Journal of the American Statistical Association , volume=. 2024 , publisher=
work page 2024
-
[30]
Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition , author=. Psychometrika , volume=. 1970 , publisher=
work page 1970
-
[31]
Statistics in medicine , volume=
A functional multiple imputation approach to incomplete longitudinal data , author=. Statistics in medicine , volume=. 2011 , publisher=
work page 2011
-
[32]
Statistics in medicine , volume=
A Bayesian multiple imputation approach to bivariate functional data with missing components , author=. Statistics in medicine , volume=. 2021 , publisher=
work page 2021
-
[33]
MissForest—non-parametric missing value imputation for mixed-type data , author=. Bioinformatics , volume=. 2012 , publisher=
work page 2012
-
[34]
Statistical methods in medical research , volume=
Multiple imputation of discrete and continuous data by fully conditional specification , author=. Statistical methods in medical research , volume=. 2007 , publisher=
work page 2007
-
[35]
IEEE transactions on pattern analysis and machine intelligence , volume=
Simultaneous tensor decomposition and completion using factor priors , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2013 , publisher=
work page 2013
-
[36]
Chemometrics and Intelligent Laboratory Systems , volume=
Scalable tensor factorizations for incomplete data , author=. Chemometrics and Intelligent Laboratory Systems , volume=. 2011 , publisher=
work page 2011
-
[37]
Proceedings of the 25th international conference on Machine learning , pages=
Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , author=. Proceedings of the 25th international conference on Machine learning , pages=
-
[38]
Transportation research part C: emerging technologies , volume=
A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation , author=. Transportation research part C: emerging technologies , volume=. 2019 , publisher=
work page 2019
-
[39]
Statistics and its interface , volume=
Bayesian tensor-on-tensor regression with efficient computation , author=. Statistics and its interface , volume=. 2024 , publisher=
work page 2024
-
[40]
Journal of Machine Learning Research , volume=
Bayesian tensor regression , author=. Journal of Machine Learning Research , volume=
-
[41]
The annals of applied statistics , volume=
Multilinear tensor regression for longitudinal relational data , author=. The annals of applied statistics , volume=. 2015 , publisher=
work page 2015
-
[42]
Multiple imputation: a primer , author=. SMMR , volume=. 1999 , publisher=
work page 1999
-
[43]
Nonparametric mixed effects models for unequally sampled noisy curves , author=. Biometrics , volume=. 2001 , publisher=
work page 2001
-
[44]
Functional data analysis for wearable sensor data: a systematic review: N. Acar-Denizli, P. Delicado , author=. AStA Advances in Statistical Analysis , pages=. 2025 , publisher=
work page 2025
-
[45]
Modern multiple imputation with functional data , author=. Stat , volume=. 2021 , publisher=
work page 2021
-
[46]
Functional Principal Component Analysis as an Alternative to Mixed-Effect Models for Describing Sparse Repeated Measures in Presence of Missing Data , author=. Stat in Med , volume=. 2024 , publisher=
work page 2024
-
[47]
Journal of nonparametric statistics , volume=
Classical testing in functional linear models , author=. Journal of nonparametric statistics , volume=. 2016 , publisher=
work page 2016
-
[48]
Flexible imputation of missing data, second edition , pages=
Multiple imputation , author=. Flexible imputation of missing data, second edition , pages=. 2018 , publisher=
work page 2018
-
[49]
BAMITA: Bayesian multiple imputation for tensor arrays , author=. Biostatistics , volume=. 2025 , publisher=
work page 2025
-
[50]
Statistics and computing , volume=
Fast covariance estimation for sparse functional data , author=. Statistics and computing , volume=. 2018 , publisher=
work page 2018
-
[51]
Variance-based sensitivity analysis for weighting estimators results in more informative bounds , author=. Biometrika , pages=. 2024 , publisher=
work page 2024
-
[52]
Korean journal of anesthesiology , volume=
Driving pressure guided ventilation , author=. Korean journal of anesthesiology , volume=. 2020 , publisher=
work page 2020
-
[53]
Driving pressure and transpulmonary pressure: how do we guide safe mechanical ventilation? , author=. Anesthesiology , volume=
-
[54]
Journal of the American Statistical Association , volume=
Uniformly semiparametric efficient estimation of treatment effects with a continuous treatment , author=. Journal of the American Statistical Association , volume=. 2015 , publisher=
work page 2015
-
[55]
arXiv preprint arXiv:2205.05777 , year=
Efficient estimation of modified treatment policy effects based on the generalized propensity score , author=. arXiv preprint arXiv:2205.05777 , year=
-
[56]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Causal mediation analysis for stochastic interventions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2020 , publisher=
work page 2020
-
[57]
Epidemiologic methods , volume=
Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data , author=. Epidemiologic methods , volume=. 2014 , publisher=
work page 2014
-
[58]
Advances in Neural Information Processing Systems , volume=
A kernel statistical test of independence , author=. Advances in Neural Information Processing Systems , volume=
-
[59]
Hirano, Keisuke and Imbens, Guido W. , publisher =. The Propensity Score with Continuous Treatments , booktitle =. doi:https://doi.org/10.1002/0470090456.ch7 , year =
-
[60]
Journal of the American Statistical Association , volume=
Causal inference with general treatment regimes: Generalizing the propensity score , author=. Journal of the American Statistical Association , volume=. 2004 , publisher=
work page 2004
-
[61]
Statistics in Medicine , volume=
Balancing vs modeling approaches to weighting in practice , author=. Statistics in Medicine , volume=. 2020 , publisher=
work page 2020
- [62]
-
[63]
Statistical Methods in Medical Research , volume=
Propensity score-based methods for causal inference in observational studies with non-binary treatments , author=. Statistical Methods in Medical Research , volume=. 2020 , publisher=
work page 2020
-
[64]
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=
Non-parametric methods for doubly robust estimation of continuous treatment effects , author=. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=. 2017 , publisher=
work page 2017
-
[65]
Double debiased machine learning nonparametric inference with continuous treatments , author=. arXiv preprint arXiv:2004.03036 , year=
-
[66]
arXiv preprint arXiv:2103.03437 , year=
Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing , author=. arXiv preprint arXiv:2103.03437 , year=
-
[67]
Quantitative Economics , volume=
A unified framework for efficient estimation of general treatment models , author=. Quantitative Economics , volume=. 2021 , publisher=
work page 2021
-
[68]
arXiv preprint arXiv:1802.08667 , year=
De-biased machine learning of global and local parameters using regularized Riesz representers , author=. arXiv preprint arXiv:1802.08667 , year=
- [69]
-
[70]
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=
Causal isotonic regression , author=. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=. 2020 , publisher=
work page 2020
-
[71]
Annals of Intensive Care , volume=
Formal guidelines: management of acute respiratory distress syndrome , author=. Annals of Intensive Care , volume=. 2019 , publisher=
work page 2019
-
[72]
Ventilator management strategies for adults with acute respiratory distress syndrome , author=. UpToDate , year=
-
[73]
Constructing inverse probability weights for continuous exposures: a comparison of methods , author=. Epidemiology , pages=. 2014 , publisher=
work page 2014
-
[74]
arXiv preprint arXiv:2002.11276 , year=
A Balancing Weight Framework for Estimating the Causal Effect of General Treatments , author=. arXiv preprint arXiv:2002.11276 , year=
-
[75]
Health Services and Outcomes Research Methodology , doi=
Nonparametric Estimation of Population Average Dose-Response Curves using Entropy Balancing Weights for Continuous Exposures , author=. Health Services and Outcomes Research Methodology , doi=. 2021 , volume =
work page 2021
-
[76]
Unified methods for censored longitudinal data and causality , author=. 2003 , publisher=
work page 2003
-
[77]
Journal of Causal Inference , volume=
Targeted data adaptive estimation of the causal dose--response curve , author=. Journal of Causal Inference , volume=. 2013 , publisher=
work page 2013
-
[78]
Covariate association eliminating weights: a unified weighting framework for causal effect estimation , author=. Biometrika , volume=. 2018 , publisher=
work page 2018
- [79]
-
[80]
Chiou, Jeng-Min , year=. Dynamical functional prediction and classification, with application to traffic flow prediction , journal=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.