Recognition: no theorem link
Linked-Tucker Factorized Individualized Regression for Paired Multivariate Categorical Outcomes
Pith reviewed 2026-05-12 02:06 UTC · model grok-4.3
The pith
A linked Tucker factorization enables joint hurdle-ordinal modeling of paired zero-inflated dental outcomes with subject-specific spatial effects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The linked Tucker factorization decomposes the coefficient tensors for the paired outcomes by sharing subject-mode factors to capture dependence between caries and fluorosis while employing separate spatial factors for tooth surfaces and zones, thereby allowing parsimonious representation of subject-specific, spatially varying, and time-varying effects together with posterior inference on how covariates influence presence versus severity.
What carries the argument
Linked Tucker tensor factorization, which decomposes the high-dimensional coefficient arrays with shared subject factors to link the two outcomes and separate spatial factors to accommodate distinct measurement grids.
If this is right
- Population-level effect summaries are obtained by projecting individualized posterior linear predictors onto the design space.
- Wasserstein barycenters aggregate these summaries across tooth locations and anatomical classes.
- Associations between exposures and outcomes differ between the presence and severity model components.
- Fluoride exposure is associated with increased odds and severity of fluorosis while soda intake consistently increases caries risk.
- These associations vary across tooth locations, ages, and subpopulations defined by prior caries status.
Where Pith is reading between the lines
- The framework could be applied to other paired ordinal health outcomes where occurrence and progression need to be disentangled while respecting spatial heterogeneity.
- Targeted public-health interventions for dental disease might be designed around the observed location-specific exposure effects rather than uniform guidelines.
- If the proportional-odds assumption is violated in new data, the severity component would need replacement by a more flexible ordinal model while retaining the linked factorization.
- The horseshoe prior on the core tensor could be replaced by other sparsity-inducing priors to test robustness of the identified exposure effects.
Load-bearing premise
The linked Tucker factorization represents the high-dimensional coefficient arrays without substantial information loss and the proportional-odds assumption holds for the severity components of both outcomes.
What would settle it
Posterior predictive checks on the Iowa Fluoride Study data that show systematic misfit in the severity distributions for either outcome, or a simulation where the true coefficient tensors lack a low-rank linked Tucker structure yet the model recovers biased exposure effects, would indicate the factorization is inadequate.
read the original abstract
We propose a joint individualized hurdle-ordinal regression model for paired zero-inflated ordinal outcomes with subject-specific, spatially varying, and time-varying covariate effects, motivated by the Iowa Fluoride Study (IFS). The two outcomes, dental caries and dental fluorosis, are measured repeatedly across ages at fine spatial resolution, yielding nested longitudinal data with substantial zero inflation, ordinality, and heterogeneity across individuals and locations. For each outcome, a hurdle component models disease presence, while a proportional-odds component models severity among positive observations. To parsimoniously represent the high-dimensional coefficient arrays, we introduce a linked Tucker tensor factorization. Shared subject-mode factors induce dependence between the caries and fluorosis coefficient tensors, while separate spatial factors accommodate the distinct measurement grids of tooth surfaces and tooth zones. A horseshoe prior on the core tensor elements encourages sparsity, and posterior computation is performed using the No-U-Turn Sampler in NumPyro. Population-level effect summaries are obtained by projecting individualized posterior linear predictors onto the design space, and Wasserstein barycenters aggregate these summaries across tooth locations and anatomical classes. Applied to the IFS, the model reveals spatially heterogeneous associations between early-life fluoride and dietary exposures and both outcomes. Fluoride exposure is associated with increased odds and severity of fluorosis, while soda intake consistently increases caries risk. These associations differ between presence and severity components and vary across tooth locations, ages, and subpopulations defined by prior caries status, highlighting the importance of the joint hurdle-ordinal framework for disentangling disease occurrence from disease progression in multilevel dental data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a joint individualized hurdle-ordinal regression model for paired zero-inflated ordinal outcomes (dental caries and fluorosis) with subject-specific, spatially varying, and time-varying covariate effects. A linked Tucker tensor factorization is introduced to parsimoniously represent the high-dimensional coefficient arrays, with shared subject-mode factors, separate spatial factors, a horseshoe prior on the core tensor, and NUTS sampling in NumPyro. Population-level summaries are obtained via projection of posterior linear predictors and Wasserstein barycenters. The model is applied to the Iowa Fluoride Study (IFS) data, claiming to reveal spatially heterogeneous associations: fluoride exposure increases odds and severity of fluorosis, while soda intake increases caries risk, with differences between presence/severity components and across locations, ages, and subpopulations.
Significance. If the linked Tucker factorization faithfully represents the coefficient tensors with negligible loss and the proportional-odds assumption holds, the framework offers a novel, parsimonious approach to joint modeling of paired multivariate categorical outcomes with individualized and spatial heterogeneity. The combination of hurdle modeling for zero-inflation, tensor factorization for dimensionality reduction, and Wasserstein aggregation for summaries represents a technical advance with potential applicability to other multilevel longitudinal categorical data settings in epidemiology and beyond. The explicit joint treatment of presence and severity components is a clear strength.
major comments (2)
- [Abstract and model specification] The central claim of recovering spatially heterogeneous associations (Abstract) rests on the linked Tucker factorization preserving individualized and spatial structure in the coefficient arrays without substantial loss. No simulation studies or recovery metrics are described to verify that the shared subject factors plus separate spatial factors plus sparse core recover known heterogeneous effects; if the true rank or dependence structure is misaligned, the posterior linear predictors and barycenter summaries could attenuate or artifactually induce the reported heterogeneity.
- [Model formulation] The proportional-odds assumption in the severity components for both outcomes is invoked without reported diagnostics (e.g., score tests or posterior predictive checks for cumulative logit fit). Violation would undermine the separation of presence versus severity effects that is central to the joint hurdle-ordinal claim and the IFS interpretation.
minor comments (3)
- [Model specification] Notation for the linked Tucker factorization (shared subject factors, separate spatial factors, core tensor) should be introduced with explicit dimension indices and a diagram to clarify how the linking induces dependence between the two outcome tensors.
- [Posterior summaries] The description of Wasserstein barycenter aggregation across tooth locations and anatomical classes would benefit from a brief algorithmic outline or reference to the specific implementation used.
- [Discussion] The manuscript would be strengthened by explicit comparison to simpler alternatives (e.g., separate univariate models or non-tensorized multivariate regression) to quantify the gain from the linked factorization.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. We address each major comment below, proposing revisions to strengthen the manuscript where appropriate.
read point-by-point responses
-
Referee: [Abstract and model specification] The central claim of recovering spatially heterogeneous associations (Abstract) rests on the linked Tucker factorization preserving individualized and spatial structure in the coefficient arrays without substantial loss. No simulation studies or recovery metrics are described to verify that the shared subject factors plus separate spatial factors plus sparse core recover known heterogeneous effects; if the true rank or dependence structure is misaligned, the posterior linear predictors and barycenter summaries could attenuate or artifactually induce the reported heterogeneity.
Authors: We agree that simulation-based validation would strengthen confidence in the factorization's fidelity for recovering heterogeneous effects. The linked Tucker structure was specifically designed with shared subject-mode factors to capture cross-outcome dependence and separate spatial factors to respect distinct tooth-surface and tooth-zone grids, with the horseshoe prior promoting sparsity in the core tensor. Nevertheless, we will add a simulation study in the revised manuscript. Data will be generated under known spatially varying coefficient tensors aligned with the model structure; we will then report recovery metrics including coefficient MSE, coverage of credible intervals, and agreement between true and estimated Wasserstein barycenters. This will directly test for attenuation or artifactual heterogeneity under the assumed rank and dependence. revision: yes
-
Referee: [Model formulation] The proportional-odds assumption in the severity components for both outcomes is invoked without reported diagnostics (e.g., score tests or posterior predictive checks for cumulative logit fit). Violation would undermine the separation of presence versus severity effects that is central to the joint hurdle-ordinal claim and the IFS interpretation.
Authors: We acknowledge that explicit validation of the proportional-odds assumption is necessary to support the separation of presence and severity components. In the revised manuscript we will include posterior predictive checks that compare observed versus replicated cumulative probabilities across severity categories, as well as score tests for the proportional-odds assumption applied to the fitted models. These diagnostics will be reported for the IFS analysis, with discussion of any detected violations and their implications for interpreting the presence-versus-severity distinctions. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper defines a new joint hurdle-ordinal regression model with linked Tucker factorization for coefficient tensors (shared subject factors, separate spatial factors, sparse core, horseshoe prior, NUTS sampling). It then applies this model to IFS data to obtain posterior linear predictors, Wasserstein barycenter summaries, and empirical associations. These are constructive model specifications and data-driven inferences; no equation reduces a claimed prediction or result to a fitted parameter or self-citation by construction. The framework is self-contained against external benchmarks with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
free parameters (2)
- Horseshoe prior hyperparameters
- Tucker factor dimensions
axioms (2)
- domain assumption Proportional odds assumption holds for the ordinal severity components
- ad hoc to paper Linked Tucker factorization adequately captures the coefficient arrays
Reference graph
Works this paper leans on
- [1]
-
[2]
Bayesian graphical lasso models and efficient posterior computation , author=. Bayesian Analysis , volume=. 2012 , publisher=
work page 2012
-
[3]
Hoffman, Matthew D and Gelman, Andrew , journal=. The
-
[4]
Journal of Machine Learning Research , volume =
Pyro: Deep Universal Probabilistic Programming , author =. Journal of Machine Learning Research , volume =. 2019 , url =
work page 2019
-
[5]
arXiv preprint arXiv:2410.13949 , year=
Modeling Zero-Inflated Correlated Dental Data through Gaussian Copulas and Approximate Bayesian Computation , author=. arXiv preprint arXiv:2410.13949 , year=
-
[6]
arXiv preprint arXiv:2412.11348 , year=
Analyzing zero-inflated clustered longitudinal ordinal outcomes using GEE-type models with an application to dental fluorosis studies , author=. arXiv preprint arXiv:2412.11348 , year=
-
[7]
Artificial intelligence and statistics , pages=
Handling sparsity via the horseshoe , author=. Artificial intelligence and statistics , pages=. 2009 , organization=
work page 2009
-
[8]
Pitt, Michael and Chan, David and Kohn, Robert , doi =. Biometrika , keywords =
-
[9]
Proceedings of the National Academy of Sciences of the United States of America , number =
Marjoram, Paul and Molitor, John and Plagnol, Vincent and Tavar. Proceedings of the National Academy of Sciences of the United States of America , number =. doi:10.1073/pnas.0306899100 , issn =
-
[10]
doi:10.1093/biomet/asu027 , journal =
Lee, Anthony and. doi:10.1093/biomet/asu027 , journal =
-
[11]
Nicholas G. Polson and James G. Scott and Jesse Windle , title =. Journal of the American Statistical Association , volume =. 2013 , publisher =. doi:10.1080/01621459.2013.829001 , URL =
-
[12]
Hierarchical Modeling and Analysis of Spatial Data , volume =
Banerjee, Sudipto and Carlin, Bradley and Gelfand, Alan , year =. Hierarchical Modeling and Analysis of Spatial Data , volume =
-
[13]
Pillow, Jonathan W. and Scott, James G. , isbn =. Advances in Neural Information Processing Systems , pages =
-
[14]
Griffin, Jim E. and Brown, Philip J. , doi =. Bayesian Analysis , keywords =
-
[15]
Statistics in Medicine , keywords =
Kang, Tong and Gaskins, Jeremy and , Steven and Datta, Somnath , doi =. Statistics in Medicine , keywords =
-
[16]
Beaumont, Mark A. , doi =. Annual Review of Statistics and Its Application , keywords =
-
[17]
Approximately Sufficient Statistics and Bayesian Computation , title =
Paul Joyce and Paul Marjoram , volume =. Approximately Sufficient Statistics and Bayesian Computation , title =. Statistical Applications in Genetics and Molecular Biology , doi =. 2008 , lastchecked =
work page 2008
-
[18]
Statistical Applications in Genetics and Molecular Biology , note =
On Optimal Selection of Summary Statistics for Approximate Bayesian Computation , author =. Statistical Applications in Genetics and Molecular Biology , note =. 2010 , lastchecked =
work page 2010
-
[19]
Wegmann, Daniel and Leuenberger, Christoph and Excoffier, Laurent , title =. Genetics , volume =. 2009 , month =. doi:10.1534/genetics.109.102509 , url =
-
[20]
Fearnhead, Paul and Prangle, Dennis , journal =
-
[21]
Drovandi, Christopher C. and Pettitt, Anthony N. and Faddy, Malcolm J. , doi =. Journal of the Royal Statistical Society. Series C: Applied Statistics , keywords =
-
[22]
Bayesian Analysis , keywords =
Prangle, Dennis , doi =. Bayesian Analysis , keywords =. arXiv , arxivId =:1507.00874 , file =
- [23]
-
[24]
Hamilton, Grant and Currat, Mathias and Ray, Nicolas and Heckel, Gerald and Beaumont, Mark and Excoffier, Laurent , doi =. Genetics , number =
-
[25]
Handbook of Approximate Bayesian Computation , author=. 2018 , chapter = 4, pages =
work page 2018
-
[26]
and Fan, Yanan and Beaumont, Mark , title =
Sisson, Scott A. and Fan, Yanan and Beaumont, Mark , title =
-
[27]
Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction* , volume =
Polson, Nicholas and Scott, James , year =. Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction* , volume =. Bayesian Statistics 9 , doi =
-
[28]
Carlos M. Carvalho and Nicholas G. Polson and James G. Scott , journal =. The horseshoe estimator for sparse signals , urldate =
-
[29]
Choo-Wosoba, Hyoyoung and Levy, Steven M. and Datta, Somnath , doi =. Biometrics , keywords =
-
[30]
Marshall, Teresa A. and Levy, Steven M. and Broffitt, Barbara and Warren, John J. and Eichenberger-Gilmore, Julie M. and Burns, Trudy L. and Stumbo, Phyllis J. , doi =. Pediatrics , mendeley-groups =
-
[31]
Levy, S. M. and Warren, J. J. and Broffitt, B. and Hillis, S. L. and Kanellis, M. J. , doi =. Caries Research , keywords =
-
[32]
and Warren, John and Cavanaugh, Joseph E
Broffitt, Barbara and Levy, Steven M. and Warren, John and Cavanaugh, Joseph E. , doi =. Journal of Public Health Dentistry , keywords =
-
[33]
Statistics in Medicine , keywords =
Choo-Wosoba, Hyoyoung and Gaskins, Jeremy and Levy, Steven and Datta, Somnath , doi =. Statistics in Medicine , keywords =
-
[34]
Statistics in Medicine , keywords =
Kang, Tong and Gaskins, Jeremy and Levy, Steven and Datta, Somnath , doi =. Statistics in Medicine , keywords =
-
[35]
Dankmar B\". The Zero-Inflated Poisson Model and the Decayed, Missing and Filled Teeth Index in Dental Epidemiology , urldate =. Journal of the Royal Statistical Society. Series A (Statistics in Society) , number =
-
[36]
Mwalili, Samuel M and Lesaffre, Emmanuel and Declerck, Dominique. The zero-inflated negative binomial regression model with correction for misclassification: an example in caries research. Stat Methods Med Res
-
[37]
Dependence Modeling with Copulas , journal =
Joe, Harry , year =. Dependence Modeling with Copulas , journal =
-
[38]
Fonctions de repartition an dimensions et leurs marges , author=. Publ. inst. statist. univ. Paris , volume=
-
[39]
Journal of the American Statistical Association , keywords =
Park, Trevor and Casella, George , doi =. Journal of the American Statistical Association , keywords =
- [40]
-
[41]
Journal of Statistical Planning and Inference , keywords =
Kolev, Nikolai and Paiva, Delhi , doi =. Journal of Statistical Planning and Inference , keywords =
-
[42]
Canadian Journal of Statistics , keywords =
Min, Aleksey and Czado, Claudia , doi =. Canadian Journal of Statistics , keywords =
-
[43]
Journal of Financial Econometrics , keywords =
Min, Aleksey and Czado, Claudia , doi =. Journal of Financial Econometrics , keywords =
-
[44]
Smith, Michael S. and Khaled, Mohamad A. , doi =. Journal of the American Statistical Association , keywords =
-
[45]
and Luo, Zhehui and Roman, Lee Anne , doi =
Gardiner, Joseph C. and Luo, Zhehui and Roman, Lee Anne , doi =. Statistics in Medicine , keywords =
- [46]
-
[47]
Zeger, S. L. and Liang, K. Y. , doi =. Statistics in Medicine , mendeley-groups =
-
[48]
Conway, R. W. and Maxwell, W. L. , journal =. A Queuing model with state dependent service rates , volume =
-
[49]
Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing , urldate =
Diane Lambert , journal =. Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing , urldate =
-
[50]
Yau, Kelvin K. W. and Wang, Kui and Lee, Andy H. , title =. Biometrical Journal , volume =. doi:https://doi.org/10.1002/bimj.200390024 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/bimj.200390024 , abstract =
-
[51]
C. E. Rose and S. W. Martin and K. A. Wannemuehler and B. D. Plikaytis , title =. Journal of Biopharmaceutical Statistics , volume =. 2006 , publisher =. doi:10.1080/10543400600719384 , note =
-
[52]
International Statistical Review / Revue Internationale de Statistique , author =
A. International Statistical Review / Revue Internationale de Statistique , author =. 1991 , keywords =. doi:10.2307/1403572 , abstract =
-
[53]
Statistics and Computing , author =
Likelihood-free parallel tempering , volume =. Statistics and Computing , author =. 2013 , pages =. doi:10.1007/s11222-012-9328-6 , abstract =
-
[54]
P Bortot and S. G Coles and S. A Sisson , title =. Journal of the American Statistical Association , volume =. 2007 , publisher =. doi:10.1198/016214506000000988 , URL =
-
[55]
Statistics and Computing , author =
A tutorial on adaptive. Statistics and Computing , author =. 2008 , keywords =. doi:10.1007/s11222-008-9110-y , abstract =
-
[56]
Journal of Computational and Graphical Statistics , author =
Truncated importance sampling , volume =. Journal of Computational and Graphical Statistics , author =. 2008 , keywords =. doi:10.1198/106186008X320456 , number =
-
[57]
Vehtari, Aki and Simpson, Daniel and Gelman, Andrew and Yao, Yuling and Gabry, Jonah , year =. Pareto
-
[58]
Approximate. Genetics , author =. 2002 , keywords =. doi:10.1093/genetics/162.4.2025 , number =
-
[59]
POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , volume =
Andrew Gelman and Xiao-Li Meng and Hal Stern , journal =. POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , volume =
- [60]
-
[61]
International Journal of Forecasting , author =. 2000 , pages =. doi:10.1016/S0169-2070(99)00047-3 , language =
-
[62]
The Annals of Statistics , author =
Bayesianly. The Annals of Statistics , author =. doi:10.1214/aos/1176346785 , number =
-
[63]
Journal of the American Statistical Association , author =
Post-. Journal of the American Statistical Association , author =. 2006 , keywords =. doi:10.1198/016214505000001393 , language =
-
[64]
Journal of the American Statistical Association , author =
P. Journal of the American Statistical Association , author =. 2000 , keywords =
work page 2000
-
[65]
Bayesian model assessment using pivotal quantities , volume =. Bayesian Analysis , author =. 2007 , keywords =. doi:10.1214/07-BA229 , language =
-
[66]
Journal of the Royal Statistical Society: Series B (Methodological) , author =
Spatial. Journal of the Royal Statistical Society: Series B (Methodological) , author =. 1974 , keywords =. doi:10.1111/j.2517-6161.1974.tb00999.x , language =
-
[67]
Geographical Analysis , author =
Does. Geographical Analysis , author =. 2004 , keywords =. doi:10.1111/j.1538-4632.2004.tb01128.x , language =
-
[68]
Papers in Regional Science , author =. 2011 , keywords =. doi:10.1111/j.1435-5957.2010.00323.x , language =
-
[69]
Blum, Michael G. B. and Fran. Non-Linear Regression Models for. 2010 , journal =. doi:10.1007/s11222-009-9116-0 , urldate =
-
[70]
Csill\'ery, Katalin and François, Olivier and Blum, Michael G. B. , title =. Methods in Ecology and Evolution , volume =. doi:https://doi.org/10.1111/j.2041-210X.2011.00179.x , url =. https://besjournals.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2041-210X.2011.00179.x , year =
-
[71]
Nott, David J. and Drovandi, Christopher C. and Mengersen, Kerrie and Evans, Michael , year =. Approximation of. Bayesian Analysis , volume =. doi:10.1214/16-BA1033 , langid =
-
[72]
Li, Yunfan and Craig, Bruce A. and Bhadra, Anindya , year =. The. Journal of Computational and Graphical Statistics , volume =. doi:10.1080/10618600.2019.1575744 , langid =
-
[73]
Gelman, Andrew and Hwang, Jessica and Vehtari, Aki , year =. Understanding Predictive Information Criteria for. Statistics and Computing , volume =. doi:10.1007/s11222-013-9416-2 , langid =
-
[74]
Gelfand, A. E. and Dey, D. K. , year =. Bayesian. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. doi:10.1111/j.2517-6161.1994.tb01996.x , keywords =
-
[75]
Ver Hoef, Jay M. and Hanks, Ephraim M. and Hooten, Mevin B. , year =. On the Relationship between Conditional (. Spatial Statistics , volume =. doi:10.1016/j.spasta.2018.04.006 , archiveprefix =. 1710.07000 , pages =
-
[76]
Debarsy, Nicolas and LeSage, James P. , year =. Bayesian. Journal of Business & Economic Statistics , volume =. doi:10.1080/07350015.2020.1840993 , urldate =
-
[77]
Weighted Rank Aggregation of Cluster Validation Measures: A
Pihur, Vasyl and Datta, Susmita and Datta, Somnath , year =. Weighted Rank Aggregation of Cluster Validation Measures: A. Bioinformatics , volume =. doi:10.1093/bioinformatics/btm158 , urldate =
-
[78]
Finding common genes in multiple cancer types through meta--analysis of microarray experiments: A rank aggregation approach , author=. Genomics , volume=. 2008 , publisher=
work page 2008
-
[79]
Keith, Ord , year =. Estimation. Journal of the American Statistical Association , langid =
-
[80]
Cressie, Noel A. C. , pages =. Spatial Models on Lattices , booktitle =. doi:https://doi.org/10.1002/9781119115151.ch6 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119115151.ch6 , year =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.