pith. machine review for the scientific record. sign in

arxiv: 2605.12577 · v1 · submitted 2026-05-12 · 📊 stat.AP

Recognition: 2 theorem links

· Lean Theorem

Circula-based multivariate distributions on the flat torus, with applications in structural biology

Authors on Pith no claims yet

Pith reviewed 2026-05-14 20:29 UTC · model grok-4.3

classification 📊 stat.AP
keywords flat toruscirculatorsion angleslatent variable modelmixture modelsprotein structuremultivariate distributionsstructural biology
0
0 comments X

The pith

A low-rank latent variable model yields the first closed-form normalized distributions on the flat torus that carry explicit covariance structure.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces circula defined through latent variable models with low-rank covariance to create multivariate distributions on the d-dimensional flat torus. The construction separates the modeling of dependencies from marginal distributions and supplies a closed-form normalized density for the first time. The same framework produces the first joint distributions for torsion angles of neighboring amino acids, covering both backbone and side-chain angles in proteins. Mixtures fitted on torii ranging from two to fourteen dimensions achieve state-of-the-art likelihood and sparsity on real structural data. The approach is positioned to support thermodynamic and kinetic descriptions of proteins that move beyond discrete conformation catalogs.

Core claim

Using a low rank covariance structure to define circulae based on a latent variable model, the authors design the first closed-form normalized distribution on the flat torus T^d with covariance structure. Building on this, they propose the first models for joint distributions of torsion angles (backbone and side-chains) for neighboring amino-acids in proteins, fitting mixtures on flat torii from T^2 to T^14 that are SOTA in likelihood and sparsity.

What carries the argument

Circula constructed from a latent variable model equipped with low-rank covariance, which supplies a normalized density on the flat torus while encoding pairwise and higher-order dependencies among angles.

Load-bearing premise

The low-rank covariance structure inside the latent variable model captures the essential dependencies among torsion angles without substantial loss of fidelity or introduction of artifacts on protein data.

What would settle it

If mixtures built from these circula yield lower likelihood or poorer sparsity than existing methods when fitted to the same sets of protein torsion angle measurements, the claim of first closed-form normalized distributions with usable covariance would be falsified.

Figures

Figures reproduced from arXiv: 2605.12577 by Alix Lh\'eritier, Fr\'ed\'eric Cazals, Guillaume Carri\`ere.

Figure 1
Figure 1. Figure 1: Modeling covariance on the flat torus: illustration with a 2D von Mises-wrapped Cauchy distribution (vM-wC). (A) The wrapped Cauchy circula correlates circular uniform vari￾ables. (B) The wrapped Cauchy circula is applied to variables of interest passed through the probability integral transform. (C) The variables of interest follow their marginal von Mises densities. (D-E) The vM-wC distribution is the pr… view at source ↗
Figure 2
Figure 2. Figure 2: Circula as a latent variable model [7]. Arrows represent modeled correlations. ρ is the length of the mean resultant vector, which acts as a concentration parameter [6]. The product ρkρl is the dependence value for the pair (k, l). 3.1 Dependency structure analysis With µi the circular mean of θi , we characterize the dependency structure in CBMDs (Eq. 4) using the so-called Jammalamadaka and Sarma (JS) co… view at source ↗
Figure 3
Figure 3. Figure 3: The Ramachandran trinity for a mixture with 50 components: credible region contours at 50% level for each unweighted component of the baseline and vM-wC mixture. For the latter, orange arrow indicate slanted components due to the covariance structure. 9 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Difference in bits per observation between the baseline and vM-wC mixtures, for the 20 × 20 pairs of amino acids. (Length(θvM-wC, X) − Length(θbaseline, X))/|X| 10 [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

Modeling dependencies between random variables independently from their marginals is fundamental in applications ranging from finance to (structural) biology. In this work, we undertake this problem using circula to model data living on the $d$-dimensional flat torus $\mathbb{T}^d$, making two contributions. First, using a low rank covariance structure to define circulae based on a latent variable model, we design the first closed-form normalized distribution on the flat torus $\mathbb{T}^d$--with covariance structure. Second, building on this framework, we propose the first models for joint distributions of torsion angles (backbone and side-chains) for neighboring amino-acids in proteins. In practice, we fit mixtures on flat torii from $\mathbb{T}^{2}$ to $\mathbb{T}^{14}$, and show they are SOTA in terms of likelihood and sparsity. We anticipate that these models will prove fundamental to move from discrete structural studies like in AlphaFold2, to thermodynamics and kinetics, which are the ultimate goals in theoretical biophysics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces 'circulae' as a new class of distributions on the d-dimensional flat torus T^d, constructed via a latent-variable model with low-rank covariance structure. This is presented as yielding the first closed-form normalized density on T^d that incorporates a covariance structure. The second contribution applies the framework to model joint distributions of backbone and side-chain torsion angles for neighboring amino acids in proteins, fitting mixture models on tori from T^2 to T^14 and claiming state-of-the-art performance in likelihood and sparsity relative to existing methods. The work positions these models as enabling a shift from discrete structural predictions (e.g., AlphaFold2) toward thermodynamic and kinetic analyses in biophysics.

Significance. If the low-rank latent construction indeed delivers a properly normalized closed-form density with usable covariance on T^d, the result would be significant for circular statistics and structural biology. It would provide a principled way to model continuous, correlated torsion angles in proteins, supporting probabilistic extensions beyond discrete rotamer libraries and potentially improving sampling for dynamics and folding pathways. The sparsity and mixture-fitting results up to dimension 14 are practically relevant if they hold under proper baselines.

major comments (3)
  1. [Abstract and §3] Abstract and §3 (latent-variable construction): the claim that the low-rank covariance latent model produces a 'closed-form normalized distribution' on T^d must be supported by an explicit normalization constant derivation; without it, it is unclear whether the low-rank constraint preserves the closed-form property or merely approximates a density that requires numerical normalization.
  2. [§5 and Table 2] §5 (protein torsion application) and Table 2: the SOTA likelihood and sparsity claims for T^2–T^14 mixtures rest on the assumption that low-rank Gaussian latents capture the essential multimodal and higher-order dependencies among torsion angles; the manuscript should include a direct comparison against non-low-rank baselines (e.g., full-rank von Mises or kernel density estimators) plus a diagnostic for residual multimodality or bias in the fitted densities.
  3. [§4] §4 (mixture fitting): the reported likelihood gains are load-bearing for the 'first models' claim, yet no cross-validation or held-out log-likelihood on independent protein datasets is described; without this, it is impossible to rule out overfitting to the training torsion statistics.
minor comments (2)
  1. [Introduction] Notation for the flat torus T^d and the circula density should be introduced consistently in the first section rather than appearing first in the abstract.
  2. [Figures] Figure captions for the torsion-angle visualizations should explicitly state the amino-acid pairs and the number of mixture components used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review. The comments identify areas where additional clarity and validation will strengthen the manuscript, and we address each point below with planned revisions.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (latent-variable construction): the claim that the low-rank covariance latent model produces a 'closed-form normalized distribution' on T^d must be supported by an explicit normalization constant derivation; without it, it is unclear whether the low-rank constraint preserves the closed-form property or merely approximates a density that requires numerical normalization.

    Authors: We appreciate the request for explicit detail. Section 3 derives the normalization constant in closed form by integrating the latent Gaussian density over the torus; the low-rank structure allows the multi-dimensional integral to factor into a product of univariate integrals that admit closed-form expressions involving modified Bessel functions. To make this fully transparent, we will add a dedicated subsection in the revision that walks through the derivation step by step, confirming that the low-rank constraint preserves the closed-form property without numerical normalization. revision: yes

  2. Referee: [§5 and Table 2] §5 (protein torsion application) and Table 2: the SOTA likelihood and sparsity claims for T^2–T^14 mixtures rest on the assumption that low-rank Gaussian latents capture the essential multimodal and higher-order dependencies among torsion angles; the manuscript should include a direct comparison against non-low-rank baselines (e.g., full-rank von Mises or kernel density estimators) plus a diagnostic for residual multimodality or bias in the fitted densities.

    Authors: We agree that direct comparisons to non-low-rank baselines are needed to support the performance claims. In the revision we will add results for full-rank von Mises mixture models and kernel density estimators on the identical torsion datasets, together with diagnostic checks (marginal density overlays and residual correlation plots) that assess whether higher-order dependencies or multimodality remain uncaptured by the low-rank latent construction. revision: yes

  3. Referee: [§4] §4 (mixture fitting): the reported likelihood gains are load-bearing for the 'first models' claim, yet no cross-validation or held-out log-likelihood on independent protein datasets is described; without this, it is impossible to rule out overfitting to the training torsion statistics.

    Authors: We acknowledge that the current presentation reports in-sample likelihoods and that explicit held-out evaluation is required. We will revise §4 to include a 5-fold cross-validation protocol on the protein torsion data, reporting held-out log-likelihoods on independent test structures drawn from a disjoint PDB subset. This addition will directly address concerns about overfitting. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central construction is an independent latent-variable definition

full rationale

The paper defines circulae on T^d via a novel low-rank latent-variable model that yields the first closed-form normalized distribution with covariance structure. No equations, self-citations, or fitted inputs are shown reducing this claim to a tautology, renaming, or load-bearing prior result by the same authors. The construction is presented as original rather than a re-expression of data or ansatz, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of a low-rank latent variable construction that produces a normalized closed-form density on the torus; this construction is treated as novel but its normalization and covariance properties are asserted without visible derivation steps in the abstract.

free parameters (1)
  • low-rank covariance parameters
    Parameters of the latent variable model used to define the circula covariance structure; their number and fitting procedure are not detailed in the abstract.
axioms (1)
  • domain assumption The flat torus geometry is an appropriate manifold for representing torsion angles in proteins.
    Invoked when mapping backbone and side-chain angles to T^d for the mixture models.

pith-pipeline@v0.9.0 · 5488 in / 1298 out tokens · 37531 ms · 2026-05-14T20:29:56.430308+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

52 extracted references

  1. [1]

    Springer, 2006

    Roger B Nelsen.An introduction to copulas. Springer, 2006

  2. [2]

    Fonctions de répartition à n dimensions et leurs marges

    M Sklar. Fonctions de répartition à n dimensions et leurs marges. InAnnales de l’ISUP, volume 8, pages 229–231, 1959

  3. [3]

    The estimation method of inference functions for margins for multivariate models

    Harry Joe and James Jianmeng Xu. The estimation method of inference functions for margins for multivariate models. 1996

  4. [4]

    Computing a nearest correlation matrix with factor structure.SIAM Journal on Matrix Analysis and Applications, 31(5):2603–2622, 2010

    Rüdiger Borsdorf, Nicholas J Higham, and Marcos Raydan. Computing a nearest correlation matrix with factor structure.SIAM Journal on Matrix Analysis and Applications, 31(5):2603–2622, 2010

  5. [5]

    Vine copula based modeling.Annual Review of Statistics and Its Application, 9(1):453–477, 2022

    Claudia Czado and Thomas Nagler. Vine copula based modeling.Annual Review of Statistics and Its Application, 9(1):453–477, 2022

  6. [6]

    Mardia and P

    K. Mardia and P. Jupp.Directional statistics, volume 494. John Wiley & Sons, 2009

  7. [7]

    On a class of circulas: copulas for circular distributions

    MC Jones, Arthur Pewsey, and Shogo Kato. On a class of circulas: copulas for circular distributions. Annals of the Institute of Statistical Mathematics, 67:843–862, 2015

  8. [8]

    Stereochemistry of polypeptide chain configurations

    Gopalasamudram Narayana Ramachandran. Stereochemistry of polypeptide chain configurations. J. Mol. Biol., 7:95–99, 1963

  9. [9]

    Shapovalov and R

    M. Shapovalov and R. Dunbrack. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions.Structure, 19(6):844–858, 2011

  10. [10]

    Molprobity’s ultimate rotamer-library distributions for model validation

    Bradley J Hintze, Steven M Lewis, Jane S Richardson, and David C Richardson. Molprobity’s ultimate rotamer-library distributions for model validation. Proteins: Structure, Function, and Bioinformatics, 84(9):1177–1189, 2016

  11. [11]

    Molpro- bity: more and better reference data for improved all-atom structure validation.Protein science, 27(1):293–315, 2018

    Christopher J Williams, Jeffrey J Headd, Nigel W Moriarty, Michael G Prisant, Lizbeth L Videau, Lindsay N Deis, Vishal Verma, Daniel A Keedy, Bradley J Hintze, Vincent B Chen, et al. Molpro- bity: more and better reference data for improved all-atom structure validation.Protein science, 27(1):293–315, 2018

  12. [12]

    Amarasinghe, L

    P. Amarasinghe, L. Allison, P. Stuckey, M. Garcia de la Banda, A. Lesk, and A. Konagurthu. Getting ‘ϕψχal’with proteins: minimum message length inference of joint distributions of backbone and sidechain dihedral angles.Bioinformatics, 39(Supplement_1):i357–i367, 2023

  13. [13]

    PhiSiCal-Checkup: A bayesian framework to validate amino acid conformations within experimental protein structures

    Piyumi R Amarasinghe, Lloyd Allison, Craig J Morton, Peter J Stuckey, Maria Garcia de la Banda, Arthur M Lesk, and Arun S Konagurthu. PhiSiCal-Checkup: A bayesian framework to validate amino acid conformations within experimental protein structures. Proceedings of the National Academy of Sciences, 122(1):e2416301121, 2025

  14. [14]

    Jumper, R

    J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, et al. Highly accurate protein structure prediction with Al- phaFold. Nature, 596(7873):583–589, 2021

  15. [15]

    Hallen and B

    M. Hallen and B. Donald. Protein design by provable algorithms.Communications of the ACM, 62(10):76–84, 2019

  16. [16]

    Mocapy++–a toolkit for inference and learning in dynamic bayesian networks.BMC bioinformatics, 11(1):1–6, 2010

    Martin Paluszewski and Thomas Hamelryck. Mocapy++–a toolkit for inference and learning in dynamic bayesian networks.BMC bioinformatics, 11(1):1–6, 2010

  17. [17]

    Coutsias, C

    E. Coutsias, C. Seok, M. Jacobson, and K. Dill. A kinematic view of loop closure. Journal of computational chemistry, 25(4):510–528, 2004. 11

  18. [18]

    O’Donnell, C.H

    T. O’Donnell, C.H. Robert, and F. Cazals. Tripeptide loop closure: a detailed study of recon- structions based on Ramachandran distributions.Proteins: structure, function, and bioinformatics, 90(3):858–868, 2022

  19. [19]

    Lelièvre, G

    T. Lelièvre, G. Stoltz, and M. Rousset. Free energy computations: A mathematical perspective. World Scientific, 2010

  20. [20]

    An amino-domino model described by a cross-peptide-bond ramachandran plot defines amino acid pairs as local structural units

    Aviv A Rosenberg, Nitsan Yehishalom, Ailie Marx, and Alex M Bronstein. An amino-domino model described by a cross-peptide-bond ramachandran plot defines amino acid pairs as local structural units. Proceedings of the National Academy of Sciences, 120(44):e2301064120, 2023

  21. [21]

    Springer Science & Business Media, 2004

    Jean Jacod and Philip Protter.Probability essentials. Springer Science & Business Media, 2004

  22. [22]

    Topics in circular statistics, volume5

    SRaoJammalamadakaandAmbarSengupta. Topics in circular statistics, volume5. worldscientific, 2001

  23. [23]

    Inferences based on a bivariate distribution with von mises marginals

    Grace S Shieh and Richard A Johnson. Inferences based on a bivariate distribution with von mises marginals. Annals of the Institute of Statistical Mathematics, 57:789–802, 2005

  24. [24]

    N.J. Higham. Computing the nearest correlation matrix–a problem from finance.IMA Journal of Numerical Analysis, 22(3):329, 2002

  25. [25]

    On the generalized low rank approximation of the correlation matrices arising in the asset portfolio

    Xuefeng Duan, Jianchao Bai, Maojun Zhang, and Xinjun Zhang. On the generalized low rank approximation of the correlation matrices arising in the asset portfolio. Linear Algebra and its Applications, 461:1–17, 2014

  26. [26]

    Handbook of mathematical functions with formulas, graphs, and mathematical tables, 1988

    Milton Abramowitz, Irene A Stegun, and Robert H Romer. Handbook of mathematical functions with formulas, graphs, and mathematical tables, 1988

  27. [27]

    John Wiley & Sons, 2008

    Geoffrey J McLachlan and Thriyambakam Krishnan.The EM algorithm and extensions. John Wiley & Sons, 2008

  28. [28]

    Mario A. T. Figueiredo and Anil K. Jain. Unsupervised learning of finite mixture models.IEEE Transactions on pattern analysis and machine intelligence, 24(3):381–396, 2002

  29. [29]

    The protein data bank.Nucleic acids research, 28(1):235– 242, 2000

    Helen M Berman, John Westbrook, Zukang Feng, Gary Gilliland, Talapady N Bhat, Helge Weissig, Ilya N Shindyalov, and Philip E Bourne. The protein data bank.Nucleic acids research, 28(1):235– 242, 2000

  30. [30]

    Minimum message length estimation of mixtures of multi- variate gaussian and von mises-fisher distributions.Machine Learning, 100:333–378, 2015

    Parthan Kasarapu and Lloyd Allison. Minimum message length estimation of mixtures of multi- variate gaussian and von mises-fisher distributions.Machine Learning, 100:333–378, 2015

  31. [31]

    Finite mixtures of multivariate wrapped normal distributions for model based clustering of p-torus data.Journal of Computational and Graphical Statistics, 32(3):1215–1228, 2023

    Luca Greco, Pier Luigi Novi Inverardi, and Claudio Agostinelli. Finite mixtures of multivariate wrapped normal distributions for model based clustering of p-torus data.Journal of Computational and Graphical Statistics, 32(3):1215–1228, 2023

  32. [32]

    Efficient simulation of the von mises distribution.Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(2):152–157, 1979

    DJ Best and Nicholas I Fisher. Efficient simulation of the von mises distribution.Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(2):152–157, 1979

  33. [33]

    A short note on parameter approximation for von mises-fisher distributions: and a fast implementation of i s (x).Computational Statistics, 27(1):177–190, 2012

    Suvrit Sra. A short note on parameter approximation for von mises-fisher distributions: and a fast implementation of i s (x).Computational Statistics, 27(1):177–190, 2012

  34. [34]

    S. Kato. Personnal communication, 2024

  35. [35]

    A majorization algorithm for constrained correlation matrix approxima- tion

    Dan Simon and Jeff Abell. A majorization algorithm for constrained correlation matrix approxima- tion. Linear algebra and its applications, 432(5):1152–1164, 2010

  36. [36]

    Christophe Biernacki, Gilles Celeux, and Gérard Govaert. Choosing starting values for the em algorithm for getting the highest likelihood in multivariate gaussian mixture models.Computational Statistics & Data Analysis, 41(3-4):561–575, 2003

  37. [37]

    John Wiley & Sons, 2000

    Geoffrey J McLachlan and David Peel.Finite mixture models. John Wiley & Sons, 2000. 12

  38. [38]

    A new method for random initialization of the em algorithm for multivariate gaus- sian mixture learning

    Wojciech Kwedlo. A new method for random initialization of the em algorithm for multivariate gaus- sian mixture learning. InProceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, pages 81–90. Springer, 2013

  39. [39]

    Adaptive seeding for gaussian mixture models

    Johannes Blömer and Kathrin Bujna. Adaptive seeding for gaussian mixture models. InPacific-asia conference on knowledge discovery and data mining, pages 296–308. Springer, 2016

  40. [40]

    A new iterative initialization of em algorithm for gaussian mixture models

    Jie You, Zhaoxuan Li, and Junli Du. A new iterative initialization of em algorithm for gaussian mixture models. Plos one, 18(4):e0284114, 2023

  41. [41]

    Mixtures of concentrated multivariate sine distributions with applications to bioinformatics.Journal of Applied Statistics, 39(11):2475–2492, 2012

    Kanti V Mardia, John T Kent, Zhengzheng Zhang, Charles C Taylor, and Thomas Hamelryck. Mixtures of concentrated multivariate sine distributions with applications to bioinformatics.Journal of Applied Statistics, 39(11):2475–2492, 2012

  42. [42]

    Estimating the dimension of a model

    Gideon Schwarz. Estimating the dimension of a model. The annals of statistics, pages 461–464, 1978

  43. [43]

    H. Akaike. A new look at the statistical model identification.IEEE Transactions on Automatic Control, 19(6):716–723, 1974

  44. [44]

    Rissanen

    J. Rissanen. Modeling by shortest data description.Automatica, 14(5):465–471, 1978

  45. [45]

    Pattern recognition and machine learning, volume4

    ChristopherMBishopandNasserMNasrabadi. Pattern recognition and machine learning, volume4. Springer, 2006

  46. [46]

    E. L. Lehmann and Joseph P. Romano.Testing statistical hypotheses. Springer Texts in Statistics. Springer, New York, third edition, 2005

  47. [47]

    Estimation and inference by compact coding.Journal of the Royal Statistical Society Series B: Statistical Methodology, 49(3):240–252, 1987

    Chris S Wallace and Peter R Freeman. Estimation and inference by compact coding.Journal of the Royal Statistical Society Series B: Statistical Methodology, 49(3):240–252, 1987

  48. [48]

    J.W. Milnor. Morse Theory. Princeton University Press, Princeton, NJ, 1963. m-mt-63

  49. [49]

    Hatcher.Algebraic Topology

    A. Hatcher.Algebraic Topology. Cambridge, 2002. key was h-at-02

  50. [50]

    A möbius transformation-induced distribution on the torus

    Shogo Kato and Arthur Pewsey. A möbius transformation-induced distribution on the torus. Biometrika, 102(2):359–370, 2015

  51. [51]

    frame of reference

    Daniel Lewandowski, Dorota Kurowicka, and Harry Joe. Generating random correlation matrices based on vines and extended onion method. Journal of multivariate analysis, 100(9):1989–2001, 2009. 13 S1 Background S1.1 von Mises distribution The von Mises distributionvm(µ, κ) (sometimes referred to as theCircular Normal distribution) is the most commonly used ...

  52. [52]

    un- certainty of each datum

    + Knz(N + 1) 2 − L(X|θ) (49) where Knz denotes the number of components with non-zero weights,n = |X| is the sample size and N is the number of free parameters in a single component (N = 3d for CBMD components from Eq. 4 1). Figueiredo and Jain [28] also introduce an estimation procedure based on EM, designed to minimize Length(θ, X) over the component pa...