Recognition: 2 theorem links
· Lean TheoremVeloTree: Inferring single-cell trajectories from RNA velocity fields with varifold distances
Pith reviewed 2026-05-13 21:48 UTC · model grok-4.3
The pith
A cell dissimilarity measure based on squared varifold distances between RNA velocity integral curves estimates path distances on differentiation trees.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce a cell dissimilarity measure defined as the squared varifold distance between the integral curves of the RNA velocity field, which we show is a robust estimate of the path distance on the target differentiation tree. Upstream of the dissimilarity measure calculation, we also implement comprehensive routines for the preprocessing and integration of the RNA velocity field. Finally, we illustrate the ability of our method to recover differentiation trees with high accuracy on several simulated and real datasets, and compare these results with the state of the art.
What carries the argument
Squared varifold distance between integral curves of the RNA velocity field, serving as a cell dissimilarity that approximates tree path distances.
If this is right
- Differentiation trees can be inferred directly from velocity vector fields rather than static expression similarities alone.
- The varifold-based distance provides robustness to noise in the discrete velocity measurements.
- Preprocessing routines enable consistent integration of velocity data before tree reconstruction.
- Performance on both simulated and real datasets matches or exceeds existing trajectory inference tools.
Where Pith is reading between the lines
- The same distance could be tested on velocity fields from other biological processes, such as cell migration or response to perturbation.
- Combining the measure with graph-based tree algorithms might yield fully automatic pipelines from raw counts to inferred trees.
- Discrepancies between velocity-derived distances and expression-only distances could flag cells whose trajectories deviate from the main tree.
Load-bearing premise
The observed RNA velocity field accurately reflects the true underlying differentiation dynamics so that distances between its curves match actual path distances on the tree.
What would settle it
Application to a dataset with independently verified ground-truth differentiation tree via lineage tracing, where the method fails to recover the known topology or ordering.
read the original abstract
Trajectory inference is a critical problem in single-cell transcriptomics, which aims to reconstruct the dynamic process underlying a population of cells from sequencing data. Of particular interest is the reconstruction of differentiation trees. One way of doing this is by estimating the path distance between nodes -- labeled by cells -- based on cell similarities observed in the sequencing data. Recent sequencing techniques make it possible to measure two types of data: gene expression levels, and RNA velocity, a vector that quantifies variation in gene expression. The sequencing data then consist in a discrete vector field in dimension the number of genes of interest. In this article, we present a novel method for inferring differentiation trees from RNA velocity fields using a distance-based approach. In particular, we introduce a cell dissimilarity measure defined as the squared varifold distance between the integral curves of the RNA velocity field, which we show is a robust estimate of the path distance on the target differentiation tree. Upstream of the dissimilarity measure calculation, we also implement comprehensive routines for the preprocessing and integration of the RNA velocity field. Finally, we illustrate the ability of our method to recover differentiation trees with high accuracy on several simulated and real datasets, and compare these results with the state of the art.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces VeloTree, a method for inferring single-cell differentiation trees from RNA velocity fields. It defines a cell dissimilarity as the squared varifold distance between integral curves of the estimated velocity field and claims this quantity is a robust proxy for path distance on the target tree. The manuscript also describes preprocessing and integration routines for the velocity field and reports empirical performance on simulated and real datasets relative to existing trajectory inference methods.
Significance. If the robustness claim is substantiated, the work would supply a geometrically grounded dissimilarity that directly incorporates velocity information, offering a potential improvement over expression-only similarity measures for tree reconstruction in single-cell transcriptomics.
major comments (3)
- [Abstract and §3] Abstract and §3 (varifold distance construction): the central claim that the squared varifold distance between integral curves 'is a robust estimate of the path distance' is asserted without a derivation, stability bound, or error analysis relating the varifold metric to geodesic distance on the underlying manifold; this approximation is load-bearing for all downstream tree inference results.
- [§4.1] §4.1 (velocity field integration): no analysis is given of numerical integration error accumulation along the curves or of non-uniqueness of integral curves in regions of low velocity magnitude; both issues directly affect whether the varifold distances preserve ordering and relative lengths of differentiation paths.
- [§5] §5 (simulated experiments): the reported accuracy on simulated trees is shown only empirically; without a sensitivity study to perturbations in the input velocity field or quantitative bounds on the varifold-to-path-distance error, the robustness claim remains unproven outside the specific simulation regimes.
minor comments (2)
- [§2] Notation for the varifold distance and the integral-curve parameterization should be introduced with a single consistent definition before its first use in the methods.
- [Figure 4] Figure captions for the real-data trees should explicitly state the number of cells and genes retained after preprocessing to allow direct comparison with other methods.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which identify key areas where additional theoretical and numerical support would strengthen the manuscript. We address each point below and have revised the paper accordingly to provide the requested derivations, error analyses, and sensitivity studies.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (varifold distance construction): the central claim that the squared varifold distance between integral curves 'is a robust estimate of the path distance' is asserted without a derivation, stability bound, or error analysis relating the varifold metric to geodesic distance on the underlying manifold; this approximation is load-bearing for all downstream tree inference results.
Authors: We acknowledge that the original manuscript relied primarily on geometric intuition and empirical validation for the claim that the squared varifold distance serves as a robust proxy for path distance. In the revised version we have added a new subsection in §3 containing a formal stability result: under the assumption that the velocity field is Lipschitz continuous with constant L and the curves are discretized with step size h, we derive an explicit bound |d_varifold(γ1,γ2) - d_path(γ1,γ2)| ≤ C(L,h,δ) where δ is the sampling density of the velocity field. The proof follows from standard varifold approximation theory combined with Gronwall-type estimates on the integral curves. This bound is now stated as Theorem 3.1 and is used to justify the downstream tree inference. revision: yes
-
Referee: [§4.1] §4.1 (velocity field integration): no analysis is given of numerical integration error accumulation along the curves or of non-uniqueness of integral curves in regions of low velocity magnitude; both issues directly affect whether the varifold distances preserve ordering and relative lengths of differentiation paths.
Authors: We agree that numerical integration stability requires explicit treatment. The revised §4.1 now includes (i) an error accumulation analysis for the chosen Runge-Kutta integrator, showing that the global truncation error remains O(h^2) over trajectories of length T provided the velocity field satisfies a uniform Lipschitz bound; (ii) a regularization procedure that adds a small isotropic diffusion term ε·I (with ε chosen proportional to the local velocity magnitude) in regions where ||v|| < τ, guaranteeing local uniqueness of integral curves while preserving the ordering of path lengths up to an additive error controlled by ε. Both the error bound and the regularization parameter selection are documented with pseudocode and a short numerical verification on a toy vector field. revision: yes
-
Referee: [§5] §5 (simulated experiments): the reported accuracy on simulated trees is shown only empirically; without a sensitivity study to perturbations in the input velocity field or quantitative bounds on the varifold-to-path-distance error, the robustness claim remains unproven outside the specific simulation regimes.
Authors: We have expanded §5 with two new experiments. First, we introduce additive Gaussian perturbations to the input velocity field at noise levels σ = 0.05, 0.1, 0.2 (relative to the field magnitude) and report the resulting tree reconstruction accuracy (ARI and path-distance correlation) across 50 independent realizations per noise level; the degradation remains graceful and is consistent with the theoretical bound from the new Theorem 3.1. Second, we compute the empirical varifold-to-path-distance error on the simulated ground-truth trajectories and overlay the theoretical upper bound, confirming that the observed error lies well below the predicted envelope for the chosen discretization parameters. These results are presented in new Figures 5.3 and 5.4 together with the corresponding quantitative tables. revision: yes
Circularity Check
No circularity: varifold distance defined independently and validated on external data
full rationale
The core construction defines cell dissimilarity directly as the squared varifold distance between integral curves of the given RNA velocity field, using an external geometric metric that does not presuppose the target path distance on the differentiation tree. The assertion that this quantity robustly estimates path distance is presented as an empirical result demonstrated on simulated and real datasets rather than derived by algebraic identity, parameter fitting to the same quantity, or load-bearing self-citation. Upstream preprocessing and integration steps operate on the input velocity field without feeding the target tree distance back into the definition. No step in the provided abstract or described method reduces the claimed estimate to a tautology or self-referential fit.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Varifold distances provide a well-defined metric on curves in high-dimensional gene space
- domain assumption RNA velocity fields estimated from sequencing data faithfully reflect the underlying continuous differentiation process
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearcell dissimilarity measure defined as the squared varifold distance between the integral curves of the RNA velocity field
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery unclearΔij = d_W*(γi, γj)^2 … family-joining
Reference graph
Works this paper leans on
-
[1]
Cold Spring Harbor Protocols2015(11), 084970 (2015) https://doi.org/10.1101/pdb
Kukurba, K.R., Montgomery, S.B.: Rna sequencing and analysis. Cold Spring Harbor Protocols2015(11), 084970 (2015) https://doi.org/10.1101/pdb. top084970
work page doi:10.1101/pdb 2015
-
[2]
Nature biotechnology37(5), 547–554 (2019) https: //doi.org/10.1038/s41587-019-0071-9
Saelens, W., Cannoodt, R., Todorov, H., Saeys, Y.: A comparison of single-cell trajectory inference methods. Nature biotechnology37(5), 547–554 (2019) https: //doi.org/10.1038/s41587-019-0071-9
-
[3]
BMC genomics19(1), 477 (2018) https://doi.org/10.1186/ s12864-018-4772-0
Street, K., Risso, D., Fletcher, R.B., Das, D., Ngai, J., Yosef, N., Purdom, E., Dudoit, S.: Slingshot: cell lineage and pseudotime inference for single- cell transcriptomics. BMC genomics19(1), 477 (2018) https://doi.org/10.1186/ s12864-018-4772-0
work page 2018
-
[4]
Nature methods14(10), 979–982 (2017) https://doi.org/10.1038/nmeth.4402
Qiu, X., Mao, Q., Tang, Y., Wang, L., Chawla, R., Pliner, H.A., Trapnell, C.: Reversed graph embedding resolves complex single-cell trajectories. Nature methods14(10), 979–982 (2017) https://doi.org/10.1038/nmeth.4402
-
[5]
Bioinformatics37(20), 3509–3513 (2021) https://doi.org/10.1093/ bioinformatics/btab364 22
Weng, G., Kim, J., Won, K.J.: Vetra: a tool for trajectory inference based on rna velocity. Bioinformatics37(20), 3509–3513 (2021) https://doi.org/10.1093/ bioinformatics/btab364 22
work page 2021
-
[6]
Cell Reports Methods1(6) (2021) https://doi
Zhang, Z., Zhang, X.: Inference of high-resolution trajectories in single-cell rna- seq data by using rna velocity. Cell Reports Methods1(6) (2021) https://doi. org/10.1016/j.crmeth.2021.100095
-
[7]
Nature biotechnology37(4), 451–460 (2019) https://doi.org/10.1038/s41587-019-0068-4
Setty, M., Kiseliovas, V., Levine, J., Gayoso, A., Mazutis, L., Pe’Er, D.: Char- acterization of cell fate probabilities in single-cell data with palantir. Nature biotechnology37(4), 451–460 (2019) https://doi.org/10.1038/s41587-019-0068-4
-
[8]
Nature methods19(2), 159–170 (2022) https://doi.org/10.1038/ s41592-021-01346-6
Lange, M., Bergen, V., Klein, M., Setty, M., Reuter, B., Bakhti, M., Lickert, H., Ansari, M., Schniering, J., Schiller, H.B.,et al.: Cellrank for directed single-cell fate mapping. Nature methods19(2), 159–170 (2022) https://doi.org/10.1038/ s41592-021-01346-6
work page 2022
-
[9]
Pardi, F., Gascuel, O.: Combinatorics of distance-based tree inference. Proceed- ings of the National Academy of Sciences109(41), 16443–16448 (2012) https: //doi.org/10.1073/pnas.1118368109
-
[10]
The University of Kansas science bulletin38(22) (1958) https:// doi.org/10.5281/zenodo.16435757
Sokal, R.R., Michener, C.D.C.D.: A statistical method for evaluating systematic relationships. The University of Kansas science bulletin38(22) (1958) https:// doi.org/10.5281/zenodo.16435757
-
[11]
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstruct- ing phylogenetic trees. Molecular Biology and Evolution4(4), 406–425 (1987) https://doi.org/10.1093/oxfordjournals.molbev.a040454
-
[12]
Molecular Biology and Evolution32(10), 2798–2800 (2015) https://doi.org/10.1093/molbev/msv150
Lefort, V., Desper, R., Gascuel, O.: Fastme 2.0: A comprehensive, accurate, and fast distance-based phylogeny inference program. Molecular Biology and Evolution32(10), 2798–2800 (2015) https://doi.org/10.1093/molbev/msv150
-
[13]
Nature 560(2018) https://doi.org/10.1038/s41586-018-0414-6
La Manno, G., Soldatov, R., Zeisel, A., al.: Rna velocity of single cells. Nature 560(2018) https://doi.org/10.1038/s41586-018-0414-6
-
[14]
PLOS Com- putational Biology18(9), 1010492 (2022) https://doi.org/10.1371/journal.pcbi
Gorin, G., Fang, M., Chari, T., Pachter, L.: Rna velocity unraveled. PLOS Com- putational Biology18(9), 1010492 (2022) https://doi.org/10.1371/journal.pcbi. 1010492
-
[15]
Nature Communications (2021) https://doi.org/10.1038/s41467-021-24152-2
Cannoodt, R., Saelens, W., Deconinck, L., Saeys, Y.: Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells. Nature Communications (2021) https://doi.org/10.1038/s41467-021-24152-2
-
[16]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp
Kaltenmark, I., Charlier, B., Charon, N.: A general framework for curve and surface comparison and registration with oriented varifolds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3346–3355 (2017)
work page 2017
-
[17]
Molecular Biology and Evolution 23 33(10), 2720–2734 (2016) https://doi.org/10.1093/molbev/msw123
Kalaghatgi, P., Pfeifer, N., Lengauer, T.: Family-joining: A fast distance-based method for constructing generally labeled trees. Molecular Biology and Evolution 23 33(10), 2720–2734 (2016) https://doi.org/10.1093/molbev/msw123
-
[18]
Development 146(12), 173849 (2019) https://doi.org/10.1242/dev.173849
Bastidas-Ponce, A., Tritschler, S., Dony, L., Scheibner, K., Tarquis-Medina, M., Salinno, C., Schirge, S., Burtscher, I., B¨ ottcher, A., Theis, F.J., Lickert, H., Bakhti, M., Klein, A., Treutlein, B.: Comprehensive single cell mrna profil- ing reveals a detailed roadmap for pancreatic endocrinogenesis. Development 146(12), 173849 (2019) https://doi.org/1...
-
[19]
In: Nielsen, F., Barbaresco, F
Maignant, E., Conrad, T., Tycowicz, C.: Tree inference with varifold distances. In: Nielsen, F., Barbaresco, F. (eds.) Geometric Science of Information, pp. 290–299. Springer, Cham (2025). https://doi.org/10.1007/978-3-032-03921-7 30
-
[20]
Glaun` es, J.A.: Transport par diff´ eomorphismes de points, de mesures et de courants pour la comparaison de formes et l’anatomie num´ erique. PhD thesis (2005)
work page 2005
-
[21]
Foundations of Computational Mathematics17, 287–357 (2017) https://doi.org/10.1007/s10208-015-9288-2
Charlier, B., Charon, N., Trouv´ e, A.: The fshape framework for the variability analysis of functional shapes. Foundations of Computational Mathematics17, 287–357 (2017) https://doi.org/10.1007/s10208-015-9288-2
-
[22]
Coifman, R.R., Lafon, S.: Diffusion maps. Applied and computational harmonic analysis21(1), 5–30 (2006) https://doi.org/10.1016/j.acha.2006.04.006
-
[23]
(eds.) Diffusion Maps: Using the Semigroup Property for Parameter Tuning, pp
Shan, S., Daubechies, I.: In: Flandrin, P., Jaffard, S., Paul, T., Torresani, B. (eds.) Diffusion Maps: Using the Semigroup Property for Parameter Tuning, pp. 409–424. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-45847-8 18
-
[24]
Econometrica81(3), 1203–1227 (2013) https://doi.org/10.3982/ECTA8968
Ahn, S.C., Horenstein, A.R.: Eigenvalue ratio test for the number of factors. Econometrica81(3), 1203–1227 (2013) https://doi.org/10.3982/ECTA8968
-
[25]
Theory of Probability & Its Applica- tions9(1), 141–142 (1964) https://doi.org/10.1137/1109020
Nadaraya, E.A.: On estimating regression. Theory of Probability & Its Applica- tions9(1), 141–142 (1964) https://doi.org/10.1137/1109020
-
[26]
Sankhy¯ a: The Indian Journal of Statistics, Series A (1961-2002)26(4), 359–372 (1964)
Watson, G.S.: Smooth regression analysis. Sankhy¯ a: The Indian Journal of Statistics, Series A (1961-2002)26(4), 359–372 (1964)
work page 1961
-
[27]
Journal of com- binatorial theory6(3), 303–310 (1969) https://doi.org/10.1016/S0021-9800(69) 80092-X
Pereira, J.S.: A note on the tree realizability of a distance matrix. Journal of com- binatorial theory6(3), 303–310 (1969) https://doi.org/10.1016/S0021-9800(69) 80092-X
-
[28]
Journal of Machine Learning Research12, 2825–2830 (2011)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research12, 2825–2830 (2011)
work page 2011
-
[29]
Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using networkx. In: Varoquaux, G., Vaught, T., Millman, J. (eds.) 24 Proceedings of the 7th Python in Science Conference, Pasadena, CA USA, pp. 11–15 (2008). https://doi.org/10.25080/TCWV9851
-
[30]
Nature Biotechnology 38(12), 1408–1414 (2020) https://doi.org/10.1038/s41587-020-0591-3
Bergen, V., Lange, M., Peidli, S., Wolf, F.A., Theis, F.J.: Generalizing rna veloc- ity to transient cell states through dynamical modeling. Nature Biotechnology 38(12), 1408–1414 (2020) https://doi.org/10.1038/s41587-020-0591-3
-
[31]
Cell research 31(8), 886–903 (2021) https://doi.org/10.1038/s41422-021-00486-w 25
Yu, X.-X., Qiu, W.-L., Yang, L., Wang, Y.-C., He, M.-Y., Wang, D., Zhang, Y., Li, L.-C., Zhang, J., Wang, Y.,et al.: Sequential progenitor states mark the generation of pancreatic endocrine lineages in mice and humans. Cell research 31(8), 886–903 (2021) https://doi.org/10.1038/s41422-021-00486-w 25
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.