arxiv: 2605.08648 · v1 · submitted 2026-05-09 · 💻 cs.LG · q-bio.NC

Recognition: 2 theorem links

· Lean Theorem

FLUX: Geometry-Aware Longitudinal Flow Matching with Mixture of Experts

Josue Ortega Caro , Yongxu Zhang , Hannah M Batchelor , Sizhuang He , Jessica Cardin , Shreya Saxena

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:21 UTC · model grok-4.3

classification 💻 cs.LG q-bio.NC

keywords flow matchinglongitudinal datamixture of expertsregime discoverymanifold learningsingle-cell datacalcium imagingunsupervised learning

0 comments

The pith

FLUX reconstructs longitudinal transport from unpaired snapshots and discovers latent regimes by learning a data-dependent metric and routing velocity through mixture-of-experts experts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Biological systems often switch between latent regimes during learning, stimulus response, or development, yet observations frequently arrive only as unpaired population snapshots across stages rather than matched individual trajectories. FLUX addresses the dual problem of respecting curved low-dimensional manifolds in high-dimensional measurements while also detecting when the underlying transport dynamics themselves change. It does so by first learning a data-dependent metric from pooled observations, then building geometry-aware conditional paths between successive marginal distributions, and finally decomposing the induced velocity field into sparse expert vector fields chosen by a Gumbel-Softmax router. If this joint modeling succeeds, researchers gain both a continuous description of state evolution and an unsupervised segmentation of the process into interpretable regimes without needing to track the same cells or animals over time.

Core claim

FLUX learns a data-dependent metric from pooled labeled and unlabeled observations, uses that metric to construct geometry-aware conditional paths between adjacent marginals, and decomposes the resulting velocity field into sparse expert vector fields selected by a Straight-Through Gumbel-Softmax router, thereby enabling simultaneous longitudinal transport reconstruction and unsupervised regime discovery.

What carries the argument

Geometry-aware conditional paths built from a learned data-dependent metric, followed by mixture-of-experts decomposition of the velocity field via Straight-Through Gumbel-Softmax routing.

If this is right

The framework successfully reconstructs transport and recovers regime structure on manifold controls, a regime-switching Lorenz system, widefield cortical calcium imaging, and embryoid-body single-cell differentiation.
Mixture-of-experts routing alone is insufficient for regime discovery when regimes are encoded in local dynamics; geometric metric learning is necessary.
The same pipeline supplies a general strategy for extracting latent state transitions from any collection of unpaired longitudinal snapshots that lie on curved manifolds.
Ablation results indicate that the router fails or weakens without the geometry-aware component, confirming the coupling between manifold respect and regime identification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same geometry-plus-routing decomposition could be applied to other high-dimensional longitudinal settings where regime shifts occur, such as neural population recordings during behavior or population dynamics in ecology.
If the discovered regimes align with external variables like stimulus timing or developmental markers in new datasets, this would provide an external check on the unsupervised segmentation.
Extending the router to allow overlapping or hierarchical regime membership might capture more gradual or nested transitions observed in some biological processes.

Load-bearing premise

Latent regimes are encoded in distinct local dynamics that a learned manifold metric plus mixture-of-experts router can separate and identify.

What would settle it

An ablation experiment on the widefield cortical imaging or embryoid-body datasets in which removing the geometry-aware metric component leaves regime-discovery performance unchanged or improved would falsify the claim that geometric learning is required for effective regime recovery.

Figures

Figures reproduced from arXiv: 2605.08648 by Hannah M Batchelor, Jessica Cardin, Josue Ortega Caro, Shreya Saxena, Sizhuang He, Yongxu Zhang.

**Figure 1.** Figure 1: Overview of FLUX. Longitudinal data are observed as unpaired population snapshots. FLUX builds on Metric Flow Matching by using a learned data-dependent geometry to construct manifold-aware conditional paths, and extends this construction to ordered multi-marginal sequences with a regime-switching velocity field. A router g computes expert logits, M expert velocity networks predict candidate vector fields … view at source ↗

**Figure 2.** Figure 2: Stanford Bunny dimensionality ablation. Eight ordered marginals are sampled on a geodesic path over the Stanford Bunny mesh, with two intermediate marginals held out from velocity training. The same surface is embedded into R D for D ∈ {3, 5, 10, 20, 50, 100}. Evaluation integrates each model sequentially through all eight marginal times and reports held-out Wasserstein distance [PITH_FULL_IMAGE:figures/f… view at source ↗

**Figure 3.** Figure 3: Lorenz dynamical-system benchmark. (A) Two-dimensional visualization of trajectorywindow samples colored by the ground-truth Lorenz parameter regime. Each model input is a flattened 3 × 20 trajectory segment. (B) Generative transport metrics: one-hop Wasserstein distance, two-hop Wasserstein distance, and full-chain Wasserstein distance. (C) Segment-level regime-discovery metrics, including adjusted Rand … view at source ↗

**Figure 4.** Figure 4: Widefield calcium-imaging benchmark. (A) Example post-stimulus cortical activity represented as brain area by time. Each trial is flattened into a 451-dimensional vector. (B) Generative transport metrics across Widefield Calcium baselines. (C) Segment-level regime-discovery metrics. (D) Temporal expert assignments compared with early, intermediate, and late behavioral learning labels. (E) Radar summary of … view at source ↗

**Figure 5.** Figure 5: Embryoid body differentiation benchmark. (A) RNA-seq profiles projected into UMAP space and colored by differentiation stage. Each model input is a PCA representation of gene expression. (B) Generative transport metrics. (C) Segment-level regime-discovery metrics. (D) Temporal expert assignments compared with pluripotent, commitment, and differentiated stage labels. (E) Radar summary of transport and regim… view at source ↗

read the original abstract

Many biological systems evolve through continuous local dynamics while switching between latent regimes defined by learning, stimulus context, internal state, or developmental stage. These processes are often observed only as unpaired longitudinal snapshots: the same cells, neurons, or animals are not tracked as matched trajectories, even though population states are sampled across successive stages. This creates two coupled challenges. First, trajectories must respect curved low-dimensional manifolds embedded in high-dimensional biological measurements. Second, the model must identify when the transport mechanism itself changes. We introduce FLUX (FLow matching for Unpaired longitudinal data with miXture-of-experts), a geometry-aware longitudinal flow-matching framework for joint transport modeling and unsupervised regime discovery. FLUX learns a data-dependent metric from pooled labeled and unlabeled observations, uses that metric to construct geometry-aware conditional paths between adjacent marginals, and decomposes the resulting velocity field into sparse expert vector fields selected by a Straight-Through Gumbel-Softmax router. Across manifold controls, a regime-switching Lorenz system, widefield cortical calcium imaging during associative learning, and embryoid body single-cell differentiation, FLUX reconstructs longitudinal transport while recovering interpretable regime structure. Ablations show that mixture-of-experts routing alone is insufficient: FLUX without geometric learning can fit local transport but fails or weakens regime discovery when regimes are encoded in local dynamics. These results suggest that geometry-aware velocity decomposition provides a general strategy for discovering latent biological state transitions from unpaired longitudinal snapshots.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FLUX combines a learned data-dependent metric with MoE-routed velocity fields to model transport and discover regimes from unpaired biological snapshots, but the ablations leave room for doubt on whether geometry is the decisive factor.

read the letter

FLUX learns a data-dependent metric from the data to build conditional paths that respect manifold geometry, then decomposes the velocity field into expert vector fields routed by a Straight-Through Gumbel-Softmax. This is the core new piece: a joint approach to longitudinal transport and unsupervised regime detection that the abstract positions as missing from prior flow-matching or MoE work on snapshot data.

Referee Report

3 major / 3 minor

Summary. The paper introduces FLUX, a geometry-aware longitudinal flow-matching model with mixture-of-experts for unpaired snapshot data. It learns a data-dependent metric from pooled observations, builds geometry-aware conditional paths between adjacent marginals, and decomposes the velocity field into sparse expert vector fields routed by a Straight-Through Gumbel-Softmax. Experiments on synthetic manifold controls, a regime-switching Lorenz system, widefield cortical calcium imaging, and embryoid-body single-cell differentiation show that FLUX recovers longitudinal transport and interpretable latent regimes; ablations indicate that MoE routing without the geometry-aware component fails or weakens regime discovery when regimes are encoded in local dynamics.

Significance. If the central claim holds, FLUX offers a principled way to jointly solve transport and unsupervised regime discovery on curved manifolds from unpaired longitudinal snapshots, a common setting in developmental biology and systems neuroscience. The empirical validation across four distinct regimes (synthetic controls, chaotic dynamics, neural population activity, and single-cell trajectories) and the explicit ablation isolating the geometry component are strengths; the framework could generalize to other high-dimensional longitudinal settings where both manifold structure and discrete state switches must be recovered.

major comments (3)

[Ablation studies] Ablation studies (likely §4.3 or the supplementary ablation table): the claim that 'mixture-of-experts routing alone is insufficient' and that 'FLUX without geometric learning ... fails or weakens regime discovery' is load-bearing for the central thesis, yet the manuscript does not explicitly confirm that router capacity, number of experts, training protocol, optimizer settings, and evaluation metric were held identical between the full model and the geometry-ablated variant. Without these controls, performance differences could arise from implementation discrepancies rather than the geometry-aware metric.
[Real-data experiments] Real-data evaluation sections (cortical imaging and embryoid-body experiments): regime recovery is assessed via post-hoc interpretability (e.g., alignment with known learning stages or developmental markers). Because ground-truth regime labels are unavailable, the manuscript should report at least one quantitative, held-out metric (e.g., predictive accuracy of regime labels on a small labeled subset or consistency of regime assignments across random seeds) to independently verify that the geometry component, rather than the router alone, enabled the discovery.
[Methods] Metric-learning paragraph (early methods section): the data-dependent metric is learned from 'pooled labeled and unlabeled observations.' It is unclear whether this pooling introduces any leakage of temporal or regime information that would not be available in a strictly unsupervised longitudinal setting; a controlled experiment isolating the metric-learning step on purely unlabeled data would strengthen the claim that the geometry is discovered without supervision.

minor comments (3)

[Methods] Notation for the Straight-Through Gumbel-Softmax router should be introduced with an explicit equation (currently described only in prose) so that the sparsity and temperature schedule are unambiguous.
[Figures] Figure captions for the regime-switching Lorenz and cortical-imaging results should state the number of random seeds and whether error bars represent standard deviation or standard error.
[Abstract and figures] The abstract states that 'without geometric learning the router alone fails,' but the corresponding ablation figure legend should repeat the exact hyperparameter settings used for both models to avoid reader ambiguity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions where the concerns identify areas for improved transparency or additional validation.

read point-by-point responses

Referee: [Ablation studies] Ablation studies (likely §4.3 or the supplementary ablation table): the claim that 'mixture-of-experts routing alone is insufficient' and that 'FLUX without geometric learning ... fails or weakens regime discovery' is load-bearing for the central thesis, yet the manuscript does not explicitly confirm that router capacity, number of experts, training protocol, optimizer settings, and evaluation metric were held identical between the full model and the geometry-ablated variant. Without these controls, performance differences could arise from implementation discrepancies rather than the geometry-aware metric.

Authors: We thank the referee for this important observation on experimental controls. In the ablation studies, router capacity, number of experts, training protocol, optimizer settings, and evaluation metrics were held identical between the full model and the geometry-ablated variant to isolate the contribution of the geometry-aware metric. We will revise Section 4.3 and the supplementary ablation table to explicitly document these controls. revision: yes
Referee: [Real-data experiments] Real-data evaluation sections (cortical imaging and embryoid-body experiments): regime recovery is assessed via post-hoc interpretability (e.g., alignment with known learning stages or developmental markers). Because ground-truth regime labels are unavailable, the manuscript should report at least one quantitative, held-out metric (e.g., predictive accuracy of regime labels on a small labeled subset or consistency of regime assignments across random seeds) to independently verify that the geometry component, rather than the router alone, enabled the discovery.

Authors: We agree that quantitative validation strengthens the real-data claims. Although ground-truth regime labels are unavailable, we have computed the consistency of regime assignments across random seeds as a held-out quantitative metric. This analysis shows higher consistency for the geometry-aware model. We will add these results to the cortical imaging and embryoid-body sections along with the supplementary material. revision: yes
Referee: [Methods] Metric-learning paragraph (early methods section): the data-dependent metric is learned from 'pooled labeled and unlabeled observations.' It is unclear whether this pooling introduces any leakage of temporal or regime information that would not be available in a strictly unsupervised longitudinal setting; a controlled experiment isolating the metric-learning step on purely unlabeled data would strengthen the claim that the geometry is discovered without supervision.

Authors: We welcome the opportunity to clarify. The data-dependent metric is learned solely from the feature vectors of the pooled observations and does not incorporate temporal ordering, regime labels, or other supervisory signals. The phrase 'labeled and unlabeled' refers to the presence of time-point metadata in some datasets, which is not used for metric learning. We will add a controlled experiment in the supplementary material that performs metric learning on purely unlabeled data to confirm the absence of leakage. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation is self-contained and empirically grounded

full rationale

The paper defines FLUX as a composite model (data-dependent metric for geometry-aware paths + MoE router for velocity decomposition) and evaluates it via training on pooled observations followed by held-out reconstruction and post-hoc regime interpretability on synthetic manifolds, Lorenz systems, calcium imaging, and single-cell data. Ablations compare variants but do not reduce any claimed result to a fitted parameter renamed as prediction or to a self-citation chain; regime discovery is unsupervised yet assessed against independent metrics of transport fidelity and biological interpretability. No equation or claim equates the output regime structure to the input metric or router by construction. The central claims therefore rest on external data performance rather than definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review limits visibility; the central additions appear to be a learned metric and expert decomposition whose exact parameterization and training objectives are not specified.

axioms (1)

domain assumption Biological measurements lie on curved low-dimensional manifolds that a data-dependent metric can capture from pooled observations.
Invoked to justify construction of geometry-aware conditional paths between adjacent marginals.

invented entities (1)

Sparse expert vector fields no independent evidence
purpose: Decompose the overall velocity field into regime-specific components selected by the router.
Introduced as the output of the mixture-of-experts component; no independent evidence outside the model is provided.

pith-pipeline@v0.9.0 · 5581 in / 1336 out tokens · 61538 ms · 2026-05-12T02:21:58.291183+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

FLUX learns a data-dependent metric... decomposes the resulting velocity field into sparse expert vector fields selected by a Straight-Through Gumbel-Softmax router.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ablations show that mixture-of-experts routing alone is insufficient: FLUX without geometric learning can fit local transport but fails or weakens regime discovery

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 1 internal anchor

[1]

International Conference on Learning Representations (ICLR) , year=

Flow Matching for Generative Modeling , author=. International Conference on Learning Representations (ICLR) , year=

work page
[2]

Transactions on Machine Learning Research (TMLR) , year=

Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport , author=. Transactions on Machine Learning Research (TMLR) , year=

work page
[3]

International Conference on Machine Learning (ICML) , pages=

Multisample Flow Matching: Straightening Flows with Minibatch Couplings , author=. International Conference on Machine Learning (ICML) , pages=. 2023 , volume=

work page 2023
[4]

International Conference on Learning Representations (ICLR) , year=

Flow Matching on General Geometries , author=. International Conference on Learning Representations (ICLR) , year=

work page
[5]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Neural Ordinary Differential Equations , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[6]

Foundations and Trends in Machine Learning , volume=

Computational Optimal Transport: With Applications to Data Science , author=. Foundations and Trends in Machine Learning , volume=. 2019 , publisher=

work page 2019
[7]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Sinkhorn Distances: Lightspeed Computation of Optimal Transport , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[8]

Sun, Xingzhi and Liao, Danqi and MacDonald, Kincaid and Zhang, Yanlei and Huguet, Guillaume and Wolf, Guy and Adelstein, Ian and Rudner, Tim G. J. and Krishnaswamy, Smita , booktitle=. Geometry-Aware Generative Autoencoders for Warped. 2025 , publisher=

work page 2025
[9]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Generalised Implicit Neural Representations , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[10]

Categorical Reparameterization with

Jang, Eric and Gu, Shixiang and Poole, Ben , booktitle=. Categorical Reparameterization with

work page
[11]

International Conference on Learning Representations (ICLR) , year=

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , author=. International Conference on Learning Representations (ICLR) , year=

work page
[12]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Denoising Diffusion Probabilistic Models , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[13]

International Conference on Learning Representations (ICLR) , year=

Score-Based Generative Modeling through Stochastic Differential Equations , author=. International Conference on Learning Representations (ICLR) , year=

work page
[14]

Nature Biotechnology , volume=

Visualizing Structure and Transitions in High-Dimensional Biological Data , author=. Nature Biotechnology , volume=. 2019 , doi=

work page 2019
[15]

Bioinformatics , volume=

Nextstrain: real-time tracking of pathogen evolution , author=. Bioinformatics , volume=. 2018 , doi=

work page 2018
[16]

Longitudinal Flow Matching for Trajectory Modeling

Longitudinal Flow Matching for Trajectory Modeling , author=. arXiv preprint arXiv:2510.03569 , year=

work page arXiv
[17]

Diffusion

De Bortoli, Valentin and Thornton, James and Heng, Jeremy and Doucet, Arnaud , booktitle=. Diffusion

work page
[18]

Learning Population-Level Diffusions with Generative

Hashimoto, Tatsunori and Gifford, David and Jaakkola, Tommi , journal=. Learning Population-Level Diffusions with Generative

work page
[19]

2020 , pages=

Tong, Alexander and Huang, Jessie and Wolf, Guy and van Dijk, David and Krishnaswamy, Smita , booktitle=. 2020 , pages=

work page 2020
[20]

bioRxiv , year=

Selective Changes in Cortical Cholinergic Signaling during Learning , author=. bioRxiv , year=. doi:10.1101/2025.08.29.673096 , url=

work page doi:10.1101/2025.08.29.673096 2025
[21]

Neuron , volume=

Mesoscopic Imaging: Shining a Wide Light on Large-Scale Neural Dynamics , author=. Neuron , volume=. 2020 , doi=

work page 2020
[22]

Cell , volume=

Optimal-Transport Analysis of Single-Cell Gene Expression Identifies Developmental Trajectories in Reprogramming , author=. Cell , volume=. 2019 , doi=

work page 2019
[23]

Nature Neuroscience , volume=

Dimensionality Reduction for Large-Scale Neural Recordings , author=. Nature Neuroscience , volume=. 2014 , publisher=

work page 2014
[24]

Nature Methods , volume=

Inferring Single-Trial Neural Population Dynamics Using Sequential Auto-Encoders , author=. Nature Methods , volume=. 2018 , publisher=

work page 2018
[25]

American Journal of Physiology , volume=

Conduction Velocity and Diameter of Nerve Fibers , author=. American Journal of Physiology , volume=

work page
[26]

Science , volume=

A Global Geometric Framework for Nonlinear Dimensionality Reduction , author=. Science , volume=. 2000 , doi=

work page 2000
[27]

McInnes, Leland and Healy, John and Melville, James , journal=

work page
[28]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Attention Is All You Need , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[29]

Journal of Machine Learning Research , volume=

A Kernel Two-Sample Test , author=. Journal of Machine Learning Research , volume=

work page
[30]

and Welling, Max , booktitle=

Kingma, Diederik P. and Welling, Max , booktitle=. Auto-Encoding Variational

work page
[31]

International Conference on Artificial Intelligence and Statistics (AISTATS) , pages=

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , author=. International Conference on Artificial Intelligence and Statistics (AISTATS) , pages=. 2017 , publisher=

work page 2017
[32]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Inference of Neural Dynamics Using Switching Recurrent Neural Networks , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[33]

, journal=

Rabiner, Lawrence R. , journal=. A Tutorial on Hidden. 1989 , doi=

work page 1989
[34]

SIAM Journal on Control and Optimization , volume=

Acceleration of Stochastic Approximation by Averaging , author=. SIAM Journal on Control and Optimization , volume=

work page
[35]

Advances in Neural Information Processing Systems (NeurIPS) , volume=

Metric Flow Matching for Smooth Interpolations on the Data Manifold , author=. Advances in Neural Information Processing Systems (NeurIPS) , volume=

work page
[36]

Advances in Neural Information Processing Systems , volume =

Manifold Interpolating Optimal-Transport Flows for Trajectory Inference , author =. Advances in Neural Information Processing Systems , volume =

work page
[37]

International Conference on Learning Representations , year =

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow , author =. International Conference on Learning Representations , year =

work page
[38]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions , author =. arXiv preprint arXiv:2303.08797 , year =

work page internal anchor Pith review Pith/arXiv arXiv
[39]

arXiv preprint arXiv:2310.03695 , year =

Multimarginal Generative Modeling with Stochastic Interpolants , author =. arXiv preprint arXiv:2310.03695 , year =

work page arXiv
[40]

and Gifford, David K

Yeo, Grace Hui Ting and Saksena, Sachit D. and Gifford, David K. , journal =. Generative Modeling of Single-Cell Time Series with. 2021 , doi =

work page 2021
[41]

arXiv preprint arXiv:2507.22270 , year=

Weighted conditional flow matching , author=. arXiv preprint arXiv:2507.22270 , year=

work page arXiv
[42]

Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques , series =

Zippered Polygon Meshes from Range Images , author =. Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques , series =. 1994 , publisher =

work page 1994
[43]

Journal of the Atmospheric Sciences , volume =

Deterministic Nonperiodic Flow , author =. Journal of the Atmospheric Sciences , volume =. 1963 , doi =

work page 1963
[44]

Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability , volume =

Some Methods for Classification and Analysis of Multivariate Observations , author =. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability , volume =. 1967 , publisher =

work page 1967
[45]

and Laird, Nan M

Dempster, Arthur P. and Laird, Nan M. and Rubin, Donald B. , journal =. Maximum Likelihood from Incomplete Data via the. 1977 , doi =

work page 1977
[46]

On Lines and Planes of Closest Fit to Systems of Points in Space , author =

LIII. On Lines and Planes of Closest Fit to Systems of Points in Space , author =. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science , volume =. 1901 , doi =

work page 1901
[47]

Advances in Neural Information Processing Systems , volume =

On Spectral Clustering: Analysis and an Algorithm , author =. Advances in Neural Information Processing Systems , volume =

work page
[48]

2024 , doi =

Zhang, Jiaqi and Larschan, Erica and Bigness, Jeremy and Singh, Ritambhara , journal =. 2024 , doi =

work page 2024
[49]

Nature Methods , volume =

Learning Single-Cell Perturbation Responses Using Neural Optimal Transport , author =. Nature Methods , volume =. 2023 , doi =

work page 2023
[50]

Journal of Machine Learning Research , volume =

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , author =. Journal of Machine Learning Research , volume =

work page