pith. machine review for the scientific record. sign in

arxiv: 2605.08497 · v1 · submitted 2026-05-08 · 🧬 q-bio.QM · cs.MS· q-bio.BM

Recognition: 1 theorem link

· Lean Theorem

MeTime: An R package for reproducible longitudinal metabolomics data analysis

Bharadwaj Marella, Gabi Kastenmueller, Josef J Bless, Lara Vehovec, Matthias Arnold, Patrick Weinisch, Vinh Tran, Yacoub A. Njipouombe Nsangou

Pith reviewed 2026-05-12 00:48 UTC · model grok-4.3

classification 🧬 q-bio.QM cs.MSq-bio.BM
keywords R packagelongitudinal metabolomicsreproducibilityS4 containeromics data analysisworkflow pipingprovenance tracking
0
0 comments X

The pith

MeTime is an R package that stores longitudinal metabolomics data, metadata, and analysis outputs in a single S4 container to keep workflows reproducible.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MeTime as an open-source R package designed specifically for handling the complexities of longitudinal metabolomics studies. It centers on a unified container that holds multiple datasets along with their metadata and all generated results, allowing users to build analyses through a sequence of modular functions connected by piping. This structure keeps every step visible, retains intermediate outputs, and automatically produces HTML and PDF reports, addressing the common problem of scattered scripts that lose track of provenance in repeated analyses over time.

Core claim

MeTime introduces the metime_analyser S4 container that holds multiple datasets, associated metadata, and all analysis outputs in one object, enabling workflows built by piping modular functions that begin with data transformations, continue through calculations, and optionally include meta-analysis while preserving full provenance for iterative exploration and reproducible reporting.

What carries the argument

The metime_analyser S4 container combined with the mod_, calc_, and meta_ piping interface, which unifies storage of datasets and results and wraps existing methods such as PCA, mixed-effects regression, and WGCNA clustering under a consistent structure.

If this is right

  • Users can apply a wide range of existing methods including dimensionality reduction, random forest imputation, and regression models through the same interface without rewriting code for each step.
  • All intermediate results and provenance remain inside the container, supporting iterative changes to the workflow while keeping prior outputs intact.
  • Automated generation of HTML and PDF reports follows directly from the retained data and steps, reducing manual documentation effort.
  • The design supports complex studies with multiple datasets by keeping everything unified rather than scattered across files.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same container approach could extend to other longitudinal omics data types that share similar needs for tracking multiple time-point measurements and metadata.
  • Integration with version-control systems might become simpler because the single container object can be saved and reloaded as one unit.
  • New modular functions could be added by users to incorporate methods not yet wrapped, provided they follow the existing piping pattern.
  • The emphasis on retaining all outputs may increase memory use in very large studies, requiring explicit subsetting steps that the package does not currently automate.

Load-bearing premise

That wrapping existing methods inside a consistent container and piping interface will produce meaningful improvements in reproducibility and usability for longitudinal metabolomics beyond what ad-hoc R scripts or other packages already achieve.

What would settle it

A direct comparison in which researchers complete the same longitudinal metabolomics study using MeTime versus custom scripts and show no difference in time to reproduce results or in the number of provenance errors would falsify the central claim.

read the original abstract

MeTime is an opensource R package for reproducible analysis of longitudinal metabolomics data. It builds upon a central S4 container, metime_analyser, that stores multiple datasets, associated metadata and analysis outputs, enabling unified handling of complex longitudinal studies. Analyses are constructed by piping modular functions, beginning with data transformations (mod_), followed by calculations (calc_), and optional meta-analysis (meta_), so entire workflows remain transparent and easy to modify. MeTime wraps numerous existing methods within a consistent interface, including sample and metabolite distributions, correlation and distance matrices, dimensionality reduction (PCA, UMAP, tSNE), random forest imputation and feature selection via Boruta, eigenmetabolites and WGCNA based clustering, conservation index analysis, regression models (linear, mixed effects, and generalized additive), and partial correlation networks. By retaining all intermediate results and provenance within the container, MeTime facilitates iterative exploration and ensures reproducible reporting via automatically generated HTML and PDF outputs. Comprehensive user guides, case studies and reference documentation accompany the package, making MeTime a versatile platform for longitudinal omics workflows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript describes MeTime, an open-source R package for reproducible analysis of longitudinal metabolomics data. It introduces a central S4 container (metime_analyser) that stores multiple datasets, metadata, and analysis outputs, with workflows constructed via modular piping functions (mod_ for transformations, calc_ for calculations, meta_ for meta-analysis) that wrap existing methods including distributions, correlations, dimensionality reduction (PCA/UMAP/tSNE), imputation/feature selection (random forest/Boruta), clustering (eigenmetabolites/WGCNA), regression (linear/mixed-effects/GAM), and partial correlation networks. All intermediates and provenance are retained to support iterative exploration and automatic HTML/PDF report generation, accompanied by user guides and case studies.

Significance. If the described architecture functions as outlined, MeTime provides a practical advance for longitudinal omics workflows by enforcing a consistent, provenance-retaining interface that reduces fragmentation from ad-hoc scripts. Credit is due for the explicit design choices around S4 container modularity, automatic reporting, and comprehensive wrapping of standard methods, which directly support reproducibility goals in a field prone to complex, multi-dataset studies. These features, combined with open-source availability and documentation, position the package as a useful platform rather than a novel algorithmic contribution.

minor comments (2)
  1. [Abstract] Abstract and methods overview: the enumeration of wrapped techniques (e.g., Boruta, WGCNA, GAM) would be strengthened by explicit citations to the original method papers so readers can trace implementation details without external search.
  2. [Architecture description] The description of the metime_analyser container and piping interface is clear at a high level but lacks a concrete workflow diagram or pseudocode example showing how provenance is serialized across mod_/calc_/meta_ steps; adding one would improve immediate usability for new users.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and constructive review of our manuscript describing the MeTime R package. We appreciate the recognition of the package's architecture for supporting reproducible longitudinal metabolomics workflows and the recommendation to accept.

Circularity Check

0 steps flagged

No significant circularity; software description only

full rationale

The manuscript presents an R package architecture (metime_analyser S4 container, mod_/calc_/meta_ piping, provenance retention, wrapped methods) without any derivation chain, predictions, fitted parameters, or first-principles results. No equations, uniqueness theorems, or self-citations of load-bearing mathematical claims appear. The central claim—that the container and workflow enable unified reproducible handling—follows directly from the explicit design description and does not reduce to its own inputs by construction. This is a standard honest non-finding for a methods/software paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software package description rather than a theoretical derivation or empirical study, so the central claim rests on no free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5532 in / 1024 out tokens · 46371 ms · 2026-05-12T00:48:37.998988+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

    central S4 container, metime_analyser, that stores multiple datasets, associated metadata and analysis outputs, enabling unified handling of complex longitudinal studies. Analyses are constructed by piping modular functions, beginning with data transformations (mod_), followed by calculations (calc_), and optional meta-analysis (meta_*)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Institute of Computational Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany

  2. [2]

    * Equal contribution # Correspondence to: matthias.arnold@helmholtz-munich.de Abstract MeTime is an opensource R package for reproducible analysis of longitudinal metabolomics data

    Department of Psychiatry and Behavioral Sciences, Duke Institute for Brain Sciences, Department of Medicine, Duke University, Durham, NC, USA. * Equal contribution # Correspondence to: matthias.arnold@helmholtz-munich.de Abstract MeTime is an opensource R package for reproducible analysis of longitudinal metabolomics data. It builds upon a central S4 cont...

  3. [3]

    Introduction Longitudinal metabolomics provides a powerful framework for characterizing temporal changes in metabolite profiles within an organism. By capturing within-subject dynamics over time, longitudinal designs enable the investigation of molecular responses to stimuli, interventions, and disease progression, while accounting for inter-individual va...

  4. [4]

    This object stores raw and processed data, analytical results, and detailed metadata describing each pipeline step (called functions, parameter settings, and execution order)

    Implementation and Functionality MeTime enables transparent and reproducible analytical pipelines for longitudinal omics data by centering all inputs, outputs, and provenance in a single S4 container, the metime_analyser. This object stores raw and processed data, analytical results, and detailed metadata describing each pipeline step (called functions, p...

  5. [5]

    Exemplary analysis To demonstrate the functionality of MeTime, we provide three complementary usage examples. First, the package includes the publicly available HuMet dataset (Weinisch et al., 2024), which is used in the GitHub tutorials and example workflows as a small and accessible dataset for learning the package and exploring its main functions. Seco...

  6. [6]

    Conclusion MeTime provides a comprehensive framework for the reproducible analysis of longitudinal metabolomics data by enabling rapid construction of transparent and modular analytical pipelines. By integrating data management, statistical modeling, result storage, and report generation within a single S4-based architecture, MeTime thereby reduces the te...

  7. [7]

    R., Martins dos Santos, V

    Rosato, A., Tenori, L., Cascante, M., De Atauri Carulla, P. R., Martins dos Santos, V. A., & Saccenti, E. (2018). From correlation to causation: analysis of metabolomics data using systems biology approaches. Metabolomics, 14, 1-20

  8. [8]

    A., Rosati, G., Moguet, C., Fuentes, C., Marrugo-Ramírez, J., Lefebvre, T.,

    Castelli, F. A., Rosati, G., Moguet, C., Fuentes, C., Marrugo-Ramírez, J., Lefebvre, T., ... & Junot, C. (2022). Metabolomics for personalized medicine: the input of analytical chemistry from biomarker discovery to point-of-care tests. Analytical and bioanalytical chemistry, 414(2), 759-789

  9. [9]

    R., & Luo, J

    Shen, S., Zhan, C., Yang, C., Fernie, A. R., & Luo, J. (2023). Metabolomics-centered mining of plant metabolic diversity and function: Past decade and future perspectives. Molecular Plant, 16(1), 43-63

  10. [10]

    & Kastenmüller, G

    Arnold, M., Nho, K., Kueider-Paisley, A., Massaro, T., Huynh, K., Brauner, B., ... & Kastenmüller, G. (2020). Sex and APOE ε4 genotype modify the Alzheimer’s disease serum metabolome. Nature communications, 11(1), 1148

  11. [11]

    E., Kluttig, A., Tiller, D., Medenwald, D., Giegling, I., Rujescu, D.,

    Lacruz, M. E., Kluttig, A., Tiller, D., Medenwald, D., Giegling, I., Rujescu, D., ... & Kastenmüller, G. (2018). Instability of personal human metabotype is linked to all-cause mortality. Scientific reports, 8(1), 9810

  12. [12]

    S., Guo, A., Oler, E., Wang, F., Anjum, A., Peters, H.,

    Wishart, D. S., Guo, A., Oler, E., Wang, F., Anjum, A., Peters, H., ... & Gautam, V. (2022). HMDB 5.0: the human metabolome database for 2022. Nucleic acids research, 50(D1), D622-D631

  13. [13]

    P., Maslov, D

    Trifonova, O. P., Maslov, D. L., Balashova, E. E., & Lokhov, P. G. (2023). Current State and Future Perspectives on Personalized Metabolomics. Metabolites, 13(1), 67

  14. [14]

    D., Hedeker, D., & DuToit, S

    Gibbons, R. D., Hedeker, D., & DuToit, S. (2010). Advances in analysis of longitudinal data. Annual review of clinical psychology, 6, 79-107

  15. [15]

    A., Kastenmüller, G., Gieger, C., Shin, S

    Yousri, N. A., Kastenmüller, G., Gieger, C., Shin, S. Y., Erte, I., Menni, C., ... & Suhre, K. (2014). Long term conservation of human metabolic phenotypes and link to heritability. Metabolomics, 10, 1005-1017

  16. [16]

    Krumsiek, J., Suhre, K., Illig, T., Adamski, J., & Theis, F. J. (2011). Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC systems biology, 5, 1-16

  17. [17]

    J., Mõttus, R., & Borsboom, D

    Epskamp, S., Waldorp, L. J., Mõttus, R., & Borsboom, D. (2018). The Gaussian graphical model in cross-sectional and time-series data. Multivariate behavioral research, 53(4), 453-480

  18. [18]

    C., Thomas, G., Boulesteix, A

    Considine, E. C., Thomas, G., Boulesteix, A. L., Khashan, A. S., & Kenny, L. C. (2018). Critical review of reporting of the data analysis step in metabolomics. Metabolomics, 14, 1-16

  19. [19]

    J., Garrett, T

    Du, X., Aristizabal-Henao, J. J., Garrett, T. J., Brochhausen, M., Hogan, W. R., & Lemas, D. J. (2022). A checklist for reproducible computational analysis in clinical metabolomics research. Metabolites, 12(1), 87

  20. [20]

    Baker, M. (2016). Why scientists must share their research code. Nature

  21. [21]

    & Xia, J

    Chong, J., Soufan, O., Li, C., Caraus, I., Li, S., Bourque, G., ... & Xia, J. (2018). MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic acids research, 46(W1), W486-W494

  22. [22]

    P., Schweickart, A., Batra, R., Buyukozkan, M.,

    Chetnik, K., Benedetti, E., Gomari, D. P., Schweickart, A., Batra, R., Buyukozkan, M., ... & Krumsiek, J. (2022). maplet: an extensible R toolbox for modular and reproducible metabolomics pipelines. Bioinformatics, 38(4), 1168-1170

  23. [23]

    R., Cornberg, M.,

    Grabert, G., Dehncke, D., More, T., List, M., Kraft, A. R., Cornberg, M., ... & Kacprowski, T. (2024). MeTEor: an R Shiny app for exploring longitudinal metabolomics data. Bioinformatics Advances, 4(1), vbae178

  24. [24]

    M., Wickham, H., Henry, L., & Henry, M

    Bache, S. M., Wickham, H., Henry, L., & Henry, M. L. (2022). Package ‘magrittr’. R. Package Version

  25. [25]

    SummarizedExperiment: SummarizedExperiment container

    Morgan M, Obenchain V, Hester J, Pagès H (2023). SummarizedExperiment: SummarizedExperiment container. doi:10.18129/B9.bioc.SummarizedExperiment

  26. [26]

    rmarkdown: Dynamic Documents for R

    Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R (2023). rmarkdown: Dynamic Documents for R. R package version 2.25, https://github.com/rstudio/rmarkdown

  27. [27]

    P., Rist, M

    Weinisch, P., Raffler, J., Römisch-Margl, W., Arnold, M., Mohney, R. P., Rist, M. J., ... & Kastenmüller, G. (2024). The HuMet Repository: Watching human metabolism at work. Cell reports, 43(8)

  28. [28]

    J., Risacher, S

    Marella, B., Weinisch, P., Bless, J. J., Risacher, S. L., Blach, C., Karu, N., ... & Alzheimer’s Disease Neuroimaging Initiative. (2025). A seven-year longitudinal study of the Alzheimer’s disease blood metabolome. medRxiv

  29. [29]

    Author contributions Conceptualization & Methodology: GK, MA; Code development & software: BM, PW, MA; Funding acquisition: GK, MA; Testing: BM, PW, VT, LV, YN, MA; Writing – original draft: BM, JJB, MA; Writing – review & editing: All authors

  30. [30]

    A Healthy Diet for a Healthy Life

    Acknowledgements This work was supported by the National Institutes of Health/the National Institute on Aging through grants 1RF1AG057452, R01AG069901, U01AG061359, and R01AG081322. This work was also supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) FOR 5795 (HyperMet) and by the German Federal Ministry of Education and R...

  31. [31]

    Conflicts of interest All authors declared no conflicts. metime_analyser meta_results base R datatypes output files Day tSNE1 tSNE2 Challenge Association analysis Data characteristics Conservation index humet_object %>% mod_trans_zscore(...) %>% calc_conservation_metabolite(...) %>% mod_generate_plots(...) humet_object %>% mod_merge_row_data_and_data(...)...