arxiv: 2605.09173 · v1 · submitted 2026-05-09 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

Peng Cao , Zhijian Yang , Tennison Liu , Jonathan Wang , Jiang Wu , Magdalena Proszewska , Arvind Pillai , Mingwu Gao

show 12 more authors

Amir Farjadian Lawrence Cai Emily Blanchard Daniel McDuff Pramod Rudrapatna Matthew Thompson Anupam Pathak Mark Malhotra Shwetak Patel Dina Katabi Paolo Di Achille Ming-Zher Poh

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords wearable sensorsself-supervised learningphysiological waveformslongitudinal datahierarchical representationshealth predictionfoundation model

0 comments

The pith

WavesFM's two-stage self-supervised framework learns both local features in short waveform segments and their evolution over multi-day sequences from longitudinal wearable sensor data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a foundation model that addresses the difficulty of using very long, high-frequency recordings from wearables such as photoplethysmography and accelerometry. It splits the self-supervised learning task into a first stage that pretrains an encoder on brief signal segments to capture morphological details and a second stage that trains a temporal encoder on the resulting sequence of embeddings across days. This decomposition allows the model to manage extreme sequence lengths while retaining both subtle local signatures and broader circadian or inter-day patterns. After pretraining on millions of hours of unlabeled data, the representations support strong performance when adapted to many different prediction problems. A reader would care because continuous free-living sensor data is abundant yet hard to interpret without extensive labels, and the approach offers a scalable route to extract health-relevant information from it.

Core claim

WavesFM demonstrates that decomposing self-supervised pretraining into a segment-level encoder for short physiological waveforms followed by a temporal encoder for sequences of those embeddings across multi-day horizons enables the capture of both local signal semantics and complex longitudinal dynamics. Pretrained on over 6.8M hours of recordings from 324k individuals for the segment stage and 5.3M hours from 10k individuals for the temporal stage, the resulting model achieves superior performance across 58 diverse downstream tasks spanning demographics, lifestyle, health conditions, and medications.

What carries the argument

The two-stage hierarchical self-supervised learning framework: a segment-level encoder pretrained to extract local embeddings from short waveforms, followed by a temporal encoder trained to model sequences of those embeddings over multi-day horizons.

If this is right

The framework scales self-supervised pretraining to extreme sequence lengths without prohibitive computation.
Both fine morphological details in waveforms and extended behavioral patterns such as circadian variations become available for downstream use.
Large-scale unlabeled wearable data can be leveraged to improve accuracy on tasks with scarce ground-truth labels.
The learned representations support better results across a broad range of phenotype predictions including demographics, lifestyle factors, health conditions, and medications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The embeddings could support anomaly detection or trend forecasting in ongoing physiological monitoring beyond the reported tasks.
Similar hierarchical decomposition might apply to other continuous high-frequency sensor streams such as environmental or industrial time series.
Combining these representations with sparse clinical labels could enable more robust personalized inference in real-world health applications.

Load-bearing premise

The two-stage split and embedding aggregation from short segments to long sequences preserves all predictive information present in the original high-resolution longitudinal waveforms.

What would settle it

A controlled experiment showing that a single-stage model trained directly on raw long waveforms or using different aggregation methods matches or exceeds accuracy on the same 58 tasks would indicate the hierarchical separation discards useful information.

read the original abstract

Wearable sensors enable the continuous acquisition of high-resolution physiological waveforms, such as photoplethysmography and accelerometry, under free-living conditions. However, inferring health-related phenotypes from these signals presents significant challenges due to high sampling frequencies, multimodal dependencies, and extreme sequence lengths (e.g., weeks of recordings), compounded by a scarcity of ground-truth labels. To address these challenges, existing self-supervised learning (SSL) methodologies typically follow two paradigms: (1) learning rich morphological representations from short waveform segments while collapsing longitudinal dynamics through simple aggregation, or (2) modeling behavioral patterns from coarse, hand-crafted features (e.g. heart rate, step counts) spanning longer horizons but foregoing subtle, predictive signatures in raw waveforms. To bridge this gap, we propose WavesFM, a foundation model utilizing a two-stage SSL framework for longitudinal physiological data. Specifically, we decompose the learning problem into two stages: first, a segment-level encoder is pretrained to extract local embeddings from short waveforms; subsequently, a temporal encoder is trained to model the sequence of these embeddings across a multi-day horizon. This hierarchical approach overcomes the computational complexity of high-resolution, long-sequence data, allowing the overall model to capture both local signal semantics and the complex circadian and inter-day variations governing physiological dynamics. Pretrained on over 6.8M hours (N=324k individuals) of recordings for the first stage and 5.3M hours (N=10k) for the second stage, WavesFM demonstrates superior performance across 58 diverse tasks spanning demographics, lifestyle, health conditions, and medications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

WavesFM offers a practical two-stage SSL split for long wearable waveforms at large scale, but the abstract's superiority claim on 58 tasks lacks the numbers and controls needed to judge it.

read the letter

The main point on WavesFM is that it splits self-supervised pretraining into a segment encoder for short raw clips and a temporal encoder on the resulting embedding sequences, trained on millions of hours of wearable data. This is a straightforward way to handle the length and resolution of continuous PPG and accelerometer streams without immediate compute blowup, and it targets the gap between local morphology and multi-day patterns like circadian shifts. The scale stands out: 6.8M hours from 324k people for the first stage and 5.3M hours from 10k for the second. That kind of data volume is real work and gives the approach a shot at learning general representations for downstream health tasks. The framing around label scarcity and free-living recordings is also on target for digital health applications. What the paper does less well is the evaluation. The abstract states superior performance across 58 tasks covering demographics, lifestyle, conditions, and medications, yet supplies no metrics, baseline details, statistical tests, task selection rules, or checks for pretraining-evaluation leakage. Without those, the central empirical result is hard to assess. The stress-test worry about information loss in fixed segment embeddings also lands, because the description gives no ablations against joint fine-tuning of both stages or against end-to-end models on aggregated raw waveforms. If the segment embeddings drop subtle long-horizon signals, the temporal stage starts from an incomplete input. This work is aimed at researchers building foundation models for continuous sensor data in health. A reader working on SSL for time series or wearable analytics would find the architecture and scale worth discussing, but only after seeing the full tables and controls. It deserves peer review so the results can be checked and strengthened, even though the current write-up leaves the performance claims under-supported.

Referee Report

2 major / 1 minor

Summary. The paper introduces WavesFM, a two-stage self-supervised learning foundation model for longitudinal wearable sensor waveforms (e.g., PPG and accelerometry). It first pretrains a segment-level encoder on short clips from 6.8M hours of data across 324k individuals, then trains a temporal encoder on the resulting embedding sequences from 5.3M hours across 10k individuals, claiming this hierarchical approach captures both local morphology and multi-day dynamics to achieve superior performance on 58 diverse downstream tasks spanning demographics, lifestyle, health conditions, and medications.

Significance. If the two-stage decomposition successfully retains task-relevant information from raw high-resolution longitudinal waveforms, the work could meaningfully advance scalable representation learning for wearable health data, where standard SSL either collapses long-range dynamics or relies on coarse hand-crafted features. The reported pretraining scale on real free-living recordings is a concrete strength that, if paired with rigorous validation, would support broader adoption of hierarchical SSL in physiological signal modeling.

major comments (2)

Abstract: the central claim of superior performance across 58 tasks supplies no quantitative metrics, baseline comparisons, statistical tests, task selection criteria, or discussion of potential confounds such as pretraining-evaluation data leakage, leaving the empirical superiority assertion unsupported by evidence in the manuscript.
Two-stage SSL framework (as described in the abstract and methods): the architecture implicitly treats segment embeddings as a sufficient statistic for all downstream phenotypes, yet no ablations are reported that compare the frozen two-stage model against (a) joint end-to-end fine-tuning of both stages or (b) direct temporal modeling on temporally aggregated raw waveforms; without such controls it is impossible to confirm that the embedding aggregation step does not discard subtle longitudinal predictive signal.

minor comments (1)

Abstract: the description of the 58 tasks and the exact SSL objectives used in each stage lack sufficient detail for readers to assess novelty relative to prior contrastive or masked-modeling approaches on waveforms.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, providing clarifications from the full paper and indicating where revisions will be made to strengthen the presentation.

read point-by-point responses

Referee: Abstract: the central claim of superior performance across 58 tasks supplies no quantitative metrics, baseline comparisons, statistical tests, task selection criteria, or discussion of potential confounds such as pretraining-evaluation data leakage, leaving the empirical superiority assertion unsupported by evidence in the manuscript.

Authors: The abstract is necessarily concise due to length limits and focuses on the high-level contribution. The full manuscript provides the requested details in the Results section (including Tables 2-5 and Figures 3-6), which report per-task metrics (e.g., AUC, F1, MAE), comparisons against 12 baselines, statistical significance via paired t-tests and Bonferroni correction, and task selection criteria (all publicly available UK Biobank and All of Us phenotypes meeting minimum sample-size thresholds). Data leakage is prevented by partitioning at the individual level with no temporal overlap between pretraining and evaluation cohorts, as stated in Section 3.4. To improve accessibility, we will revise the abstract to include two key quantitative highlights (average relative improvement and mention of significance) while retaining the overall claim. revision: yes
Referee: Two-stage SSL framework (as described in the abstract and methods): the architecture implicitly treats segment embeddings as a sufficient statistic for all downstream phenotypes, yet no ablations are reported that compare the frozen two-stage model against (a) joint end-to-end fine-tuning of both stages or (b) direct temporal modeling on temporally aggregated raw waveforms; without such controls it is impossible to confirm that the embedding aggregation step does not discard subtle longitudinal predictive signal.

Authors: The hierarchical decomposition is required for tractability: direct temporal modeling on raw 100 Hz waveforms over multi-day horizons exceeds available GPU memory even with aggressive downsampling, which is why we first learn a segment encoder on 6.8 M hours and then a temporal encoder on the resulting embeddings. We already include an ablation replacing the temporal encoder with mean pooling of segment embeddings (Section 4.3), demonstrating consistent gains from modeling temporal dynamics. We did not perform full end-to-end fine-tuning of both stages on the entire 58-task suite because of compute cost, but we agree this would be informative. We will add end-to-end fine-tuning results on a representative subset of 10 tasks and a discussion of memory constraints for raw-waveform baselines in the revised Methods and Results sections. revision: partial

Circularity Check

0 steps flagged

No circularity: hierarchical SSL performance claims are empirical and self-contained

full rationale

The paper describes a standard two-stage self-supervised pretraining pipeline (segment encoder on short clips, followed by temporal encoder on embeddings) trained on external wearable datasets using conventional SSL objectives. Downstream evaluation on 58 tasks is reported as empirical results with no equations, fitted parameters, or predictions that reduce by construction to quantities defined inside the paper. No self-citations are invoked as load-bearing uniqueness theorems, and the architecture choice does not smuggle in ansatzes or rename known results. The derivation chain is therefore independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard SSL assumption that pretext tasks on unlabeled data yield useful representations and on the unstated premise that the chosen segment length and temporal horizon preserve longitudinal information; no explicit free parameters or invented entities are described.

axioms (1)

domain assumption Self-supervised pretraining on unlabeled physiological waveforms produces embeddings that transfer to diverse downstream health-related tasks.
Invoked by the entire two-stage pretraining and fine-tuning pipeline.

pith-pipeline@v0.9.0 · 5662 in / 1363 out tokens · 39355 ms · 2026-05-12T04:03:07.991834+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean (Jcost uniqueness, washburn_uniqueness_aczel) washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

two-stage SSL framework: segment-level encoder pretrained via subject-contrastive learning on 15 s windows; temporal encoder trained on 5-min binned embeddings with masked reconstruction, factorized day-of-week / time-of-day positional encodings and dual-branch decoder
IndisputableMonolith/Foundation/DimensionForcing.lean (8-tick period, D=3 forcing) reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multi-scale masking with patch sizes 1/2/4 (5/20/60 min) and 8-tick period never mentioned; no golden-ratio spacing or recognition-cost term appears

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

[1]

Wearables and the medical revolution.Personalized Medicine, 15(5):429–448, 2018

Jessilyn Dunn, Ryan Runge, and Michael Snyder. Wearables and the medical revolution.Personalized Medicine, 15(5):429–448, 2018

work page 2018
[2]

High-performance medicine: the convergence of human and artificial intelligence.Nature medicine, 25(1):44–56, 2019

Eric J Topol. High-performance medicine: the convergence of human and artificial intelligence.Nature medicine, 25(1):44–56, 2019

work page 2019
[3]

The role of the circadian system in fractal neurophysiological control.Biological Reviews, 88(4):873–894, 2013

Benjamin R Pittman-Polletta, Frank AJL Scheer, Matthew P Butler, Steven A Shea, and Kun Hu. The role of the circadian system in fractal neurophysiological control.Biological Reviews, 88(4):873–894, 2013

work page 2013
[4]

Circadian rhythms and the molecular clock in cardiovascular biology and disease.Nature Reviews Cardiology, 16(7):437–447, 2019

Sandra Crnko, Bastiaan C Du Pré, Joost PG Sluijter, and Linda W Van Laake. Circadian rhythms and the molecular clock in cardiovascular biology and disease.Nature Reviews Cardiology, 16(7):437–447, 2019

work page 2019
[5]

Morning surge in blood pressure and cardiovascular risk: evidence and perspectives

Kazuomi Kario. Morning surge in blood pressure and cardiovascular risk: evidence and perspectives. Hypertension, 56(5):765–773, 2010

work page 2010
[6]

Mustafa Halimeh, Yonghua Yang, Theodore Sheehan, Solveig Vieluf, Michele Jackson, Tobias Loddenkem- per, and Christian Meisel. Wearable device assessments of antiseizure medication effects on diurnal patterns of electrodermal activity, heart rate, and heart rate variability.Epilepsy & Behavior, 129:108635, 2022

work page 2022
[7]

Glucotypes reveal new patterns of glucose dysregulation.PLoS biology, 16(7): e2005143, 2018

Heather Hall, Dalia Perelman, Alessandra Breschi, Patricia Limcaoco, Ryan Kellogg, Tracey McLaughlin, and Michael Snyder. Glucotypes reveal new patterns of glucose dysregulation.PLoS biology, 16(7): e2005143, 2018. 11 WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

work page 2018
[8]

Wearable multimodal sensing for quantifying the cardiovascular autonomic effects of levodopa in parkinsonism.Frontiers in Network Physiology, 5:1543838, 2025

John A Berkebile, Omer T Inan, and Paul A Beach. Wearable multimodal sensing for quantifying the cardiovascular autonomic effects of levodopa in parkinsonism.Frontiers in Network Physiology, 5:1543838, 2025

work page 2025
[9]

Large-scale training of foundation models for wearable biosignals

Salar Abbaspourazad, Oussama Elachqar, Andrew Miller, Saba Emrani, Udhyakumar Nallasamy, and Ian Shapiro. Large-scale training of foundation models for wearable biosignals. InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[10]

Wearable accelerom- eter foundation models for health via knowledge distillation.arXiv preprint arXiv:2412.11276, 2024

Salar Abbaspourazad, Anshuman Mishra, Joseph Futoma, Andrew C Miller, and Ian Shapiro. Wearable accelerometer foundation models for health via knowledge distillation.arXiv preprint arXiv:2412.11276, 2024

work page arXiv 2024
[11]

Papagei: Open founda- tion models for optical physiological signals

Arvind Pillai, Dimitris Spathis, Fahim Kawsar, and Mohammad Malekzadeh. Papagei: Open founda- tion models for optical physiological signals. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[12]

Mithun Saha, Maxwell A Xu, Wanting Mao, Sameer Neupane, James M Rehg, and Santosh Kumar. Pulse-ppg: An open-source field-trained ppg foundation model for wearable applications across lab and field settings.Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 9(3): 1–35, 2025

work page 2025
[13]

Anyppg: An ecg-guided ppg foundation model trained on over 100,000 hours of recordings for holistic health profiling.arXiv preprint arXiv:2511.01747, 2025

Guangkun Nie, Gongzheng Tang, Yujie Xiao, Jun Li, Shun Huang, Deyun Zhang, Qinghao Zhao, and Shenda Hong. Anyppg: An ecg-guided ppg foundation model trained on over 100,000 hours of recordings for holistic health profiling.arXiv preprint arXiv:2511.01747, 2025

work page arXiv 2025
[14]

Gpt-ppg: a gpt-based foundation model for photoplethysmography signals.Physiological Measurement, 46 (5):055004, 2025

Zhaoliang Chen, Cheng Ding, Saurabh Kataria, Runze Yan, Minxiao Wang, Randall Lee, and Xiao Hu. Gpt-ppg: a gpt-based foundation model for photoplethysmography signals.Physiological Measurement, 46 (5):055004, 2025

work page 2025
[15]

Himae: Hierarchical masked autoencoders discover resolution-specific structure in wearable time series.arXiv preprint arXiv:2510.25785, 2025

Simon A Lee, Cyrus Tanade, Hao Zhou, Juhyeon Lee, Megha Thukral, Minji Han, Rachel Choi, Md Saz- zad Hissain Khan, Baiying Lu, Migyeong Gwak, et al. Himae: Hierarchical masked autoencoders discover resolution-specific structure in wearable time series.arXiv preprint arXiv:2510.25785, 2025

work page arXiv 2025
[16]

Xiao Gu, Wei Tang, Jinpei Han, Veer Sangha, Fenglin Liu, Shreyank N Gowda, Antonio H Ribeiro, Patrick Schwab, Kim Branson, Lei Clifton, et al. Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals.Nature Machine Intelligence, 8(2):220–233, 2026

work page 2026
[17]

Tailor, Jacob Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff

Girish Narayanswamy, Xin Liu, Kumar Ayush, Yuzhe Yang, Xuhai Xu, shun liao, Jake Garrison, Shyam A. Tailor, Jacob Sunshine, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Samy Abdel-Ghaffar, and Daniel McDuff. Scaling wearable foundation models. InThe Thirteenth International Conference on Learning R...

work page 2025
[18]

Lsm-2: Learning from incomplete wearable sensor data.arXiv preprint arXiv:2506.05321, 2025

Maxwell A Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, Shun Liao, Shyam A Tailor, Ahmed Metwally, A Ali Heydari, Yuwei Zhang, Jake Garrison, et al. Lsm-2: Learning from incomplete wearable sensor data.arXiv preprint arXiv:2506.05321, 2025

work page arXiv 2025
[19]

Beyond sensor data: Foundation models of behavioral data from wearables improve health predictions

ErayErturk,FahadKamran,SalarAbbaspourazad,SeanJewell,HarshSharma,YujieLi,SineadWilliamson, Nicholas J Foti, and Joseph Futoma. Beyond sensor data: Foundation models of behavioral data from wearables improve health predictions. InForty-second International Conference on Machine Learning, 2025

work page 2025
[20]

Photoplethysmography in wearable devices: a comprehensive review of technological advances, current challenges, and future directions.Electronics, 12(13):2923, 2023

Kwang Bok Kim and Hyun Jae Baek. Photoplethysmography in wearable devices: a comprehensive review of technological advances, current challenges, and future directions.Electronics, 12(13):2923, 2023

work page 2023
[21]

A review of accelerometry-based wearable motion detectors for physical activity monitoring.Sensors, 10(8):7772–7788, 2010

Che-Chang Yang and Yeh-Liang Hsu. A review of accelerometry-based wearable motion detectors for physical activity monitoring.Sensors, 10(8):7772–7788, 2010

work page 2010
[22]

Motion artifact reduction in wearable photoplethysmography based on multi-channel sensors with multiple wavelengths.Sensors, 20(5):1493, 2020

Jongshill Lee, Minseong Kim, Hoon-Ki Park, and In Young Kim. Motion artifact reduction in wearable photoplethysmography based on multi-channel sensors with multiple wavelengths.Sensors, 20(5):1493, 2020. 12 WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

work page 2020
[23]

Correlates of heart rate measures with incidental physical activity and cardiorespiratory fitness in overweight female workers.Frontiers in physiology, 6:405, 2016

Laís Tonello, Felipe F Reichert, Iransé Oliveira-Silva, Sebastián Del Rosso, Anthony S Leicht, and Daniel A Boullosa. Correlates of heart rate measures with incidental physical activity and cardiorespiratory fitness in overweight female workers.Frontiers in physiology, 6:405, 2016

work page 2016
[24]

Daivaras Sokas, Egl˙e Tamuleviči¯ut˙e-Prascien˙e, Aurelija Beigien˙e, Vitalija Barasait˙e, Julius Marozas, Raimondas Kubilius, Raquel Bailón, and Andrius Petr˙enas. Wearable-based assessment of heart rate response to physical stressors in patients after open-heart surgery with frailty.IEEE Journal of Biomedical and Health Informatics, 27(4):1825–1834, 2023

work page 2023
[25]

Developmentandvalidation of an ecg-based 10-year risk prediction model for major adverse cardiac and cerebrovascular events in uk biobank.medRxiv, pages 2026–03, 2026

AdamSturge, StefanvanDuijvenboden, BarbaraCasadei, andAidenDoherty. Developmentandvalidation of an ecg-based 10-year risk prediction model for major adverse cardiac and cerebrovascular events in uk biobank.medRxiv, pages 2026–03, 2026

work page 2026
[26]

Assessment of physical activity in adults using wrist accelerometers.Epidemiologic reviews, 43(1):65–93, 2021

Fangyu Liu, Amal A Wanigatunga, and Jennifer A Schrack. Assessment of physical activity in adults using wrist accelerometers.Epidemiologic reviews, 43(1):65–93, 2021

work page 2021
[27]

Multiday rhythms modulate human heart rate: an observational study in healthy adults.bioRxiv, pages 2026–03, 2026

Rochelle I De Silva, Rachel E Stirling, Jodie Naim-Feil, Shivam Puri, Elizabeth Paratz, and Philippa J Karoly. Multiday rhythms modulate human heart rate: an observational study in healthy adults.bioRxiv, pages 2026–03, 2026

work page 2026
[28]

Objective evaluation of physical activity pattern using smart devices.Scientific Reports, 9(1):2006, 2019

Monika Šimaityt˙e, Andrius Petr˙enas, Julija Kravčenko, Eleni Kaldoudi, and Vaidotas Marozas. Objective evaluation of physical activity pattern using smart devices.Scientific Reports, 9(1):2006, 2019

work page 2006
[29]

The impact of chronotype on circadian rest-activity rhythm and sleep characteristics across the week.Chronobiology International, 38(11): 1575–1590, 2021

Chris Brooks, Nina Shaafi Kabiri, Jaspreet Bhangu, Xuemei Cai, Eve Pickering, Michael Kelley Erb, Sanford Auerbach, Paolo Bonato, Tara L Moore, Farzad Mortazavi, et al. The impact of chronotype on circadian rest-activity rhythm and sleep characteristics across the week.Chronobiology International, 38(11): 1575–1590, 2021

work page 2021
[30]

Siamquality: a convnet-based foundation model for photoplethysmography signals.Physiological Measurement, 45(8): 085004, 2024

Cheng Ding, Zhicheng Guo, Zhaoliang Chen, Randall J Lee, Cynthia Rudin, and Xiao Hu. Siamquality: a convnet-based foundation model for photoplethysmography signals.Physiological Measurement, 45(8): 085004, 2024

work page 2024
[31]

Self-supervised learning for human activity recognition using 700,000 person-days of wearable data.NPJ digital medicine, 7(1):91, 2024

Hang Yuan, Shing Chan, Andrew P Creagh, Catherine Tong, Aidan Acquah, David A Clifton, and Aiden Doherty. Self-supervised learning for human activity recognition using 700,000 person-days of wearable data.NPJ digital medicine, 7(1):91, 2024

work page 2024
[32]

Relcon: Relative contrastive learning for a motion foundation model for wearable data

Maxwell A Xu, Jaya Narain, Gregory Darnell, Haraldur T Hallgrimsson, Hyewon Jeong, Darren Forde, Richard Andres Fineman, Karthik Jayaraman Raghuram, James Matthew Rehg, and Shirley You Ren. Relcon: Relative contrastive learning for a motion foundation model for wearable data. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[33]

Wearable foundation models should go beyond static encoders.arXiv preprint arXiv:2603.19564, 2026

Yu Yvonne Wu, Yuwei Zhang, Hyungjun Yoon, Ting Dang, Dimitris Spathis, Tong Xia, Qiang Yang, Jing Han, Dong Ma, Sung-Ju Lee, et al. Wearable foundation models should go beyond static encoders.arXiv preprint arXiv:2603.19564, 2026

work page arXiv 2026
[34]

Sensorlm: Learning the language of wearable sensors

Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A Ali Heydari, Girish Narayanswamy, Maxwell A Xu, Ahmed Metwally, Jinhua Xu, Jake Garrison, Xuhai Xu, et al. Sensorlm: Learning the language of wearable sensors. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[35]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[36]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020

work page 2020
[37]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 13 WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

work page 2017
[38]

Lau, Jan C

Dominique Makowski, Tam Pham, Zen J. Lau, Jan C. Brammer, François Lespinasse, Hung Pham, Christo- pher Schölzel, and S. H. Annabel Chen. NeuroKit2: A python toolbox for neurophysiological signal processing.Behavior Research Methods, 53(4):1689–1696, feb 2021. doi: 10.3758/s13428-020-01516-y

work page doi:10.3758/s13428-020-01516-y 2021
[39]

The effect of obesity and weight loss on aortic pulse wave velocity as assessed by magnetic resonance imaging.Obesity, 18(12):2311–2316, 2010

Oliver J Rider, Upasana Tayal, Jane M Francis, Mohammed K Ali, Monique R Robinson, James P Byrne, Kieran Clarke, and Stefan Neubauer. The effect of obesity and weight loss on aortic pulse wave velocity as assessed by magnetic resonance imaging.Obesity, 18(12):2311–2316, 2010

work page 2010
[40]

Central blood pressure under angiotensin and calcium channel blockade, 2009

Michel E Safar, Athanase Protogerou, and Jacques Blacher. Central blood pressure under angiotensin and calcium channel blockade, 2009

work page 2009
[41]

Effects of antipsychotics on circadian rhythms in humans: a systematic review and meta-analysis.Progress in Neuro-Psychopharmacology and Biological Psychiatry, 108:110162, 2021

Eunsoo Moon, Paola Lavin, Kai-Florian Storch, and Outi Linnaranta. Effects of antipsychotics on circadian rhythms in humans: a systematic review and meta-analysis.Progress in Neuro-Psychopharmacology and Biological Psychiatry, 108:110162, 2021

work page 2021
[42]

The effect of alcohol on subsequent sleep in healthy adults: A systematic review and meta-analysis.Sleep Medicine Reviews, 80:102030, 2025

Carissa Gardiner, Jonathon Weakley, Louise M Burke, Gregory D Roach, Charli Sargent, Nirav Maniar, Minh Huynh, Dean J Miller, Andrew Townshend, and Shona L Halson. The effect of alcohol on subsequent sleep in healthy adults: A systematic review and meta-analysis.Sleep Medicine Reviews, 80:102030, 2025

work page 2025
[43]

Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

work page 2024
[44]

Longshort-termmemory.Neuralcomputation,9(8):1735–1780, 1997

SeppHochreiterandJürgenSchmidhuber. Longshort-termmemory.Neuralcomputation,9(8):1735–1780, 1997

work page 1997
[45]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16000–16009, 2022. 14 WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms A. Discussion A.1. Lim...

work page arXiv 2022