arxiv: 2604.10404 · v1 · submitted 2026-04-12 · 💻 cs.ET · cs.LG

Recognition: 2 theorem links

· Lean Theorem

Sense Less, Infer More: Agentic Multimodal Transformers for Edge Medical Intelligence

Brandon Lee, Chengwei Zhou, Christopher Pulliam, Gourav Datta, Haotian Yu, Massoud Pedram, Steve Majerus, Xuming Chen, Zhaoyan Jia

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:40 UTC · model grok-4.3

classification 💻 cs.ET cs.LG

keywords adaptive multimodal sensingagentic transformersedge medical intelligencesensor gatingsigma-delta samplingwearable health monitoringenergy-efficient inference

0 comments

The pith

An end-to-end agentic framework learns to gate sensors and skip redundant samples, cutting usage by 48.8 percent while raising accuracy 1.9 percent on physiological monitoring tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Adaptive Multimodal Intelligence as a system that jointly trains a controller to choose active sensors and a sensing module to ignore temporally redundant patches. This setup, built around a cross-modal transformer that fuses partial inputs, is evaluated on three standard wearable datasets and reports both lower sensor activity and higher classification performance than prior methods. The motivation is that continuous streams from ECG, PPG, EMG, and IMU devices exhaust batteries in hours, so selective sensing directly extends usable lifetime without sacrificing diagnostic reliability. Joint optimization of accuracy, sparsity, alignment, and predictive coding lets the model adapt to missing modalities at inference time.

Core claim

By training a differentiable Gumbel-Sigmoid modality controller together with learnable-threshold Sigma-Delta patch sampling inside a foundation-encoder-plus-cross-modal-transformer backbone, the resulting model achieves robust fusion from sparser inputs and thereby delivers both energy reduction and accuracy gains on edge medical tasks.

What carries the argument

The Agentic Modality Controller (Gumbel-Sigmoid gating on model confidence and task relevance) paired with the Learned Sigma-Delta Sensing module (patch-wise operations with trainable thresholds) inside the jointly optimized multimodal prediction network.

If this is right

Wearable devices can operate for longer periods on the same battery by activating fewer sensors.
The architecture supports dynamic computation graphs and masked operations that translate directly into hardware latency and power savings.
Performance remains stable even when some modalities are gated off, removing the need for always-on acquisition.
The same joint-training recipe applies across ECG, PPG, EMG, and IMU streams without separate per-modality pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same gating-plus-sparse-sampling pattern could be applied to non-medical edge sensing tasks such as industrial vibration monitoring or environmental IoT nodes.
Extending the learned thresholds to adapt online during deployment would further reduce the need for periodic retraining.
Because the transformer backbone already handles missing inputs, the framework may tolerate intermittent wireless dropouts in addition to deliberate sensor gating.

Load-bearing premise

That the learned gating thresholds and cross-modal predictions will continue to preserve accuracy and energy savings when the model encounters real hardware noise, new patients, or longer recording durations absent from the three evaluation datasets.

What would settle it

Running the trained model on physical wearable hardware with a fresh patient cohort over multi-day recordings and checking whether measured battery lifetime and diagnostic accuracy match the reported 48.8 percent sensor reduction and 1.9 percent accuracy lift.

Figures

Figures reproduced from arXiv: 2604.10404 by Brandon Lee, Chengwei Zhou, Christopher Pulliam, Gourav Datta, Haotian Yu, Massoud Pedram, Steve Majerus, Xuming Chen, Zhaoyan Jia.

**Figure 1.** Figure 1: illustrates this issue: using power values reported in sensor datasheets (e.g., 0.3–1 mW IMU, 1–5 mW ECG, 6–15 mW EMG, 4–10 mW PPG), a wearable with a 300 mWh battery can support each sensor alone for hundreds of hours, yet combining them reduces runtime to under 10 hours. This mismatch between multimodal sensing and limited battery capacity severely restricts long-term, continuous monitoring scenarios suc… view at source ↗

**Figure 2.** Figure 2: Our Unified Agentic Multimodal Sensing and Inference Framework for Efficient, High-Accuracy Biomedical AI. However, these models rely on dense, continuous sampling from all sensors and incur heavy computational and energy costs. In parallel, sensor selection has been explored through sparsity methods [51, 54], RL-based policies [48], contextual bandits [14], differentiable gating [28], information-theore… view at source ↗

**Figure 4.** Figure 4: Thresholding and skipping in Sigma–Delta Sensing. [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: Training pipeline with unrolled timesteps optimized via BPTT. Each fused state 𝑆𝑡 produces a prediction loss, gating loss, predictive coding loss, and a contrastive alignment loss computed against a memory bank. The controller’s gating actions 𝐴𝑡 influence future observations, and all losses jointly update the model through temporal backpropagation. Adaptive Thresholding and Skip Policy: As shown in [PITH… view at source ↗

**Figure 6.** Figure 6: Sensing rate heatmap over patches obtained from the proposed method on (Left) MHEALTH and (Right) HMC [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Per-iteration latency (left) and energy (right) as the sensing rate varies on MHEALTH. Measurements are across ARM CPU, Jetson (TensorRT) and A6000 (PyTorch and TensorRT). Latency is decomposed into AMC and FMPM and values are in √ of ms. deployments on Jetson and A6000 apply standard optimizations such as layer fusion, kernel autotuning, and FP16 execution. Latency & Energy Savings: To evaluate the runti… view at source ↗

read the original abstract

Edge-based multimodal medical monitoring requires models that balance diagnostic accuracy with severe energy constraints. Continuous acquisition of ECG, PPG, EMG, and IMU streams rapidly drains wearable batteries, often limiting operation to under 10 hours, while existing systems overlook the high temporal redundancy present in physiological signals. We introduce Adaptive Multimodal Intelligence (AMI), an end-to-end framework that jointly learns when to sense and how to infer. AMI integrates three components: (1) a lightweight Agentic Modality Controller that uses differentiable Gumbel-Sigmoid gating to dynamically select active sensors based on model confidence and task relevance; (2) a Learned Sigma-Delta Sensing module that applies patch-wise Delta-Sigma operations with learnable thresholds to skip temporally redundant samples; and (3) a Foundation-backed Multimodal Prediction Model built on unimodal foundation encoders and a cross-modal transformer with temporal context, enabling robust fusion even under gated or missing inputs. These components are trained jointly via a multi-objective loss combining classification accuracy, sparsity regularization, cross-modal alignment, and predictive coding. AMI is hardware-aware, supporting dynamic computation graphs and masked operations, leading to real energy and latency savings. Across MHEALTH, HMC Sleep, and WESAD datasets, it reduces sensor usage by 48.8% while improving state-of-the-art accuracy by 1.9% on average.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a joint sensing-and-inference pipeline for edge medical signals that cuts reported sensor use by nearly half with a small accuracy lift, but the numbers rest on thin validation that leaves generalization unproven.

read the letter

The headline result is an end-to-end system called AMI that uses a differentiable Gumbel-Sigmoid controller to pick active sensors, learnable sigma-delta thresholds to drop redundant patches, and a foundation-model cross-modal transformer to fuse whatever remains. On MHEALTH, HMC Sleep, and WESAD it claims 48.8% lower sensor usage and 1.9% higher accuracy than prior work. The integration of these pieces into one hardware-aware training loop is the clearest novelty; separate gating or skipping tricks exist, but the specific combination for physiological streams with masked fusion and joint multi-objective loss looks new relative to the cited literature. The paper also does a solid job stating the real battery-life limit of continuous wearables and showing how temporal redundancy can be turned into a trainable degree of freedom. The architecture choices for dynamic graphs and masked operations are practical and consistent with the goal. The soft spots are mostly in the evaluation. The abstract gives percentage improvements without baselines, error bars, ablations, or patient-wise splits, so it is impossible to judge whether the 1.9% gain is robust or partly an artifact of post-hoc modality selection. The stress-test concern about distribution shift is fair: nothing in the provided description tests longer horizons, unseen patients, added noise, or actual on-device power draw, only proxy counts on the three training distributions. The loss weights and learnable thresholds are free parameters that could be tuned to the reported numbers. This work is aimed at people building efficient multimodal pipelines for health monitoring who need concrete ideas for adaptive sensing. A reader already working on edge transformers or wearable signal processing could extract the gating and skipping modules as starting points. I would send it to peer review. The technical pieces fit together coherently and the problem is worth solving, so referees can ask for the missing controls and stronger generalization checks rather than rejecting it outright.

Referee Report

3 major / 1 minor

Summary. The paper presents Adaptive Multimodal Intelligence (AMI), an end-to-end framework for edge medical intelligence that combines an Agentic Modality Controller using differentiable Gumbel-Sigmoid gating for dynamic sensor selection, a Learned Sigma-Delta Sensing module with learnable thresholds for skipping redundant samples, and a foundation-backed multimodal transformer for prediction under partial inputs. Jointly trained with a multi-objective loss, it claims to achieve 48.8% reduction in sensor usage and 1.9% average accuracy improvement over state-of-the-art on the MHEALTH, HMC Sleep, and WESAD datasets.

Significance. Should the empirical results prove robust, this contribution would be significant for the field of edge AI in healthcare, as it addresses the critical trade-off between continuous monitoring accuracy and battery life in wearables. The differentiable, hardware-aware design for adaptive sensing represents a technical advance over static or heuristic approaches, with potential for broader application in resource-constrained multimodal sensing scenarios. The joint optimization of gating, sampling, and inference is a strength.

major comments (3)

[Abstract] The headline performance claims (48.8% sensor reduction and +1.9% accuracy) are given as averages without per-dataset breakdowns, baseline specifications, error bars, or statistical tests, which are necessary to substantiate the improvements over prior work.
[Abstract] The evaluation lacks any mention of patient-independent splits, leave-one-subject-out cross-validation, or tests for robustness to noise and distribution shifts, undermining confidence in the generalization of the learned Gumbel-Sigmoid controller and Sigma-Delta thresholds to real-world unseen patients and hardware conditions.
[Abstract] Assertions of 'real energy and latency savings' and 'hardware-aware' benefits are supported only by proxy counts of sensor usage rather than direct measurements of power consumption or latency on target edge devices.

minor comments (1)

[Abstract] The description of the multi-objective loss would benefit from explicit equations or weighting details to allow reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas to improve the transparency and robustness of our evaluation. We address each point below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Abstract] The headline performance claims (48.8% sensor reduction and +1.9% accuracy) are given as averages without per-dataset breakdowns, baseline specifications, error bars, or statistical tests, which are necessary to substantiate the improvements over prior work.

Authors: We agree that additional detail is warranted. In the revised manuscript we will add a compact per-dataset breakdown (including means, standard deviations across folds, and the exact baselines) either as a footnote to the abstract or as a new table referenced from the abstract. We will also report paired statistical tests supporting the accuracy gains. revision: yes
Referee: [Abstract] The evaluation lacks any mention of patient-independent splits, leave-one-subject-out cross-validation, or tests for robustness to noise and distribution shifts, undermining confidence in the generalization of the learned Gumbel-Sigmoid controller and Sigma-Delta thresholds to real-world unseen patients and hardware conditions.

Authors: We acknowledge the importance of subject-independent evaluation. Our experiments already follow a leave-one-subject-out protocol on all three datasets; we will explicitly document this protocol, together with the resulting per-subject variance, in the methods and abstract. We will further add controlled robustness experiments (additive sensor noise and simulated covariate shift) to the revised results section. revision: yes
Referee: [Abstract] Assertions of 'real energy and latency savings' and 'hardware-aware' benefits are supported only by proxy counts of sensor usage rather than direct measurements of power consumption or latency on target edge devices.

Authors: We agree that direct on-device measurements constitute stronger evidence. Sensor activation is the dominant power consumer in the target wearables, so the reported usage reduction is a first-order proxy; we will augment the revision with calibrated energy estimates drawn from published sensor power profiles and will quantify the latency benefit arising from the dynamic computation graph. Full end-to-end power traces on a specific microcontroller are beyond the present scope and will be noted as future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical results on external datasets

full rationale

The paper's headline claims (48.8% sensor reduction, +1.9% average accuracy) are direct empirical measurements obtained by running the jointly trained AMI framework on the three external datasets MHEALTH, HMC Sleep, and WESAD. These quantities are not derived from any internal model equation, fitted parameter renamed as a prediction, or self-referential definition; they are observed outcomes of the training and evaluation procedure. The architectural components (Gumbel-Sigmoid controller, learnable Sigma-Delta thresholds, cross-modal transformer) are described as trained end-to-end via a composite loss, but the reported performance numbers remain independent experimental results rather than quantities forced by construction from the inputs or prior self-citations. No load-bearing uniqueness theorem, ansatz smuggling, or renaming of known results appears in the derivation chain.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on standard assumptions about differentiability of Gumbel-Sigmoid and the existence of temporal redundancy in the chosen signals; no new physical entities are postulated.

free parameters (2)

multi-objective loss weights
The loss combines classification accuracy, sparsity regularization, cross-modal alignment, and predictive coding; relative weighting of these terms must be chosen or tuned.
learnable sigma-delta thresholds
Patch-wise thresholds are learned and directly control which samples are skipped.

axioms (2)

domain assumption Physiological signals contain sufficient temporal redundancy that skipping unchanged patches preserves diagnostic information
Invoked to justify the Learned Sigma-Delta Sensing module.
standard math Gumbel-Sigmoid relaxation provides unbiased gradients for discrete sensor selection
Used to make the modality controller differentiable.

pith-pipeline@v0.9.0 · 5568 in / 1487 out tokens · 43570 ms · 2026-05-10T16:40:39.196372+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
a lightweight Agentic Modality Controller that uses differentiable Gumbel-Sigmoid gating... Learned Sigma-Delta Sensing module that applies patch-wise Delta-Sigma operations with learnable thresholds to skip temporally redundant samples... Lgating = 1/M sum p_soft^(m)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
O(k*/ε²) sample complexity with logarithmic convergence

Reference graph

Works this paper leans on

55 extracted references · 17 canonical work pages · 2 internal anchors

[1]

Shimmer3 Wearable Sensor Specifications

2020. Shimmer3 Wearable Sensor Specifications. https://www.shimmersensing. com

2020
[2]

Salar Abbaspourazad et al. 2024. Large-scale Training of Foundation Models for Wearable Biosignals. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=pC3WJHf51j

2024
[3]

Abd Al Aleem et al. 2024. A Deep Learning Approach Using WESAD Data for Multi-Class Classification with Wearable Sensors. In2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES). IEEE, 194–197

2024
[4]

Almujally et al

N. Almujally et al . 2025. Wearable sensors-based assistive technologies for patient health monitoring.Frontiers in Bioengineering and Biotechnology13 (2025), 1437877

2025
[5]

2022.Haaglanden Medisch Cen- trum sleep staging database (version 1.1)

Diego Alvarez-Estevez and Ronald Rijsman. 2022.Haaglanden Medisch Cen- trum sleep staging database (version 1.1). https://doi.org/10.13026/t79q-fr32 RRID:SCR_007345

work page doi:10.13026/t79q-fr32 2022
[6]

Diego Alvarez-Estevez and Roselyne M Rijsman. 2021. Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one16, 8 (2021), e0256111

2021
[7]

Samaneh Aminikhanghahi and Diane J Cook. 2017. A survey of methods for time series change point detection.Knowledge and Information Systems51, 2 (2017), 339–367

2017
[8]

Banos et al

O. Banos et al . 2014. MHEALTH. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5TW22

work page doi:10.24432/c5tw22 2014
[9]

Yoshua Bengio et al. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation.arXiv preprint arXiv:1308.3432(2013)

work page internal anchor Pith review arXiv 2013
[10]

Víctor Campos et al. 2017. Skip RNN: Learning to skip state updates in recurrent neural networks. InInternational Conference on Learning Representations

2017
[11]

Emmanuel J Candès et al . 2006. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information.IEEE Transactions on Information Theory52, 2 (2006), 489–509

2006
[12]

Feng Chen and TODO. 2010. Compressed sensing for wireless ECG bio-sensor networks.IEEE Transactions on Biomedical Engineering57, 2 (2010), 139–148

2010
[13]

Isaac Debache et al. 2020. A lean and performant hierarchical model for human activity recognition using body-mounted sensors.Sensors20, 11 (2020), 3090

2020
[14]

Demirel et al

B. Demirel et al. 2022. Neural Contextual Bandits Based Dynamic Sensor Se- lection for Low-Power Body-Area Networks. InProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED). Boston, MA, USA, 1–6. https://doi.org/10.1145/3531437.3539713

work page doi:10.1145/3531437.3539713 2022
[15]

Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2019. Neural architecture search: A survey.Journal of Machine Learning Research20, 1 (2019), 1997–2017

2019
[16]

Ching Fang et al . 2024. Promoting cross-modal representations to improve multimodal foundation models for physiological signals. InAdvancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond. https: //openreview.net/forum?id=HNQxrUOvX4

2024
[17]

Oliver Faust et al. 2018. Deep Learning for Healthcare Applications Based on Physiological Signals: A Review.Computer Methods and Programs in Biomedicine 161 (2018), 1–13

2018
[18]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta- learning for fast adaptation of deep networks. InInternational Conference on Machine Learning. 1126–1135

2017
[19]

Sebastian Frey, Marco Guermandi, Simone Benatti, Victor Kartsch, Andrea Cosset- tini, and Luca Benini. 2023. BioGAP: a 10-Core FP-capable Ultra-Low Power IoT Processor, with Medical-Grade AFE and BLE Connectivity for Wearable Biosig- nal Processing. In2023 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Vol. 1. 1–7. https://doi.or...

work page doi:10.1109/coins57856.2023.10189286 2023
[20]

Garnett et al

R. Garnett et al. 2010. Bayesian optimization for sensor set selection. InProceed- ings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks. 209–219. https://doi.org/10.1145/1791212.1791238

work page doi:10.1145/1791212.1791238 2010
[21]

Yi Gu et al. 2026. Learning Contrastive Multimodal Fusion with Improved Modal- ity Dropout for Disease Detection and Prediction. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2025. 280–290

2026
[22]

Song Han, Huizi Mao, and William J Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. International Conference on Learning Representations(2016)

2016
[23]

Han et al

Yi. Han et al. 2021. Dynamic neural networks: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence(2021)

2021
[24]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network.arXiv preprint arXiv:1503.02531(2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[25]

Texas Instruments. 2019. ADS1292R Low-Power Analog Front-End for ECG and Bioelectrical Measurements. Datasheet

2019
[26]

Maxim Integrated. 2018. MAX30101 Optical Pulse Oximeter and Heart-Rate Sensor. Datasheet

2018
[27]

Jacob et al

B. Jacob et al. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. InProceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition. 2704–2713

2018
[28]

Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical reparameterization with Gumbel-Softmax. InInternational Conference on Learning Representations

2017
[29]

arXiv preprint arXiv:2504.19596 , year=

W. Jiang et al . 2025. Towards Robust Multimodal Physiological Foundation Models: Handling Arbitrary Missing Modalities.arXiv preprint arXiv:2504.19596 (2025)

work page arXiv 2025
[30]

Kolba and L

M. Kolba and L. Collins. 2006. Information-theoretic Sensor Management for Multimodal Sensing. In2006 IEEE International Symposium on Geoscience and Remote Sensing, Vol. 1. 3935–3938. https://doi.org/10.1109/IGARSS.2006.1009

work page doi:10.1109/igarss.2006.1009 2006
[31]

Kurz et al

Christoph F. Kurz et al. 2025. Benchmarking vision–language models for diag- nostics in emergency and critical care settings.npj Digital Medicine8 (2025), 423. https://doi.org/10.1038/s41746-025-01837-2

work page doi:10.1038/s41746-025-01837-2 2025
[32]

Shih-Chii Liu and Tobi Delbruck. 2010. Neuromorphic sensory systems.Current Opinion in Neurobiology20, 3 (2010), 288–295

2010
[33]

William Lotter, Gabriel Kreiman, and David Cox. 2017. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning. InInternational Conference on Learning Representations

2017
[34]

Shuo Ma et al . 2024. SleepMG: Multimodal generalizable sleep staging with inter-modal balance of classification and domain discrimination. InProceedings of the 32nd ACM International Conference on Multimedia. 4004–4013

2024
[35]

Kaden McKeen, Sameer Masood, Augustin Toma, Barry Rubin, and Bo Wang
[36]

Ecg-fm: An open electrocardiogram foundation model.JAMIA open8, 5 (2025), ooaf122

2025
[37]

Sparsh Mittal. 2016. A survey of techniques for approximate computing.Comput. Surveys48, 4 (2016), 1–33

2016
[38]

2023.TensorRT Developer Guide

NVIDIA Corporation. 2023.TensorRT Developer Guide. https://docs.nvidia.com/ deeplearning/tensorrt/developer-guide/

2023
[39]

Peter O’Connor and Max Welling. 2016. Sigma delta quantized networks.arXiv preprint arXiv:1611.02024(2016)

work page arXiv 2016
[40]

Yanghua Peng et al. 2021. DL2: A Deep Learning-Driven Scheduler for Deep Learning Clusters. , 1947-1960 pages. https://doi.org/10.1109/TPDS.2021.3052895

work page doi:10.1109/tpds.2021.3052895 2021
[41]

Pereira et al

C. Pereira et al. 2024. Machine Learning Applied to Edge Computing and Wear- able Devices for Healthcare: Systematic Mapping of the Literature.Sensors24, 19 (2024). https://doi.org/10.3390/s24196322

work page doi:10.3390/s24196322 2024
[42]

Pillai et al

A. Pillai et al. 2025. PaPaGei: Open Foundation Models for Optical Physiological Signals. InThe Thirteenth International Conference on Learning Representations. https://openreview.net/forum?id=kYwTmlq6Vn

2025
[43]

Rajesh PN Rao and Dana H Ballard. 1999. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.Nature Neuroscience2, 1 (1999), 79–87

1999
[44]

Philip Schmidt et al. 2018. Introducing wesad, a multimodal dataset for wearable stress and affect detection. InProceedings of the 20th ACM international conference on multimodal interaction. 400–408

2018
[45]

Richard Schreier and Gabor C. Temes. 2005.Understanding Delta–Sigma Data Converters. IEEE Press

2005
[46]

Abu Sebastian, Manuel Le Gallo, Riduan Khaddam-Aljameh, and Evangelos Eleftheriou. 2020. Memory devices and applications for in-memory computing. Nature Nanotechnology15, 7 (2020), 529–544

2020
[47]

Deepak Sharma, Arup Roy, Sankar Prasad Bag, Pawan Kumar Singh, and Youakim Badr. 2023. A hybrid deep learning-based approach for human activity recogni- tion using wearable sensors. InInnovations in Machine and Deep Learning: Case Studies and Applications. Springer, 231–259

2023
[48]

Basit Riaz Sheikh and Rajit Manohar. 2011. Energy-Efficient Pipeline Templates for High-Performance Asynchronous Circuits.J. Emerg. Technol. Comput. Syst. 7, 4, Article 19 (Dec. 2011), 26 pages. https://doi.org/10.1145/2043643.2043649

work page doi:10.1145/2043643.2043649 2011
[49]

Ali Tazarv et al. 2023. Active Reinforcement Learning for Personalized Stress Monitoring in Everyday Settings. In2023 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Vol. 1. 44–55. https://doi.org/10.1145/3580252.3586979

work page doi:10.1145/3580252.3586979 2023
[50]

Surat Teerapittayanon, Bradley McDanel, and HT Kung. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2464–2469

2016
[51]

Thapa et al

R. Thapa et al. 2025. A Multimodal Sleep Foundation Model Developed with 500K Hours of Sleep Recordings for Disease Predictions.medRxiv(2025). https: //doi.org/10.1101/2025.02.04.25321675

work page doi:10.1101/2025.02.04.25321675 2025
[52]

Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso.Journal of the Royal Statistical Society: Series B58, 1 (1996), 267–288. 7

1996
[53]

Bichen Wu et al . 2019. FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10734–10742

2019
[54]

Jinghua Xu and Michael Staniek. 2025. Multimodal Transformers for Clinical Time Series Forecasting and Early Sepsis Prediction. InProceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.cl4health-1.8

work page doi:10.18653/v1/2025.cl4health-1.8 2025
[55]

Hui Zou and Trevor Hastie. 2005. Regularization and variable selection via the elastic net.Journal of the Royal Statistical Society: Series B67, 2 (2005), 301–320. 8

2005