pith. sign in

arxiv: 2606.05481 · v1 · pith:WZKRFBM7new · submitted 2026-06-03 · 💻 cs.LG · cs.AI· eess.SP

Towards Unified and Data-Efficient Prognostics and Health Management with Tabular Foundation Models

Pith reviewed 2026-06-28 06:40 UTC · model grok-4.3

classification 💻 cs.LG cs.AIeess.SP
keywords tabular foundation modelsprognostics and health managementin-context learningdata efficiencycondition monitoringremaining useful lifePHM tasks
0
0 comments X

The pith

Converting unit-level signals to tabular rows lets foundation models handle multiple PHM tasks with top average ranks and strong low-data performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that tabular foundation models can address fragmented and sparsely labeled industrial PHM data by turning time-varying condition-monitoring signals into tabular rows suitable for in-context learning. This representation supports both diagnostic classification and prognostic remaining-useful-life estimation within a single framework. The models are compared against sequence models, transformer baselines, and gradient-boosted trees on shared benchmarks, where they record the best average ranks across tasks. PFN-based variants remain competitive when training data are scarce. The results indicate that the tabular format can retain enough temporal context for effective learning while providing a unified interface for heterogeneous PHM problems.

Core claim

By converting raw unit-level signals into tabular rows, tabular foundation models perform well across multiple PHM tasks—including prognostics and diagnostics—and achieve the best average ranks; PFN-based models are competitive in low-data regimes.

What carries the argument

The conversion of time-varying condition-monitoring signals into tabular rows that supports in-context learning with tabular foundation models.

If this is right

  • A single tabular foundation model can serve as a reusable interface for mixed diagnostic and prognostic problems without task-specific retraining.
  • PFN-based tabular models reduce the labeled data volume required for acceptable accuracy on industrial assets.
  • Performance hinges on constructing representative context examples during subsampling of the tabular rows.
  • Temporal ordering information can be retained sufficiently in the row format to support both classification and regression PHM objectives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same tabular conversion step could be tested on other fragmented time-series domains such as predictive maintenance in energy or transportation networks.
  • An ablation that varies the number of context rows per query would quantify exactly how much historical context the models need to match sequence-model accuracy.
  • If tabular foundation models continue to improve, they might allow maintenance planners to deploy one pretrained system across fleets with different sensor suites and failure modes.

Load-bearing premise

Turning time-varying signals into static tabular rows still keeps the temporal information needed for accurate diagnosis and remaining-life prediction.

What would settle it

A controlled test in which the same signals are presented with temporal order deliberately shuffled in the tabular rows, and performance on prognostic tasks drops sharply while diagnostic tasks remain stable, would show the representation fails to preserve required timing.

Figures

Figures reproduced from arXiv: 2606.05481 by Leandro von Krannichfeldt, Lev Telyatnikov, Olga Fink, Raffael Theiler.

Figure 1
Figure 1. Figure 1: Overview of the unified evaluation pipeline. Unit-level PHM signals are transformed into aligned feature–target windows and evaluated either by trained sequence models or by in-context tabular foundation models after tabularization. Validation data are used for model selection or tabular-shape selection before final test evaluation. converts continuous time-series into supervised samples that can be interp… view at source ↗
Figure 2
Figure 2. Figure 2: Our tabularization scheme. Time-series features are aggregated into their statistics on a per-window basis, along with the corresponding labels. These are then composed into a row in our table, providing richer context. 5. Practical Implementation Details 5.1. Unified Evaluation Pipeline Details All baselines are trained from scratch and evaluated alongside all foundation models using the evaluation infras… view at source ↗
Figure 3
Figure 3. Figure 3: reports the effect of missing data on normalized MAE for PHME20. TabPFN remains the strongest model under this missing-data setting, followed by TabDPT, indicating that the tabular foundation models retain their advantage when incomplete inputs are present. Unexpectedly, however, the explicit TabPFN NaN-token variant underperforms the imputed TabPFN configuration, indicating that TabPFN’s proposed missing-… view at source ↗
Figure 4
Figure 4. Figure 4: Predictive quantile distribution heatmaps for PHME20 and Unibo. RUL distribution with and without tabularization of time. R. Theiler, L. Telyatnikov et al. Page 19 of 43 [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Scaling behavior for PHME20, Unibo, and MZVAV. The left column shows uniformly subsampled scaling laws; the right column shows blockwise subsampled scaling laws. We show the top 5 models for each dataset, which include TabPFN and TabDPT in all cases. A complete version with all models is included in the Appendix as [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Class balance in the MZVAV subset-ratio experiment, comparing uniformly subsampled and blockwise subsampling (Seed: 72). 0 20 40 60 80 100 Sequence Length 0.02 0.03 0.04 0.05 0.06 MAE (normalized) TabDPT TabPFN (a) PHME20 0 20 40 60 80 100 Sequence Length 0.04 0.05 0.06 0.07 0.08 MAE (normalized) TabDPT TabPFN (b) Unibo 0 20 40 60 80 100 Sequence Length 0.052 0.054 0.056 0.058 0.060 0.062 0.064 0.066 0.068… view at source ↗
Figure 7
Figure 7. Figure 7: Effect of sequence length on normalized MAE for selected prognostics datasets. R. Theiler, L. Telyatnikov et al. Page 21 of 43 [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Data-efficiency scaling under aggregate and blockwise context subsampling for PHME20, Unibo, and MZVAV. Each row compares aggregate random subsampling with contiguous blockwise subsampling, highlighting when small contexts preserve or lose coverage of trajectories or diagnostic classes. R. Theiler, L. Telyatnikov et al. Page 39 of 43 [PITH_FULL_IMAGE:figures/full_fig_p039_8.png] view at source ↗
read the original abstract

Data-driven Prognostics and Health Management (PHM) uses time-varying condition-monitoring data to diagnose system states and estimate remaining useful life in engineered assets. These tasks are central to maintenance planning, but industrial PHM data are often fragmented, partially observed, and poorly labeled, which hinders supervised learning. Foundation models offer a route toward reusable predictive systems, yet most time-series foundation models are designed for forecasting and assume long, coherent, regularly sampled sequences. To address this gap, we propose a framework for applying Tabular Foundation Models to industrial time series using in-context learning, and we evaluate them on a variety of PHM tasks. By converting raw unit-level signals into tabular rows, we show that these models perform well across multiple tasks - including prognostics, and diagnostics - and are highly data efficient. We compare them directly with sequence models, transformer baselines, and gradient-boosted trees under a common evaluation protocol. The results indicate that tabular foundation models achieve the best average ranks across prognostic and diagnostic tasks. Our findings further show that PFN-based models are competitive in low-data regimes, that temporal context can be preserved in the tabular representation, and that performance depends on representative context construction under subsampling. These results demonstrate that tabular foundation models provide a practical and general interface for heterogeneous PHM problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes converting unit-level time-varying condition-monitoring signals from PHM applications into tabular rows to enable in-context learning with Tabular Foundation Models (TFMs), particularly PFN-based ones. It claims these models achieve the best average ranks across prognostic and diagnostic tasks compared to sequence models, transformers, and gradient-boosted trees, while being highly data-efficient in low-data regimes. The work asserts that temporal context can be preserved in the tabular representation and that performance depends on representative context construction under subsampling, offering a unified interface for heterogeneous, fragmented PHM data.

Significance. If the empirical claims hold under rigorous verification, the result would be significant for PHM by demonstrating a practical route to reusable, data-efficient predictive systems that bypass the need for large labeled datasets or task-specific retraining. It would also extend the applicability of tabular foundation models beyond static tabular data to time-series condition monitoring, with potential impact on industrial maintenance where data fragmentation is common. The direct comparison under a common protocol and emphasis on low-data performance are strengths.

major comments (2)
  1. [Abstract] Abstract: The central performance claim (best average ranks across tasks and data efficiency) cannot be evaluated because the text supplies no dataset descriptions, metric definitions (e.g., how RUL error or diagnostic accuracy is computed), statistical tests for rank differences, or exclusion criteria for tasks/models. This information is load-bearing for any assertion of superiority.
  2. [Abstract] Abstract (framework and results paragraph): The assertion that 'temporal context can be preserved in the tabular representation' and that results 'depend on representative context construction under subsampling' is not supported by any description of the encoding (row ordering, explicit time deltas, cumulative statistics, or windowing) or by an ablation that isolates the contribution of temporal order versus marginal feature distributions. Without this, it is unclear whether the approach retains the sequential degradation dynamics required for prognostics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We agree that it should be more self-contained to support the central claims and will revise it accordingly while ensuring the full manuscript details remain clear. Below we respond to each major comment.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central performance claim (best average ranks across tasks and data efficiency) cannot be evaluated because the text supplies no dataset descriptions, metric definitions (e.g., how RUL error or diagnostic accuracy is computed), statistical tests for rank differences, or exclusion criteria for tasks/models. This information is load-bearing for any assertion of superiority.

    Authors: We agree the abstract should reference key evaluation elements for self-containment. The full manuscript (Sections 3 and 4) details the datasets (e.g., C-MAPSS variants and other PHM benchmarks), metrics (RMSE for RUL estimation, accuracy/F1 for diagnostics), the common protocol across models, and task inclusion criteria. Statistical significance tests on rank differences are not currently reported. We will revise the abstract to briefly note the evaluation setup, datasets, and metrics while keeping length constraints in mind; we will also explore adding rank significance tests if they can be computed without new experiments. revision: partial

  2. Referee: [Abstract] Abstract (framework and results paragraph): The assertion that 'temporal context can be preserved in the tabular representation' and that results 'depend on representative context construction under subsampling' is not supported by any description of the encoding (row ordering, explicit time deltas, cumulative statistics, or windowing) or by an ablation that isolates the contribution of temporal order versus marginal feature distributions. Without this, it is unclear whether the approach retains the sequential degradation dynamics required for prognostics.

    Authors: The abstract summarizes findings whose supporting details appear in the method (Section 2) and results (Section 5). Section 2 describes the signal-to-table conversion, including feature extraction, windowing, and how rows are ordered to retain temporal structure via cumulative statistics and time-aware features. Section 5 reports performance under varying subsampling regimes, showing sensitivity to context construction. An explicit ablation separating temporal ordering from marginal distributions is not present. We will revise the abstract to reference the encoding approach in Section 2 and clarify the subsampling results; we can expand the method description if needed but note that space in the abstract is limited. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical benchmarking with no derivations or self-referential predictions

full rationale

The paper is an empirical evaluation of tabular foundation models on PHM tasks via conversion of time-series signals to tabular rows and in-context learning. It reports average ranks, data-efficiency comparisons, and observations about temporal context preservation, all grounded in experimental results under a shared protocol against baselines. No equations, fitted parameters renamed as predictions, self-citation load-bearing uniqueness theorems, or ansatzes appear in the derivation chain. The central claims rest on observable performance metrics rather than any reduction to inputs by construction. This is the standard case of a self-contained empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract introduces no free parameters, axioms, or invented entities; the approach rests on the pre-existing capabilities of tabular foundation models (e.g., PFNs) and the unstated premise that tabular conversion is information-preserving.

pith-pipeline@v0.9.1-grok · 5777 in / 1115 out tokens · 31995 ms · 2026-06-28T06:40:38.389337+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 32 canonical work pages · 3 internal anchors

  1. [1]

    Chronos-2: From univariate to universal forecasting

    Ansari,A.F.,Shchur,O.,Küken,J.,Auer,A.,Han,B.,Mercado,P.,Rangapuram,S.S.,Shen,H.,Stella,L.,Zhang,X.,Goswami,M.,Kapoor, S., Maddix, D.C., Guerron, P., Hu, T., Yin, J., Erickson, N., Desai, P.M., Wang, H., Rangwala, H., Karypis, G., Wang, Y., Bohlke-Schneider, M., 2025. Chronos-2: From univariate to universal forecasting. URL:https://arxiv.org/abs/2510.1582...

  2. [2]

    Chronos: Learning the language of time series

    Ansari,A.F.,Stella,L.,Turkmen,A.C.,Zhang,X.,Mercado,P.,Shen,H.,Shchur,O.,Rangapuram,S.S.,Arango,S.P.,Kapoor,S.,Zschiegner, J., Maddix, D.C., Wang, H., Mahoney, M.W., Torkkola, K., Wilson, A.G., Bohlke-Schneider, M., Wang, B., 2024. Chronos: Learning the language of time series. URL:https://openreview.net/forum?id=gerNCVqqtR. expert Certification

  3. [3]

    Arbel,M.,Salinas,D.,Hutter,F.,2026. Equitabpfn:Atarget-permutationequivariantpriorfittednetwork,in:AdvancesinNeuralInformation ProcessingSystems,CurranAssociates,Inc..pp.62586–62609.URL:https://proceedings.neurips.cc/paper_files/paper/2025/ file/5a66c7adffdbde9dd5e78820cbf6935c-Paper-Conference.pdf

  4. [4]

    Aircraft engine run-to-failure dataset under real flight conditions for prognostics and diagnostics

    Arias Chao, M., Kulkarni, C., Goebel, K., Fink, O., 2021. Aircraft engine run-to-failure dataset under real flight conditions for prognostics and diagnostics. Data 6, 5. doi:10.3390/data6010005

  5. [5]

    Adaptationofanelectrochemistry-basedli-ionbatterymodeltoaccountfordeteriorationobserved under randomized use

    Bole,B.,Kulkarni,C.S.,Daigle,M.,2014. Adaptationofanelectrochemistry-basedli-ionbatterymodeltoaccountfordeteriorationobserved under randomized use. Annual Conference of the PHM Society 6. doi:10.36001/phmconf.2014.v6i1.2490

  6. [6]

    To charge or to sell? ev pack useful life estimation via lstms, cnns, and autoencoders

    Bosello, M., Falcomer, C., Rossi, C., Pau, G., 2023. To charge or to sell? ev pack useful life estimation via lstms, cnns, and autoencoders. Energies 16, 2837. doi:10.3390/en16062837

  7. [7]

    Explore the time series forecasting potential of tabpfn leveraging the intrinsic periodicity of data, in: ICML 2025 Workshop on Foundation Models for Structured Data (FMSD)

    Cai, S., Sun, X., Zhong, H., 2025. Explore the time series forecasting potential of tabpfn leveraging the intrinsic periodicity of data, in: ICML 2025 Workshop on Foundation Models for Structured Data (FMSD). URL:https://openreview.net/forum?id=7JGD1kNlzU. workshop paper

  8. [8]

    Case Western Reserve University Bearing Data Center Website.https: //engineering.case.edu/bearingdatacenter/download-data-file

    Case Western Reserve University Bearing Data Center, . Case Western Reserve University Bearing Data Center Website.https: //engineering.case.edu/bearingdatacenter/download-data-file. Accessed: 2026-05-29

  9. [9]

    Large language model-based autonomous agent for prognostics and health management

    Cha, M., Yoon, S.i., Kim, S., Kang, D., Nam, K., Lee, T., Kim, J.Y., 2025. Large language model-based autonomous agent for prognostics and health management. Machines 13, 831. URL:https://www.mdpi.com/2075-1702/13/9/831, doi:10.3390/machines13090831

  10. [10]

    XGBoost: A scalable tree boosting system,

    Chen,T.,Guestrin,C.,2016. Xgboost:Ascalabletreeboostingsystem,in:Proceedingsofthe22ndACMSIGKDDInternationalConference on Knowledge Discovery and Data Mining, pp. 785–794. doi:10.1145/2939672.2939785

  11. [11]

    Long-termforecastingwithtide:Time-seriesdenseencoder

    Das,A.,Kong,W.,Leach,A.,Mathur,S.K.,Sen,R.,Yu,R.,2023. Long-termforecastingwithtide:Time-seriesdenseencoder. Transactions on Machine Learning Research

  12. [12]

    A decoder-only foundation model for time-series forecasting, in: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F

    Das, A., Kong, W., Sen, R., Zhou, Y., 2024. A decoder-only foundation model for time-series forecasting, in: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F. (Eds.), Proceedings of the 41st International Conference on Machine Learning, PMLR. pp. 10148–10167. URL:https://proceedings.mlr.press/v235/das24c.html

  13. [13]

    A survey on in-context learning, in: Proceedings of the 2024 conference on empirical methods in natural language processing, pp

    Dong, Q., Li, L., Dai, D., Zheng, C., Ma, J., Li, R., Xia, H., Xu, J., Wu, Z., Chang, B., et al., 2024. A survey on in-context learning, in: Proceedings of the 2024 conference on empirical methods in natural language processing, pp. 1107–1128

  14. [14]

    Forecastpfn: Synthetically-trained zero-shot forecasting

    Dooley, S., Khurana, G.S., Mohapatra, C., Naidu, S.V., White, C., 2024. Forecastpfn: Synthetically-trained zero-shot forecasting. Advances in Neural Information Processing Systems 36

  15. [15]

    The Faiss library

    Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazaré, P.E., Lomeli, M., Hosseini, L., Jégou, H., 2024. The faiss library. URL: https://arxiv.org/abs/2401.08281, doi:10.48550/arXiv.2401.08281,arXiv:2401.08281

  16. [16]

    Unifault:Afaultdiagnosisfoundationmodelfrombearingdata

    Eldele,E.,Ragab,M.,Qing,X.,Chen,Z.,Wu,M.,Li,X.,Lee,J.,etal.,2025. Unifault:Afaultdiagnosisfoundationmodelfrombearingdata. arXiv preprint arXiv:2504.01373

  17. [17]

    Only the curve shape matters: Training foundation models for zero-shot multivariate time series forecasting through next curve shape prediction

    Feng, C., Huang, L., Krompass, D., . Only the curve shape matters: Training foundation models for zero-shot multivariate time series forecasting through next curve shape prediction. URL:http://arxiv.org/abs/2402.07570,arXiv:2402.07570 [cs]

  18. [18]

    PHMForge: Evaluating LLM Agents on Industrial Prognostics through MCP-Native, Algorithm-Grounded Tools

    Feng, T., Chen, Y., Tsai, C.Y., Sun, Y., Das, A., El Maghraoui, K., Lin, S., Patel, D., 2026. PHMForge: Evaluating llm agents on industrial prognostics through mcp-native, algorithm-grounded tools. URL:https://arxiv.org/abs/2604.01532, doi:10.48550/arXiv.2604. 01532,arXiv:2604.01532

  19. [19]

    From physics to machine learning and back: Part II - Learning and observational bias in prognostics and health management (PHM)

    Fink, O., Nejjar, I., Sharma, V., Faghih Niresi, K., Sun, H., Dong, H., Xu, C., Wei, A., Bizzi, A., Theiler, R., Tian, Y., Von Krannichfeldt, L., Ma, Z., Garmaev, S., Zhang, Z., Zhao, M., Steiner, K., Kesmen, Y., 2026. From physics to machine learning and back: Part II - Learning and observational bias in prognostics and health management (PHM). Reliabili...

  20. [20]

    User’s guide for the commercial modular aero-propulsion system simulation (C-MAPSS)

    Frederick, D.K., DeCastro, J.A., Litt, J.S., 2007. User’s guide for the commercial modular aero-propulsion system simulation (C-MAPSS). Technical Report NASA/TM—2007-215026. NASA Glenn Research Center

  21. [21]

    Buildingfaultdetectiondatatoaiddiagnosticalgorithmcreationandperformance testing

    Granderson,J.,Lin,G.,Harding,A.,Im,P.,Chen,Y.,2020a. Buildingfaultdetectiondatatoaiddiagnosticalgorithmcreationandperformance testing. ScientificData7,65. URL:https://www.nature.com/articles/s41597-020-0398-6,doi:10.1038/s41597-020-0398-6

  22. [22]

    Granderson, J., Lin, G., Harding, A., Im, P., Chen, Y., 2020b. Dataset for building fault detection and diagnostics algorithm creation andperformancetestingURL:https://figshare.com/articles/dataset/LBNLDataSynthesisInventory_pdf/11752740,doi:10. 6084/m9.figshare.11752740.v3

  23. [23]

    Long-range transformers for dynamic spatiotemporal forecasting.arXiv:2109.12218

    Grigsby, J., Wang, Z., Qi, Y., 2021. Long-range transformers for dynamic spatiotemporal forecasting.arXiv:2109.12218

  24. [24]

    Creation of publicly available data sets for prognostics and diagnostics addressing data scenarios relevant to industrial applications

    Hagmeyer, S., Mauthe, F., Zeiler, P., 2021. Creation of publicly available data sets for prognostics and diagnostics addressing data scenarios relevant to industrial applications. International Journal of Prognostics and Health Management 12

  25. [25]

    Helwig, N., Pignanelli, E., Schütze, A., 2015. Condition monitoring of a complex hydraulic system using multivariate statistics, in: 2015 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) Proceedings, pp. 210–215. doi:10.1109/I2MTC. 2015.7151267

  26. [26]

    Tabpfn: A transformer that solves small tabular classification problems in a second, in: The Eleventh International Conference on Learning Representations

    Hollmann, N., Müller, S., Eggensperger, K., Hutter, F., 2023. Tabpfn: A transformer that solves small tabular classification problems in a second, in: The Eleventh International Conference on Learning Representations. R. Theiler, L. Telyatnikov et al. Page 41 of 43 Towards Unified and Data-Efficient PHM

  27. [27]

    Accurate predictions on small data with a tabular foundation model

    Hollmann, N., Müller, S., Purucker, L., Krishnakumar, A., Körfer, M., Hoo, S.B., Schirrmeister, R.T., Hutter, F., 2025. Accurate predictions on small data with a tabular foundation model. Nature 637, 319–326

  28. [28]

    Hoo, S.B., Müller, S., Salinas, D., Hutter, F., 2024. The tabular foundation model tabpfn outperforms specialized time series forecasting models based on simple features, in: NeurIPS 2024 Workshop on Table Representation Learning (TRL). URL:https://neurips.cc/ virtual/2024/103164. neurIPS 2024 workshop paper

  29. [29]

    Remaining useful life prediction for experimental filtration system: A data challenge

    İnce, K., Sirkeci, E., Genç, Y., 2020. Remaining useful life prediction for experimental filtration system: A data challenge. PHM Society European Conference 5. doi:10.36001/phme.2020.v5i1.1317

  30. [30]

    Time-llm:Timeseriesforecasting by reprogramming large language models, in: Kim, B., Yue, Y., Chaudhuri, S., Fragkiadaki, K., Khan, M., Sun, Y

    Jin,M.,Wang,S.,Ma,L.,Chu,Z.,Zhang,J.,Shi,X.,Chen,P.Y.,Liang,Y.,Li,Y.F.,Pan,S.,Wen,Q.,2024. Time-llm:Timeseriesforecasting by reprogramming large language models, in: Kim, B., Yue, Y., Chaudhuri, S., Fragkiadaki, K., Khan, M., Sun, Y. (Eds.), International ConferenceonLearningRepresentations,pp.23857–23880. URL:https://proceedings.iclr.cc/paper_files/paper...

  31. [31]

    Carte: Pretraining and transfer for tabular learning, in: Forty-first International Conference on Machine Learning

    Kim, M.J., Grinsztajn, L., Varoquaux, G., 2024. Carte: Pretraining and transfer for tabular learning, in: Forty-first International Conference on Machine Learning

  32. [32]

    LeCun, Y

    LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–444. doi:10.1038/nature14539

  33. [33]

    Bioinformatics , volume =

    Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., Kang, J., 2019. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240. URL:http://dx.doi.org/10.1093/bioinformatics/btz682, doi:10.1093/bioinformatics/btz682

  34. [34]

    Machinery health prognostics: A systematic review from data acquisition to remaining useful life prediction

    Lei, Y., Li, N., Guo, L., Li, N., Yan, T., Lin, J., 2018. Machinery health prognostics: A systematic review from data acquisition to remaining useful life prediction. Mechanical Systems and Signal Processing 104, 799–834. doi:10.1016/j.ymssp.2017.11.016

  35. [35]

    Smalldatachallengesforintelligentprognosticsandhealthmanagement:areview

    Li,C.,Li,S.,Feng,Y.,Gryllias,K.,Gu,F.,Pecht,M.,2024. Smalldatachallengesforintelligentprognosticsandhealthmanagement:areview. Artificial Intelligence Review 57, 214

  36. [36]

    ICL4RUL:In-contextlearning-basedaircraftengineremainingusefullifeprediction

    Liu,D.,Qu,G.,Xu,Y.,Qiu,T.,Ding,S.,Guo,K.,2025. ICL4RUL:In-contextlearning-basedaircraftengineremainingusefullifeprediction. IEEE Internet of Things Journal 12, 29766–29783. URL:https://ieeexplore.ieee.org/document/10998956, doi:10.1109/JIOT. 2025.3569131

  37. [37]

    Timer: Generative pre-trained transformers are large time series models

    Liu, Y., Zhang, H., Li, C., Huang, X., Wang, J., Long, M., . Timer: Generative pre-trained transformers are large time series models. URL: http://arxiv.org/abs/2402.02368,arXiv:2402.02368 [cs, stat]

  38. [38]

    TabDPT: Scaling tabular foundation models on real data URL:https://openreview.net/forum?id=pIZxEOZCId

    Ma, J., Thomas, V., Hosseinzadeh, R., Labach, A., Kamkari, H., Cresswell, J.C., Golestan, K., Yu, G., Caterini, A.L., Volkovs, M., 2025. TabDPT: Scaling tabular foundation models on real data URL:https://openreview.net/forum?id=pIZxEOZCId

  39. [39]

    Early fault classification in rotating machinery with limited data using tabpfn

    Magadán, L., Roldán-Gómez, J., Granda, J., Suárez, F., 2023. Early fault classification in rotating machinery with limited data using tabpfn. IEEE Sensors Journal 23, 30960–30970. doi:10.1109/JSEN.2023.3331100

  40. [40]

    Overview and analysis of publicly available degradation data sets for tasks within prognostics and health management, in: 35th european safety and reliability conference.(accepted)

    Mauthe, F., Steinmann, L., Neu, M., Zeiler, P., 2025. Overview and analysis of publicly available degradation data sets for tasks within prognostics and health management, in: 35th european safety and reliability conference.(accepted). Research Publishing

  41. [41]

    Asurveyofdeeplearningandfoundationmodelsfortime series forecasting

    Miller,J.A.,Aldosari,M.,Saeed,F.,Barna,N.H.,Rana,S.,Arpinar,I.B.,Liu,N.,. Asurveyofdeeplearningandfoundationmodelsfortime series forecasting. URL:http://arxiv.org/abs/2401.13912, doi:10.48550/arXiv.2401.13912,arXiv:2401.13912 [cs]

  42. [42]

    Nguyen, N., Sinthong, P., Kalagnanam, J., 2023

    Nie, Y., H. Nguyen, N., Sinthong, P., Kalagnanam, J., 2023. A time series is worth 64 words: Long-term forecasting with transformers, in: International Conference on Learning Representations

  43. [43]

    Can generalist foundation models outcompete special-purpose tuning? case study in medicine

    Nori, H., Lee, Y.T., Zhang, S., Carignan, D., Edgar, R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W., Luo, R., McKinney, S.M., Ness, R.O., Poon, H., Qin, T., Usuyama, N., White, C., Horvitz, E., 2023. Can generalist foundation models outcompete special-purpose tuning? case study in medicine. arXiv preprint arXiv:2311.16452 URL:https://arxiv.org/abs/2311.16452

  44. [44]

    Acomparativestudyofdeeplearningmodelbasedequipmentfaultdiagnosisandprognosis

    Qiao,X.,Liow,H.Y.,Jauw,V.L.,Lim,C.S.,2025. Acomparativestudyofdeeplearningmodelbasedequipmentfaultdiagnosisandprognosis. International Journal of Prognostics and Health Management 16. doi:10.36001/IJPHM.2025.v16i1.4254

  45. [45]

    Tabicl: A tabular foundation model for in-context learning on large data, in: Proceedings of the 38th International Conference on Machine Learning (ICML 2025)

    Qu, J., Holzmüller, D., Varoquaux, G., Morvan, M.L., 2025. Tabicl: A tabular foundation model for in-context learning on large data, in: Proceedings of the 38th International Conference on Machine Learning (ICML 2025). URL:https://arxiv.org/abs/2502.05564. accepted at ICML 2025

  46. [46]

    Performance Benchmarking and Analysis of Prognostic Methods for CMAPSS Datasets

    Ramasso, E., Saxena, A., 2014. Performance Benchmarking and Analysis of Prognostic Methods for CMAPSS Datasets. International JournalofPrognosticsandHealthManagement5. URL:https://papers.phmsociety.org/index.php/ijphm/article/view/2236, doi:10.36001/ijphm.2014.v5i2.2236

  47. [47]

    A comprehensive review and evaluation framework for data-driven prognostics: Uncertainty, robustness, interpretability, and feasibility

    Salinas-Camus, M., Goebel, K., Eleftheroglou, N., 2025. A comprehensive review and evaluation framework for data-driven prognostics: Uncertainty, robustness, interpretability, and feasibility. Mechanical Systems and Signal Processing 237, 113015. doi:10.1016/j.ymssp. 2025.113015

  48. [48]

    Damage propagation modeling for aircraft engine run-to-failure simulation, in: 2008 International Conference on Prognostics and Health Management, pp

    Saxena, A., Goebel, K., Simon, D., Eklund, N., 2008. Damage propagation modeling for aircraft engine run-to-failure simulation, in: 2008 International Conference on Prognostics and Health Management, pp. 1–9. doi:10.1109/PHM.2008.4711414

  49. [49]

    The performance of lstm and bilstm in forecasting time series, in: 2019 IEEE International Conference on Big Data (Big Data), pp

    Siami-Namini, S., Tavakoli, N., Siami Namin, A., 2019. The performance of lstm and bilstm in forecasting time series, in: 2019 IEEE International Conference on Big Data (Big Data), pp. 3285–3292. doi:10.1109/BigData47090.2019.9005997

  50. [50]

    Atutorialforfeatureengineeringintheprognosticsandhealthmanagementofgearsandbearings

    Sim,J.,Kim,S.,Park,H.J.,Choi,J.H.,2020. Atutorialforfeatureengineeringintheprognosticsandhealthmanagementofgearsandbearings. Applied Sciences 10, 5639. doi:10.3390/app10165639

  51. [51]

    Journal of Vibration and Acoustics 147, 041002

    Sun,L.,Wu,J.,Wang,J.,Wen,S.,Li,G.,Liu,Y.,2025.Faultdiagnosisofslewingbearingusingaudiblesoundsignalbasedontimegan–tabpfn method. Journal of Vibration and Acoustics 147, 041002. doi:10.1115/1.4068223

  52. [52]

    Picid: A Modular Evaluation Infrastructure for Reproducible PHM Across Tasks and Domains

    Telyatnikov, L., Theiler, R., Von Krannichfeldt, L., Fink, O., 2026. Picid: A modular evaluation infrastructure for reproducible phm across tasks and domains. URL:https://arxiv.org/abs/2605.28345, doi:10.48550/arXiv.2605.28345,arXiv:2605.28345

  53. [53]

    Frompapertobenchmark:agentic,framework-based reproduction of under-specified methods in machine health intelligence

    Theiler,R.,Comito,L.,Leko,D.,VonKrannichfeldt,L.,Telyatnikov,L.,Fink,O.,2026. Frompapertobenchmark:agentic,framework-based reproduction of under-specified methods in machine health intelligence. URL:https://arxiv.org/abs/2605.28371, doi:10.48550/ arXiv.2605.28371,arXiv:2605.28371. R. Theiler, L. Telyatnikov et al. Page 42 of 43 Towards Unified and Data-Ef...

  54. [54]

    Powerpm: Foundation model for power systems, in: Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C

    Tu, S., Zhang, Y., Zhang, J., Fu, Z., Zhang, Y., Yang, Y., 2024. Powerpm: Foundation model for power systems, in: Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C. (Eds.), Advances in Neural Information Processing Sys- tems, Curran Associates, Inc.. pp. 115233–115260. URL:https://proceedings.neurips.cc/paper_files/paper/...

  55. [55]

    Deep learning for smart manufacturing: Methods and applications

    Wang, J., Ma, Y., Zhang, L., Gao, R.X., Wu, D., 2018. Deep learning for smart manufacturing: Methods and applications. Journal of Manufacturing Systems 48, 144–156. doi:10.1016/j.jmsy.2018.01.003

  56. [56]

    Deep time series models: A comprehensive survey and benchmark

    Wang, Y., Wu, H., Dong, J., Liu, Y., Wang, C., Long, M., Wang, J., 2026. Deep time series models: A comprehensive survey and benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence URL:https://doi.org/10.1109/TPAMI.2026.3690845, doi:10.1109/TPAMI.2026.3690845

  57. [58]

    Wong,K.L.,Bosello,M.,Tse,R.,Falcomer,C.,Rossi,C.,Pau,G.,2021b. Li-ionbatteriesstate-of-chargeestimationusingdeeplstmatvarious battery specifications and discharge cycles, in: Proceedings of the Conference on Information Technology for Social Good, Association for Computing Machinery, New York, NY, USA. p. 85–90. URL:https://doi.org/10.1145/3462203.3475878...

  58. [59]

    Unified training of universal time series forecasting transformers, in: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F

    Woo, G., Liu, C., Kumar, A., Xiong, C., Savarese, S., Sahoo, D., 2024. Unified training of universal time series forecasting transformers, in: Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., Berkenkamp, F. (Eds.), Proceedings of the 41st International Conference on Machine Learning, PMLR. pp. 53140–53164. URL:https://proce...

  59. [60]

    Hydra - a framework for elegantly configuring complex applications

    Yadan, O., 2019. Hydra - a framework for elegantly configuring complex applications. Github. URL:https://github.com/ facebookresearch/hydra

  60. [61]

    Xjtu-sy rolling element bearing accelerated life test datasets: A tutorial

    Yaguo, L., Tianyu, H., Biao, W., Naipeng, L., Tao, Y., Jun, Y., 2019. Xjtu-sy rolling element bearing accelerated life test datasets: A tutorial. Journal of Mechanical Engineering 55, 1–6

  61. [62]

    Renewable and Sustainable Energy Reviews 227, 116527

    Yao,J.,Han,T.,2026.Utilizinglarge-scalefoundationmodelsforprognosticsandhealthmanagementinwindturbines:Techniques,challenges, and future directions. Renewable and Sustainable Energy Reviews 227, 116527. doi:10.1016/j.rser.2025.116527

  62. [63]

    PDMBench: A Standardized Platform for Predictive Maintenance Research

    Zhang, S., Wang, T., Kulkarni, A., Adams, S., Bhattacharya, S., Tiyyagura, S.R., Bowen, E., Veeramani, B., Zhou, D., 2025. PDMBench: A Standardized Platform for Predictive Maintenance Research. URL:https://openreview.net/forum?id=oJhj8wOCNB. openReview preprint

  63. [64]

    Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting, in: The eleventh international conference on learning representations

    Zhang, Y., Yan, J., 2023. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting, in: The eleventh international conference on learning representations

  64. [65]

    Deeplearninganditsapplicationstomachinehealthmonitoring

    Zhao,R.,Yan,R.,Chen,Z.,Mao,K.,Wang,P.,Gao,R.X.,2019. Deeplearninganditsapplicationstomachinehealthmonitoring. Mechanical Systems and Signal Processing 115, 213–237. doi:10.1016/j.ymssp.2018.05.050

  65. [66]

    A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

    Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G., Zhang, K., Ji, C., Yan, Q., He, L., et al., 2025. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. International Journal of Machine Learning and Cybernetics 16, 9851–9915

  66. [67]

    Predictive maintenance in the industry 4.0: A systematic literature review

    Zonta, T., da Costa, C.A., da Rosa Righi, R., de Lima, M.J., da Trindade, E.S., Li, G., 2020. Predictive maintenance in the industry 4.0: A systematic literature review. Computers & Industrial Engineering 150, 106889. doi:10.1016/j.cie.2020.106889. R. Theiler, L. Telyatnikov et al. Page 43 of 43