pith. machine review for the scientific record. sign in

arxiv: 2605.08963 · v1 · submitted 2026-05-09 · 📊 stat.ML · cs.LG

Recognition: 2 theorem links

· Lean Theorem

Survey-aware Machine Learning: A Guideline for Valid Population Health Inference based on Scoping Review

Alex A. T. Bui, Henry W. Zheng, Jeffrey Feng, YongKyung Oh

Pith reviewed 2026-05-12 01:47 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords survey datamachine learningpopulation inferencesampling weightshealth surveysbiasfairnessguidelines
0
0 comments X

The pith

A nine-step guideline integrates survey design into machine learning to produce valid population health inferences from data like NHANES.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine learning models trained on complex health surveys routinely ignore primary sampling units, stratification, and sampling weights. This practice violates independence assumptions and produces biased estimates, understated uncertainty, and fairness assessments that do not reflect population disparities. The paper proposes Survey-aware Machine Learning, a nine-step guideline that folds survey design metadata into every stage of the modeling pipeline. A scoping review of sixteen prior methodological papers identifies existing techniques for weighted training and design-based evaluation while noting gaps in hyperparameter tuning and deployment. The guideline supplies task-specific instructions so that different analytical goals receive the appropriate sequence of adjustments.

Core claim

The paper claims that standard machine learning workflows applied to survey data such as NHANES violate the independence assumptions underlying most training and evaluation procedures, and that a nine-step Survey-aware Machine Learning guideline remedies this by embedding primary sampling units, stratification variables, and sampling weights throughout data preparation, model fitting, validation, performance assessment, and deployment.

What carries the argument

The nine-step Survey-aware Machine Learning (SaML) guideline that places survey design metadata at every point in the machine learning lifecycle.

If this is right

  • Population estimates of health outcomes become representative of the full target population rather than the sampled individuals alone.
  • Uncertainty intervals properly reflect the complex sampling structure and avoid overconfidence.
  • Fairness evaluations capture true population disparities instead of sample-specific artifacts.
  • Task-specific variants of the guideline tell users exactly which steps apply to prediction, descriptive inference, or other objectives.
  • Explicit attention is directed to previously under-addressed stages such as hyperparameter tuning and model deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same checklist structure could be adapted for other data types that violate independence, such as clustered or time-series observations.
  • Automated software wrappers enforcing the nine steps would reduce the practical barrier for analysts working with public survey files.
  • Requiring SaML compliance in public health ML pipelines could improve reproducibility of published findings.
  • Empirical tests on surveys other than NHANES would clarify how far the guideline generalizes.

Load-bearing premise

That following the nine prescribed steps will eliminate the bias, underestimated uncertainty, and invalid fairness results caused by ignoring survey design features.

What would settle it

A head-to-head comparison on the same NHANES dataset showing that population-level estimates, confidence intervals, and fairness metrics remain materially unchanged when the nine-step guideline is followed versus when standard machine learning is used.

Figures

Figures reproduced from arXiv: 2605.08963 by Alex A. T. Bui, Henry W. Zheng, Jeffrey Feng, YongKyung Oh.

Figure 1
Figure 1. Figure 1: Sample composition vs. population esti￾mates by age group (NHANES 2021–2023). Older adults are oversampled to enable pre￾cise subgroup estimation [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of unweighted and weighted estimates for continuous variables (Age, BMI, Systolic BP, Diastolic BP, from left to right). Error bars indicate 95% confidence intervals. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sample composition vs. population estimates by age group (left) and race/ethnicity (right). [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: ROC curves under standard evaluation (left) and survey-weighted evaluation (right). [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Precision-Recall curves under standard evaluation (left) and survey-weighted evaluation (right). [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗
read the original abstract

Machine Learning (ML) models trained on complex health surveys such as the National Health and Nutrition Examination Survey (NHANES) often ignore primary sampling units, stratification variables, and sampling weights. This practice violates the independence assumptions of standard evaluation methods. As a result, estimates become biased, uncertainty is underestimated, and fairness assessments fail to reflect population-level disparities. We propose Survey-aware Machine Learning (SaML), a nine-step guideline that incorporates survey design metadata across the ML lifecycle. Through a scoping review of 16 methodological papers, we summarize existing work on weighted model training, design-based cross-validation, and survey-adjusted performance evaluation. We also identify gaps in hyperparameter tuning and deployment. We provide task-specific guidance that clarifies which steps are required for different analytical objectives. SaML provides a checklist for valid population inference from survey data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript conducts a scoping review of 16 methodological papers on survey-adjusted machine learning techniques and proposes Survey-aware Machine Learning (SaML), a nine-step guideline for incorporating sampling weights, strata, and primary sampling units across the ML lifecycle when analyzing complex health surveys such as NHANES. It summarizes existing approaches to weighted training, design-based cross-validation, and adjusted performance evaluation, identifies gaps in hyperparameter tuning and deployment, and offers task-specific guidance for different analytical objectives to support valid population inference.

Significance. The structured synthesis of existing survey-aware techniques into a checklist format could help standardize practices and reduce common errors in bias, uncertainty, and fairness estimation for population health applications. The scoping review consolidates dispersed methodological work, and the task-specific recommendations add practical value. However, the absence of any empirical demonstration that the full guideline improves outcomes limits its immediate contribution beyond a literature summary.

major comments (1)
  1. [§3 (SaML Guideline)] §3 (SaML Guideline): The central claim that the nine-step SaML guideline resolves bias, underestimated uncertainty, and invalid fairness assessments when ML is applied to survey data is unsupported by evidence. The manuscript presents no simulation study, real-data case study, before/after comparison, or benchmark against standard ML or existing survey methods to show that following the steps produces the claimed improvements in inference validity.
minor comments (2)
  1. [Scoping Review section] Scoping Review section: The methods for identifying and selecting the 16 papers (search strategy, databases, inclusion/exclusion criteria) are not described in sufficient detail to allow replication or assessment of coverage.
  2. [Task-specific guidance] Task-specific guidance: A summary table mapping each of the nine steps to the analytical objectives (e.g., prediction vs. inference vs. fairness) would improve clarity and usability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for identifying the need to clarify the scope and evidentiary basis of our proposed guideline. We address the major comment below.

read point-by-point responses
  1. Referee: The central claim that the nine-step SaML guideline resolves bias, underestimated uncertainty, and invalid fairness assessments when ML is applied to survey data is unsupported by evidence. The manuscript presents no simulation study, real-data case study, before/after comparison, or benchmark against standard ML or existing survey methods to show that following the steps produces the claimed improvements in inference validity.

    Authors: We agree that the manuscript does not contain new empirical validation of the complete SaML guideline. As a scoping review, the paper synthesizes findings from the 16 included methodological papers, each of which provides evidence for specific components (weighted training, design-based cross-validation, and adjusted performance metrics). The nine-step guideline consolidates these existing approaches into a unified checklist rather than introducing or empirically testing a novel method. We will revise the manuscript to explicitly state that SaML is a literature-derived best-practice framework, tone down any implication of comprehensive resolution, and add a limitations section noting the absence of a unified empirical demonstration of the full guideline. This revision will also highlight the need for future studies to benchmark SaML against standard ML pipelines. revision: partial

Circularity Check

0 steps flagged

No circularity: SaML guideline is a synthesis from external scoping review

full rationale

The paper's central contribution is a nine-step guideline synthesized via scoping review of 16 external methodological papers on survey-adjusted ML. No equations, fitted parameters, or predictions are defined in terms of the target result. The derivation chain consists of summarizing existing techniques (weighted training, design-based CV, adjusted evaluation) and identifying gaps; this does not reduce to self-definition, self-citation load-bearing, or renaming of known results by construction. The manuscript is self-contained against external benchmarks and contains no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that survey design features must be integrated at every ML stage and that the scoping review of 16 papers provides a sufficient basis for the guideline.

axioms (1)
  • domain assumption Complex survey designs with primary sampling units, stratification, and weights violate the independence assumptions of standard ML evaluation methods
    Core premise stated directly in the abstract as the source of biased estimates and underestimated uncertainty.
invented entities (1)
  • Survey-aware Machine Learning (SaML) no independent evidence
    purpose: Nine-step guideline for incorporating survey design metadata across the ML lifecycle
    Newly proposed framework synthesized from prior work with no independent empirical validation reported.

pith-pipeline@v0.9.0 · 5451 in / 1395 out tokens · 70623 ms · 2026-05-12T01:47:09.141681+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

132 extracted references · 132 canonical work pages

  1. [1]

    ACM Comput

    Gama, João and Žliobaitundefined, Indrundefined and Bifet, Albert and Pechenizkiy, Mykola and Bouchachia, Abdelhamid , title =. ACM Comput. Surv. , volume = 46, number = 4, publisher =

  2. [2]

    Little, Roderick J. A. and Rubin, Donald B. , title =

  3. [3]

    and Bian, Jiang and Wang, Fei , title =

    Xu, Jie and Xiao, Yunyu and Wang, Wendy Hui and Ning, Yue and Shenkman, Elizabeth A. and Bian, Jiang and Wang, Fei , title =. eBioMedicine , volume = 84, publisher =

  4. [4]

    and Ali, Shehzad and Buckeridge, David and others , title =

    Birdi, Sharon and Rabet, Roxana and Durant, Steve and Patel, Atushi and Vosoughi, Tina and Shergill, Mahek and Costanian, Christy and Ziegler, Carolyn P. and Ali, Shehzad and Buckeridge, David and others , title =. BMC Public Health , volume = 24, number = 1, pages = 3599, year = 2024, month = dec, doi =

  5. [5]

    Information Fusion , volume = 99, pages = 101896, year = 2023, month = nov, doi =

    Díaz-Rodríguez, Natalia and Del Ser, Javier and Coeckelbergh, Mark and López de Prado, Marcos and Herrera-Viedma, Enrique and Herrera, Francisco , title =. Information Fusion , volume = 99, pages = 101896, year = 2023, month = nov, doi =

  6. [6]

    and Confalonieri, Roberto and Guidotti, Riccardo and Del Ser, Javier and Díaz-Rodríguez, Natalia and Herrera, Francisco , title =

    Ali, Sajid and Abuhmed, Tamer and El-Sappagh, Shaker and Muhammad, Khan and Alonso-Moral, Jose M. and Confalonieri, Roberto and Guidotti, Riccardo and Del Ser, Javier and Díaz-Rodríguez, Natalia and Herrera, Francisco , title =. Information Fusion , volume = 99, pages = 101805, year = 2023, month = nov, doi =

  7. [7]

    ACM Comput

    Mehrabi, Ninareh and Morstatter, Fred and Saxena, Nripsuta and Lerman, Kristina and Galstyan, Aram , title =. ACM Comput. Surv. , volume = 54, number = 6, pages =

  8. [8]

    Journal of Biomedical Informatics , volume = 154, pages = 104646, year = 2024, month = jun, doi =

    Yang, Yifan and Lin, Mingquan and Zhao, Han and Peng, Yifan and Huang, Furong and Lu, Zhiyong , title =. Journal of Biomedical Informatics , volume = 154, pages = 104646, year = 2024, month = jun, doi =

  9. [9]

    and Molsberry, Samantha A

    MacNell, Nathaniel and Feinstein, Lydia and Wilkerson, Jesse and Salo, Paivi M. and Molsberry, Samantha A. and Fessler, Michael B. and Thorne, Peter S. and Motsinger-Reif, Alison A. and Zeldin, Darryl C. , title =. PLOS ONE , volume = 18, number = 1, pages =. doi:10.1371/journal.pone.0280387 , editor =

  10. [10]

    Samio and Islam, Md

    Dey, Devjit and Haque, Md. Samio and Islam, Md. Mojahedul and Aishi, Umme Iffat and Shammy, Sajida Sultana and Mayen, Md. Sabbir Ahmed and Noor, Syed Toukir Ahmed and Uddin, Md. Jamal , title =. BMC Medical Research Methodology , volume = 25, number = 1, pages = 15, year = 2025, month = jan, doi =

  11. [11]

    Journal of Statistical Software , volume = 9, number = 8, year = 2004, doi =

    Lumley, Thomas , title =. Journal of Statistical Software , volume = 9, number = 8, year = 2004, doi =

  12. [12]

    Proceedings of the 2019

    Holstein, Kenneth and Vaughan, Jennifer Wortman and Daumé, Hal and Dudik, Miro and Wallach, Hanna , title =. Proceedings of the 2019

  13. [13]

    Proceedings of the 1st

    Buolamwini, Joy and Gebru, Timnit , title =. Proceedings of the 1st

  14. [14]

    Proceedings of the 35th

    Kallus, Nathan and Zhou, Angela , title =. Proceedings of the 35th

  15. [15]

    Addiction , volume = 111, number = 7, pages =

    Stockwell, Tim and Zhao, Jinhui and Greenfield, Thomas and Li, Jessica and Livingston, Michael and Meng, Yang , title =. Addiction , volume = 111, number = 7, pages =. doi:10.1111/add.13373 , language =

  16. [16]

    Stoop, Ineke A. L. and Billiet, Jaak and Koch, Achim and Fitzgerald, Rory , title =

  17. [17]

    PLOS ONE , volume = 19, number = 6, pages =

    Ahn, Hyeong Jun and Ishikawa, Kyle and Kim, Min-Hee , title =. PLOS ONE , volume = 19, number = 6, pages =. doi:10.1371/journal.pone.0304785 , editor =

  18. [18]

    International Statistical Review / Revue Internationale de Statistique , volume = 61, number = 2, pages = 317, year = 1993, month = aug, doi =

    Pfeffermann, Danny , title =. International Statistical Review / Revue Internationale de Statistique , volume = 61, number = 2, pages = 317, year = 1993, month = aug, doi =

  19. [19]

    and Graubard, Barry I

    Korn, Edward L. and Graubard, Barry I. , title =

  20. [20]

    Canadian Journal of Forest Research , volume = 28, number = 10, pages =

    Gregoire, T G , title =. Canadian Journal of Forest Research , volume = 28, number = 10, pages =

  21. [21]

    International Statistical Review , volume = 87, number =

    Skinner, Chris , title =. International Statistical Review , volume = 87, number =. doi:10.1111/insr.12285 , language =

  22. [22]

    Journal of Systems and Software , volume = 231, pages = 112612, year = 2026, month = jan, doi =

    Bucaioni, Alessio and Kazman, Rick and Pelliccione, Patrizio , title =. Journal of Systems and Software , volume = 231, pages = 112612, year = 2026, month = jan, doi =

  23. [23]

    Psychological Methods , year = 2025, month = oct, doi =

    Tang, Dandan and Tong, Xin , title =. Psychological Methods , year = 2025, month = oct, doi =

  24. [24]

    Journal of Big Data , volume = 12, number = 1, pages = 61, year = 2025, month = mar, doi =

    Taha, Kamal , title =. Journal of Big Data , volume = 12, number = 1, pages = 61, year = 2025, month = mar, doi =

  25. [25]

    , title =

    Lones, Michael A. , title =. Patterns , volume = 5, number = 10, publisher =

  26. [26]

    Patterns , volume = 4, number = 9, publisher =

    Kapoor, Sayash and Narayanan, Arvind , title =. Patterns , volume = 4, number = 9, publisher =

  27. [27]

    Nature Reviews Physics , volume = 4, number = 12, pages =

    Krenn, Mario and Pollice, Robert and Guo, Si Yue and Aldeghi, Matteo and Cervera-Lierta, Alba and Friederich, Pascal and dos Passos Gomes, Gabriel and Häse, Florian and Jinich, Adrian and Nigam, AkshatKumar and others , title =. Nature Reviews Physics , volume = 4, number = 12, pages =

  28. [28]

    Nature , volume = 620, number = 7972, pages =

    Wang, Hanchen and Fu, Tianfan and Du, Yuanqi and Gao, Wenhao and Huang, Kexin and Liu, Ziming and Chandak, Payal and Liu, Shengchao and Van Katwyk, Peter and Deac, Andreea and others , title =. Nature , volume = 620, number = 7972, pages =

  29. [29]

    and Sakshaug, Joseph W

    West, Brady T. and Sakshaug, Joseph W. and Aurelien, Guy Alain S. , title =. PLOS ONE , volume = 11, number = 6, pages =. doi:10.1371/journal.pone.0158120 , editor =

  30. [30]

    and Kreuter, Frauke , title =

    Valliant, Richard and Dever, Jill A. and Kreuter, Frauke , title =

  31. [31]

    Advances in

    Ding, Frances and Hardt, Moritz and Miller, John and Schmidt, Ludwig , title =. Advances in

  32. [32]

    and Pollard, Tom J

    Johnson, Alistair E.W. and Pollard, Tom J. and Shen, Lu and Lehman, Li-wei H. and Feng, Mengling and Ghassemi, Mohammad and Moody, Benjamin and Szolovits, Peter and Anthony Celi, Leo and Mark, Roger G. , title =. Scientific Data , volume = 3, number = 1, pages = 160035, year = 2016, month = may, doi =

  33. [33]

    Documenting large webtext corpora: A case study on the colossal clean crawled corpus

    Dodge, Jesse and Sap, Maarten and Marasović, Ana and Agnew, William and Ilharco, Gabriel and Groeneveld, Dirk and Mitchell, Margaret and Gardner, Matt , title =. Proceedings of the 2021. doi:10.18653/v1/2021.emnlp-main.98 , editor =

  34. [34]

    doi: 10.18653/v1/W18-5446

    Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel , title =. Proceedings of the 2018. doi:10.18653/v1/W18-5446 , editor =

  35. [35]

    Advances in

    Lakshminarayanan, Balaji and Pritzel, Alexander and Blundell, Charles , title =. Advances in

  36. [36]

    , title =

    Binder, David A. , title =. International Statistical Review / Revue Internationale de Statistique , volume = 51, number = 3, pages = 279, year = 1983, month = dec, doi =

  37. [37]

    and West, Brady T

    Heeringa, Steven G. and West, Brady T. and Heeringa, Steve G. and Berglund, Patricia A. and Berglund, Patricia A. , title =

  38. [38]

    , title =

    Lohr, Sharon L. , title =. doi:10.1201/9780429298899 , edition = 3, language =

  39. [39]

    Reweighting

    Van Alten, Sjoerd and Domingue, Benjamin W and Faul, Jessica and Galama, Titus and Marees, Andries T , title =. International Journal of Epidemiology , volume = 53, number = 3, pages =. doi:10.1093/ije/dyae054 , language =

  40. [40]

    JAMA Network Open , volume = 6, number = 1, pages =

    Dahlen, Alex and Charu, Vivek , title =. JAMA Network Open , volume = 6, number = 1, pages =. doi:10.1001/jamanetworkopen.2022.49804 , language =

  41. [41]

    DIGITAL HEALTH , volume = 11, pages = 20552076251331319, publisher =

    Zhang, Xingyu and Wang, Hairong and Yu, Guan and Zhang, Wenbin , title =. DIGITAL HEALTH , volume = 11, pages = 20552076251331319, publisher =. doi:10.1177/20552076251331319 , language =

  42. [42]

    and Reinhart, Alex and Bilinski, Alyssa and Chua, Eu Jing and La Motte-Kerr, Wichada and Rönn, Minttu M

    Salomon, Joshua A. and Reinhart, Alex and Bilinski, Alyssa and Chua, Eu Jing and La Motte-Kerr, Wichada and Rönn, Minttu M. and Reitsma, Marissa B. and Morris, Katherine A. and LaRocca, Sarah and Farag, Tamer H. and others , title =. Proceedings of the National Academy of Sciences , volume = 118, number = 51, pages =. doi:10.1073/pnas.2111454118 , language =

  43. [43]

    and Schuler, Megan and Stuart, Elizabeth A

    DuGoff, Eva H. and Schuler, Megan and Stuart, Elizabeth A. , title =. Health Services Research , volume = 49, number = 1, pages =. doi:10.1111/1475-6773.12090 , language =

  44. [44]

    Nature Communications , volume = 12, number = 1, pages = 2729, year = 2021, month = may, doi =

    Modi, Chirag and Böhm, Vanessa and Ferraro, Simone and Stein, George and Seljak, Uroš , title =. Nature Communications , volume = 12, number = 1, pages = 2729, year = 2021, month = may, doi =

  45. [45]

    , title =

    García De La Garza, Ángel and Blanco, Carlos and Olfson, Mark and Wall, Melanie M. , title =. JAMA Psychiatry , volume = 78, number = 4, pages = 398, year = 2021, month = apr, doi =

  46. [46]

    BMJ Public Health , volume = 3, number = 1, pages =

    Falasinnu, Titilola and Hossain, Md Belal and Karim, Mohammad Ehsanul and Weber, Kenneth Arnold and Mackey, Sean , title =. BMJ Public Health , volume = 3, number = 1, pages =. doi:10.1136/bmjph-2024-001628 , language =

  47. [47]

    and Souza, Pamela E

    Ellis, Gregory M. and Souza, Pamela E. , title =. Frontiers in Digital Health , volume =

  48. [48]

    Communications Medicine , volume = 2, number = 1, pages = 125, year = 2022, month = oct, doi =

    Qiu, Wei and Chen, Hugh and Dincer, Ayse Berceste and Lundberg, Scott and Kaeberlein, Matt and Lee, Su-In , title =. Communications Medicine , volume = 2, number = 1, pages = 125, year = 2022, month = oct, doi =

  49. [49]

    and Kuriwaki, Shiro and Isakov, Michael and Sejdinovic, Dino and Meng, Xiao-Li and Flaxman, Seth , title =

    Bradley, Valerie C. and Kuriwaki, Shiro and Isakov, Michael and Sejdinovic, Dino and Meng, Xiao-Li and Flaxman, Seth , title =. Nature , volume = 600, number = 7890, pages =

  50. [50]

    The Annals of Applied Statistics , volume = 12, number = 2, pages =

    Statistical paradises and paradoxes in big data (. The Annals of Applied Statistics , volume = 12, number = 2, pages =

  51. [51]

    Healthcare Analytics , volume = 5, pages = 100297, year = 2024, month = jun, doi =

    Chowdhury, Mohammad Mihrab and Ayon, Ragib Shahariar and Hossain, Md Sakhawat , title =. Healthcare Analytics , volume = 5, pages = 100297, year = 2024, month = jun, doi =

  52. [52]

    BMC Public Health , volume = 25, number = 1, pages = 319, year = 2025, month = jan, doi =

    Guo, Xinghong and Ma, Mingze and Zhao, Lipei and Wu, Jian and Lin, Yan and Fei, Fengyi and Tarimo, Clifford Silver and Wang, Saiyi and Zhang, Jingyi and Cheng, Xinya and others , title =. BMC Public Health , volume = 25, number = 1, pages = 319, year = 2025, month = jan, doi =

  53. [53]

    BMC Medical Informatics and Decision Making , volume = 25, number = 1, pages = 105, year = 2025, month = mar, doi =

    Tang, Qun and Wang, Yong and Luo, Yan , title =. BMC Medical Informatics and Decision Making , volume = 25, number = 1, pages = 105, year = 2025, month = mar, doi =

  54. [54]

    Preventing Chronic Disease , volume = 16, pages = 190109, year = 2019, month = sep, doi =

    Xie, Zidian and Nikolayeva, Olga and Luo, Jiebo and Li, Dongmei , title =. Preventing Chronic Disease , volume = 16, pages = 190109, year = 2019, month = sep, doi =

  55. [55]

    Computer

    Fogliato, Riccardo and Patil, Pratik and Monfort, Mathew and Perona, Pietro , title =. Computer

  56. [56]

    Scientific Reports , volume = 10, number = 1, pages = 10620, year = 2020, month = jun, doi =

    López-Martínez, Fernando and Núñez-Valdez, Edward Rolando and Crespo, Rubén González and García-Díaz, Vicente , title =. Scientific Reports , volume = 10, number = 1, pages = 10620, year = 2020, month = jun, doi =

  57. [57]

    and Huang, Samuel Y

    Huang, Alexander A. and Huang, Samuel Y. , title =. PLOS ONE , volume = 19, number = 5, pages =. doi:10.1371/journal.pone.0304509 , editor =

  58. [58]

    PLOS ONE , volume = 19, number = 9, pages =

    Olshvang, Daniel and Harris, Carl and Chellappa, Rama and Santhanam, Prasanna , title =. PLOS ONE , volume = 19, number = 9, pages =. doi:10.1371/journal.pone.0309830 , editor =

  59. [59]

    and Pagano, Marcello , title =

    Hedt, Bethany L. and Pagano, Marcello , title =. Statistics in Medicine , volume = 30, number = 5, pages =. doi:10.1002/sim.3920 , language =

  60. [60]

    and Fuller, Wayne A

    Isaki, Cary T. and Fuller, Wayne A. , title =. Journal of the American Statistical Association , volume = 77, number = 377, pages =

  61. [61]

    and Korn, Edward L

    Graubardand, Barry I. and Korn, Edward L. , title =. Statistical Science , volume = 17, number = 1, year = 2002, month = may, doi =

  62. [62]

    Best, Henning and Wolf, Christof , title =

  63. [63]

    , title =

    Toth, Daniell and Eltinge, John L. , title =. Journal of the American Statistical Association , volume = 106, number = 496, pages =. doi:10.1198/jasa.2011.tm10383 , language =

  64. [64]

    Science , volume = 366, number = 6464, pages =

    Obermeyer, Ziad and Powers, Brian and Vogeli, Christine and Mullainathan, Sendhil , title =. Science , volume = 366, number = 6464, pages =

  65. [65]

    Nature Machine Intelligence , volume = 3, number = 8, pages =

    Mhasawade, Vishwali and Zhao, Yuan and Chunara, Rumi , title =. Nature Machine Intelligence , volume = 3, number = 8, pages =

  66. [66]

    and Feuerriegel, Stefan and Kesselheim, Aaron S

    Vokinger, Kerstin N. and Feuerriegel, Stefan and Kesselheim, Aaron S. , title =. Communications Medicine , volume = 1, number = 1, pages = 25, year = 2021, month = aug, doi =

  67. [67]

    The Lancet Psychiatry , volume = 3, number = 3, pages =

    Chekroud, Adam Mourad and Zotti, Ryan Joseph and Shehzad, Zarrar and Gueorguieva, Ralitza and Johnson, Marcia K and Trivedi, Madhukar H and Cannon, Tyrone D and Krystal, John Harrison and Corlett, Philip Robert , title =. The Lancet Psychiatry , volume = 3, number = 3, pages =

  68. [68]

    and Paolini, Marco and Chisholm, Katharine and Kambeitz, Joseph and Haidl, Theresa and others , title =

    Koutsouleris, Nikolaos and Kambeitz-Ilankovic, Lana and Ruhrmann, Stephan and Rosen, Marlene and Ruef, Anne and Dwyer, Dominic B. and Paolini, Marco and Chisholm, Katharine and Kambeitz, Joseph and Haidl, Theresa and others , title =. JAMA Psychiatry , volume = 75, number = 11, pages =

  69. [69]

    The Lancet Digital Health , volume = 4, number = 11, pages =

    Koutsouleris, Nikolaos and Hauser, Tobias U and Skvortsova, Vasilisa and De Choudhury, Munmun , title =. The Lancet Digital Health , volume = 4, number = 11, pages =

  70. [70]

    Journal of the American Medical Informatics Association , volume = 29, number = 9, pages =

    van den Goorbergh, Ruben and van Smeden, Maarten and Timmerman, Dirk and Van Calster, Ben , title =. Journal of the American Medical Informatics Association , volume = 29, number = 9, pages =

  71. [71]

    Artificial Intelligence Review , volume = 57, number = 10, pages = 273, year = 2024, month = sep, doi =

    Salmi, Mabrouka and Atif, Dalia and Oliva, Diego and Abraham, Ajith and Ventura, Sebastian , title =. Artificial Intelligence Review , volume = 57, number = 10, pages = 273, year = 2024, month = sep, doi =

  72. [72]

    and Cecil, Charlotte and Zuluaga, Maria A

    Dang, Vien Ngoc and Cascarano, Anna and Mulder, Rosa H. and Cecil, Charlotte and Zuluaga, Maria A. and Hernández-González, Jerónimo and Lekadir, Karim , title =. Scientific Reports , volume = 14, number = 1, pages = 7848, year = 2024, month = apr, doi =

  73. [73]

    The Lancet Digital Health , volume = 7, number = 1, pages =

    Cho, Peter J and Olaye, Iredia M and Shandhi, Md Mobashir Hasan and Daza, Eric J and Foschini, Luca and Dunn, Jessilyn P , title =. The Lancet Digital Health , volume = 7, number = 1, pages =

  74. [74]

    , title =

    Selvarajah, Sharmini and Kaur, Gurpreet and Haniff, Jamaiyah and Cheong, Kee Chee and Hiong, Tee Guat and van der Graaf, Yolanda and Bots, Michiel L. , title =. International Journal of Cardiology , volume = 176, number = 1, pages =

  75. [75]

    and Yang, Hao ‘Frank’ , title =

    Du, Hongru and Zhao, Yang and Zhao, Jianan and Xu, Shaochong and Lin, Xihong and Chen, Yiran and Gardner, Lauren M. and Yang, Hao ‘Frank’ , title =. Nature Computational Science , volume = 5, number = 6, pages =

  76. [76]

    and Geirsson, Arnar and Krumholz, Harlan M

    Mori, Makoto and Dhruva, Sanket S. and Geirsson, Arnar and Krumholz, Harlan M. , title =. npj Digital Medicine , volume = 5, number = 1, pages = 192, year = 2022, month = dec, doi =

  77. [77]

    and Master, Hiral and Kim, Juseong and Kouame, Aymone and Harris, Paul A

    Jeong, Hayoung and Roghanizad, Ali R. and Master, Hiral and Kim, Juseong and Kouame, Aymone and Harris, Paul A. and Basford, Melissa and Marginean, Kayla and Dunn, Jessilyn , title =. npj Digital Medicine , volume = 8, number = 1, pages = 8, year = 2025, month = jan, doi =

  78. [78]

    Population Health Metrics , volume = 20, number = 1, pages = 22, year = 2022, month = dec, doi =

    Mardon, Russell and Campione, Joanne and Nooney, Jennifer and Merrill, Lori and Johnson, Maurice and Marker, David and Jenkins, Frank and Saydah, Sharon and Rolka, Deborah and Zhang, Xuanping and others , title =. Population Health Metrics , volume = 20, number = 1, pages = 22, year = 2022, month = dec, doi =

  79. [79]

    Journal of Applied Statistics , volume = 50, number = 3, pages =

    Dagdoug, Mehdi and Goga, Camelia and Haziza, David , title =. Journal of Applied Statistics , volume = 50, number = 3, pages =

  80. [80]

    and Breidt, F

    McConville, Kelly S. and Breidt, F. Jay and Lee, Thomas C. M. and Moisen, Gretchen G. , title =. Journal of Survey Statistics and Methodology , volume = 5, number = 2, pages =

Showing first 80 references.