pith. machine review for the scientific record. sign in

arxiv: 2604.14547 · v1 · submitted 2026-04-16 · 💻 cs.LG

Recognition: unknown

Predicting Post-Traumatic Epilepsy from Clinical Records using Large Language Model Embeddings

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:14 UTC · model grok-4.3

classification 💻 cs.LG
keywords post-traumatic epilepsytraumatic brain injurylarge language modelsclinical predictionmachine learningfeature embeddingsTRACK-TBIepilepsy risk
0
0 comments X

The pith

Routine clinical records encoded with large language model embeddings predict post-traumatic epilepsy when fused with tabular features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that pretrained large language models can serve as fixed feature extractors to turn acute clinical notes from traumatic brain injury patients into embeddings that improve prediction of post-traumatic epilepsy. These embeddings, when combined with standard tabular variables such as seizure history and injury details through a modality-aware fusion step and fed to gradient-boosted tree classifiers, reach an AUC-ROC of 0.892 and AUPRC of 0.798 under stratified cross-validation on a TRACK-TBI subset. The method outperforms tabular features alone because the embeddings capture contextual clinical information present in the text. A sympathetic reader would care since the inputs are data collected in the first days after injury, offering a route to early risk identification that does not require neuroimaging.

Core claim

The authors claim that embedding clinical records with pretrained LLMs captures contextual clinical information that improves prediction of post-traumatic epilepsy compared to tabular features alone, with the best results from a hybrid fusion strategy yielding an AUC-ROC of 0.892 and AUPRC of 0.798 on a curated TRACK-TBI subset using gradient-boosted tree classifiers.

What carries the argument

The modality-aware feature fusion strategy that integrates LLM-generated embeddings of clinical records with tabular clinical variables before classification by gradient-boosted trees.

If this is right

  • LLM embeddings improve performance over tabular features alone by capturing contextual clinical information.
  • Acute post-traumatic seizures, injury severity, neurosurgical intervention, and ICU stay are the strongest contributors to the predictions.
  • Routine acute clinical records contain information sufficient for early PTE risk prediction.
  • The hybrid approach functions as a complement to imaging-based methods.
  • Gradient-boosted trees handle the combined features effectively under stratified cross-validation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hospitals lacking advanced imaging could still use existing notes for initial PTE screening if the method holds in new data.
  • The same embedding-plus-tabular pattern may extend to predicting other post-TBI outcomes when similar text records exist.
  • Results could change if documentation style or patient mix differs markedly from the TRACK-TBI cohort.
  • Fine-tuning the language models on larger medical text collections might raise accuracy further on bigger datasets.

Load-bearing premise

The TRACK-TBI subset is representative of broader traumatic brain injury populations and the small number of positive PTE cases permits stable performance estimates without overfitting under stratified cross-validation.

What would settle it

An external validation study on an independent cohort from a different hospital system in which the AUC-ROC drops below 0.75 would show that the predictive performance does not generalize.

read the original abstract

Objective: Post-traumatic epilepsy (PTE) is a debilitating neurological disorder that develops after traumatic brain injury (TBI). Early prediction of PTE remains challenging due to heterogeneous clinical data, limited positive cases, and reliance on resource-intensive neuroimaging data. We investigate whether routinely collected acute clinical records alone can support early PTE prediction using language model-based approaches. Methods: Using a curated subset of the TRACK-TBI cohort, we developed an automated PTE prediction framework that implements pretrained large language models (LLMs) as fixed feature extractors to encode clinical records. Tabular features, LLM-generated embeddings, and hybrid feature representations were evaluated using gradient-boosted tree classifiers under stratified cross-validation. Results: LLM embeddings achieved performance improvements by capturing contextual clinical information compared to using tabular features alone. The best performance was achieved by a modality-aware feature fusion strategy combining tabular features and LLM embeddings, achieving an AUC-ROC of 0.892 and AUPRC of 0.798. Acute post-traumatic seizures, injury severity, neurosurgical intervention, and ICU stay are key contributors to the predictive performance. Significance: These findings demonstrate that routine acute clinical records contain information suitable for early PTE risk prediction using LLM embeddings in conjunction with gradient-boosted tree classifiers. This approach represents a promising complement to imaging-based prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that pretrained large language models can be used as fixed feature extractors on routine acute clinical records from a curated TRACK-TBI subset to predict post-traumatic epilepsy (PTE). Tabular features, LLM embeddings, and a modality-aware fusion of both are fed to gradient-boosted tree classifiers under stratified cross-validation; the fusion strategy yields the highest performance (AUC-ROC 0.892, AUPRC 0.798), with acute post-traumatic seizures, injury severity, neurosurgical intervention, and ICU stay identified as key contributors. The work positions this as a practical, imaging-free complement to existing PTE prediction methods.

Significance. If the performance numbers prove stable, the result would be meaningful for clinical translation: it shows that routinely collected textual and tabular data already encode sufficient signal for early PTE risk stratification without requiring neuroimaging. The fixed-embedding approach is computationally lightweight and avoids the need for domain-specific LLM fine-tuning, which is a practical strength for deployment in resource-limited settings.

major comments (3)
  1. [Methods and Results] Methods/Results: The manuscript reports AUC-ROC 0.892 and AUPRC 0.798 but does not state the total cohort size, the number of positive PTE cases, or the exact pretrained LLM (model name, version, and embedding dimension). Given the known low incidence of PTE, these omissions make it impossible to evaluate whether the stratified CV metrics are stable or whether the small positive set has produced optimistic, cohort-specific estimates.
  2. [Results] Results: No confidence intervals, standard deviations across folds, or statistical comparison (e.g., DeLong test) are provided for the AUC/AUPRC differences between tabular-only, embedding-only, and fusion models. Without these, the claim that the modality-aware fusion is superior cannot be assessed for robustness.
  3. [Methods] Methods: Class-imbalance handling (class weights, sampling, or loss modification) is not described despite the use of stratified cross-validation on an imbalanced outcome. This detail is load-bearing for interpreting the AUPRC of 0.798.
minor comments (2)
  1. [Abstract and Results] The abstract and results refer to 'modality-aware feature fusion' without a concise definition or pseudocode; a short diagram or equation showing how tabular and embedding vectors are combined would improve clarity.
  2. [Results] Feature-importance analysis is mentioned but the exact method (e.g., SHAP, gain, permutation) and whether it was computed on the fused or tabular-only model is not stated.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We have addressed each major comment below and will revise the manuscript to incorporate the requested clarifications and additions.

read point-by-point responses
  1. Referee: [Methods and Results] Methods/Results: The manuscript reports AUC-ROC 0.892 and AUPRC 0.798 but does not state the total cohort size, the number of positive PTE cases, or the exact pretrained LLM (model name, version, and embedding dimension). Given the known low incidence of PTE, these omissions make it impossible to evaluate whether the stratified CV metrics are stable or whether the small positive set has produced optimistic, cohort-specific estimates.

    Authors: We agree these details are necessary to assess result stability given the low incidence of PTE. The revised manuscript will explicitly report the total size of the curated TRACK-TBI subset, the number of positive PTE cases, and the precise pretrained LLM (including model name, version, and embedding dimension) in the Methods section. revision: yes

  2. Referee: [Results] Results: No confidence intervals, standard deviations across folds, or statistical comparison (e.g., DeLong test) are provided for the AUC/AUPRC differences between tabular-only, embedding-only, and fusion models. Without these, the claim that the modality-aware fusion is superior cannot be assessed for robustness.

    Authors: We acknowledge that variability measures and formal statistical comparisons are needed to support the superiority claim. The revision will include per-fold standard deviations, confidence intervals for AUC-ROC and AUPRC, and statistical tests (such as DeLong) comparing the three modeling strategies. revision: yes

  3. Referee: [Methods] Methods: Class-imbalance handling (class weights, sampling, or loss modification) is not described despite the use of stratified cross-validation on an imbalanced outcome. This detail is load-bearing for interpreting the AUPRC of 0.798.

    Authors: We apologize for the omission. Stratified cross-validation was used to maintain class proportions across folds, and class weights were applied within the gradient-boosted tree classifiers. The revised Methods section will describe these imbalance-handling steps in full. revision: yes

Circularity Check

0 steps flagged

No circularity: standard empirical ML pipeline with fixed pretrained extractors and cross-validation

full rationale

The paper describes a purely empirical pipeline: pretrained LLMs are used as fixed (non-fine-tuned) feature extractors on clinical text, combined with tabular features, fed to gradient-boosted tree classifiers, and evaluated under stratified cross-validation on a TRACK-TBI subset. No equations, uniqueness theorems, or predictions are derived; performance numbers (AUC-ROC 0.892, AUPRC 0.798) are obtained directly from held-out folds rather than being forced by construction from fitted parameters or self-citations. The central claim therefore rests on external data and standard ML procedures rather than reducing to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the domain assumption that clinical text contains extractable signals for PTE risk and that pretrained LLM embeddings preserve those signals without fine-tuning.

axioms (1)
  • domain assumption Pretrained LLMs encode clinically relevant contextual information from acute TBI records that correlates with later epilepsy risk.
    LLMs are used as fixed feature extractors; no fine-tuning or domain adaptation is described.

pith-pipeline@v0.9.0 · 5542 in / 1380 out tokens · 52467 ms · 2026-05-10T12:14:36.768364+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 19 canonical work pages · 4 internal anchors

  1. [1]

    , Hussain, S

    ahmed2025summarizing APACrefauthors Ahmed, M. , Hussain, S. , Ali, F. , G \'a rate-Escamilla, A K. , Amaya, I. , Ochoa-Ruiz, G. \ Ortiz-Bayliss, J C. APACrefauthors \ 2025 . Summarizing recent developments on autism spectrum disorder detection and classification through machine learning and deep learning techniques Summarizing recent developments on autis...

  2. [2]

    , Ruf, S F

    akbar2024advancing APACrefauthors Akbar, M N. , Ruf, S F. , Singh, A. , Faghihpirayesh, R. , Garner, R. , Bennett, A. others APACrefauthors \ 2024 . Advancing post-traumatic seizure classification and biomarker identification: Information decomposition based multimodal fusion and explainable machine learning with missing neuroimaging data Advancing post-t...

  3. [3]

    , Cui, W

    akrami2024prediction APACrefauthors Akrami, H. , Cui, W. , Kim, P E. , Heck, C N. , Irimia, A. , Jerbi, K. Joshi, A A. APACrefauthors \ 2024 . Prediction of Post Traumatic Epilepsy Using MR-Based Imaging Markers Prediction of post traumatic epilepsy using mr-based imaging markers . Human Brain Mapping 45 17 e70075

  4. [4]

    , Irimia, A

    akrami2021prediction APACrefauthors Akrami, H. , Irimia, A. , Cui, W. , Joshi, A A. \ Leahy, R M. APACrefauthors \ 2021 . Prediction of posttraumatic epilepsy using machine learning Prediction of posttraumatic epilepsy using machine learning . Medical Imaging 2021: Biomedical Applications in Molecular, Structural, and Functional Imaging Medical imaging 20...

  5. [5]

    , Wheelock, J R

    ayvaz2025predicting APACrefauthors Ayvaz, B B. , Wheelock, J R. , Jin, D S. , Appleton, J. , Snider, S B. , Torres-Lopez, V. others APACrefauthors \ 2025 . Predicting Post-Traumatic Epilepsy with Automated Contusion Measurements using Acute CT Images: A Competing Risk Approach Predicting post-traumatic epilepsy with automated contusion measurements using ...

  6. [6]

    Llama-Embed-Nemotron-8B: A universal text em- bedding model for multilingual and cross-lingual tasks.arXiv:2511.07025, 2025

    babakhin2025llamaembednemotron8buniversaltextembedding APACrefauthors Babakhin, Y. , Osmulski, R. , Ak, R. , Moreira, G. , Xu, M. , Schifferer, B. Oldridge, E. APACrefauthors \ 2025 . Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks. Llama-embed-nemotron-8b: A universal text embedding model for multilingua...

  7. [7]

    , Podell, J

    badjatia2025machine APACrefauthors Badjatia, N. , Podell, J. , Felix, R B. , Chen, L K. , Dalton, K. , Wang, T I. Hu, P. APACrefauthors \ 2025 . Machine learning approaches to prognostication in traumatic brain injury Machine learning approaches to prognostication in traumatic brain injury . Current Neurology and Neuroscience Reports 25 1 1--12

  8. [8]

    \ Rappoport, N

    ben2024cpllm APACrefauthors Ben Shoham, O. \ Rappoport, N. APACrefauthors \ 2024 . Cpllm: Clinical prediction with large language models Cpllm: Clinical prediction with large language models . PLOS Digital Health 3 12 e0000680

  9. [9]

    , Gugger, J

    burke2021association APACrefauthors Burke, J. , Gugger, J. , Ding, K. , Kim, J A. , Foreman, B. , Yue, J K. others APACrefauthors \ 2021 . Association of Posttraumatic Epilepsy With 1-Year Outcomes After Traumatic Brain Injury Association of posttraumatic epilepsy with 1-year outcomes after traumatic brain injury . JAMA Network Open 4 12 e2140191--e2140191

  10. [10]

    , Xiao, S

    bge-m3 APACrefauthors Chen, J. , Xiao, S. , Zhang, P. , Luo, K. , Lian, D. \ Liu, Z. APACrefauthors \ 2024 . BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation

  11. [11]

    Proceedings of the 22nd

    Chen_2016 APACrefauthors Chen, T. \ Guestrin, C. APACrefauthors \ 2016 08 . XGBoost: A Scalable Tree Boosting System Xgboost: A scalable tree boosting system . Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining...

  12. [12]

    Super- vised learning of universal sentence representations from natural language inference data

    conneau2018supervisedlearninguniversalsentence APACrefauthors Conneau, A. , Kiela, D. , Schwenk, H. , Barrault, L. \ Bordes, A. APACrefauthors \ 2018 . Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. Supervised learning of universal sentence representations from natural language inference data. APACrefURL ht...

  13. [13]

    , Akrami, H

    cui2024generalizable APACrefauthors Cui, W. , Akrami, H. , Joshi, A A. \ Leahy, R M. APACrefauthors \ 2024 . Generalizable Representation Learning for fMRI-based Neurological Disorder Identification Generalizable representation learning for fmri-based neurological disorder identification . arXiv preprint arXiv:2412.16197

  14. [14]

    , Akrami, H

    cui2023meta APACrefauthors Cui, W. , Akrami, H. , Zhao, G. , Joshi, A A. \ Leahy, R M. APACrefauthors \ 2023 . Meta Transfer of Self-Supervised Knowledge: Foundation Model in Action for Post-Traumatic Epilepsy Prediction Meta transfer of self-supervised knowledge: Foundation model in action for post-traumatic epilepsy prediction . arXiv preprint arXiv:2312.14204

  15. [15]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    devlin2019bertpretrainingdeepbidirectional APACrefauthors Devlin, J. , Chang, M W. , Lee, K. \ Toutanova, K. APACrefauthors \ 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Bert: Pre-training of deep bidirectional transformers for language understanding. APACrefURL https://arxiv.org/abs/1810.04805 APACrefURL

  16. [16]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    dosovitskiy2020image APACrefauthors Dosovitskiy, A. APACrefauthors \ 2020 . An image is worth 16x16 words: Transformers for image recognition at scale An image is worth 16x16 words: Transformers for image recognition at scale . arXiv preprint arXiv:2010.11929

  17. [17]

    \ Ghodsi, A

    ghojogh2023recurrentneuralnetworkslong APACrefauthors Ghojogh, B. \ Ghodsi, A. APACrefauthors \ 2023 . Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey. Recurrent neural networks and long short-term memory networks: Tutorial and survey. APACrefURL https://arxiv.org/abs/2304.11461 APACrefURL

  18. [18]

    , Pettee, M

    golkar2024xval APACrefauthors Golkar, S. , Pettee, M. , Bietti, A. , Eickenberg, M. , Cranmer, M. , Krawezik, G. Ho, S. APACrefauthors \ 2024 . xVal: A Continuous Number Encoding for Large Language Models. xval: A continuous number encoding for large language models. APACrefURL https://openreview.net/forum?id=OinvjdvPjp APACrefURL

  19. [19]

    APACrefauthors \ 2024

    google_gemini_embeddings_2024 APACrefauthors Google. APACrefauthors \ 2024 . text-embedding-004. text-embedding-004. APACrefURL https://ai.google.dev/gemini-api/docs/embeddings APACrefURL Accessed: 2025-12-28

  20. [20]

    APACrefauthors \ 2006

    jakkula2006tutorial APACrefauthors Jakkula, V. APACrefauthors \ 2006 . Tutorial on support vector machine (svm) Tutorial on support vector machine (svm) . School of EECS, Washington State University 37 2.5 3

  21. [21]

    , Kim, W

    jin2023medcpt APACrefauthors Jin, Q. , Kim, W. , Chen, Q. , Comeau, D C. , Yeganova, L. , Wilbur, W J. \ Lu, Z. APACrefauthors \ 2023 . MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval Medcpt: Contrastive pre-trained transformers with large-scale pubmed search logs for zero-sho...

  22. [22]

    Chankyu Lee, Rajarshi Roy, Mengyao Xu, Jonathan Raiman, Mohammad Shoeybi, Bryan Catanzaro, and Wei Ping

    kusupati2024matryoshkarepresentationlearning APACrefauthors Kusupati, A. , Bhatt, G. , Rege, A. , Wallingford, M. , Sinha, A. , Ramanujan, V. Farhadi, A. APACrefauthors \ 2024 . Matryoshka Representation Learning. Matryoshka representation learning. APACrefURL https://arxiv.org/abs/2205.13147 APACrefURL

  23. [23]

    Emergency department decision support using clinical pseudo-notes.arXiv preprint arXiv:2402.00160, 2024

    lee2024emergency APACrefauthors Lee, S A. , Jain, S. , Chen, A. , Ono, K. , Fang, J. , Rudas, A. \ Chiang, J N. APACrefauthors \ 2024 . Emergency department decision support using clinical pseudo-notes Emergency department decision support using clinical pseudo-notes . arXiv preprint arXiv:2402.00160

  24. [24]

    APACrefauthors \ 2023 Oct 18

    mezzetti2023embeddings APACrefauthors Mezzetti, D. APACrefauthors \ 2023 Oct 18 . Embeddings for Medical Literature: A Strong Baseline Model for Semantic Search and More. Embeddings for medical literature: A strong baseline model for semantic search and more. APACrefURL https://medium.com/neuml/embeddings-for-medical-literature-74dae6abf5e0 APACrefURL Medium

  25. [25]

    , Kramer, M D

    nelson2021relationship APACrefauthors Nelson, L D. , Kramer, M D. , Joyner, K J. , Patrick, C J. , Stein, M B. , Temkin, N. others APACrefauthors \ 2021 . Relationship between transdiagnostic dimensions of psychopathology and traumatic brain injury (TBI): A TRACK-TBI study. Relationship between transdiagnostic dimensions of psychopathology and traumatic b...

  26. [26]

    , Valtierra-Rodriguez, M

    perez2025artificial APACrefauthors Perez-Sanchez, A V. , Valtierra-Rodriguez, M. , De-Santiago-Perez, J J. , Perez-Ramirez, C A. , Garcia-Perez, A. \ Amezquita-Sanchez, J P. APACrefauthors \ 2025 . Artificial Intelligence-Based Epileptic Seizure Prediction Strategies: A Review Artificial intelligence-based epileptic seizure prediction strategies: A review...

  27. [27]

    , Balas, V E

    popescu2009multilayer APACrefauthors Popescu, M C. , Balas, V E. , Perescu-Popescu, L. \ Mastorakis, N. APACrefauthors \ 2009 . Multilayer perceptron and neural networks Multilayer perceptron and neural networks . WSEAS Transactions on Circuits and Systems 8 7 579--588

  28. [28]

    radford2019language APACrefauthors Radford, A. , Wu, J. , Child, R. , Luan, D. , Amodei, D. , Sutskever, I. \ . APACrefauthors \ 2019 . Language models are unsupervised multitask learners Language models are unsupervised multitask learners . OpenAI blog 1 8 9

  29. [29]

    , Salazar, A M

    Raymont2010-ag APACrefauthors Raymont, V. , Salazar, A M. , Lipsky, R. , Goldman, D. , Tasick, G. \ Grafman, J. APACrefauthors \ 2010 07 . Correlates of posttraumatic epilepsy 35 years following combat brain injury Correlates of posttraumatic epilepsy 35 years following combat brain injury . Neurology 75 3 224--229

  30. [30]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

    reimers2019sentence APACrefauthors Reimers, N. \ Gurevych, I. APACrefauthors \ 2019 . Sentence-bert: Sentence embeddings using siamese bert-networks Sentence-bert: Sentence embeddings using siamese bert-networks . arXiv preprint arXiv:1908.10084

  31. [31]

    , Garner, R

    la2019machine APACrefauthors Rocca, M L. , Garner, R. , Jann, K. , Kim, H. , Vespa, P. , Toga, A W. \ Duncan, D. APACrefauthors \ 2019 . Machine learning of multimodal MRI to predict the development of epileptic seizures after traumatic brain injury. Machine learning of multimodal MRI to predict the development of epileptic seizures after traumatic brain ...

  32. [32]

    saab2024capabilities APACrefauthors Saab, K. , Tu, T. , Weng, W H. , Tanno, R. , Stutz, D. , Wulczyn, E. others APACrefauthors \ 2024 . Capabilities of gemini models in medicine Capabilities of gemini models in medicine . arXiv preprint arXiv:2404.18416

  33. [33]

    arXiv preprint arXiv:2509.20354 (2025) 6

    embedding_gemma_2025 APACrefauthors Schechter Vera, H. , Dua, S. , Zhang, B. , Salz, D. , Mullins, R. , Raghuram Panyam, S. Seyedhosseini, M. APACrefauthors \ 2025 . EmbeddingGemma: Powerful and Lightweight Text Representations Embeddinggemma: Powerful and lightweight text representations . arXiv preprint arXiv:2509.20354 . APACrefURL https://arxiv.org/ab...

  34. [34]

    , Tang, L

    sollee2022artificial APACrefauthors Sollee, J. , Tang, L. , Igiraneza, A B. , Xiao, B. , Bai, H X. \ Yang, L. APACrefauthors \ 2022 . Artificial intelligence for medical image analysis in epilepsy Artificial intelligence for medical image analysis in epilepsy . Epilepsy Research 106861

  35. [35]

    , Zheng, Q

    song2022eeg APACrefauthors Song, Y. , Zheng, Q. , Liu, B. \ Gao, X. APACrefauthors \ 2022 . EEG conformer: Convolutional transformer for EEG decoding and visualization Eeg conformer: Convolutional transformer for eeg decoding and visualization . IEEE Transactions on Neural Systems and Rehabilitation Engineering 31 710--719

  36. [36]

    BioClinical ModernBERT: A state-of-the-art long-context encoder for biomedi- cal and clinical NLP.arXiv preprint arXiv:2506.10896,

    sounack2025bioclinicalmodernbertstateoftheartlongcontext APACrefauthors Sounack, T. , Davis, J. , Durieux, B. , Chaffin, A. , Pollard, T J. , Lehman, E. Lindvall, C. APACrefauthors \ 2025 . BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP. Bioclinical modernbert: A state-of-the-art long-context encoder for bi...

  37. [37]

    \ Ozkurt, N

    taspinar2024review APACrefauthors Taspinar, G. \ Ozkurt, N. APACrefauthors \ 2024 . A review of ADHD detection studies with machine learning methods using rsfMRI data A review of adhd detection studies with machine learning methods using rsfmri data . NMR in Biomedicine 37 8 e5138

  38. [38]

    , Pujara, J

    thawani2021representingnumbersnlpsurvey APACrefauthors Thawani, A. , Pujara, J. , Szekely, P A. \ Ilievski, F. APACrefauthors \ 2021 . Representing Numbers in NLP: a Survey and a Vision. Representing numbers in nlp: a survey and a vision. APACrefURL https://arxiv.org/abs/2103.13136 APACrefURL

  39. [39]

    , Ting, D S J

    thirunavukarasu2023large APACrefauthors Thirunavukarasu, A J. , Ting, D S J. , Elangovan, K. , Gutierrez, L. , Tan, T F. \ Ting, D S W. APACrefauthors \ 2023 . Large language models in medicine Large language models in medicine . Nature medicine 29 8 1930--1940

  40. [41]

    , Shazeer, N

    vaswani2017attention APACrefauthors Vaswani, A. , Shazeer, N. , Parmar, N. , Uszkoreit, J. , Jones, L. , Gomez, A N. Polosukhin, I. APACrefauthors \ 2017 . Attention is all you need Attention is all you need . Advances in neural information processing systems 30

  41. [42]

    \ Cavazos, J E

    verellen2010post APACrefauthors Verellen, R M. \ Cavazos, J E. APACrefauthors \ 2010 . Post-traumatic epilepsy: an overview Post-traumatic epilepsy: an overview . Therapy 7 5 527

  42. [43]

    , Zhong, J

    wang2021artificial APACrefauthors Wang, X. , Zhong, J. , Lei, T. , Chen, D. , Wang, H. , Zhu, L. Liu, L. APACrefauthors \ 2021 . An artificial neural network prediction model for posttraumatic epilepsy: retrospective cohort study An artificial neural network prediction model for posttraumatic epilepsy: retrospective cohort study . Journal of Medical Inter...

  43. [44]

    , Zhong, J

    wang2021development APACrefauthors Wang, X. , Zhong, J. , Lei, T. , Wang, H j. , Zhu, L n. , Chu, S. Liu, L. APACrefauthors \ 2021 . Development and external validation of a predictive nomogram model of posttraumatic epilepsy: a retrospective analysis Development and external validation of a predictive nomogram model of posttraumatic epilepsy: a retrospec...

  44. [45]

    , Gao, C

    wang2024meditabscalingmedicaltabular APACrefauthors Wang, Z. , Gao, C. , Xiao, C. \ Sun, J. APACrefauthors \ 2024 . MediTab: Scaling Medical Tabular Data Predictors via Data Consolidation, Enrichment, and Refinement. Meditab: Scaling medical tabular data predictors via data consolidation, enrichment, and refinement. APACrefURL https://arxiv.org/abs/2305.1...

  45. [46]

    , Chen, A

    yang2022large APACrefauthors Yang, X. , Chen, A. , PourNejatian, N. , Shin, H C. , Smith, K E. , Parisien, C. others APACrefauthors \ 2022 . A large language model for electronic health records A large language model for electronic health records . npj Digital Medicine 5 1 194 . APACrefDOI 10.1038/s41746-022-00742-2 APACrefDOI

  46. [47]

    qwen3embedding APACrefauthors Zhang, Y. , Li, M. , Long, D. , Zhang, X. , Lin, H. , Yang, B. Zhou, J. APACrefauthors \ 2025 . Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Qwen3 embedding: Advancing text embedding and reranking through foundation models . arXiv preprint arXiv:2506.05176

  47. [48]

    zhou2020machine APACrefauthors Zhou, B. , An, D. , Xiao, F. , Niu, R. , Li, W. , Li, W. others APACrefauthors \ 2020 . Machine learning for detecting mesial temporal lobe epilepsy by structural and functional neuroimaging Machine learning for detecting mesial temporal lobe epilepsy by structural and functional neuroimaging . Frontiers of Medicine 14 5 630--641

  48. [49]

    , Tian, C

    zhou2018epileptic APACrefauthors Zhou, M. , Tian, C. , Cao, R. , Wang, B. , Niu, Y. , Hu, T. Xiang, J. APACrefauthors \ 2018 . Epileptic seizure detection based on EEG signals and CNN Epileptic seizure detection based on eeg signals and cnn . Frontiers in neuroinformatics 12 95

  49. [50]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  50. [51]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  51. [52]

    9"` 7V9 ƺ a9P v֭ H lͩH #-cgс= ԧ' j5+P ͖Μ _ PK !U _rels/.rels ( MK 1 !̽;*

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * 10 13 REFERENCES \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * 10 13 \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @ @openbib .11em \@plus.33em \@minus.07e...

  52. [53]

    write newline

    " write newline "" before.all 'output.state := FUNCTION output.internal 'delimiter := duplicate empty 'pop 's := output.state mid.sentence = delimiter * write output.state before.all = 'write add.period " " * write if mid.sentence 'output.state := if s if FUNCTION output.check.internal 'delimiter := 't := duplicate empty pop "empty " t * " in " * cite * w...

  53. [54]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter edition editor howpublished institution journal doi key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence...

  54. [55]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...

  55. [56]

    write newline

    " write newline " cite write " FUNCTION editor.postfix editor num.names #1 > "( )" "( )" if FUNCTION editor.trans.postfix editor num.names #1 > "( )" "( )" if FUNCTION trans.postfix translator num.names #1 > "( )" "( )" if FUNCTION authors.editors.reflist.apa5 'field := 'dot := field num.names 'numnames := numnames 'format.num.names := format.num.names na...

  56. [57]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize ":" * " " *...

  57. [58]

    Available from:

    ENTRY address assignee author booktitle chapter cartographer day edition editor howpublished institution inventor journal key month note number organization pages part publisher school series title type volume word year eprint doi url lastchecked updated label INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprint...

  58. [59]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize ":" * " " *...