pith. sign in

arxiv: 2605.20052 · v2 · pith:3SJYQB7Wnew · submitted 2026-05-19 · 💻 cs.CL · cs.AI

PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

Pith reviewed 2026-05-21 07:49 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords prompt tuningradiology report labelinglow-resource learningmulti-label classificationUMLSclinical NLPmasked language modelingnegation detection
0
0 comments X

The pith

PromptRad labels radiology reports accurately with prompt-tuning and medical synonyms using only 32 examples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that reformulating multi-label radiology report labeling as masked language modeling, then enriching the verbalizer with UMLS synonyms, lets a small pre-trained language model perform well even when labeled data is extremely scarce. A sympathetic reader would care because clinical reports contain varied phrasing and negations, yet large-scale annotation for imaging research is blocked by the high cost of expert labels. The approach avoids adding new classification layers, instead tuning the model directly on prompt templates, which cuts the data requirement compared with standard fine-tuning. Experiments on liver CT reports confirm the method beats both dictionary rules and ordinary fine-tuning at this low data scale while staying competitive with much larger models.

Core claim

PromptRad reformulates multi-label classification as masked language modeling and incorporates synonyms from the UMLS Metathesaurus into a multi-word verbalizer to enrich category representations. By fine-tuning the PLM without additional classification layers, PromptRad requires substantially less labeled data than conventional fine-tuning. On liver CT reports it outperforms dictionary-based and fine-tuning baselines with only 32 labeled training examples and achieves competitive performance with GPT-4 despite using a much smaller model.

What carries the argument

Multi-word verbalizer augmented with UMLS synonyms, which supplies enriched prompt tokens so the model can perform masked-language-modeling-style prediction over clinical categories.

If this is right

  • The method achieves strong multi-label accuracy on liver CT reports with only 32 labeled examples.
  • It matches or approaches GPT-4 performance while using a far smaller model.
  • It handles complex negation patterns better than dictionary or standard fine-tuning approaches.
  • No extra classification head is needed, so the same pre-trained language model serves both prompting and prediction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same verbalizer enrichment could be tested on other report types such as chest X-ray or pathology notes to check cross-domain transfer.
  • Replacing the fixed UMLS synonym list with a learned or context-aware synonym selector might further reduce noise on rare findings.
  • Because the approach stays within the original language-model head, it could be combined with retrieval-augmented prompts to handle even scarcer data regimes.

Load-bearing premise

Adding UMLS synonyms to the multi-word verbalizer reliably enriches category representations without introducing noise or conflicting signals on negation and rare findings.

What would settle it

A controlled run on the same liver CT test set in which the UMLS synonyms are removed from the verbalizer and performance on negation-heavy or low-frequency findings drops sharply below the reported baseline.

Figures

Figures reproduced from arXiv: 2605.20052 by Chien-Hung Liao, Chi-Tung Cheng, Hung-Yu Kao, Ping-Chien Li, Tzu-Chin Lo, Ying-Jia Lin.

Figure 1
Figure 1. Figure 1: An example liver CT report from our dataset [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Differences in the approaches of (a) Prompt-tuning for multi-class classification on a general domain [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The workflow of the PromptRad report labeling system. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Accuracy on negation cases: reports that men [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison in F1-score for using different numbers of reports for training. We ran experiments five times [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Automatic report labeling facilitates the identification of clinical findings from unstructured text and enables large-scale annotation for medical imaging research. Existing rule-based labelers struggle with the diverse descriptions in clinical reports, while fine-tuning pre-trained language models (PLMs) requires large amounts of labeled data that are often unavailable in clinical settings. In this paper, we propose PromptRad, a knowledge-enhanced multi-label \textbf{prompt}-tuning approach for \textbf{rad}iology report labeling under low-resource settings. PromptRad reformulates multi-label classification as masked language modeling and incorporates synonyms from the UMLS Metathesaurus into a multi-word verbalizer to enrich category representations. By fine-tuning the PLM without additional classification layers, PromptRad requires substantially less labeled data than conventional fine-tuning. Experiments on liver CT (computed tomography) reports show that PromptRad outperforms dictionary-based and fine-tuning baselines with only 32 labeled training examples, and achieves competitive performance with GPT-4 despite using a much smaller model. Further analysis demonstrates that PromptRad captures complex negation patterns more effectively than existing methods, making it a promising solution for report labeling in data-scarce clinical scenarios. Our code is available at https://github.com/ila-lab/PromptRad.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes PromptRad, a knowledge-enhanced multi-label prompt-tuning method for radiology report labeling in low-resource settings. It reformulates multi-label classification as masked language modeling and incorporates UMLS Metathesaurus synonyms into multi-word verbalizers to enrich category representations. Experiments on liver CT reports claim that PromptRad outperforms dictionary-based and fine-tuning baselines using only 32 labeled training examples, achieves competitive performance with GPT-4 using a smaller model, and handles complex negation patterns more effectively.

Significance. If the low-resource empirical gains prove robust, the work would be significant for clinical NLP by demonstrating a data-efficient alternative to full fine-tuning or large generative models in annotation-scarce medical domains. The integration of domain knowledge via UMLS into prompt verbalizers and the avoidance of additional classification layers address a practical bottleneck, and the reported competitiveness with GPT-4 highlights potential efficiency advantages.

major comments (2)
  1. [Experiments] Experiments section (liver CT results with 32 examples): The reported outperformance over dictionary and fine-tuning baselines is presented without averaging F1 or AUC over multiple independent draws of the 32-example training set or standard deviations across random seeds. This directly affects the central low-resource claim, as the observed margins could depend on a single fortunate split rather than the method's properties.
  2. [Method] Method section (UMLS verbalizer construction): The claim that adding UMLS synonyms reliably enriches category representations without introducing noise is load-bearing for the knowledge-enhancement contribution, yet no ablation isolates the multi-word verbalizer's effect on negation or rare findings, leaving the weakest assumption untested.
minor comments (2)
  1. [Abstract] Abstract: The statement that PromptRad 'captures complex negation patterns more effectively' lacks any mention of the specific metrics, examples, or analysis method used to support this post-hoc claim.
  2. [Method] The paper provides a code link but does not specify the exact prompt templates or verbalizer word lists in the main text, which would aid reproducibility of the multi-label prompt setup.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and will revise the paper to incorporate the suggested improvements where they strengthen the work.

read point-by-point responses
  1. Referee: [Experiments] Experiments section (liver CT results with 32 examples): The reported outperformance over dictionary and fine-tuning baselines is presented without averaging F1 or AUC over multiple independent draws of the 32-example training set or standard deviations across random seeds. This directly affects the central low-resource claim, as the observed margins could depend on a single fortunate split rather than the method's properties.

    Authors: We agree that the current single-split results for the 32-example setting limit the strength of the low-resource claims. In the revised manuscript we will sample five independent 32-example training sets using different random seeds, rerun all methods, and report mean F1 and AUC together with standard deviations. These results will be added to the Experiments section and to Table 2. revision: yes

  2. Referee: [Method] Method section (UMLS verbalizer construction): The claim that adding UMLS synonyms reliably enriches category representations without introducing noise is load-bearing for the knowledge-enhancement contribution, yet no ablation isolates the multi-word verbalizer's effect on negation or rare findings, leaving the weakest assumption untested.

    Authors: We acknowledge that an explicit ablation isolating the multi-word UMLS verbalizer would more directly test its contribution to negation handling and rare findings. In the revision we will add an ablation comparing the full PromptRad verbalizer against a single-word baseline and against a version without UMLS synonyms, with separate analysis on negated and rare-finding subsets. This will be included in the Method and Analysis sections. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation or claims

full rationale

The paper presents PromptRad as an empirical method that reformulates multi-label radiology report labeling as masked language modeling and augments a multi-word verbalizer with UMLS synonyms. All load-bearing results are experimental comparisons against dictionary and fine-tuning baselines on a fixed liver CT dataset using 32 examples. No equations or steps reduce by construction to fitted parameters renamed as predictions, no self-definitional loops appear in the method description, and no uniqueness theorems or ansatzes are imported via self-citation chains. The approach is self-contained against external benchmarks and does not rely on prior author work to justify its core construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The method relies on the standard assumption that PLM masked language modeling can be repurposed for classification via verbalizers; no new axioms or invented entities are introduced beyond the UMLS synonym list which is an external resource.

free parameters (1)
  • choice of prompt template and verbalizer words
    The specific wording of the prompt and which UMLS synonyms are selected are design choices that affect performance but are not derived from first principles.
axioms (1)
  • domain assumption Masked language modeling objective can be directly used for multi-label classification without additional classification layers
    Invoked when reformulating the task as MLM in the abstract.

pith-pipeline@v0.9.0 · 5768 in / 1230 out tokens · 28980 ms · 2026-05-21T07:49:44.564401+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · 2 internal anchors

  1. [1]

    Aho and Jeffrey D

    Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

  2. [2]

    Publications Manual , year = "1983", publisher =

  3. [3]

    Chandra and Dexter C

    Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

  4. [4]

    Scalable training of

    Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

  5. [5]

    Dan Gusfield , title =. 1997

  6. [6]

    Tetreault , title =

    Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

  7. [7]

    A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

    Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =

  8. [8]

    and Ball, Robyn L

    Chen, Matthew C. and Ball, Robyn L. and Yang, Lingyao and Moradzadeh, Nathaniel and Chapman, Brian E. and Larson, David B. and Langlotz, Curtis P. and Amrhein, Timothy J. and Lungren, Matthew P. , doi =. Deep Learning to Classify Radiology Free-Text Reports , url =. 2018 , bdsk-url-1 =. https://doi.org/10.1148/radiol.2017171115 , journal =

  9. [9]

    Pons, Ewoud and Braun, Loes M. M. and Hunink, M. G. Myriam and Kors, Jan A. , title =. Radiology , volume =. 2016 , doi =

  10. [10]

    International Conference on Learning Representations , year =

    Decoupled Weight Decay Regularization , author =. International Conference on Learning Representations , year =

  11. [11]

    On the Stratification of Multi-label Data , booktitle =

    Sechidis, Konstantinos and Tsoumakas, Grigorios and Vlahavas, Ioannis , editor =. On the Stratification of Multi-label Data , booktitle =. 2011 , publisher =

  12. [12]

    2021 , eprint =

    SciFive: a text-to-text transformer model for biomedical literature , author =. 2021 , eprint =

  13. [13]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library , year =

    Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and K\". PyTorch: An Imperative Style, High-Performance Deep Learning Library , year =. Proceedings of the 33rd International Conference on Neural Informa...

  14. [14]

    Transformers: State-of-the-Art Natural Language Processing , booktitle =

    Transformers: State-of-the-Art Natural Language Processing , author =. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations , month = oct, year =. doi:10.18653/v1/2020.emnlp-demos.6 , pages =

  15. [15]

    XLNet: Generalized Autoregressive Pretraining for Language Understanding , url =

    Yang, Zhilin and Dai, Zihang and Yang, Yiming and Carbonell, Jaime and Salakhutdinov, Russ R and Le, Quoc V , booktitle =. XLNet: Generalized Autoregressive Pretraining for Language Understanding , url =. 2019 , bdsk-url-1 =

  16. [16]

    DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

    Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf , title =. CoRR , volume =. 2019 , url =. 1910.01108 , timestamp =

  17. [17]

    Liu , title =

    Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu , title =. Journal of Machine Learning Research , year =

  18. [18]

    The Unified Medical Language System (UMLS): integrating biomedical terminology , volume =

    Olivier Bodenreider , doi =. The Unified Medical Language System (UMLS): integrating biomedical terminology , volume =. Nucleic Acids Research , month =

  19. [19]

    ACM Trans

    Gu, Yu and Tinn, Robert and Cheng, Hao and Lucas, Michael and Usuyama, Naoto and Liu, Xiaodong and Naumann, Tristan and Gao, Jianfeng and Poon, Hoifung , title =. ACM Trans. Comput. Healthcare , month =. 2021 , issue_date =. doi:10.1145/3458754 , abstract =

  20. [20]

    Jia Li and Yucong Lin and Pengfei Zhao and Wenjuan Liu and Linkun Cai and Jing Sun and Lei Zhao and Zhenghan Yang and Hong Song and Han Lv and Zhenchang Wang , doi =. Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT) , vol...

  21. [21]

    Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning , volume =

    Vincent M D’Anniballe and Fakrul Islam Tushar and Khrystyna Faryna and Songyue Han and Maciej A Mazurowski and Geoffrey D Rubin and Joseph Y Lo , doi =. Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning , volume =. BMC Medical Informatics and Decision Making , pages =

  22. [22]

    Schick, Timo and Sch. It. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , month = jun, year =. doi:10.18653/v1/2021.naacl-main.185 , pages =

  23. [23]

    Revisiting Few-sample

    Tianyi Zhang and Felix Wu and Arzoo Katiyar and Kilian Q Weinberger and Yoav Artzi , booktitle =. Revisiting Few-sample. 2021 , url =

  24. [24]

    Smith , title =

    Jesse Dodge and Gabriel Ilharco and Roy Schwartz and Ali Farhadi and Hannaneh Hajishirzi and Noah A. Smith , title =. CoRR , volume =. 2020 , url =. 2002.06305 , timestamp =

  25. [25]

    Plos one , volume=

    Supervised and unsupervised language modelling in Chest X-Ray radiological reports , author=. Plos one , volume=. 2020 , publisher=

  26. [26]

    European Radiology , volume=

    Transformer-based structuring of free-text radiology report databases , author=. European Radiology , volume=. 2023 , publisher=

  27. [27]

    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , month = apr, year =

    Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , author =. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , month = apr, year =. doi:10.18653/v1/2021.eacl-main.20 , pages =

  28. [28]

    Publicly Available Clinical

    Alsentzer, Emily and Murphy, John and Boag, William and Weng, Wei-Hung and Jindi, Di and Naumann, Tristan and McDermott, Matthew , booktitle =. Publicly Available Clinical. 2019 , address =. doi:10.18653/v1/W19-1909 , pages =

  29. [29]

    Bioinformatics , volume =

    Lee, Jinhyuk and Yoon, Wonjin and Kim, Sungdong and Kim, Donghyeon and Kim, Sunkyu and So, Chan Ho and Kang, Jaewoo , title =. Bioinformatics , volume =. 2019 , month =. doi:10.1093/bioinformatics/btz682 , url =

  30. [30]

    Proceedings of the Third Conference on Medical Imaging with Deep Learning , pages =

    Automated Labelling using an Attention model for Radiology reports of MRI scans (ALARM) , author =. Proceedings of the Third Conference on Medical Imaging with Deep Learning , pages =. 2020 , editor =

  31. [31]

    Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children , volume =

    Fredrik A Dahl and Taraka Rama and Petter Hurlen and Pål H Brekke and Haldor Husby and Tore Gundersen and Øystein Nytrø and Lilja Øvrelid , doi =. Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children , volume =. BMC Medical Informatics and Decision Making , pages =

  32. [32]

    Transfer Learning in Biomedical Natural Language Processing: An Evaluation of

    Peng, Yifan and Yan, Shankai and Lu, Zhiyong , booktitle =. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of. 2019 , address =. doi:10.18653/v1/W19-5006 , pages =

  33. [33]

    Attention is All you Need , url =

    Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , booktitle =. Attention is All you Need , url =

  34. [34]

    Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using

    Smit, Akshay and Jain, Saahil and Rajpurkar, Pranav and Pareek, Anuj and Ng, Andrew and Lungren, Matthew , booktitle =. Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using. 2020 , address =. doi:10.18653/v1/2020.emnlp-main.117 , pages =

  35. [35]

    Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (

    Convolutional Neural Networks for Sentence Classification , author =. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (. 2014 , address =. doi:10.3115/v1/D14-1181 , pages =

  36. [36]

    Proceedings of the 15th Conference of the

    Very Deep Convolutional Networks for Text Classification , author =. Proceedings of the 15th Conference of the. 2017 , address =

  37. [37]

    Wang and Y

    X. Wang and Y. Peng and L. Lu and Z. Lu and M. Bagheri and R. M. Summers , booktitle =. ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases , year =. doi:10.1109/CVPR.2017.369 , url =

  38. [38]

    Wordpress: Luke Oakden Rayner , volume =

    Exploring the ChestXray14 dataset: problems , author =. Wordpress: Luke Oakden Rayner , volume =

  39. [39]

    and Lee, Timothy and Choi, Jinho D

    Shin, Bonggun and Chokshi, Falgun H. and Lee, Timothy and Choi, Jinho D. , booktitle =. Classification of radiology reports using neural attention models , year =

  40. [40]

    Radiology , volume =

    Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation , author =. Radiology , volume =. 2020 , publisher =

  41. [41]

    Proceedings of the AAAI Conference on Artificial Intelligence , author =

    CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , volume =. Proceedings of the AAAI Conference on Artificial Intelligence , author =. 2019 , month =. doi:10.1609/aaai.v33i01.3301590 , abstractnote =

  42. [42]

    Medical image analysis , volume =

    Padchest: A large chest x-ray image dataset with multi-label annotated reports , author =. Medical image analysis , volume =. 2020 , publisher =

  43. [43]

    Proceedings of the Third Conference on Medical Imaging with Deep Learning , pages =

    On the limits of cross-domain generalization in automated X-ray prediction , author =. Proceedings of the Third Conference on Medical Imaging with Deep Learning , pages =. 2020 , editor =

  44. [44]

    Machine Learning for Healthcare Conference , pages =

    Contrastive learning of medical visual representations from paired images and text , author =. Machine Learning for Healthcare Conference , pages =. 2022 , organization =

  45. [45]

    Proceedings of the 2019 Conference of the North

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , editor =. Proceedings of the 2019 Conference of the North. 2019 , address =. doi:10.18653/v1/N19-1423 , pages =

  46. [46]

    Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer

    Deep Contextualized Word Representations , author =. Proceedings of the 2018 Conference of the North. 2018 , address =. doi:10.18653/v1/N18-1202 , pages =

  47. [47]

    ACM Transactions on Computing for Healthcare (HEALTH) , volume =

    Domain-specific language model pretraining for biomedical natural language processing , author =. ACM Transactions on Computing for Healthcare (HEALTH) , volume =. 2021 , publisher =

  48. [48]

    Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , author =. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , pages =

  49. [49]

    Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium

    Mednli-a natural language inference dataset for the clinical domain , author =. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. Association for Computational Linguistics , pages =

  50. [50]

    Bioinformatics , volume =

    BIOSSES: a semantic sentence similarity estimation system for the biomedical domain , author =. Bioinformatics , volume =. 2017 , publisher =

  51. [51]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Roberta: A robustly optimized bert pretraining approach , author =. arXiv preprint arXiv:1907.11692 , year =

  52. [52]

    and Reicher, Joshua J

    Majkowska, Anna and Mittal, Sid and Steiner, David F. and Reicher, Joshua J. and McKinney, Scott Mayer and Duggan, Gavin E. and Eswaran, Krish and Cameron Chen, Po-Hsuan and Liu, Yun and Kalidindi, Sreenivasa Raju and Ding, Alexander and Corrado, Greg S. and Tse, Daniel and Shetty, Shravya , title =. Radiology , volume =. 2020 , doi =. https://doi.org/10....

  53. [53]

    PadChest: A large chest x-ray image dataset with multi-label annotated reports , journal =

    Aurelia Bustos and Antonio Pertusa and Jose-Maria Salinas and Maria. PadChest: A large chest x-ray image dataset with multi-label annotated reports , journal =. 2020 , issn =. doi:https://doi.org/10.1016/j.media.2020.101797 , url =

  54. [54]

    Wordpress: Luke Oakden Rayner , month =

    Oakden-Rayner, Luke , title =. Wordpress: Luke Oakden Rayner , month =

  55. [55]

    Bulletin of the Medical Library Association , volume =

    Medical subject headings (MeSH) , author =. Bulletin of the Medical Library Association , volume =. 2000 , publisher =

  56. [56]

    Proceedings of the National Academy of Sciences , volume =

    PubMed Central: The GenBank of the published literature , author =. Proceedings of the National Academy of Sciences , volume =. 2001 , publisher =

  57. [57]

    2022 , url =

    長庚醫療財團法人診療項目 , author =. 2022 , url =

  58. [58]

    Proceedings of the 11th International Workshop on Semantic Evaluation (

    Cer, Daniel and Diab, Mona and Agirre, Eneko and Lopez-Gazpio, I. Proceedings of the 11th International Workshop on Semantic Evaluation (. 2017 , address =. doi:10.18653/v1/S17-2001 , pages =

  59. [59]

    Language Models are Few-Shot Learners , url =

    Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...

  60. [60]

    Training language models to follow instructions with human feedback , url =

    Ouyang, Long and Wu, Jeffrey and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and Schulman, John and Hilton, Jacob and Kelton, Fraser and Miller, Luke and Simens, Maddie and Askell, Amanda and Welinder, Peter and Christiano, Paul F and Leike, Jan and Lowe,...

  61. [61]

    Making Pre-trained Language Models Better Few-shot Learners , author =. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , month = aug, year =. doi:10.18653/v1/2021.acl-long.295 , pages =

  62. [62]

    Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

    Hu, Shengding and Ding, Ning and Wang, Huadong and Liu, Zhiyuan and Wang, Jingang and Li, Juanzi and Wu, Wei and Sun, Maosong. Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.186...

  63. [63]

    ACM Comput

    Liu, Pengfei and Yuan, Weizhe and Fu, Jinlan and Jiang, Zhengbao and Hayashi, Hiroaki and Neubig, Graham , title =. ACM Comput. Surv. , month =. 2023 , issue_date =. doi:10.1145/3560815 , abstract =

  64. [64]

    Logan IV and Eric Wallace and Sameer Singh , title =

    Taylor Shin and Yasaman Razeghi and Robert L. Logan IV and Eric Wallace and Sameer Singh , title =. Empirical Methods in Natural Language Processing (EMNLP) , year =

  65. [65]

    Learning to Recall , author =

    Factual Probing Is [MASK]: Learning vs. Learning to Recall , author =. North American Association for Computational Linguistics (NAACL) , year =

  66. [66]

    2001 , journal =

    Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , author =. 2001 , journal =

  67. [67]

    2010 , publisher=

    Any domain parsing: automatic domain adaptation for natural language parsing , author=. 2010 , publisher=

  68. [68]

    2018 , journal =

    NegBio: a high-performance tool for negation and uncertainty detection in radiology reports , author =. 2018 , journal =

  69. [69]

    Prefix-Tuning: Optimizing Continuous Prompts for Generation , author =. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , month = aug, year =. doi:10.18653/v1/2021.acl-long.353 , pages =

  70. [70]

    arXiv , pages =

    GPT-4 technical report , author =. arXiv , pages =. 2023 , url =

  71. [71]

    Neural Computation , volume =

    Long short-term memory , author =. Neural Computation , volume =. 1997 , publisher =

  72. [72]

    Can Rule-Based Insights Enhance LLM s for Radiology Report Classification? Introducing the R ad P rompt Methodology

    Fytas, Panagiotis and Breger, Anna and Selby, Ian and Baker, Simon and Shahipasand, Shahab and Korhonen, Anna. Can Rule-Based Insights Enhance LLM s for Radiology Report Classification? Introducing the R ad P rompt Methodology. Proceedings of the 23rd Workshop on Biomedical Natural Language Processing. 2024. doi:10.18653/v1/2024.bionlp-1.17

  73. [73]

    Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework

    Abdullah, Abdullah and Kim, Seong Tae. Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework. JMIR Med Inform. 2025. doi:10.2196/68618

  74. [74]

    2023 , eprint=

    Radiology-Llama2: Best-in-Class Large Language Model for Radiology , author=. 2023 , eprint=

  75. [75]

    2024 , eprint=

    CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling , author=. 2024 , eprint=

  76. [76]

    2024 , eprint=

    The Llama 3 Herd of Models , author=. 2024 , eprint=

  77. [77]

    2025 , eprint=

    Qwen3 Technical Report , author=. 2025 , eprint=

  78. [78]

    Applied Sciences , VOLUME =

    Wei, Liting and Li, Yun and Zhu, Yi and Li, Bin and Zhang, Lejun , TITLE =. Applied Sciences , VOLUME =. 2022 , NUMBER =

  79. [79]

    IKM \_ L ab at B io L ay S umm Task 1: Longformer-based Prompt Tuning for Biomedical Lay Summary Generation

    Wu, Yu-Hsuan and Lin, Ying-Jia and Kao, Hung-Yu. IKM \_ L ab at B io L ay S umm Task 1: Longformer-based Prompt Tuning for Biomedical Lay Summary Generation. Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks. 2023. doi:10.18653/v1/2023.bionlp-1.64

  80. [80]

    Journal of Healthcare Informatics Research , author =

    Prompt. Journal of Healthcare Informatics Research , author =. 2024 , pages =. doi:10.1007/s41666-024-00162-9 , abstract =