pith. machine review for the scientific record. sign in

arxiv: 2605.00468 · v1 · submitted 2026-05-01 · 💻 cs.CL

Recognition: unknown

ReLay: Personalized LLM-Generated Plain-Language Summaries for Better Understanding, but at What Cost?

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:57 UTC · model grok-4.3

classification 💻 cs.CL
keywords plain language summariesLLM personalizationhealth communicationcomprehension outcomeshallucinationsuser biasesdataset evaluation
0
0 comments X

The pith

Personalizing plain-language summaries with LLMs improves comprehension but raises risks of bias reinforcement and hallucinations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates the ReLay dataset to test whether large language models can create personalized plain language summaries of research for individual lay readers, especially in health topics. It compares static expert-written summaries to interactive LLM-personalized versions using data from 50 participants, including their characteristics, needs, and outcomes. Personalization leads to better self-reported comprehension and higher quality ratings across five LLMs and two methods. At the same time, it increases the likelihood that summaries will reinforce existing user biases or include inaccurate information. This work shows a clear trade-off that must be managed if AI is to make scientific information more accessible without undermining trust.

Core claim

Personalization of plain language summaries using LLMs improves comprehension and perceived quality for lay readers in health contexts, but raises the risk of reinforcing user biases and introducing hallucinations, as demonstrated through evaluations on the new ReLay dataset of 300 participant-PLS pairs.

What carries the argument

The ReLay dataset of 300 participant-PLS pairs from 50 lay participants, which records user characteristics, health information needs, comprehension outcomes, interaction logs, and quality ratings to compare static and LLM-personalized summaries.

If this is right

  • Personalization can be applied to health information but requires safeguards to limit hallucinations.
  • Different LLMs and methods vary in how well they balance gains in understanding with safety risks.
  • Interaction logs provide a practical way to measure how users engage with and understand summaries in real time.
  • Health communication tools need both effectiveness and trustworthiness to support better real-world decisions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future systems could add fact-checking layers to LLM personalization to reduce introduced errors.
  • The same effectiveness-safety tension may appear in non-health domains like finance or education where accuracy matters.
  • Objective tests of understanding, rather than self-reports, would give stronger evidence on whether the benefits hold up.

Load-bearing premise

The assumption that the sample of 50 participants and their self-reported comprehension measures, quality ratings, and interaction logs accurately reflect real-world understanding, bias reinforcement, and hallucination rates for broader audiences and health topics.

What would settle it

A larger study with objective knowledge tests instead of self-reports, showing no gains in comprehension and no rise in hallucinations or bias reinforcement, would challenge the reported trade-off.

Figures

Figures reproduced from arXiv: 2605.00468 by Alexandra Lee, Jingyuan Chen, Joey Chan, Lauren D. Gryboski, Lucy Lu Wang, Qingqing Zhu, Samuel Fang, Sheel Tanna, Yikun Han, Yue Guo, Zhiyong Lu.

Figure 1
Figure 1. Figure 1: RELAY construction illustration. Of the 397 recruited participants, 50 met eligibility criteria and completed both delivery settings, each involving three scientific abstracts. For the first three abstracts, participants reported their familiarity with terms selected by three medical expert annotators, indicated any additional information needs, read an expert￾written PLS, and answered comprehension and ev… view at source ↗
Figure 2
Figure 2. Figure 2: Participant characteristics in the evaluation cohort. view at source ↗
Figure 3
Figure 3. Figure 3: Tukey mean-difference plot of within-participant comprehension scores. The blue line marks the observed mean difference ( ¯d = 0.80), the shaded band its 95% CI. One dot is one participant (jittered for overlaps). Dimension Static Interactive ∆ p pFDR Explanation 3.82 4.27 +0.45 < .001 .001 Importance 3.97 4.35 +0.38 < .001 .001 Tailored 3.73 4.09 +0.35 < .001 .001 Simplicity 4.21 4.50 +0.29 < .001 .001 Un… view at source ↗
Figure 4
Figure 4. Figure 4: Distributions of participant-reported topic familiarity, interest, and focus prefer view at source ↗
Figure 5
Figure 5. Figure 5: Spearman correlations (ρ) between 18 background variables and 6 outcomes. Of 108 tests, 9 reached nominal significance (5.4 expected by chance). None survived Benjamini￾Hochberg FDR correction (smallest q = .354). conversation turns (ρ = .14, p = .37), the number of user messages (ρ = .14, p = .37), nor user message length (ρ = .10, p = .49) predicted the comprehension gain. We also tested whether educatio… view at source ↗
Figure 6
Figure 6. Figure 6: Paired slope plot of comprehension scores (out of 12) for all 50 participants view at source ↗
Figure 7
Figure 7. Figure 7: Interface where participants were asked to rate term familiarity. view at source ↗
Figure 8
Figure 8. Figure 8: Interface where participants were asked to rate additional information they would view at source ↗
Figure 9
Figure 9. Figure 9: Interface where participants were asked to select all answers which applied to the view at source ↗
Figure 10
Figure 10. Figure 10: Interface where participants were asked to compare the expert-written summary view at source ↗
Figure 11
Figure 11. Figure 11: Interface where participants interacted with a chatbot to ask questions about the view at source ↗
Figure 12
Figure 12. Figure 12: Interface where participants were asked to select all that applied to the contents view at source ↗
Figure 13
Figure 13. Figure 13: Interface where participants were asked to compare the AI-generated summary view at source ↗
read the original abstract

Plain Language Summaries (PLS) aim to make research accessible to lay readers, but they are typically written in a one-size-fits-all style that ignores differences in readers' information needs and comprehension. In health contexts, this limitation is particularly important because misunderstanding scientific information can affect real-world decisions. Large language models (LLMs) offer new opportunities for personalizing PLS, but it remains unclear whether personalization helps, which strategies are most effective, and how to balance personalization with safety. We introduce ReLay, a dataset of 300 participant--PLS pairs from 50 lay participants in both static (expert-written) and interactive (LLM-personalized) settings. ReLay includes user characteristics, health information needs, information-seeking behavior, comprehension outcomes, interaction logs, and quality ratings. We use ReLay to evaluate five LLMs across two personalization methods. Personalization improves comprehension and perceived quality, but it also raises the risk of reinforcing user biases and introducing hallucinations, revealing a trade-off between personalization and safety. These findings highlight the need for personalization methods that are both effective and trustworthy for diverse lay audiences.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces the ReLay dataset of 300 participant-PLS pairs collected from 50 lay participants, comparing static expert-written plain-language summaries against interactive LLM-personalized versions in health contexts. It evaluates five LLMs across two personalization methods and reports that personalization improves comprehension and perceived quality while increasing risks of bias reinforcement and hallucinations, thereby identifying a trade-off between effectiveness and safety.

Significance. If the empirical findings on the trade-off are robust, the work is significant for providing a new public dataset to study personalized health communication and for highlighting practical risks in deploying LLMs for lay audiences. The dataset's inclusion of user characteristics, interaction logs, and quality ratings offers a concrete resource for follow-on research on accessible scientific summaries.

major comments (2)
  1. Abstract: The central claim that personalization 'raises the risk of reinforcing user biases and introducing hallucinations' is load-bearing for the paper's contribution, yet the abstract supplies no information on measurement protocols (e.g., pre/post attitude scales for bias or blinded source verification for hallucinations). The described data sources are limited to self-reported comprehension, quality ratings, and interaction logs, which do not directly quantify these safety risks.
  2. Methods/Results (inferred from abstract description): The directional findings rest on a sample of 50 participants without reported sample-size justification, statistical tests, effect sizes, or controls for demand effects and confounds. This leaves the observed differences open to alternative explanations and weakens support for generalization to health decisions.
minor comments (1)
  1. Abstract: The two personalization methods and the five LLMs evaluated are not named, reducing immediate clarity for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to improve clarity on measurement and analysis while maintaining the integrity of our findings.

read point-by-point responses
  1. Referee: Abstract: The central claim that personalization 'raises the risk of reinforcing user biases and introducing hallucinations' is load-bearing for the paper's contribution, yet the abstract supplies no information on measurement protocols (e.g., pre/post attitude scales for bias or blinded source verification for hallucinations). The described data sources are limited to self-reported comprehension, quality ratings, and interaction logs, which do not directly quantify these safety risks.

    Authors: We agree the abstract should briefly reference the protocols used. Hallucinations were identified via blinded expert comparison of generated PLS content against the original source paper for unsupported claims, while bias reinforcement was assessed by coding interaction logs for user acceptance of content aligning with pre-stated attitudes or misconceptions (detailed in Sections 4.3 and 4.4). We will revise the abstract to note these approaches and clarify that logs provide behavioral indicators supplementing self-reports. This makes the safety claims more transparent without altering the reported trade-off. revision: yes

  2. Referee: Methods/Results (inferred from abstract description): The directional findings rest on a sample of 50 participants without reported sample-size justification, statistical tests, effect sizes, or controls for demand effects and confounds. This leaves the observed differences open to alternative explanations and weakens support for generalization to health decisions.

    Authors: The 50-participant sample produced 300 PLS pairs across diverse health topics and was chosen to balance depth of interaction data with feasibility; we have added a post-hoc power analysis and justification in the revised Methods. We now report the statistical tests (paired comparisons and mixed models) and effect sizes in Results. We acknowledge the absence of explicit demand-effect controls (e.g., no deception check) as a limitation and have expanded the Discussion to discuss this and alternative explanations. Claims about generalization have been tempered to emphasize the exploratory nature of the work. revision: partial

Circularity Check

0 steps flagged

Empirical user study with newly collected data exhibits no circularity

full rationale

The paper describes an empirical user study that introduces the ReLay dataset of 300 participant-PLS pairs from 50 new lay participants, along with user characteristics, interaction logs, comprehension outcomes, and quality ratings. Central claims about personalization improving comprehension while raising bias and hallucination risks are derived directly from this fresh data collection and LLM evaluations across five models. No equations, fitted parameters, self-definitional constructs, or load-bearing self-citations reduce any result to prior inputs by construction. The derivation chain consists of standard experimental measurement and comparison, remaining self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of the human-subject data collection and LLM output analysis without external benchmarks or formal verification mentioned.

axioms (1)
  • domain assumption Self-reported comprehension scores and quality ratings from 50 lay participants accurately reflect true understanding and safety risks in health contexts.
    This assumption underpins interpretation of all reported improvements and risks.

pith-pipeline@v0.9.0 · 5531 in / 1121 out tokens · 78968 ms · 2026-05-09T19:57:43.568718+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

88 extracted references · 20 canonical work pages · 9 internal anchors

  1. [1]

    Advances in neural information processing systems , volume=

    Attention is all you need , author=. Advances in neural information processing systems , volume=

  2. [2]

    Proceedings of The 10th Workshop on Financial Technology and Natural Language Processing , pages=

    Synthesizing Behaviorally-Grounded Reasoning Chains: A Data-Generation Framework for Personal Finance LLMs , author=. Proceedings of The 10th Workshop on Financial Technology and Natural Language Processing , pages=

  3. [3]

    arXiv preprint arXiv:2407.11773 , year=

    Educational personalized learning path planning with large language models , author=. arXiv preprint arXiv:2407.11773 , year=

  4. [4]

    Periodicals of Engineering and Natural Sciences , volume=

    Large language models in medicine: A systematic review of applications in medical, healthcare, and educational contexts , author=. Periodicals of Engineering and Natural Sciences , volume=

  5. [5]

    NPJ Digital Medicine , volume=

    Benchmarking large language models for personalized, biomarker-based health intervention recommendations , author=. NPJ Digital Medicine , volume=. 2025 , publisher=

  6. [6]

    Iscience , volume=

    The application of large language models in medicine: A scoping review , author=. Iscience , volume=. 2024 , publisher=

  7. [7]

    Communications Medicine , volume=

    Current applications and challenges in large language models for patient care: a systematic review , author=. Communications Medicine , volume=. 2025 , publisher=

  8. [8]

    Health expectations , volume=

    Defining information need in health--assimilating complex theories derived from information science , author=. Health expectations , volume=. 2011 , publisher=

  9. [9]

    Journal of Medical Internet Research , volume=

    The evolution of patient empowerment and its impact on health care’s future , author=. Journal of Medical Internet Research , volume=. 2025 , publisher=

  10. [10]

    Nature Machine Intelligence , volume=

    The benefits, risks and bounds of personalizing the alignment of large language models to individuals , author=. Nature Machine Intelligence , volume=. 2024 , publisher=

  11. [11]

    Mei Tan, Christopher Mah, and Dorottya Demszky

    When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs , author=. arXiv preprint arXiv:2601.11000 , year=

  12. [12]

    Understanding bias reinforcement in llm agents debate.arXiv preprint arXiv:2503.16814, 2025

    Understanding bias reinforcement in llm agents debate , author=. arXiv preprint arXiv:2503.16814 , year=

  13. [13]

    Why Language Models Hallucinate

    Why language models hallucinate , author=. arXiv preprint arXiv:2509.04664 , year=

  14. [14]

    arXiv preprint arXiv:2408.14317 , year=

    Claim verification in the age of large language models: A survey , author=. arXiv preprint arXiv:2408.14317 , year=

  15. [15]

    The Innovation , year=

    A survey on llm-as-a-judge , author=. The Innovation , year=

  16. [16]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Retrieval-augmented generation for large language models: A survey , author=. arXiv preprint arXiv:2312.10997 , volume=

  17. [17]

    Steering Language Models With Activation Engineering

    Steering language models with activation engineering , author=. arXiv preprint arXiv:2308.10248 , year=

  18. [18]

    Whose story is it? Personalizing story generation by inferring author styles , author=. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics , pages=

  19. [19]

    American journal of pharmaceutical education , volume=

    Best practice strategies for effective use of questions as a teaching tool , author=. American journal of pharmaceutical education , volume=. 2013 , publisher=

  20. [20]

    Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Follow-up question generation for enhanced patient-provider conversations , author=. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

  21. [21]

    2004 , publisher=

    Health literacy: a prescription to end confusion , author=. 2004 , publisher=

  22. [22]

    BMC family practice , volume=

    The Single Item Literacy Screener: evaluation of a brief instrument to identify limited reading ability , author=. BMC family practice , volume=. 2006 , publisher=

  23. [23]

    , author=

    How numeracy influences risk comprehension and medical decision making. , author=. Psychological bulletin , volume=. 2009 , publisher=

  24. [24]

    and Cisewski, Jodi A

    Ahmad, Farida B. and Cisewski, Jodi A. and Anderson, Robert N. , title =. 2025 , month = sep, pages =

  25. [25]

    GPT-4o mini: advancing cost-efficient intelligence , year =

  26. [26]

    Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025) , pages=

    Readctrl: Personalizing text generation with readability-controlled instruction learning , author=. Proceedings of the Fourth Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2025) , pages=

  27. [27]

    Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

    GoEmotions: A dataset of fine-grained emotions , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

  28. [28]

    2026 , issn =

    PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization , journal =. 2026 , issn =. doi:https://doi.org/10.1016/j.jbi.2026.105019 , author =

  29. [29]

    OpenAI GPT-5 System Card

    Openai gpt-5 system card , author=. arXiv preprint arXiv:2601.03267 , year=

  30. [30]

    Bioinformatics , volume=

    MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval , author=. Bioinformatics , volume=. 2023 , publisher=

  31. [31]

    Biocomputing 2025: Proceedings of the Pacific Symposium , pages=

    Improving retrieval-augmented generation in medicine with iterative follow-up questions , author=. Biocomputing 2025: Proceedings of the Pacific Symposium , pages=. 2024 , organization=

  32. [32]

    Qwen3 Technical Report

    Qwen3 technical report , author=. arXiv preprint arXiv:2505.09388 , year=

  33. [33]

    MedGemma Technical Report

    MedGemma Technical Report , author=. arXiv preprint arXiv:2507.05201 , year=

  34. [34]

    Mistral 7B

    Mistral 7B , author=. arXiv preprint arXiv:2310.06825 , year=

  35. [35]

    Findings of the Association for Computational Linguistics: EMNLP 2022 , pages=

    SciFact-open: Towards open-domain scientific claim verification , author=. Findings of the Association for Computational Linguistics: EMNLP 2022 , pages=

  36. [36]

    GPT-4o System Card

    Gpt-4o system card , author=. arXiv preprint arXiv:2410.21276 , year=

  37. [37]

    ac—A subject pool for online experiments , author=

    Prolific. ac—A subject pool for online experiments , author=. Journal of behavioral and experimental finance , volume=. 2018 , publisher=

  38. [38]

    Findings of the Association for Computational Linguistics: ACL 2025 , pages=

    Biasguard: A reasoning-enhanced bias detection tool for large language models , author=. Findings of the Association for Computational Linguistics: ACL 2025 , pages=

  39. [39]

    USAF School of Aviation Medicine , year=

    Nonparametric discrimination: consistency properties , author=. USAF School of Aviation Medicine , year=

  40. [40]

    general

    Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the" general" audience , author=. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems , pages=

  41. [41]

    JMIR cancer , volume=

    Using ChatGPT to improve the presentation of plain language summaries of Cochrane systematic reviews about oncology interventions: cross-sectional study , author=. JMIR cancer , volume=. 2025 , publisher=

  42. [42]

    Israel Journal of Health Policy Research , volume=

    Promoting patient comprehension of relevant health information , author=. Israel Journal of Health Policy Research , volume=. 2018 , publisher=

  43. [43]

    The permanente journal , volume=

    Do patients understand? , author=. The permanente journal , volume=

  44. [44]

    PloS one , volume=

    What do editors of medical journals think about opportunities and barriers to advancement in the publication of plain language summaries? A qualitative analysis , author=. PloS one , volume=. 2026 , publisher=

  45. [45]

    The Patient-Patient-Centered Outcomes Research , volume=

    What author instructions do health journals provide for writing plain language summaries? A scoping review , author=. The Patient-Patient-Centered Outcomes Research , volume=. 2023 , publisher=

  46. [46]

    Research Involvement and Engagement , volume=

    A practical ‘How-To’Guide to plain language summaries (PLS) of peer-reviewed scientific publications: results of a multi-stakeholder initiative utilizing co-creation methodology , author=. Research Involvement and Engagement , volume=. 2022 , publisher=

  47. [47]

    Cochrane Evidence Synthesis and Methods , volume=

    The use of a large language model to create plain language summaries of evidence reviews in healthcare: A feasibility study , author=. Cochrane Evidence Synthesis and Methods , volume=. 2024 , publisher=

  48. [48]

    Cochrane evidence synthesis and methods , volume=

    ChatGPT-4o Compared With Human Researchers in Writing Plain-Language Summaries for Cochrane Reviews: A Blinded, Randomized Non-Inferiority Controlled Trial , author=. Cochrane evidence synthesis and methods , volume=. 2025 , publisher=

  49. [49]

    JAMIA open , volume=

    Using artificial intelligence to expedite and enhance plain language summary abstract writing of scientific content , author=. JAMIA open , volume=. 2025 , publisher=

  50. [50]

    arXiv preprint arXiv:2505.10409 , year=

    Are LLM-generated plain language summaries truly understandable? A large-scale crowdsourced evaluation , author=. arXiv preprint arXiv:2505.10409 , year=

  51. [51]

    NPJ digital medicine , volume=

    A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation , author=. NPJ digital medicine , volume=. 2025 , publisher=

  52. [52]

    Available: https://arxiv.org/abs/2503.05777

    Medical hallucinations in foundation models and their impact on healthcare , author=. arXiv preprint arXiv:2503.05777 , year=

  53. [53]

    Bias in Large Language Models: Origin, Evaluation, and Mitigation

    Bias in large language models: Origin, evaluation, and mitigation , author=. arXiv preprint arXiv:2411.10915 , year=

  54. [54]

    A survey of personalized large language models: Progress and future directions

    A survey of personalized large language models: Progress and future directions , author=. arXiv preprint arXiv:2502.11528 , year=

  55. [55]

    Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

    APPLS: evaluating evaluation metrics for plain language summarization , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

  56. [56]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Automated lay language summarization of biomedical scientific reviews , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  57. [57]

    Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

    Paragraph-level Simplification of Medical Texts , author=. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

  58. [58]

    Proceedings of the conference

    Summarizing, simplifying, and synthesizing medical evidence using GPT-3 (with varying success) , author=. Proceedings of the conference. Association for Computational Linguistics. Meeting , volume=

  59. [59]

    Journal of Biomedical Informatics , volume=

    Retrieval augmentation of large language models for lay language generation , author=. Journal of Biomedical Informatics , volume=. 2024 , publisher=

  60. [60]

    Patient education and counseling , volume=

    Health information needs and health-related quality of life in a diverse population of long-term cancer survivors , author=. Patient education and counseling , volume=. 2012 , publisher=

  61. [61]

    Patient education and counseling , volume=

    Understanding online health information: Evaluation, tools, and strategies , author=. Patient education and counseling , volume=. 2017 , publisher=

  62. [62]

    American journal of health behavior , volume=

    Tailored and targeted health communication: strategies for enhancing information relevance , author=. American journal of health behavior , volume=. 2003 , publisher=

  63. [63]

    Digital health , volume=

    Tailored health communication: Opportunities and challenges in the digital era , author=. Digital health , volume=. 2020 , publisher=

  64. [64]

    Kukafka , title =

    R. Kukafka , title =. Consumer Health Informatics , editor =. 2005 , publisher =. doi:10.1007/0-387-27652-1_3 , url =

  65. [65]

    World Wide Web , volume=

    When large language models meet personalization: Perspectives of challenges and opportunities , author=. World Wide Web , volume=. 2024 , publisher=

  66. [66]

    Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

    Personalized jargon identification for enhanced interdisciplinary communication , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

  67. [67]

    Journal of Imaging Informatics in Medicine , volume=

    Personalized impression generation for PET reports using large language models , author=. Journal of Imaging Informatics in Medicine , volume=. 2024 , publisher=

  68. [68]

    Smart Health , volume=

    ChatDiet: Empowering personalized nutrition-oriented food recommender chatbots through an LLM-augmented framework , author=. Smart Health , volume=. 2024 , publisher=

  69. [69]

    Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , pages=

    GPTCoach: Towards LLM-Based Physical Activity Coaching , author=. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems , pages=

  70. [70]

    Health-llm: Large language models for health prediction via wearable sensor data,

    Health-llm: Large language models for health prediction via wearable sensor data , author=. arXiv preprint arXiv:2401.06866 , year=

  71. [71]

    Rahmani, and Ramesh Jain

    Conversational health agents: A personalized llm-powered agent framework , author=. arXiv preprint arXiv:2310.02374 , year=

  72. [72]

    Journal of environment and health sciences , volume=

    Health literacy, social determinants of health, and disease prevention and control , author=. Journal of environment and health sciences , volume=

  73. [73]

    Healthcare , volume=

    Online health information seeking behavior: a systematic review , author=. Healthcare , volume=. 2021 , organization=

  74. [74]

    BMJ Evidence-Based Medicine , volume=

    New evidence pyramid , author=. BMJ Evidence-Based Medicine , volume=. 2016 , publisher=

  75. [75]

    Health education research , volume=

    Understanding tailoring in communicating about health , author=. Health education research , volume=. 2008 , publisher=

  76. [76]

    tailoring

    What do we mean by “tailoring” of medical information during clinical interactions? , author=. Patient Education and Counseling , volume=. 2024 , publisher=

  77. [77]

    Journal of medical Internet research , volume=

    Clarifying the concepts of personalization and tailoring of ehealth technologies: Multimethod qualitative study , author=. Journal of medical Internet research , volume=. 2024 , publisher=

  78. [78]

    Yearbook of medical informatics , volume=

    Safety and precision AI for a modern digital health system , author=. Yearbook of medical informatics , volume=. 2024 , publisher=

  79. [79]

    PLOS digital health , volume=

    Bias in medical AI: implications for clinical decision-making , author=. PLOS digital health , volume=. 2024 , publisher=

  80. [80]

    npj Digital Medicine , volume=

    Cognitive bias in clinical large language models , author=. npj Digital Medicine , volume=. 2025 , publisher=

Showing first 80 references.