Exploring Reinforcement Learning for Fluid Transitions Between Clinical Mental Healthcare and Everyday Wellness Support
Pith reviewed 2026-06-27 21:20 UTC · model grok-4.3
The pith
RL-optimized mental health intervention sequences produce benefits only after they end and sustain engagement better than fixed ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A contextual bandit that selects journaling prompts from clinical and wellness repertoires to optimize sustained journaling yields intervention benefits that appear after the active period ends and produces deepening engagement over time, in contrast to constant interventions that lead to later burnout and dropout.
What carries the argument
Contextual bandit that dynamically selects journaling prompts from clinical and wellness repertoires to optimize for sustained journaling.
If this is right
- Benefits of RL-optimized intervention sequences may appear only after the active intervention period ends.
- RL-generated interventions can lead to deepening engagement over time.
- Constant interventions can produce burnout and later dropout.
- Systems blending clinical and wellness interventions may need to incorporate stepping-back periods.
- Intensity of interventions may need to be reduced at times to avoid burnout while still maximizing gains.
Where Pith is reading between the lines
- Additional safety protocols or individual risk checks may be required even when the system optimizes only for journaling.
- The same dynamic-selection approach could be applied to other health behaviors that wax and wane.
- Data on when to insert stepping-back periods could be collected to refine timing rules.
Load-bearing premise
Optimizing for sustained journaling as the single goal produces coherent clinical-wellness care journeys without additional safety measures or clinical oversight.
What would settle it
A follow-up study in which post-intervention benefits fail to appear or in which engagement with RL-generated prompts does not increase over time relative to constant prompts.
Figures
read the original abstract
Mental health struggles wax and wane, yet clinical and wellness interventions typically operate separately, causing frequent breakdowns at care transitions. We explore reinforcement learning (RL) as a means to build digital health systems that deliver clinical and wellness interventions proactively, as part of a coherent care journey. We ask: what complexities does designing such a system involve? We built a contextual bandit that dynamically selects journaling prompts from clinical and wellness repertoires to optimize for an overarching health goal (sustained journaling) and deployed it in a four-week exploratory study (N=38). We found that, first, many benefits of RL-optimized intervention sequences appeared only after interventions ended, raising the question: Should systems that offer coherent clinical-wellness care journeys include stepping-back periods? If so, when and how? Second, participants most engaged with RL-generated interventions deepened their engagement over time, while those most engaged with a constant intervention tended to burn out and drop out later. It raises the question: When should a system blending clinical and wellness interventions reduce intensity to prevent burnout in versus sustain it to maximize treatment gains?
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports an exploratory four-week study (N=38) in which a contextual bandit RL system dynamically selects journaling prompts drawn from clinical and wellness repertoires, with the single optimization target of sustained journaling. The central observations are that apparent benefits of the RL-optimized sequences emerged only after the active intervention period, and that participants who engaged most with the RL-generated prompts showed deepening engagement over time while those receiving a constant intervention tended to burn out and drop out.
Significance. If the reported post-intervention effects and engagement dynamics prove robust, the work supplies concrete empirical observations that can inform the design of hybrid clinical-wellness digital mental-health systems. It explicitly surfaces two actionable open questions—whether coherent care journeys should incorporate deliberate stepping-back periods and how intensity should be modulated to avoid burnout—thereby contributing to the HCI literature on care transitions without overclaiming validated clinical efficacy.
major comments (1)
- [Methods and Results] Methods and Results sections: the manuscript provides no description of the statistical methods, baseline comparisons, effect-size calculations, or precise operationalization of 'benefits' and 'engagement' used to support the two central findings (post-intervention emergence of benefits; differential burnout trajectories). These details are load-bearing for evaluating whether the observations can be distinguished from noise or regression to the mean.
minor comments (1)
- [Abstract] Abstract: the sample size (N=38) and study duration (four weeks) appear only in the body; including them in the abstract would improve immediate context for readers.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the need for greater methodological transparency. We agree that the absence of detailed statistical descriptions, baselines, effect sizes, and operational definitions weakens the manuscript's ability to support its central claims. We will revise the Methods and Results sections to address this directly.
read point-by-point responses
-
Referee: [Methods and Results] Methods and Results sections: the manuscript provides no description of the statistical methods, baseline comparisons, effect-size calculations, or precise operationalization of 'benefits' and 'engagement' used to support the two central findings (post-intervention emergence of benefits; differential burnout trajectories). These details are load-bearing for evaluating whether the observations can be distinguished from noise or regression to the mean.
Authors: We fully agree that these details were omitted and are essential for interpreting the exploratory findings. In the revised manuscript we will add: (1) a Methods subsection specifying all statistical procedures (including any pre-registered or post-hoc tests, handling of missing data, and correction for multiple comparisons); (2) explicit baseline comparisons (pre-intervention scores, between-group contrasts where applicable); (3) effect-size reporting (e.g., Cohen’s d or rank-biserial correlations) for all key contrasts; and (4) precise operational definitions—'benefits' as changes on validated mental-health and engagement scales, 'engagement' as daily prompt completion rate plus qualitative depth indicators. These additions will allow readers to evaluate the post-intervention effects and burnout trajectories against regression to the mean or noise. revision: yes
Circularity Check
No significant circularity
full rationale
The paper presents an exploratory four-week deployment study of a contextual bandit for selecting journaling prompts, with observations on post-intervention effects and engagement patterns. No derivation chain, equations, fitted parameters, or first-principles predictions are described that could reduce to the study's own inputs by construction. The work frames its contribution as surfacing open questions rather than asserting validated causal mechanisms or unified models, making self-contained empirical reporting the central content.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Adrian Aguilera, Marvyn Arévalo Avalos, Jing Xu, Bibhas Chakraborty, Car- oline Figueroa, Faviola Garcia, Karina Rosales, Rosa Hernandez-Ramos, Chris Karr, Joseph Williams, et al. 2024. Effectiveness of a digital health intervention leveraging reinforcement learning: results from the Diabetes and Mental Health Adaptive Notification Tracking and Evaluation...
2024
-
[2]
Jules Angst, Alex Gamma, David S Baldwin, Vladeta Ajdacic-Gross, and Wulf Rössler. 2009. The generalized anxiety spectrum: prevalence, onset, course and outcome.European archives of psychiatry and clinical neuroscience259, 1 (2009), 37–45
2009
-
[3]
Chantal Backman, Rosie Papp, Aurelie Tonjock Kolle, Stephen R. Papp, S. Visintini, Ana Lúcia Schaefer Ferreira de Mello, Gabriela Marcellino de Melo Lanzoni, and Anne Harley. 2024. Platform-Based Patient-Clinician Digital Health Interventions for Care Transitions: Scoping Review.Journal of Medical Internet Research26 (2024). doi:10.2196/55753
-
[4]
Michael Bauer, Tasha Glenn, John Geddes, Michael Gitlin, Paul Grof, Lars V Kessing, Scott Monteith, Maria Faurholt-Jepsen, Emanuel Severus, and Peter C Whybrow. 2020. Smartphones in mental health: a critical review of background issues, current status and future concerns.International journal of bipolar disorders 8, 1 (2020), 2
2020
-
[5]
2024.Cognitive therapy of depression
Aaron T Beck, A John Rush, Brian F Shaw, Gary Emery, Robert J DeRubeis, and Steven D Hollon. 2024.Cognitive therapy of depression. Guilford Publications
2024
-
[6]
Peter Bower and Simon Gilbody. 2005. Stepped care in psychological therapies: access, effectiveness and efficiency: narrative literature review.The British Journal of Psychiatry186, 1 (2005), 11–17
2005
-
[7]
You Chen, Christoph U Lehmann, and Bradley A Malin. 2024. Digital Information Ecosystems in Modern Care Coordination and Patient Care Pathways and the Challenges and Opportunities for AI Solutions.Journal of Medical Internet Research26 (2024). doi:10.2196/60258
-
[8]
Deborah J Cohen, Sara R Keller, Gillian R Hayes, David A Dorr, Joan S Ash, and Dean F Sittig. 2016. Integrating patient-generated health data into clinical care settings or clinical decision-making: lessons learned from project healthdesign. JMIR human factors3, 2 (2016), e5919
2016
-
[9]
Guridi, Angel Hsing-Chi Hwang, Beth Kolko, Emma Elizabeth McGinty, and Qian Yang
Ned Cooper, Jose A. Guridi, Angel Hsing-Chi Hwang, Beth Kolko, Emma Elizabeth McGinty, and Qian Yang. 2026. Framing Responsible Design of AI for Mental Well-Being: AI as Primary Care, Nutritional Supplement, or Yoga Instructor?. In Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (Barcelona, Spain)(CHI ’26). Association for Com...
-
[10]
Pim Cuijpers, Mirjam Reijnders, and Marcus JH Huibers. 2019. The role of common factors in psychotherapy outcomes.Annual review of clinical psychology 15, 1 (2019), 207–231
2019
-
[11]
Nediyana Daskalova, Eindra Kyi, Kevin Ouyang, Arthur Borem, Sally Chen, Sung Hyun Park, Nicole Nugent, and Jeff Huang. 2021. Self-e: Smartphone- supported guidance for customizable self-experimentation. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13
2021
-
[12]
Rudy Douven, Minke Remmerswaal, and Tobias Vervliet. 2021. Payment schemes and treatment responses after a demand shock in mental health care.Health Economics30, 12 (2021), 2956–2973
2021
-
[13]
Robert A Emmons and Michael E McCullough. 2003. Counting blessings versus burdens: an experimental investigation of gratitude and subjective well-being in daily life.Journal of personality and social psychology84, 2 (2003), 377
2003
-
[14]
Christoph Flückiger and AC Del Re. 2017. The sleeper effect between psychother- apy orientations: A strategic argument of sustainability of treatment effects at follow-up.Epidemiology and psychiatric sciences26, 4 (2017), 442–444
2017
-
[15]
Steven D Hollon, Michael O Stewart, and Daniel Strunk. 2006. Enduring effects for cognitive behavior therapy in the treatment of depression and anxiety.Annu. Rev. Psychol.57, 1 (2006), 285–315
2006
-
[16]
Angel Hsing-Chi Hwang, Dan Adler, Meir Friedenberg, and Qian Yang. 2024. Societal-Scale Human-AI Interaction Design? How Hospitals and Companies are Integrating Pervasive Sensing into Mental Healthcare. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ’24). Association for Computing Machinery. doi:10.1145/3613904.3642793
-
[17]
Frank Iorfino, Sarah E Piper, Ante Prodan, Haley M LaMonica, Tracey A Dav- enport, Grace Yeeun Lee, William Capon, Elizabeth M Scott, Jo-An Occhipinti, and Ian B Hickie. 2021. Using digital technologies to facilitate care coordination between youth mental health services: a guide for implementation.Frontiers in Health Services1 (2021), 745456
2021
-
[18]
Eunkyung Jo, Myeonghan Ryu, Georgia Kenderova, Samuel So, Bryan Shapiro, Alexandra Papoutsaki, and Daniel A. Epstein. 2022. Designing Flexible Longitudi- nal Regimens: Supporting Clinician Planning for Discontinuation of Psychiatric Drugs. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Associ...
-
[19]
Alan E Kazdin. 2007. Mediators and mechanisms of change in psychotherapy research.Annu. Rev. Clin. Psychol.3 (2007), 1–27
2007
-
[20]
Frederick B King and Diana LaRocco. 2006. E-journaling: A strategy to support student reflection and understanding.Current Issues in Education9 (2006)
2006
-
[21]
Laura A King. 2001. The health benefits of writing about life goals.Personality and social psychology bulletin27, 7 (2001), 798–807
2001
-
[22]
Geza Kovacs, Zhengxuan Wu, and Michael S Bernstein. 2021. Not now, ask later: users weaken their behavior change regimen over time, but expect to re- strengthen it imminently. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14
2021
-
[23]
Haley M LaMonica, Frank Iorfino, Grace Yeeun Lee, Sarah Piper, Jo-An Occhipinti, Tracey A Davenport, Shane Cross, Alyssa Milton, Laura Ospina-Pinillos, Lisa Whittle, et al. 2022. Informing the future of integrated digital and clinical mental health care: synthesis of the outcomes from project synergy.JMIR mental health 9, 3 (2022), e33060
2022
-
[24]
Mina Lee, Percy Liang, and Qian Yang. 2022. Coauthor: Designing a human- ai collaborative writing dataset for exploring language model capabilities. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–19
2022
-
[25]
Gin S Malhi and J John Mann. 2018. Depression.The lancet392, 10161 (2018), 2299–2312
2018
-
[26]
William R Miller. 2014. Interactive journaling as a clinical tool.Journal of mental health counseling36, 1 (2014), 31–42
2014
-
[27]
David C Mohr, Mi Zhang, and Stephen M Schueller. 2017. Personal sensing: understanding mental health using ubiquitous sensors and machine learning. Annual review of clinical psychology13 (2017), 23–47
2017
-
[28]
Susan A Murphy. 2003. Optimal dynamic treatment regimes.Journal of the Royal Statistical Society Series B: Statistical Methodology65, 2 (2003), 331–355
2003
-
[29]
Inbal Nahum-Shani, Shawna N Smith, Bonnie J Spring, Linda M Collins, Katie Witkiewitz, Ambuj Tewari, and Susan A Murphy. 2016. Just-in-time adaptive interventions (JITAIs) in mobile health: key components and design principles for ongoing health behavior support.Annals of behavioral medicine(2016), 1–17
2016
-
[30]
2002.Seeking safety: A treatment manual for PTSD and substance abuse
Lisa Najavits. 2002.Seeking safety: A treatment manual for PTSD and substance abuse. Guilford Publications
2002
-
[31]
Jodie Nghiem, Daniel A Adler, Deborah Estrin, Cecilia Livesey, and Tanzeem Choudhury. 2023. Understanding mental health clinicians’ perceptions and con- cerns regarding using passive patient-generated health data for clinical decision- making: qualitative semistructured interview study.JMIR formative research7, 1 (2023), e47380
2023
-
[32]
James W Pennebaker and Sandra K Beall. 1986. Confronting a traumatic event: to- ward an understanding of inhibition and disease.Journal of abnormal psychology 95, 3 (1986), 274
1986
-
[33]
F Poolen, J Verhoeven, DJF van Schaik, MJ Reinders, MT van der Wart, and CH Vinkers. 2025. Systematic decision-making can help in ending long-term treatments.Tijdschrift voor psychiatrie67, 7 (2025), 403–406
2025
-
[34]
Anders Prior, C. Vestergaard, P. Vedsted, Susan M. Smith, L. Virgilsen, L. Ras- mussen, and M. Fenger-Grøn. 2023. Healthcare fragmentation, multimorbidity, potentially inappropriate medication, and mortality: a Danish nationwide cohort study.BMC Medicine21 (2023). doi:10.1186/s12916-023-03021-3
-
[35]
Caryn Kseniya Rubanovich, David C Mohr, and Stephen M Schueller. 2017. Health app use among individuals with symptoms of depression and anxiety: a survey study with thematic coding.JMIR mental health4, 2 (2017), e7603
2017
-
[36]
David A Sbarra, Adriel Boals, Ashley E Mason, Grace M Larson, and Matthias R Mehl. 2013. Expressive writing can impede emotional recovery following marital separation.Clinical Psychological Science1, 2 (2013), 120–134
2013
-
[37]
Jonathan Shedler. 2010. The efficacy of psychodynamic psychotherapy.American psychologist65, 2 (2010), 98
2010
-
[38]
David A Simon, Carmel Shachar, and I Glenn Cohen. 2022. Skating the line between general wellness products and regulated devices: strategies and implica- tions.Journal of Law and the Biosciences9, 2 (2022), lsac015
2022
-
[39]
Joshua M Smyth, Jillian A Johnson, Brandon J Auer, Erik Lehman, Giampaolo Talamo, and Christopher N Sciamanna. 2018. Online positive affect journaling in the improvement of mental distress and well-being in general medical patients with elevated anxiety symptoms: A preliminary randomized controlled trial. JMIR mental health5, 4 (2018), e11290
2018
-
[40]
Monika Sohal, Pavneet Singh, Bhupinder Singh Dhillon, and Harbir Singh Gill
-
[41]
Efficacy of journaling in the management of mental illness: a systematic review and meta-analysis.Family medicine and community health10, 1 (2022), e001154
2022
-
[42]
Ambuj Tewari and Susan A Murphy. 2017. From ads to interventions: Contextual bandits in mobile health.Mobile health: sensors, analytic methods, and applications (2017), 495–517
2017
-
[43]
Talia Wise, Yuewen Yang, Ryun Shim, Kevin Chuan-Kai Chang, Judeth Oden Choi, and Qian Yang. 2025. Investigating How Emerging Adults Explore Identity through Writing: Opportunities for AI Writing Assistants to Help. (2025), 2270–2282. doi:10.1145/3715336.3735848
-
[44]
Hans-Ulrich Wittchen, Roselind Lieb, Hildegard Pfister, and Peter Schuster. 2000. The waxing and waning of mental disorders: evaluating the stability of syndromes of mental disorders in the population.Comprehensive psychiatry41, 2 (2000), 122–132. Healthcare Beyond Reaction Workshop @ IH ’26, July 05–08, 2026, Porto, Portugal Wang and Yang
2000
-
[45]
Elad Yom-Tov, Guy Feraru, Mark Kozdoba, Shie Mannor, Moshe Tennenholtz, and Irit Hochberg. 2017. Encouraging physical activity in patients with diabetes: intervention using a reinforcement learning system.Journal of medical Internet research19, 10 (2017), e338
2017
-
[46]
Chao Yu, Jiming Liu, Shamim Nemati, and Guosheng Yin. 2021. Reinforcement learning in healthcare: A survey.ACM Computing Surveys (CSUR)55, 1 (2021), 1–36
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.