COSMIC: Emotionally Intelligent Agents to Support Mental and Emotional Well-being in Extreme Isolation: Lessons from Analog Astronaut Training Missions

Alexandra Covaci; A. Xygkou-Tsiamoulou; Chee Siang Ang; Jenny Yiend; Zeqi Jia

arxiv: 2604.07589 · v1 · submitted 2026-04-08 · 💻 cs.HC

COSMIC: Emotionally Intelligent Agents to Support Mental and Emotional Well-being in Extreme Isolation: Lessons from Analog Astronaut Training Missions

A. Xygkou-Tsiamoulou , Alexandra Covaci , Zeqi Jia , Jenny Yiend , Chee Siang Ang This is my paper

Pith reviewed 2026-05-10 17:11 UTC · model grok-4.3

classification 💻 cs.HC

keywords AI companionsextreme isolationmental health supportanalog astronaut missionsgenerative AIaffective computingspace psychology

0 comments

The pith

COSMIC deploys a generative AI companion with a diffusion avatar to deliver ongoing emotional support in simulated space isolation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents COSMIC as the first formal test of a high-fidelity AI companion that uses large language models and a synthesized visual avatar to address mental and emotional strain in isolated confined environments. It describes a modular architecture that maintains conversation history through short-term and long-term memory so the system can offer continuous rather than one-off support. The authors outline an observational evaluation plan at the LunAres analog station to measure effects on psychological resilience. A reader would care because psychological risks are identified as a leading hazard for future long-duration spaceflight, and any workable affective support tool could lower that hazard.

Core claim

COSMIC constitutes the inaugural investigation into deploying a high-fidelity emotionally intelligent AI companion in an analog astronaut setting. By integrating a Large Language Model architecture with a diffusion-based digital avatar interface, the system transcends task-oriented automation to supply longitudinal affective support. A modular architecture with short- and long-term memory systems is detailed, together with a naturalistic observational framework for tracking psychological resilience at the LunAres Research Station.

What carries the argument

The COSMIC modular architecture that couples an LLM for conversational interaction with a diffusion-based digital avatar for visual empathy, sustained by short- and long-term memory modules for temporal continuity.

Load-bearing premise

That the described modular architecture with short- and long-term memory and a diffusion-based avatar will actually deliver effective long-term emotional support when placed in real analog isolation conditions.

What would settle it

An observational study at an analog station that records no reduction in standard psychological strain measures for participants using the full COSMIC system relative to a no-AI baseline would falsify the claimed efficacy.

read the original abstract

As humanity pivots toward long-duration interplanetary travel, the psychological constraints of Isolated and Confined Environments (ICE) emerge as a primary mission risk. This paper presents COSMIC (COmpanion System for Mission Interaction and Communication) representing the inaugural investigation into the deployment of a high-fidelity, emotionally intelligent AI companion in an analog astronaut setting. By integrating a Large Language Model (LLM) architecture with a diffusion-based digital avatar interface, COSMIC transcends traditional task-oriented automation to provide longitudinal affective support. We detail a modular system architecture designed for temporal continuity through short- and long-term memory systems and outline a robust naturalistic observational framework for evaluating psychological resilience at the LunAres Research Station. This work constitutes the first formal submission in the field to evaluate the efficacy of state-of-the-art generative AI and synthesized visual empathy in mitigating the effects of extreme isolation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

COSMIC lays out a practical LLM-plus-avatar architecture for affective support in space analogs but delivers only a system description and proposed framework with no deployment data or results.

read the letter

The core of this paper is a description of COSMIC, a companion system that pairs a large language model with a diffusion-based avatar to give ongoing emotional support in isolated confined environments. The authors explain a modular setup that includes short-term and long-term memory to maintain conversation history across sessions, and they map out a naturalistic observation plan at the LunAres Research Station to track psychological resilience. That architecture is concrete enough to be useful if someone is actually building similar tools for extreme settings. The emphasis on affective rather than purely functional interaction also matches the real mental-health risks of long-duration missions. The paper does a decent job spelling out how modern generative components could be combined for visual and conversational continuity. The main gap is the lack of any actual outcomes. The text claims this is the first formal effort to evaluate efficacy of this kind of generative AI and visual empathy for isolation, yet it reports no pre/post measures, participant feedback, error rates, or comparative results from real analog use. The evaluation remains a plan rather than something executed, so the efficacy part is asserted rather than shown. This is the kind of work that could interest HCI researchers or space psychology groups who need examples of how to structure an affective agent. It is less useful for anyone looking for measured evidence that such a system helps. I would send it to peer review in an applications or systems track because the topic is timely and the design choices are described in enough detail to be critiqued and built upon, even if the evaluation section needs substantial strengthening.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces COSMIC, an AI companion system that integrates a Large Language Model with a diffusion-based digital avatar to deliver longitudinal affective support in Isolated and Confined Environments (ICE). It details a modular architecture incorporating short- and long-term memory systems for temporal continuity and outlines a naturalistic observational framework for evaluating psychological resilience during analog astronaut missions at the LunAres Research Station. The work positions itself as the first formal investigation and evaluation of state-of-the-art generative AI combined with synthesized visual empathy for mitigating the effects of extreme isolation.

Significance. If the described architecture is deployed and the proposed evaluation framework produces measurable outcomes on psychological resilience, the work could advance affective computing applications in human-computer interaction for extreme environments, with direct relevance to long-duration spaceflight risks. The modular design with explicit short- and long-term memory mechanisms is a constructive contribution for maintaining interaction continuity, and the use of diffusion models for visual empathy represents a timely extension of generative techniques to emotional support scenarios.

major comments (1)

[Abstract] Abstract: The manuscript claims to present the 'inaugural investigation into the deployment' of COSMIC and states that 'This work constitutes the first formal submission in the field to evaluate the efficacy of state-of-the-art generative AI and synthesized visual empathy in mitigating the effects of extreme isolation.' However, the text supplies only a high-level system architecture description and a prospective outline of an observational framework at LunAres, with no reported deployment, pre/post psychological measures, resilience metrics, participant feedback, error analysis, or comparative results. This gap directly undermines the central efficacy-evaluation claim.

minor comments (1)

[Abstract] Abstract and title: The title references 'Lessons from Analog Astronaut Training Missions,' yet the provided content focuses on system design and a future evaluation plan rather than concrete lessons, observations, or data drawn from completed missions.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. We address the single major comment below and agree that certain claims in the abstract require revision to accurately reflect the manuscript's scope as a system description and proposed evaluation framework.

read point-by-point responses

Referee: [Abstract] Abstract: The manuscript claims to present the 'inaugural investigation into the deployment' of COSMIC and states that 'This work constitutes the first formal submission in the field to evaluate the efficacy of state-of-the-art generative AI and synthesized visual empathy in mitigating the effects of extreme isolation.' However, the text supplies only a high-level system architecture description and a prospective outline of an observational framework at LunAres, with no reported deployment, pre/post psychological measures, resilience metrics, participant feedback, error analysis, or comparative results. This gap directly undermines the central efficacy-evaluation claim.

Authors: We acknowledge the validity of this observation. The current manuscript introduces the COSMIC architecture (LLM integration with diffusion-based avatar and short-/long-term memory modules) and outlines a naturalistic observational framework for future use at LunAres, but does not contain completed deployment data, psychological metrics, or efficacy results. The phrasing 'inaugural investigation into the deployment' and 'first formal submission... to evaluate the efficacy' therefore overstates the evaluative component. We will revise the abstract to state that this work presents the first integrated system of this type designed for affective support in ICE settings together with a proposed evaluation framework, with actual deployment and outcome measurement reserved for subsequent reports. This change will align the abstract with the manuscript content while preserving the novelty claim regarding the system design itself. revision: yes

Circularity Check

0 steps flagged

No circularity; descriptive system proposal with no derivations or fitted claims.

full rationale

The manuscript is a system-description paper that details a modular LLM-plus-diffusion-avatar architecture, short- and long-term memory components, and a proposed naturalistic observational framework at LunAres. No equations, quantitative predictions, fitted parameters, or derivation chains appear anywhere in the text. The central claim of being the first formal evaluation is a prospective assertion about the work itself rather than a result derived from prior steps that reduces to its own inputs by construction. No self-citations function as load-bearing uniqueness theorems, no ansatzes are smuggled in, and no renaming of known results occurs. The paper is therefore self-contained as an engineering proposal and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are specified in the abstract. The system relies on existing LLM and diffusion model technologies.

pith-pipeline@v0.9.0 · 5471 in / 998 out tokens · 43553 ms · 2026-05-10T17:11:14.590979+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

8 extracted references · 8 canonical work pages

[1]

Russell, D. W. (1996). UCLA Loneliness Scale (Version 3): Reliability and validity. Journal of Personality Assessment

work page 1996
[2]

Cohen, S., Kamarck, T., & Mermelstein, R. (1983). A global measure of perceived stress. Journal of Health and Social Behavior

work page 1983
[3]

NASA. (2025). Human Research Program (HRP) Evidence Reports on Isolated and Confined Environments

work page 2025
[4]

OpenAI. (2025). GPT-5 Technical Architecture and Reasoning Capabilities

work page 2025
[5]

LunAres Research Station. (2026). Standardized Analog Mission Training Protocols

work page 2026
[6]

Tong, F., Lederman, R., D’Alfonso, S., Berry, K., & Bucci, S. (2025). Development of a digital therapeutic alliance scale (MM-DTA) in the context of fully automated mental health apps. Behaviour & Information Technology. Advance online publication. https://doi.org/10.1080/0144929X.2025.246967

work page doi:10.1080/0144929x.2025.246967 2025
[7]

P., & McNair, D

Heuchert, J. P., & McNair, D. M. (2012). Profile of Mood States 2nd Edition (POMS 2) Multi-Health Systems

work page 2012
[8]

Spatola, N., Kühnlenz, B., & Cheng, G. (2021). Perception and evaluation in human–robot interaction: The Human–Robot Interaction Evaluation Scale (HRIES)—A multicomponent approach of anthropomorphism. International Journal of Social Robotics, 13(7), 1517-1539. Correspondence: Dr. A. Xygkou-Tsiamoulou (A.Xygkou-Tsiamoulou@kent.ac.uk)

work page 2021

[1] [1]

Russell, D. W. (1996). UCLA Loneliness Scale (Version 3): Reliability and validity. Journal of Personality Assessment

work page 1996

[2] [2]

Cohen, S., Kamarck, T., & Mermelstein, R. (1983). A global measure of perceived stress. Journal of Health and Social Behavior

work page 1983

[3] [3]

NASA. (2025). Human Research Program (HRP) Evidence Reports on Isolated and Confined Environments

work page 2025

[4] [4]

OpenAI. (2025). GPT-5 Technical Architecture and Reasoning Capabilities

work page 2025

[5] [5]

LunAres Research Station. (2026). Standardized Analog Mission Training Protocols

work page 2026

[6] [6]

Tong, F., Lederman, R., D’Alfonso, S., Berry, K., & Bucci, S. (2025). Development of a digital therapeutic alliance scale (MM-DTA) in the context of fully automated mental health apps. Behaviour & Information Technology. Advance online publication. https://doi.org/10.1080/0144929X.2025.246967

work page doi:10.1080/0144929x.2025.246967 2025

[7] [7]

P., & McNair, D

Heuchert, J. P., & McNair, D. M. (2012). Profile of Mood States 2nd Edition (POMS 2) Multi-Health Systems

work page 2012

[8] [8]

Spatola, N., Kühnlenz, B., & Cheng, G. (2021). Perception and evaluation in human–robot interaction: The Human–Robot Interaction Evaluation Scale (HRIES)—A multicomponent approach of anthropomorphism. International Journal of Social Robotics, 13(7), 1517-1539. Correspondence: Dr. A. Xygkou-Tsiamoulou (A.Xygkou-Tsiamoulou@kent.ac.uk)

work page 2021