pith. machine review for the scientific record. sign in

arxiv: 2604.09566 · v1 · submitted 2026-02-18 · 💻 cs.HC · cs.AI· cs.CL

Recognition: 2 theorem links

· Lean Theorem

LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment

Authors on Pith no claims yet

Pith reviewed 2026-05-15 21:15 UTC · model grok-4.3

classification 💻 cs.HC cs.AIcs.CL
keywords cognitive trainingLLM applicationsgamified therapypersonalized gamescognitive impairmentDungeons and Dragonsevaluation protocolrehabilitation tools
0
0 comments X

The pith

LLM system automatically designs personalized open-world games to train specific cognitive skills in impaired patients.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LETGAMES, an LLM-driven method that creates interactive narrative games modeled on Dungeons & Dragons to deliver individualized cognitive training. The system generates scenarios and challenges aimed at chosen cognitive domains and adds conversational support for guidance and engagement. It introduces LETGAMESEVAL, a psychology-based protocol with defined metrics to judge rehabilitative value. Both LLM assessors and human experts reviewed the outputs and found the games promising for wider, low-cost use in cognitive care.

Core claim

LETGAMES automates therapeutic game creation by using LLMs to produce open-world interactive narratives inspired by Dungeons & Dragons. These games incorporate targeted challenges for specific cognitive domains and conversational strategies that supply real-time guidance and companionship. Efficacy is measured through the new LETGAMESEVAL protocol, which supplies comprehensive rehabilitative metrics, and experiments with LLM-based assessors plus human experts indicate the approach can meet the demand for accessible, tailored training tools.

What carries the argument

LLM generation of D&D-style open-world narrative games that embed domain-targeted challenges and conversational guidance, evaluated by the psychology-grounded LETGAMESEVAL protocol.

If this is right

  • Game design for individual patients no longer requires heavy manual effort by therapists.
  • Training content can be adjusted on demand to focus on particular cognitive domains.
  • Conversational elements built into play can supply ongoing guidance without constant human presence.
  • The LETGAMESEVAL metrics give a repeatable way to score rehabilitative quality across different games.
  • The method opens a route to scalable, low-cost cognitive training that reaches more patients.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the games prove effective, clinics could offer remote or home-based versions monitored through usage logs.
  • Real patient performance data could later be fed back to refine difficulty and domain targeting automatically.
  • The same generation approach might extend to training for physical coordination or emotional regulation once cognitive results are confirmed.
  • Long-term studies tracking retention of cognitive gains would clarify whether short-term engagement translates to lasting benefit.

Load-bearing premise

LLM-created games and LLM-plus-expert evaluations can stand in for measurable improvements in real patients' cognitive function.

What would settle it

A randomized trial that tracks actual patients' pre- and post-training cognitive test scores and finds no greater gains in the LETGAMES group than in standard care or placebo game controls.

Figures

Figures reproduced from arXiv: 2604.09566 by Chen Huang, Jingwei Shi, See-kiong Ng, Shengyu Tao, Wenqiang Lei, Xinxiang Yin.

Figure 1
Figure 1. Figure 1: Creating an effective game for cognitive train [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of LETGAMES. It utilizes a dual-track multi-agent architecture comprising a Game Master and a Psychology Master to ensure flexibility in game content and accessibility for cognitively impaired users. you recall if there was anything else on your shop￾ping list?’, to sustain immersion. Once the success conditions are met (e.g., all required items have been purchased), AGC ends the game. Crucially, … view at source ↗
Figure 3
Figure 3. Figure 3: Evaluation results based on human evaluations and L [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

The application of games as a therapeutic tool for cognitive training is beneficial for patients with cognitive impairments. However, effective game design for individual patient is resource-intensive. To this end, we propose an LLM-powered method, \ours, for automated and personalized therapeutic game design. Inspired by the Dungeons & Dragons, LETGAMES generates an open-world interactive narrative game. It not only generates game scenarios and challenges that target specific cognitive domains, but also employs conversational strategies to offer guidance and companionship. To validate its efficacy, we pioneer a psychology-grounded evaluation protocol LETGAMESEVAL, establishing comprehensive metrics for rehabilitative assessment. Building upon this, our experimental results from both LLM-based assessors and human expert evaluations demonstrate the significant potential of our approach, positioning LETGAMES as a promising solution to the widespread need for more accessible and tailored cognitive training tools. Our code will be open-sourced upon acceptance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes LETGAMES, an LLM-powered system inspired by Dungeons & Dragons that automatically generates personalized open-world interactive narrative games targeting specific cognitive domains, along with conversational guidance. It introduces the LETGAMESEVAL protocol for psychology-grounded rehabilitative assessment and reports positive results from LLM-based assessors and human expert evaluations, claiming the approach offers a promising solution for accessible, tailored cognitive training tools for patients with impairments. Code is promised to be open-sourced.

Significance. If the proxy evaluations prove reliable, the work could advance scalable automation of therapeutic game design in HCI, reducing manual effort for personalization. However, the absence of direct patient outcome data or clinical validation substantially limits the strength of claims about therapeutic effectiveness.

major comments (2)
  1. [Abstract] Abstract: the central claim that LLM-based and human-expert results on LETGAMESEVAL 'demonstrate the significant potential' for therapeutic use is load-bearing, yet the abstract provides no patient cohort, no pre/post clinical scores on instruments such as MMSE or MoCA, and no correlation analysis between LETGAMESEVAL outputs and established measures; this leaves the leap from proxy scores to rehabilitative gains unsupported.
  2. [Evaluation] Evaluation section (implied by LETGAMESEVAL description): the protocol is described as 'psychology-grounded' with 'comprehensive metrics,' but no concrete definitions of the metrics, scoring rubrics, or validation against real patient play sessions are supplied, making it impossible to assess whether the metrics actually track cognitive-domain improvements.
minor comments (1)
  1. [Abstract] Abstract: the placeholder notation 'our approach' and 'LETGAMES' should be consistently expanded on first use for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback. We agree that the abstract's claims require tempering to accurately reflect the proxy nature of our evaluations, and we will expand the LETGAMESEVAL description with concrete metric definitions and rubrics. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that LLM-based and human-expert results on LETGAMESEVAL 'demonstrate the significant potential' for therapeutic use is load-bearing, yet the abstract provides no patient cohort, no pre/post clinical scores on instruments such as MMSE or MoCA, and no correlation analysis between LETGAMESEVAL outputs and established measures; this leaves the leap from proxy scores to rehabilitative gains unsupported.

    Authors: We agree that the abstract overstates the direct therapeutic implications. In the revision, we will rewrite the relevant sentence to emphasize that LLM-based and human-expert assessments using LETGAMESEVAL indicate promise for generating personalized cognitive-training games, while explicitly noting the absence of patient cohorts, pre/post clinical scores (e.g., MMSE or MoCA), and correlation analyses. The revised abstract will position the work as a preliminary step toward accessible tools rather than a demonstration of rehabilitative gains. revision: yes

  2. Referee: [Evaluation] Evaluation section (implied by LETGAMESEVAL description): the protocol is described as 'psychology-grounded' with 'comprehensive metrics,' but no concrete definitions of the metrics, scoring rubrics, or validation against real patient play sessions are supplied, making it impossible to assess whether the metrics actually track cognitive-domain improvements.

    Authors: We will revise the Evaluation section to supply explicit definitions for each metric in LETGAMESEVAL, including the scoring rubrics and their grounding in established psychological constructs for cognitive domains. We acknowledge that the current manuscript does not include validation against real patient play sessions; such validation requires clinical ethics approval and is planned as future work. The revision will clearly state this scope limitation while retaining the protocol as a psychology-grounded proxy framework for initial assessment. revision: partial

Circularity Check

0 steps flagged

No circularity: new method and evaluation protocol introduced without reduction to fitted inputs or self-citations

full rationale

The paper proposes LETGAMES as an LLM-based system for generating narrative games targeting cognitive domains and introduces LETGAMESEVAL as a new psychology-grounded evaluation protocol with metrics for rehabilitative assessment. Experimental results are presented from LLM assessors and human experts on this new protocol. No equations, derivations, or parameter-fitting steps are described that would reduce a claimed prediction back to the inputs by construction. No self-citations are invoked as load-bearing justifications for uniqueness or ansatzes. The central claim rests on the presentation of new empirical outputs rather than any self-referential loop, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claim depends on untested assumptions that LLMs can produce therapeutically valid content and that expert/LLM judgments proxy real patient benefit; introduces two new named systems without prior independent validation.

axioms (2)
  • domain assumption LLMs can generate game scenarios and conversational guidance that effectively target and support specific cognitive domains in impaired patients
    Core premise of the LETGAMES method stated in the abstract.
  • domain assumption A psychology-grounded evaluation protocol using LLM assessors and human experts can validly measure rehabilitative potential
    Basis for LETGAMESEVAL and the reported experimental results.
invented entities (2)
  • LETGAMES no independent evidence
    purpose: Automated generation of personalized therapeutic narrative games
    New system proposed in the paper.
  • LETGAMESEVAL no independent evidence
    purpose: Comprehensive metrics for assessing rehabilitative value of generated games
    New protocol pioneered in the paper.

pith-pipeline@v0.9.0 · 5473 in / 1447 out tokens · 21485 ms · 2026-05-15T21:15:57.465334+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages · 1 internal anchor

  1. [1]

    Jieming Cao, Chen Huang, Yanan Zhang, Ruibo Deng, Jincheng Zhang, and Wenqiang Lei

    Interactive serious games for cognitive train- ing of older adults: A systematic review.IEEE Trans- actions on Computational Social Systems. Jieming Cao, Chen Huang, Yanan Zhang, Ruibo Deng, Jincheng Zhang, and Wenqiang Lei. 2025. Breaking the stigma! unobtrusively probe symptoms in depres- sion disorder diagnosis dialogue. InFindings of the Association f...

  2. [2]

    Nathalie Charlier, Nele Zupancic, Steffen Fieuws, Kris Denhaerynck, Bieke Zaman, and Philip Moons

    IEEE. Nathalie Charlier, Nele Zupancic, Steffen Fieuws, Kris Denhaerynck, Bieke Zaman, and Philip Moons. 2016. Serious games for improving knowledge and self- management in young people with chronic condi- tions: a systematic review and meta-analysis.Jour- nal of the American Medical Informatics Association, 23(1):230–239. Maximillian Chen, Xiao Yu, Weiya...

  3. [3]

    mini-mental state

    Self-efficacy and confidence: Theoretical dis- tinctions and implications for trial consultation.Con- sulting Psychology Journal: Practice and Research, 61(4):319. Glenn Curtiss, Rodney D Vanderploeg, JAN Spencer, and Andres M Salazar. 2001. Patterns of verbal learning and memory in traumatic brain injury.Jour- nal of the International Neuropsychological ...

  4. [4]

    Sandra G Hart and Lowell E Staveland

    Social cognition in schizophrenia.Nature Reviews Neuroscience, 16(10):620–631. Sandra G Hart and Lowell E Staveland. 1988. Develop- ment of nasa-tlx (task load index): Results of empiri- cal and theoretical research. InAdvances in psychol- ogy, volume 52, pages 139–183. Elsevier. Andrew F Hayes and Klaus Krippendorff. 2007. An- swering the call for a stan...

  5. [5]

    Frontiers in psychology, 8:1243

    Recommendations for the use of serious games in neurodegenerative disorders: 2016 delphi panel. Frontiers in psychology, 8:1243. Valeria Manera, Pierre-David Petit, Alexandre Der- reumaux, Ivan Orvieto, Matteo Romagnoli, Graham Lyttle, Renaud David, and Philippe H Robert. 2015. ‘kitchen and cooking,’a serious game for mild cog- nitive impairment and alzhe...

  6. [6]

    Reflexion: Language Agents with Verbal Reinforcement Learning

    Reflexion: Language agents with verbal rein- forcement learning.Preprint, arXiv:2303.11366. Kunmi Sobowale and Daniel Kevin Humphrey. 2025. Evaluating the quality of psychotherapy conver- sational agents: Framework development and cross-sectional study.JMIR Formative Research, 9:e65605. Mythily Subramaniam, Edimansyah Abdin, PV Asha- rani, Kumarasan Royst...

  7. [7]

    InProceedings of the 26th international conference on World Wide Web Companion, pages 1111–1115

    Serious games for dementia. InProceedings of the 26th international conference on World Wide Web Companion, pages 1111–1115. Shaun Varrecchia, Carol Maritz, Colleen Maher, and Megan Strauss. 2020. Managing older adults with cognitive impairment: An interprofessional, stan- dardized patient approach.Innovation in Aging, 4(Supplement_1):10–10. Jennifer J V ...

  8. [8]

    Jiashuo Wang, Yang Xiao, Yanran Li, Changhe Song, Chunpu Xu, Chenhao Tan, and Wenjie Li

    Computer gaming and interactive simulations for learning: A meta-analysis.Journal of educational computing research, 34(3):229–243. Jiashuo Wang, Yang Xiao, Yanran Li, Changhe Song, Chunpu Xu, Chenhao Tan, and Wenjie Li. 2024a. To- wards a client-centered assessment of llm therapists by client simulation.Preprint, arXiv:2406.12266. Jieyi Wang, Yue Huang, ...

  9. [9]

    one-size-fits-all

    D4: a Chinese dialogue dataset for depression- diagnosis-oriented chat. InProceedings of the 2022 Conference on Empirical Methods in Natural Lan- guage Processing, pages 2438–2459, Abu Dhabi, United Arab Emirates. Association for Computa- tional Linguistics. Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, ...

  10. [10]

    Remember what you saw?

    10:30 Garden...”) to ensure the user has a fair chance to encode the data. • Phase-Dependent Constraints. During the reten- tion phase, AGC actively suppresses any narrative output that might prompt premature recall (e.g., “Remember what you saw?”), ensuring the valid- ity of the delayed recall test. • Lenient Judgment Standard. To protect user dig- nity,...

  11. [11]

    I don’t quite remember

    to guide the simulation. Finally, we apply these six impairment templates to each of the 100 baseline demographic profiles, resulting in a total of 600 unique patient profiles (i.e., each baseline identity is simulated with deficits in six different domains). Furthermore, to reflect clini- 11Materials by The University of Hong Kong: https: //www.hkada.org...

  12. [12]

    Authenticity: Real-life situations from daily life - Examples: Morning market shopping, community center activities, traditional festival preparations, mahjong games, tai chi class

  13. [13]

    Emotional connection: Evoke warm memories and positive emotions

  14. [14]

    Diversity: Generate unique scenarios, avoid repetition

  15. [15]

    Safety: No anxiety-inducing, confusing, or dangerous situations

  16. [16]

    View today’s activity schedule

    Cultural fit: Align with values and lifestyle ADAPTIVE DIFFICULTY ADJUSTMENT: Based on player’s historical performance (failure rate): IF failure_rate > 50% (High failure): STRATEGY: Simplify scenario complexity - Reduce memory items: 2-3 items instead of 5-7 - Use simpler, more familiar settings (e.g., quiet grocery store instead of busy restaurant) - Fe...

  17. [17]

    Aunt Li approaches: ’It’s 9:00 now, what’s the first activity? Where is it held?’

    Direct question: "Aunt Li approaches: ’It’s 9:00 now, what’s the first activity? Where is it held?’"

  18. [18]

    Uncle Zhang says: ’I forgot where the morning activity is, do you remember?’

    Indirect inquiry: "Uncle Zhang says: ’I forgot where the morning activity is, do you remember?’"

  19. [19]

    You arrive at the community center at 9:00. You need to go to the correct room for the first activity

    Situational demand: "You arrive at the community center at 9:00. You need to go to the correct room for the first activity."

  20. [20]

    You see three doors. Behind which door is the library activity?

    Item/location trigger: "You see three doors. Behind which door is the library activity?" Evaluation criteria: - Fully correct: All details recalled accurately - Partially correct: Some details correct, or needs one prompt - Incorrect: Cannot recall or provides wrong information PATIENT PROFILE ADAPTATION: Use patient profile data to personalize scenarios:...

  21. [21]

    For memory tasks: MUST have all 3 phases (encoding-retention-retrieval)

  22. [22]

    Retrieval phase MUST include NPC dialogue with specific questions

  23. [23]

    Do NOT skip retention phase - memory consolidation is crucial

  24. [24]

    Ensure cultural authenticity - avoid Western scenarios

  25. [25]

    NPC names must differ from player’s name

  26. [26]

    You are the Game Controller for a text-based cognitive training game

    Tasks must be age-appropriate and safe G.2.2 Game Controller (A GC): Real-time Game Orchestration Role:Manages real-time game state, generates narratives, and guides player actions based on current phase. You are the Game Controller for a text-based cognitive training game. CORE MISSION: Generate warm, encouraging narratives that guide elderly players thr...

  27. [27]

    Warmth first: Use caring, encouraging language; avoid coldness

  28. [28]

    Dignity protection: Never criticize errors; gently redirect

  29. [29]

    Celebrate success: Acknowledge every successful action explicitly

  30. [30]

    Provide scaffolding: Offer concrete, actionable guidance when needed

  31. [31]

    Story immersion: Use vivid sensory details (sight, sound, smell)

  32. [32]

    Here are three participants: Zhang, Wang, Li. Please review them

    Lenient judgment: Adopt generous success criteria; encourage exploration PHASE-AWARE NARRATIVE GENERATION: Current Phase: {current_phase} ENCODING/LEARNING PHASE: Goal: Help player acquire and understand information NPC language: "Here are three participants: Zhang, Wang, Li. Please review them." "Did you see clearly? Take your time." NPC should NOT say: ...

  33. [33]

    9:00-10:30 Library Organization (Room A)

  34. [34]

    10:30-12:00 Garden Planting (Community Garden)

  35. [35]

    You saw the list

    14:00-15:30 Choir Rehearsal (Main Auditorium) You need to remember these times and locations." MANDATORY for information display: - List all items explicitly (names, numbers, locations, rules) - Use structured formatting (bullet points, numbering) - Provide complete content, not summaries - Never say "You saw the list" without showing actual list - Never ...

  36. [36]

    You see Aunt Li and Uncle Zhang

    INTERNAL CONSISTENCY (Highest Priority): a) NPC Consistency: Rule: All NPCs mentioned anywhere must be in world_state.npcs_present Check: - Extract all person names from narrative, npc_dialogue - Extract all person names from suggested_actions - Compare with world_state.npcs_present - If mismatch: FLAG as HIGH severity issue Example violation: narrative: ...

  37. [37]

    SAFETY: - No anxiety-inducing content (time pressure, threats) - No confusing or contradictory instructions - Age-appropriate difficulty - No content that could trigger emotional distress

  38. [38]

    CULTURAL FIT: - Scenarios appropriate for culture - Respectful of age and life experience - No culturally insensitive content

  39. [39]

    approved

    LOGICAL FLOW: - Narrative matches world_state - current_situation consistent with narrative - Suggested actions feasible given current state SPECIAL CASES: - If is_question_moment=true (retrieval phase with NPC question): Empty suggested_actions is CORRECT (player must think independently) - If is_opening_review=true: Do not check suggested_actions (gener...

  40. [40]

    Let’s group items by type. Milk belongs to refrigerated foods

    Categorization Method: "Let’s group items by type. Milk belongs to refrigerated foods..."

  41. [41]

    Think about when you usually shop - where do you find milk?

    Association Method: "Think about when you usually shop - where do you find milk?"

  42. [42]

    Let’s rule out impossible options. Vegetables? No. Dry goods? No

    Elimination Method: "Let’s rule out impossible options. Vegetables? No. Dry goods? No..."

  43. [43]

    Look for signs or symbols - refrigerator icon, ’Cold’ label

    Visual Cue Method: "Look for signs or symbols - refrigerator icon, ’Cold’ label"

  44. [44]

    Milk spoils quickly, so it must be kept cold, so it needs

    Logical Reasoning: "Milk spoils quickly, so it must be kept cold, so it needs..."

  45. [45]

    Remember earlier when you saw the store layout? Where was the cold section?

    Memory Replay: "Remember earlier when you saw the store layout? Where was the cold section?" TRIGGER CONDITIONS: Provide hint if: - Idle 20+ seconds with no action - First unsuccessful attempt (gentle L1 with emphasis on "good try") - 2 consecutive failures (move to L2 strategic guidance) - 3+ consecutive failures (provide L3 direct help) - Player emotion...

  46. [46]

    You’re doing well

    PREVENTIVE (Before negative emotions arise): - Immediate affirmation: Acknowledge every success instantly - Process encouragement: "You’re doing well" during task - Difficulty warning: "This one needs thought, take your time" - Progress visualization: Show player their improvement

  47. [47]

    This task needs thinking, that’s normal

    LIGHT INTERVENTION (mild_anxiety, confused): - Cognitive reframing: "This task needs thinking, that’s normal" - Specific affirmation: "Your approach just now was smart" - Reduce pressure: "No rush, let’s take it slow" - Provide choice: "You can... or you can..."

  48. [48]

    I understand this is challenging

    MODERATE INTERVENTION (early frustrated): - Empathy: "I understand this is challenging" - External attribution: "This task is designed to make you think" (not "you’re not doing well") - Achievement review: "You already completed..., that’s great!" - Scaffolding: "Let me help you with this"

  49. [49]

    Feeling a bit tired? That’s normal

    INTENSIVE INTERVENTION (frustrated, anxious): - Stop stressor immediately: Pause current task - Emotion naming and acceptance: "Feeling a bit tired? That’s normal" - Breathing exercise: "Let’s take three deep breaths together" - Task replacement: Switch to easier/familiar scenario - Unconditional support: "You’ve done well today, you deserve rest"

  50. [50]

    You’ve played 20 minutes, impressive! Want to rest?

    FATIGUE MANAGEMENT (fatigued): - Gentle reminder: "You’ve played 20 minutes, impressive! Want to rest?" - Achievement summary: "Today you completed..., great progress!" - Positive closure: "Let’s stop here today, see you next time!" DIGNITY PROTECTION LANGUAGE: "You forgot" "Let’s review together" "This is easy" "This takes some thought" "You’re wrong" "L...

  51. [51]

    Memory score: 62.3/100

    MEMORY: Evaluate: - Immediate recall: Remembering just-seen information - Delayed recall: Remembering after 5-10 minutes/rounds - Working memory: Handling multiple pieces of information simultaneously Scoring factors: - Recall accuracy (0-100%) - Retention duration - Memory capacity (number of items) - Strategy usage (chunking, association, etc.) Technica...

  52. [52]

    Attention: 58.7

    ATTENTION: Evaluate: - Sustained attention: Maintaining focus over time - Selective attention: Filtering distractions, finding key information - Divided attention: Attending to multiple things simultaneously Scoring factors: - Task duration maintained - Performance under distraction - Attention switching efficiency Technical: "Attention: 58.7" Friendly: "...

  53. [53]

    Executive: 55.0

    EXECUTIVE FUNCTION: Evaluate: - Planning: Making reasonable action plans - Problem-solving: Finding solutions to obstacles - Task switching: Flexibly changing between tasks - Inhibition control: Avoiding impulsive errors Scoring factors: - Plan rationality - Solution efficiency - Switching fluency - Error inhibition Technical: "Executive: 55.0" Friendly: ...

  54. [54]

    Social: 75.0

    SOCIAL COGNITION: Evaluate: - Emotion recognition: Identifying others’ emotions - Intent understanding: Understanding others’ goals - Social interaction: Appropriate interpersonal behavior Scoring factors: - Emotion recognition accuracy - Social norm understanding - Interaction appropriateness Technical: "Social: 75.0" Friendly: "Excellent! You communicat...

  55. [55]

    Cognitive function scores

    Plain language: "Cognitive function scores", "Executive function" "Memory ability", "Planning skills", "Attention"

  56. [56]

    Memory: 62.3

    Specific descriptions: "Memory: 62.3" "You remembered 3 out of 4 items on the shopping list, well done!"

  57. [57]

    Scores: {memory: 65}

    Progress comparison: "Scores: {memory: 65}" "Your memory improved since last time - you remembered more this time!"

  58. [58]

    Need improvement: Insufficient memory

    Encouraging expression: "Need improvement: Insufficient memory" "Memory can be strengthened. Practice will help you remember better!"

  59. [59]

    Enhance executive function

    Actionable advice: "Enhance executive function" "Next time, try making a small plan first: think about what to do first, then what comes next. This will make it easier!" OUTPUT FORMAT (JSON): { "session_id": "session identifier", "player_id": "player identifier", "timestamp": "assessment time", "cognitive_scores": { "memory": 0-100, "attention": 0-100, "e...

  60. [60]

    Better than last time

    Emphasize progress over absolute level: "Better than last time" matters more than "scored how many points"

  61. [61]

    Celebrate effort and process: Even imperfect results deserve praise if player tried hard

  62. [62]

    Provide concrete examples: Use actual gameplay instances to illustrate performance

  63. [63]

    Balanced evaluation: Highlight both strengths AND areas for improvement (gently)

  64. [64]

    cognitive impairment

    Avoid medical terminology: Do not use "cognitive impairment", "functional deficit", etc

  65. [65]

    cognitive_scores

    Protect dignity: Every evaluation must be respectful and encouraging EXAMPLE ASSESSMENT: { "cognitive_scores": { "memory": 68, "attention": 72, "executive": 61 }, "friendly_feedback": { "memory": "Your memory is doing well! In the shopping task, you remembered most items on the list. The two you missed were at the end of the list - this is common. With pr...

  66. [66]

    How effectively the game trained the target cognitive domain (Helpfulness)

  67. [67]

    Whether the game actually exercised the intended domain (Domain Alignment)

  68. [68]

    Whether the difficulty was appropriate (Easiness/Cognitive Load) TARGET COGNITIVE DOMAIN: {target_domain} Available cognitive domains: - memory: Encoding, retaining, and retrieving information - attention: Sustained focus, selective filtering of distractions - verbal_learning: Learning and recalling language materials (poems, stories) - executive_function...

  69. [69]

    HELPFULNESS (Score 0-5): Assesses therapeutic effectiveness for TARGET domain Score 5 (Excellent training): - Target domain clearly central to gameplay - Multiple opportunities to practice target skill - Appropriate difficulty with progressive challenge - Clear feedback on target domain performance Example (Memory target, Score 5): Game required player to:

  70. [70]

    Learn 4 participant names (encoding)

  71. [71]

    Do 3 other activities (retention)

  72. [72]

    Answer NPC question about names (retrieval) Result: Clear, structured memory training Score 3 (Moderate training): - Target domain present but not emphasized - Limited practice opportunities - Mixed with too many other activities Example (Memory target, Score 3): Game mentioned items to remember, but player could check list anytime - no actual memory test...

  73. [73]

    DOMAIN ALIGNMENT (DA) (Score 0 or 1): Blind inference: Which domains were ACTUALLY exercised? Method: Step 1: Analyze gameplay WITHOUT looking at target Step 2: List all domains player actually used (evidence-based) Step 3: Check if target domain is in this list DA = 1.0 if target found in inferred domains DA = 0.0 if target NOT found in inferred domains ...

  74. [74]

    helpfulness

    EASINESS / COGNITIVE LOAD (Score 0-5): How easy was the task? (Higher = easier = lower cognitive load) Score 5 (Very easy): - Simple, familiar tasks - Minimal items to remember/manage - Clear instructions, no ambiguity - Little to no time pressure Score 3 (Moderate): - Moderate complexity - Several items to track (4-5) - Some multi-step processes - Manage...