arxiv: 2604.10587 · v1 · submitted 2026-04-12 · 💻 cs.HC

Recognition: unknown

CogInstrument: Modeling Cognitive Processes for Bidirectional Human-LLM Alignment in Planning Tasks

Anqi Wang , Dongyijie Pan , Xin Tong , Pan Hui

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:52 UTC · model grok-4.3

classification 💻 cs.HC

keywords cognitive motifshuman-LLM alignmentplanning tasksbidirectional collaborationgraphical reasoningcausal dependenciesuser agencyeditable interfaces

0 comments

The pith

CogInstrument turns implicit human reasoning into editable graphical motifs with causal links to improve alignment with LLMs during planning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that standard LLM chat interfaces hide the causal structure of human planning, so users cannot easily check or fix the logic behind outputs. CogInstrument extracts cognitive motifs from natural language, displays them as graphs of linked concepts, and lets users edit those graphs to negotiate changes with the model. A within-subjects study with twelve participants found that this approach supports more precise revisions and greater reuse than text-only dialogue. The system is presented as a way to give both sides a shared, inspectable model of the reasoning process. If the motifs accurately capture what matters, the result is higher user agency and trust in the collaboration.

Core claim

CogInstrument models user reasoning as cognitive motifs, which are compositional units of concepts joined by causal dependencies. These motifs are pulled from dialogue, shown as editable graphical structures, and used as the medium for iterative inspection and reconciliation between the human and the LLM. The paper states that this externalization converts opaque planning conversations into verifiable, revisable representations that both parties can negotiate directly.

What carries the argument

Cognitive motifs: revisable units of concepts connected by explicit causal dependencies that are extracted from natural language and rendered as editable graphs for bidirectional negotiation.

If this is right

Users can revise specific causal assumptions instead of restarting entire dialogues when an LLM output is misaligned.
Saved motifs become reusable templates that transfer across related planning problems.
The LLM receives explicit structural constraints rather than only surface-level text instructions.
Verification steps become possible at each causal link, raising the chance that flawed premises are caught before final plans are accepted.
The collaboration gains a persistent, inspectable record of the reasoning that both sides can reference later.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the motif representation proves stable across users, it could become a common intermediate layer for other planning tools that need to share causal structure with people.
The same graphical editing approach might extend to non-LLM systems where transparent reasoning chains are required, such as decision-support software.
Automated detection of motif conflicts could be added later to flag when user edits contradict earlier assumptions.
Longer-term use might reveal whether repeated motif editing leads to users developing more explicit mental models of their own planning habits.

Load-bearing premise

Human planning reasoning can be decomposed into discrete, revisable cognitive motifs whose causal links are reliably extractable from natural language and usefully edited in graphical form.

What would settle it

A replication study in which participants using the motif graphs show no measurable gain in detecting or correcting logical errors in LLM plans compared with a matched text-only interface would undermine the claim that the graphical externalization improves alignment.

Figures

Figures reproduced from arXiv: 2604.10587 by Anqi Wang, Dongyijie Pan, Pan Hui, Xin Tong.

**Figure 1.** Figure 1: From intent-centric prompting to cognition-centric interaction. We introduce a pipeline that extracts and structures [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Representative clarification probes for different [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: A motif is a reusable reasoning pattern with con [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: CogInstrument interface. Panels (A–E) provide synchronized views of the underlying reasoning state, ranging from high-level dialogue planning (A) and structural reasoning mapping (B–D) to direct intervention and patch management (E) [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Three interaction modes: user-driven revision, [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Paired participant trajectories and condition means [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: CogInstrument consists of three dependency types among concepts: enable, constraint, and determine. A motif is a reusable cognitive dependency pattern including at least two concepts. A.4 Cognitive Motif Formula Definition. A cognitive motif is formally: 𝜇 = (𝐶𝜇, 𝐸𝜇, 𝜙𝜇 ) (2) where: • 𝐶𝜇 ⊆ C: concept nodes in the motif • 𝐸𝜇 : causal edges within the motif • 𝜙𝜇 : abstract reasoning function (e.g., “constrai… view at source ↗

**Figure 8.** Figure 8: System framework of CogInstrument. A.5 System Design [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

**Figure 9.** Figure 9: Full system-log timelines for Participants P1–P4. These traces mainly illustrate lighter structural uptake and several [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Full system-log timelines for Participants P5–P8. Compared with Figure [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Full system-log timelines for Participants P9–P12. These traces highlight the broadest range of strategies in the [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

read the original abstract

Although Large Language Models (LLMs) demonstrate proficiency in knowledge-intensive tasks, current interfaces frequently precipitate cognitive misalignment by failing to externalize users' underlying reasoning structures. Existing tools typically represent intent as "flat lists," thereby disregarding the causal dependencies and revisable assumptions inherent in human decision-making. We introduce CogInstrument, a system that represents user reasoning through cognitive motifs-compositional, revisable units comprising concepts linked by causal dependencies. CogInstrument extracts these motifs from natural language interactions and renders them as editable graphical structures to facilitate bidirectional alignment. This structural externalization enables both the user and the LLM to inspect, negotiate, and reconcile reasoning processes iteratively. A within-subjects study (N=12) demonstrates that CogInstrument explicitly surfaces implicit reasoning structures, facilitating more targeted revision and reusability over conventional LLM-based dialogue interfaces. By enabling users to verify the logical grounding of LLM outputs, CogInstrument significantly enhances user agency, trust, and structural control over the collaboration. This work formalizes cognitive motifs as a fundamental unit for human-LLM alignment, providing a novel framework for achieving structured, reasoning-based human-AI collaboration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CogInstrument frames cognitive motifs as editable graphs to externalize reasoning in LLM planning, but the N=12 study gives no metrics or validation to back the alignment claims.

read the letter

The main thing to know is that this paper introduces cognitive motifs as compositional units with causal links, extracted from language and rendered as editable graphs for human-LLM planning collaboration. The N=12 within-subjects study claims better revision, trust, and agency than flat dialogue, yet supplies no tasks, extraction details, dependent variables, or statistical results to show those gains are real or due to the motifs themselves.

Referee Report

2 major / 3 minor

Summary. The paper introduces CogInstrument, a system that extracts 'cognitive motifs'—compositional units of concepts linked by causal dependencies—from natural language interactions in planning tasks and renders them as editable graphical structures. This externalization is intended to enable bidirectional alignment by allowing users and LLMs to inspect, revise, and reconcile reasoning processes. A within-subjects study with N=12 participants is presented as demonstrating that the system supports more targeted revision and reusability than conventional flat LLM dialogue interfaces, thereby increasing user agency, trust, and structural control.

Significance. If the empirical results are robust, the work could contribute a structured alternative to current LLM interfaces by formalizing cognitive motifs as a unit for human-AI alignment. The graphical rendering approach has potential to improve inspectability of reasoning chains in collaborative planning. The conceptual framing is novel within HCI, though its impact hinges on stronger validation of the motif extraction and measurable benefits.

major comments (2)

[§5 (User Study)] §5 (User Study): The within-subjects evaluation with N=12 reports qualitative benefits in revision, reusability, agency, and trust but provides no task descriptions, dependent variables, quantitative metrics, statistical tests, effect sizes, or controls for order effects and interface novelty. This directly undermines the abstract's claim that CogInstrument 'significantly enhances' these outcomes, as differences could stem from confounds rather than the motif structure itself.
[§3.2 (Cognitive Motif Extraction)] §3.2 (Cognitive Motif Extraction): The procedure for identifying causal dependencies and revisable assumptions from natural language is described at a high level without validation against human annotations, inter-rater reliability, or ablation showing that the graphical representation (vs. text alone) drives the reported gains. This is load-bearing for the central claim that motifs accurately decompose reasoning into editable units.

minor comments (3)

[Abstract] The abstract and introduction could include one or two concrete examples of a cognitive motif (e.g., a planning scenario with extracted concepts and dependencies) to clarify the representation before the system description.
[Figure 2] Figure captions and the system architecture diagram would benefit from explicit labels indicating which components handle extraction versus rendering versus LLM negotiation.
[§2] Related work should reference prior HCI systems on externalizing reasoning (e.g., argument mapping or causal diagramming tools) to better position the novelty of cognitive motifs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. The feedback identifies key areas where additional detail and precision are needed to support our claims. We respond to each major comment below, indicating the revisions we will incorporate in the next version of the manuscript.

read point-by-point responses

Referee: [§5 (User Study)] §5 (User Study): The within-subjects evaluation with N=12 reports qualitative benefits in revision, reusability, agency, and trust but provides no task descriptions, dependent variables, quantitative metrics, statistical tests, effect sizes, or controls for order effects and interface novelty. This directly undermines the abstract's claim that CogInstrument 'significantly enhances' these outcomes, as differences could stem from confounds rather than the motif structure itself.

Authors: We agree that §5 requires substantial expansion to provide a clearer account of the evaluation. The study was exploratory and relied primarily on qualitative data from semi-structured interviews and interaction logs to surface themes around targeted revision and perceived agency. In the revision we will: (1) add explicit task descriptions for the planning scenarios used, (2) define the dependent variables (e.g., revision count, reuse of motifs, Likert-scale measures of agency and trust), (3) report any quantitative observations collected (e.g., edit frequency, session duration), and (4) include a dedicated limitations subsection addressing order effects, interface novelty, and the absence of statistical testing given the small sample. We will also revise the abstract to replace the phrase 'significantly enhances' with 'supports greater' or 'facilitates improved' to align with the exploratory, qualitative nature of the evidence and to avoid implying statistical significance. revision: partial
Referee: [§3.2 (Cognitive Motif Extraction)] §3.2 (Cognitive Motif Extraction): The procedure for identifying causal dependencies and revisable assumptions from natural language is described at a high level without validation against human annotations, inter-rater reliability, or ablation showing that the graphical representation (vs. text alone) drives the reported gains. This is load-bearing for the central claim that motifs accurately decompose reasoning into editable units.

Authors: We acknowledge that §3.2 currently presents the extraction process at a high level. We will expand the section with the full prompt template, decision rules for detecting causal links and revisable assumptions, and multiple concrete input-output examples from the study sessions. While the extraction is LLM-driven rather than manually annotated, we will add a small-scale validation subsection comparing a sample of automatically extracted motifs against independent human annotations (with agreement metrics). The main study already contrasts the full graphical motif interface against a standard text-only LLM dialogue baseline; we will strengthen the discussion to clarify how this comparison isolates the contribution of the structured, editable motif representation versus flat text. An explicit ablation of graphical versus textual motif rendering is beyond the current scope but will be noted as future work. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical user study

full rationale

The paper introduces CogInstrument as a novel interface for externalizing cognitive motifs and evaluates its benefits via a within-subjects user study (N=12). No mathematical derivations, equations, fitted parameters, or load-bearing self-citations appear in the abstract or described content. The central claims about improved revision, reusability, agency, trust, and structural control are presented as outcomes of the empirical demonstration rather than any self-referential definitions, constructed predictions, or reductions to prior inputs by construction. The formalization of cognitive motifs is introduced as a new framework without tautological loops or ansatz smuggling.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach rests on domain assumptions about human cognition rather than new mathematical constructs; no free parameters or invented physical entities are introduced.

axioms (1)

domain assumption Human decision-making in planning tasks involves causal dependencies and revisable assumptions that can be decomposed into compositional cognitive motifs.
Invoked as the basis for extracting and rendering reasoning structures from natural language interactions.

invented entities (1)

cognitive motifs no independent evidence
purpose: To serve as the fundamental unit for representing and externalizing user reasoning in a form editable by both humans and LLMs.
Newly defined compositional units without independent empirical validation outside this work.

pith-pipeline@v0.9.0 · 5501 in / 1268 out tokens · 38935 ms · 2026-05-10T15:52:45.643022+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

80 extracted references · 55 canonical work pages

[1]

M., Basappa, R.,Bergsmann,S.,Bouneffouf,D.,Callaghan,P.,Cavazza, M., Chaminade, T.,

Saleema Amershi, Maya Cakmak, W. Bradley Knox, and Todd Kulesza. 2014. Power to the People: The Role of Humans in In- teractive Machine Learning.AI Magazine35, 4 (2014), 105–120. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1609/aimag.v35i4.2513 doi:10.1609/aimag.v35i4.2513

work page doi:10.1609/aimag.v35i4.2513 2014
[2]

Chris Argyris. 2002. Teaching Smart People How to Learn.Reflections: The SoL Journal4, 2 (Dec. 2002), 4–15. doi:10.1162/152417302762251291

work page doi:10.1162/152417302762251291 2002
[3]

Michel Beaudouin-Lafon. 2000. Instrumental interaction: an interaction model for designing post-WIMP user interfaces. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(The Hague, The Netherlands)(CHI ’00). Association for Computing Machinery, New York, NY, USA, 446–453. doi:10.114 5/332040.332473

work page arXiv 2000
[4]

Bradshaw, Paul J

Jeffrey M. Bradshaw, Paul J. Feltovich, Hyuckchul Jung, Shriniwas Kulkarni, William Taysom, and Andrzej Uszok. 2003. Dimensions of Adjustable Auton- omy and Mixed-Initiative Interaction. InInternational Workshop on Conceptual Autonomy. Springer, 17–39

2003
[5]

Ulrik Brandes and Boris Köpf. 2002. Fast and Simple Horizontal Coordinate Assignment.Graph Drawing(2002), 31–44

2002
[6]

Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis. Qualitative research in sport, exercise and health11, 4 (2019), 589–597

2019
[7]

Ruth M. J. Byrne. 2005.The Rational Imagination: How People Create Alternatives to Reality. MIT Press, Cambridge, MA

2005
[9]

DaEun Choi, Kihoon Son, Jaesang Yu, HyunJoon Jung, and Juho Kim. 2025. IdeaBlocks: Expressing and Reusing Exploratory Intents for Design Exploration with Generative AI. InAdjunct Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. ACM, Busan Republic of Korea, 1–4. doi:10.1145/3746058.3759001

work page doi:10.1145/3746058.3759001 2025
[10]

H., & Brennan, S

Herbert H. Clark and Susan E. Brennan. 1991. Grounding in Communication. In Perspectives on Socially Shared Cognition. American Psychological Association, 127–149. doi:10.1037/10096-006

work page doi:10.1037/10096-006 1991
[11]

Adam J Coscia, Shunan Guo, Eunyee Koh, and Alex Endert. 2025. OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. ACM, Busan Republic of Korea, 1–18. doi:10.1145/3746059.3747746

work page doi:10.1145/3746059.3747746 2025
[12]

Gray, Erik Heishman, Fayin Li, Azriel Rosenfeld, Michael J

Zoran Duric, Wayne D. Gray, Erik Heishman, Fayin Li, Azriel Rosenfeld, Michael J. Schoelles, Christian Schunn, and Harry Wechsler. 2002. Integrating Perceptual and Cognitive Modeling for Adaptive and Intelligent Human-Computer Interac- tion.Proc. IEEE90, 7 (jul 2002), 1272–1289. doi:10.1109/JPROC.2002.801449

work page doi:10.1109/jproc.2002.801449 2002
[13]

Karin Ericsson and Herbert A. Simon. 1980. Verbal reports as data.Psychological Review87 (1980), 215–251. https://api.semanticscholar.org/CorpusID:144763091

1980
[14]

Li Feng, Ryan Yen, Yuzhe You, Mingming Fan, Jian Zhao, and Zhicong Lu. 2024. CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming. InProceedings of the CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–21. doi:10.1145/3613904.3642212

work page doi:10.1145/3613904.3642212 2024
[15]

1988.Knowledge in Flux: Modeling the Dynamics of Epistemic States

Peter Gärdenfors. 1988.Knowledge in Flux: Modeling the Dynamics of Epistemic States. MIT Press, Cambridge, MA

1988
[16]

Dedre Gentner. 1983. Structure-mapping: A theoretical framework for analogy. Cognitive Science7, 2 (1983), 155–170

1983
[17]

Frederic Gmeiner, Nicolai Marquardt, Michael Bentley, Hugo Romat, Michel Pahud, David Brown, Asta Roseway, Nikolas Martelaro, Kenneth Holstein, Ken Hinckley, and Nathalie Riche. 2025. Intent Tagging: Exploring Micro-Prompting Interactions for Supporting Granular Human-GenAI Co-Creation Workflows. In Proceedings of the 2025 CHI Conference on Human Factors ...

work page doi:10.1145/3706598.3713861 2025
[18]

Goodman, Joshua B

Noah D. Goodman, Joshua B. Tenenbaum, and Tobias Gerstenberg. 2015.Concepts in a Probabilistic Language of Thought. The MIT Press, 623–654. http://www.js tor.org/stable/j.ctt17kk9nr.27

2015
[19]

Sobel, Laura E

Alison Gopnik, Clark Glymour, David M. Sobel, Laura E. Schulz, Tamar Kushnir, and David Danks. 2004. A Theory of Causal Learning in Children: Causal Maps and Bayes Nets.Psychological Review111, 1 (2004), 3–32. doi:10.1037/0033- 295X.111.1.3

work page doi:10.1037/0033- 2004
[20]

Griffiths, Nick Chater, Charles Kemp, Amy Perfors, and Joshua B

Thomas L. Griffiths, Nick Chater, Charles Kemp, Amy Perfors, and Joshua B. Tenenbaum. 2010. Probabilistic models of cognition: Exploring representations and inductive biases.Trends in Cognitive Sciences14, 8 (2010), 357–364

2010
[21]

Alicia Guo, Shreya Sathyanarayanan, Leijie Wang, Jeffrey Heer, and Amy X. Zhang. 2025. From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice. InProceedings of the 2025 Conference on Creativity and Cognition. ACM, Virtual United Kingdom, 527–545. doi:10.1145/3698061.3726910

work page doi:10.1145/3698061.3726910 2025
[22]

and Nilsson, Nils J

Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. 1968. A Formal Basis for the Heuristic Determination of Minimum Cost Paths.IEEE Transactions on Systems Science and Cybernetics4, 2 (1968), 100–107. doi:10.1109/TSSC.1968.300136

work page doi:10.1109/tssc.1968.300136 1968
[23]

Jeffrey Heer. 2019. Agency plus Automation: Designing Artificial Intelligence into Interactive Systems.Proceedings of the National Academy of Sciences (PNAS) 116, 6 (2019), 1844–1850. doi:10.1073/pnas.1807184115

work page doi:10.1073/pnas.1807184115 2019
[24]

Eric Horvitz. 1999. Principles of Mixed-Initiative User Interfaces. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. 159–166. doi:10.1145/302979.303030

work page doi:10.1145/302979.303030 1999
[25]

Ziheng Huang, Kexin Quan, Joel Chan, and Stephen MacNeil. 2023. CausalMap- per: Challenging designers to think in systems with Causal Maps and Large Language Model. InCreativity and Cognition. ACM, Virtual Event USA, 325–329. doi:10.1145/3591196.3596818 TLDR: CausalMapper is presented, a mixed- initiative system, that leverages a large language model (LLM...

work page doi:10.1145/3591196.3596818 2023
[26]

1995.Cognition in the Wild

Edwin Hutchins. 1995.Cognition in the Wild. The MIT Press. doi:10.7551/mitpre ss/1881.001.0001

work page doi:10.7551/mitpre 1995
[27]

Hutchins, James D

Edwin L. Hutchins, James D. Hollan, and Donald A. Norman. 1985. Direct manipulation interfaces.Hum.-Comput. Interact.1, 4 (Dec. 1985), 311–338. doi:10 .1207/s15327051hci0104_2

1985
[28]

Dirk Ifenthaler. 2011. Identifying cross-domain distinguishing features of cog- nitive structure.Educational Technology Research and Development59, 6 (Dec. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Anqi Wang, Dongyijie Pan, Xin Tong, and Pan Hui 2011), 817–840. doi:10.1007/s11423-011-9207-4

work page doi:10.1007/s11423-011-9207-4 2011
[29]

Schulz, and Joshua B

Julian Jara-Ettinger, Laura E. Schulz, and Joshua B. Tenenbaum. 2020. The Naïve Utility Calculus as a unified, quantitative framework for action understanding. Cognitive Psychology123 (2020), 101334. doi:10.1016/j.cogpsych.2020.101334

work page doi:10.1016/j.cogpsych.2020.101334 2020
[30]

Dae Hyun Kim, Daeheon Jeong, Shakhnozakhon Yadgarova, Hyungyu Shin, Jinho Son, Hariharan Subramonyam, and Juho Kim. 2025. PlanTogether: Facilitating AI Application Planning Using Information Graphs and Large Language Models. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–23. doi:10.1145/3706598.3714044

work page doi:10.1145/3706598.3714044 2025
[31]

Tae Soo Kim, Yoonjoo Lee, Minsuk Chang, and Juho Kim. 2023. Cells, Gen- erators, and Lenses: Design Framework for Object-Oriented Interaction with Large Language Models. InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco CA USA, 1–18. doi:10.1145/3586183.3606833

work page doi:10.1145/3586183.3606833 2023
[32]

Yoonsu Kim, Brandon Chin, Kihoon Son, Seoyoung Kim, and Juho Kim. 2025. IntentFlow: Interactive Support for Communicating Intent with LLMs in Writing Tasks. doi:10.48550/arXiv.2507.22134

work page doi:10.48550/arxiv.2507.22134 2025
[33]

Lake, Ruslan Salakhutdinov, and Joshua B

Brenden M. Lake, Ruslan Salakhutdinov, and Joshua B. Tenenbaum. 2015. Human- level concept learning through probabilistic program induction.Science350, 6266 (2015), 1332–1338

2015
[34]

Lake, Tomer D

Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, and Samuel J. Gersh- man. 2017. Building machines that learn and think like people.Behavioral and Brain Sciences40 (2017), e253. doi:10.1017/S0140525X16001837

work page doi:10.1017/s0140525x16001837 2017
[35]

Chance Jiajie Li, Jiayi Wu, Zhenze Mo, Ao Qu, Yuhan Tang, Kaiya Ivy Zhao, Yulu Gan, Jie Fan, Jiangbo Yu, Jinhua Zhao, Paul Liang, Luis Alonso, and Kent Larson
[36]

Simulating society requires simulating thought,

Simulating Society Requires Simulating Thought. arXiv:2506.06958 [cs] doi:10.48550/arXiv.2506.06958

work page doi:10.48550/arxiv.2506.06958
[37]

Shuai Ma, Qiaoyi Chen, Xinru Wang, Chengbo Zheng, Zhenhui Peng, Ming Yin, and Xiaojuan Ma. 2025. Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Article 261, 261:1–261:23 pages. doi:10.1145/3706598.3713423

work page doi:10.1145/3706598.3713423 2025
[38]

Sara McNeil. 2015. Visualizing mental models: understanding cognitive change to support teaching and learning of multimedia design and development.Educa- tional Technology Research and Development63, 1 (Feb. 2015), 73–96. doi:10.1007/ s11423-014-9354-5

2015
[39]

Yu Mei, Yuanxi Wang, Shiyi Wang, Qingyang Wan, Zhuojun Li, Chun Yu, Weinan Shi, and Yuanchun Shi. 2025. InterQuest: A Mixed-Initiative Framework for Dynamic User Interest Modeling in Conversational Search. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. ACM, Busan Republic of Korea, 1–23. doi:10.1145/3746059.3747753

work page doi:10.1145/3746059.3747753 2025
[40]

Donald A. Norman. 1986. Cognitive Engineering. InUser Centered System Design (0 ed.). CRC Press, Boca Raton, 31–62. doi:10.1201/b15703-3

work page doi:10.1201/b15703-3 1986
[41]

Donald A. Norman. 2013.The design of everyday things(rev. and expanded edition ed.). MIT press, Cambridge (Mass.)

2013
[42]

Judith Reitman Olson and Gary M. Olson. 1995. The Growth of Cognitive Modeling in Human-Computer Interaction Since GOMS. InReadings in Human– Computer Interaction: Toward the Year 2000, Ronald M. Baecker, Jonathan Grudin, William A. S. Buxton, and Saul Greenberg (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 603–625. doi:10.1016/B978-0-0...

work page doi:10.1016/b978-0-08-051574-8.50063-7 1995
[43]

2009.Causality: Models, Reasoning, and Inference(2 ed.)

Judea Pearl. 2009.Causality: Models, Reasoning, and Inference(2 ed.). Cambridge University Press, Cambridge, UK

2009
[44]

2002.People and Technology: A Cognitive Approach to Contempo- rary Instruments

Pierre Rabardel. 2002.People and Technology: A Cognitive Approach to Contempo- rary Instruments. Université Paris 8, Paris. Translated by Heidi Wood

2002
[45]

Nathalie Riche, Anna Offenwanger, Frederic Gmeiner, David Brown, Hugo Romat, Michel Pahud, Nicolai Marquardt, Kori Inkpen, and Ken Hinckley. 2025. AI- Instruments: Embodying Prompts as Instruments to Abstract & Reflect Graphical Interface Commands as General-Purpose Tools. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM...

work page doi:10.1145/3706598.3714259 2025
[46]

Rumelhart

David E. Rumelhart. 1980. Schemata: The building blocks of cognition. In Theoretical Issues in Reading Comprehension, Rand J. Spiro, Bertram C. Bruce, and William F. Brewer (Eds.). Lawrence Erlbaum Associates, Hillsdale, NJ, 33–58

1980
[47]

Gaver, Jacob Beaver, and Steve Benford

Dario D. Salvucci and Frank J. Lee. 2003. Simple cognitive modeling in a complex cognitive architecture. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Ft. Lauderdale Florida USA, 265–272. doi:10.1145/ 642611.642658

work page arXiv 2003
[48]

Omar Shaikh, Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz
[49]

Shaikh, H

Navigating Rifts in Human-LLM Grounding: Study and Benchmark.arXiv preprint arXiv:2503.13975(2025). arXiv:2503.13975 [cs.CL] https://arxiv.org/abs/ 2503.13975

work page arXiv 2025
[50]

Bernstein

Omar Shaikh, Shardul Sapkota, Shan Rizvi, Eric Horvitz, Joon Sung Park, Diyi Yang, and Michael S. Bernstein. 2025. Creating General User Models from Computer Use. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. ACM, Busan Republic of Korea, 1–23. doi:10.1145/3746059.3747722

work page doi:10.1145/3746059.3747722 2025
[51]

Xinyu Shi, Yinghou Wang, Ryan Rossi, and Jian Zhao. 2025. Brickify: Enabling Expressive Design Intent Specification through Direct Manipulation on Design Tokens. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 424, 20 pages. doi:10.1145/3706598.3714087

work page doi:10.1145/3706598.3714087 2025
[52]

Kihoon Son, DaEun Choi, Tae Soo Kim, Young-Ho Kim, Sangdoo Yun, and Juho Kim. 2025. ClearFairy: Capturing Creative Workflows through Decision Structur- ing, In-Situ Questioning, and Rationale Inference. doi:10.48550/arXiv.2509.14537 arXiv:2509.14537 [cs]

work page doi:10.48550/arxiv.2509.14537 2025
[53]

Hari Subramonyam, Roy Pea, Christopher Pondoc, Maneesh Agrawala, and Colleen Seifert. 2024. Bridging the Gulf of Envisioning: Cognitive Challenges in Prompt Based Interactions with LLMs. InProceedings of the CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–19. doi:10.114 5/3613904.3642754

work page arXiv 2024
[54]

Hari Subramonyam, Divy Thakkar, Andrew Ku, Juergen Dieber, and Anoop K. Sinha. 2025. Prototyping with Prompts: Emerging Approaches and Challenges in Generative AI Design for Collaborative Software Teams. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article...

work page arXiv 2025
[55]

Kozo Sugiyama, Shojiro Tagawa, and Mitsuhiko Toda. 1981. Methods for Visual Understanding of Hierarchical System Structures.IEEE Transactions on Systems, Man, and Cybernetics11, 2 (1981), 109–125

1981
[56]

Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, and Haijun Xia. 2024. Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation. InProceedings of the CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1–26. doi:10.1 145/3613904.3642400

work page arXiv 2024
[57]

Sangho Suh, Bryan Min, Srishti Palani, and Haijun Xia. 2023. Sensecape: En- abling Multilevel Exploration and Sensemaking with Large Language Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. ACM, San Francisco CA USA, 1–18. doi:10.1145/3586183.3606756

work page doi:10.1145/3586183.3606756 2023
[58]

John Sweller. 1988. Cognitive load during problem solving: Effects on learning. Cognitive Science12, 2 (1988), 257–285

1988
[59]

Lev Tankelevitch, Viktor Kewenig, Auste Simkute, Ava Elizabeth Scott, Advait Sarkar, Abigail Sellen, and Sean Rintel. 2024. The Metacognitive Demands and Opportunities of Generative AI. InProceedings of the CHI Conference on Human Factors in Computing Systems. Article 680, 680:1–680:24 pages. doi:10.1145/3613 904.3642902

work page doi:10.1145/3613 2024
[60]

Robert E. Tarjan. 1972. Depth-First Search and Linear Graph Algorithms.SIAM J. Comput.1, 2 (1972), 146–160. doi:10.1137/0201010

work page doi:10.1137/0201010 1972
[61]

Tauber and David Ackermann

Michael J. Tauber and David Ackermann. 1991.Mental models and human- computer interaction 2. Number 7 in Human factors in information technology. North-Holland Distributors for the U.S.A. and Canada, Elsevier Science Pub. Co, Amsterdam New York New York, N.Y., U.S.A

1991
[62]

Tenenbaum, Charles Kemp, Thomas L

Joshua B. Tenenbaum, Charles Kemp, Thomas L. Griffiths, and Noah D. Goodman
[63]

doi:10.1126/science.1192788

How to Grow a Mind: Statistics, Structure, and Abstraction.Science331, 6022 (2011), 1279–1285. doi:10.1126/science.1192788

work page doi:10.1126/science.1192788 2011
[64]

Joshua B Tenenbaum, Charles Kemp, Thomas L Griffiths, and Noah D Goodman
[65]

How to Grow a Mind: Statistics, Structure, and Abstraction.Science331, 6022 (2011), 1279–1285

2011
[66]

Glassman, and Ian Arawjo

Priyan Vaithilingam, Munyeong Kim, Frida-Cecilia Acosta-Parenteau, Daniel Lee, Amine Mhedhbi, Elena L. Glassman, and Ian Arawjo. 2025. Semantic Commit: Helping Users Update Intent Specifications for AI Memory at Scale. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. Article 137, 137:1–137:18 pages. doi:10.1145/374...

work page doi:10.1145/3746059.3747778 2025
[67]

Glassman, and Ian Arawjo

Priyan Vaithilingam, Munyeong Kim, Frida-Cecilia Acosta-Parenteau, Daniel Lee, Amine Mhedhbi, Elena L. Glassman, and Ian Arawjo. 2025. Semantic Commit: Helping Users Update Intent Specifications for AI Memory at Scale. doi:10.48550 /arXiv.2504.09283 arXiv:2504.09283 [cs]

work page arXiv 2025
[68]

Anqi Wang, Zhengyi Li, Xin Tong, and Pan Hui. 2026. DesignerlyLoop: Form- ing Design Intent through Curated Reasoning for Human-LLM Alignment. arXiv:2511.15331 [cs.HC] https://arxiv.org/abs/2511.15331

work page arXiv 2026
[69]

Xingyi Wang, Xiaozheng Wang, Sunyup Park, and Yaxing Yao. 2025. Mental Models of Generative AI Chatbot Ecosystems. InProceedings of the 30th Interna- tional Conference on Intelligent User Interfaces. ACM, Cagliari Italy, 1016–1031. doi:10.1145/3708359.3712125

work page doi:10.1145/3708359.3712125 2025
[70]

Weisz, Jessica He, Michael Muller, Gabriela Hoefer, Rachel Miles, and Werner Geyer

Justin D. Weisz, Jessica He, Michael Muller, Gabriela Hoefer, Rachel Miles, and Werner Geyer. 2024. Design Principles for Generative AI Applications. InPro- ceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA)(CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 378, 22 pages. doi:10.1145/36139...

work page doi:10.1145/3613904.3642466 2024
[71]

Thomas Willemain. 2019. Visualization and The Process of Modeling: A Cognitive- theoretic View. doi:10.1287/9beec4ec-43ac-4cf9-912b-eb018324857f

work page doi:10.1287/9beec4ec-43ac-4cf9-912b-eb018324857f 2019
[72]

and Goodman, Noah D

Lionel Wong, Gabriel Grand, Alexander K. Lew, Noah D. Goodman, Vikash K. Mansinghka, Jacob Andreas, and Joshua B. Tenenbaum. 2023. From Word Mod- els to World Models: Translating from Natural Language to the Probabilistic Language of Thought. arXiv:2306.12672 [cs.CL] https://arxiv.org/abs/2306.12672

work page arXiv 2023
[73]

Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model CogInstrument Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Prompts. InCHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–22. doi:10.1145/3491102.3517582

work page doi:10.1145/3491102.3517582 2022
[74]

Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding in- teractive machine learning tool design in how non-experts actually build models. InProceedings of the 2018 designing interactive systems conference. 573–584

2018
[75]

Ryan Yen and Jian Zhao. 2024. Memolet: Reifying the Reuse of User-AI Conversational Memories. InProceedings of the 37th Annual ACM Sympo- sium on User Interface Software and Technology. Article 58, 58:1–58:22 pages. doi:10.1145/3654777.3676388

work page doi:10.1145/3654777.3676388 2024
[76]

Yiwen Yin, Yu Mei, Chun Yu, Toby Jia-Jun Li, Aamir Khan Jadoon, Sixiang Cheng, Weinan Shi, Mohan Chen, and Yuanchun Shi. 2025. From Operation to Cognition: Automatic Modeling Cognitive Dependencies from User Demonstrations for GUI Task Automation. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama Japan, 1–24. do...

work page doi:10.1145/3706598.3713356 2025
[77]

Matej Zečević, Moritz Willig, Devendra Singh Dhami, and Kristian Kersting. 2023. Causal Parrots: Large Language Models May Talk Causality But Are Not Causal. arXiv:2308.13067 [cs.AI] https://arxiv.org/abs/2308.13067

work page arXiv 2023
[78]

Rzeszotarski

Chao Zhang, Kexin Ju, Zhuolun Han, Yu-Chun Grace Yen, and Jeffrey M. Rzeszotarski. 2025. Synthia: Visually Interpreting and Synthesizing Feedback for Writing Revision. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. ACM, Busan Republic of Korea, 1–16. doi:10.1145/3746059.3747703

work page doi:10.1145/3746059.3747703 2025
[79]

Wenshuo Zhang, Leixian Shen, Shuchang Xu, Jindu Wang, Jian Zhao, Huamin Qu, and Linping Yuan. 2025. NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification. doi:10.1145/3746059.3747668

work page doi:10.1145/3746059.3747668 2025
[80]

Zhongyi Zhou, Jing Jin, Vrushank Phadnis, Xiuxiu Yuan, Jun Jiang, Xun Qian, Kristen Wright, Mark Sherwood, Jason Mayes, Jingtao Zhou, Yiyi Huang, Zheng Xu, Yinda Zhang, Johnny Lee, Alex Olwal, David Kim, Ram Iyengar, Na Li, and Ruofei Du. 2025. InstructPipe: Generating Visual Blocks Pipelines with Human Instructions and LLMs. InProceedings of the 2025 CHI...

work page doi:10.1145/3706598.3713905 2025
[81]

Increasing budget enables higher-quality hotels

John Zimmerman, Jodi Forlizzi, and Shelley Evenson. 2007. Research Through Design as a Method for Interaction Design Research in HCI. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’07). Association for Computing Machinery, New York, NY, USA, 493–502. doi:10.1145/1240624.12 40704 A Framework: Representative Model A.1 Typ...

work page doi:10.1145/1240624.12 2007