pith. sign in

arxiv: 2604.16354 · v1 · submitted 2026-03-17 · 💻 cs.HC · cs.SE

Hidden Technical Debt in Generative (GenUI) and Malleable User Interfaces

Pith reviewed 2026-05-15 09:55 UTC · model grok-4.3

classification 💻 cs.HC cs.SE
keywords malleable softwareGenUIgenerative user interfacestechnical debtuser customizationevaluation methodshuman-computer interaction
0
0 comments X p. Extension

The pith

Malleable user interfaces face hidden technical debt in data formats, security, and user skills.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies key barriers preventing generative and malleable user interfaces from being adopted in practice by non-experts. These barriers consist of data formats that do not support easy adaptation, security protocols that are outdated, and users lacking the necessary cognitive and creative skills to design custom interfaces. Without addressing these, such systems remain limited in real-world use. The author therefore calls for new evaluation strategies and scientific methods focused on user studies and usage patterns to help bring malleable software into everyday practice.

Core claim

Malleable software can profoundly change how users interact with digital content by enabling non-experts to create customized tools. However, practical adoption of GenUI systems is hindered by a lack of adaptable data formats, old security protocols, and gaps in users' cognitive and creative skills. New evaluation strategies and scientific methods are advocated to measure impact through user studies, document usage patterns, and support practical adoption.

What carries the argument

Identification of hidden technical debt across data adaptability, security compatibility, and user skill requirements in generative user interfaces.

If this is right

  • Developers must create more flexible data formats to enable malleable interfaces.
  • Security protocols need updates to handle dynamic, user-generated interfaces safely.
  • Support systems are required to help users develop the skills for interface creation.
  • Evaluation of malleable software should prioritize real usage documentation over traditional metrics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Overcoming these barriers could allow more people to customize digital tools without coding expertise.
  • New evaluation methods might uncover additional hidden issues not covered in this review.
  • This points to the need for interdisciplinary work combining HCI with security and data standards research.

Load-bearing premise

The listed barriers represent the main obstacles to adoption and that implementing new evaluation strategies will enable practical use of malleable software.

What would settle it

Demonstrating high adoption rates of GenUI systems despite the persistence of inflexible data formats, outdated security, and skill gaps would falsify the main claim.

read the original abstract

Malleable software can profoundly change how users interact with digital content, enabling non-experts to create their own customized tools. However, the practical adoption of GenUI systems faces several barriers, which I unpack in this paper, including a lack of adaptable data formats, "old" security protocols, and gaps in users' cognitive and creative skills for building their own interfaces. I advocate new evaluation strategies and scientific methods to measure the impact of malleable software in user studies, document usage patterns, and ensure their practical adoption.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that practical adoption of Generative User Interfaces (GenUI) and malleable software is blocked by three specific barriers—a lack of adaptable data formats, legacy security protocols, and gaps in users' cognitive and creative skills—and that new evaluation strategies and scientific methods (user studies, usage pattern documentation) will enable their adoption.

Significance. If the enumerated barriers were shown to be primary and causal, and if the advocated evaluation methods were developed, the work could help prioritize research directions in HCI for end-user customization tools. As presented, however, the absence of any empirical grounding, literature synthesis, or comparative analysis leaves the significance speculative.

major comments (2)
  1. [Abstract / main argument] Abstract and main text: the central claim that the three listed factors constitute the primary barriers to GenUI/malleable-interface adoption is asserted without any supporting data, user studies, systematic literature review, or comparative argument against alternatives (e.g., performance, discoverability, or integration costs). No derivation, measurement, or falsifiable prediction is supplied to establish causality or dominance.
  2. [Abstract / closing paragraph] The advocacy for 'new evaluation strategies' and 'scientific methods' to measure impact is stated at a high level only, with no concrete protocols, metrics, study designs, or examples that would allow readers to assess feasibility or novelty.
minor comments (1)
  1. The manuscript would benefit from explicit section headings and a clearer distinction between asserted barriers and proposed remedies.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. The manuscript is a position paper that identifies hidden technical debt in GenUI and malleable interfaces rather than an empirical study; we address the concerns about grounding and specificity below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract / main argument] Abstract and main text: the central claim that the three listed factors constitute the primary barriers to GenUI/malleable-interface adoption is asserted without any supporting data, user studies, systematic literature review, or comparative argument against alternatives (e.g., performance, discoverability, or integration costs). No derivation, measurement, or falsifiable prediction is supplied to establish causality or dominance.

    Authors: The manuscript is a conceptual position paper that synthesizes known challenges from the HCI and end-user programming literature rather than presenting new empirical measurements. The three barriers are drawn from documented issues in data formats for user-generated content, legacy security models in extensible systems, and cognitive load in customization tasks. We will revise the text to include a more explicit literature synthesis section with targeted citations and a short comparative paragraph addressing why these barriers are prioritized over alternatives such as raw performance or discoverability. revision: partial

  2. Referee: [Abstract / closing paragraph] The advocacy for 'new evaluation strategies' and 'scientific methods' to measure impact is stated at a high level only, with no concrete protocols, metrics, study designs, or examples that would allow readers to assess feasibility or novelty.

    Authors: We agree that the closing discussion would benefit from greater concreteness. In the revised manuscript we will add specific examples of metrics (e.g., modification frequency per session, time-to-first-customization, and retention after initial use) and sketch a lightweight usage-pattern documentation protocol that could be implemented with existing instrumentation in GenUI prototypes. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual position paper with no derivations or self-referential reductions

full rationale

The manuscript is a short position paper that enumerates three asserted barriers to GenUI/malleable interface adoption and advocates new evaluation strategies. It contains no equations, no fitted parameters, no predictions, and no derivation chain. The central claims rest on enumeration rather than any reduction to prior results or self-citations. No load-bearing steps exist that could be circular by construction. This is the expected outcome for a purely advocacy text with no technical derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No formal model, parameters, or new entities are introduced; the discussion rests on unstated domain assumptions about user capabilities and software interoperability.

pith-pipeline@v0.9.0 · 5375 in / 1000 out tokens · 38268 ms · 2026-05-15T09:55:45.529036+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 1 internal anchor

  1. [1]

    Anthropic. [n. d.].Claude Code: Overview. https://code.claude.com/docs/en/overview Accessed: 2026-02-08

  2. [2]

    Negar Arabzadeh and Charles L.A. Clarke. 2025. A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval(Padua, Italy)(SIGIR ’25). Association for Computing Machinery, New York, NY, USA, 2784–2788. doi:10.1145/3726...

  3. [3]

    Michel Beaudouin-Lafon. 2004. Designing interaction, not interfaces. InProceedings of the Working Conference on Advanced Visual Interfaces (Gallipoli, Italy)(A VI ’04). Association for Computing Machinery, New York, NY, USA, 15–22. doi:10.1145/989863.989865

  4. [4]

    Dan Bennett, Oussama Metatla, Anne Roudaut, and Elisa D. Mekler. 2023. How does HCI Understand Human Agency and Autonomy?. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems(Hamburg, Germany)(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 375, 18 pages. doi:10.1145/3544548.3580651

  5. [5]

    Bolt. [n. d.]. Bolt. Bolt. https://bolt.new/ Accessed: 2026-02-08

  6. [6]

    Yining Cao, Peiling Jiang, and Haijun Xia. 2025. Generative and Malleable User Interfaces with Generative and Evolving Task-Driven Data Model. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 686, 20 pages. doi:10.1145/3706598.3713285

  7. [7]

    Xiang ’Anthony Chen, Tiffany Knearem, and Yang Li. 2025. The GenUI Study: Exploring the Design of Generative UI Tools to Support UX Practitioners and Beyond. InProceedings of the 2025 ACM Designing Interactive Systems Conference (DIS ’25). Association for Computing Machinery, New York, NY, USA, 1179–1196. doi:10.1145/3715336.3735780

  8. [8]

    Adaptive human-llms interaction col- laboration: Reinforcement learning driven vision-language models for medical report gener- ation

    Besjon Cifliku and Hendrik Heuer. 2025. "This could save us months of work" - Use Cases of AI and Automation Support in Investigative Journalism. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’25). Association for Computing Machinery, New York, NY, USA, Article 29, 8 pages. doi:10.1145/3706599.3719856

  9. [9]

    Besjon Cifliku and Hendrik Heuer. 2026. They Think AI Can Do More Than It Actually Can: Practices, Challenges, & Opportunities of AI-Supported Reporting In Local Journalism. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI’26). Association for Computing Machinery, New York, NY, USA, 20 pages. doi:10.1145/3772318.3791130 ...

  10. [10]

    Ward Cunningham. 1992. The WyCash portfolio management system.SIGPLAN OOPS Mess.4, 2 (Dec. 1992), 29–30. doi:10.1145/157710.157715

  11. [11]

    Drew, Brooke Falcone, and Wendy L

    Mandy R. Drew, Brooke Falcone, and Wendy L. Baccus. 2018. What Does the System Usability Scale (SUS) Measure? Validation Using Think Aloud Verbalization and Behavioral Metrics. InDesign, User Experience, and Usability: Theory and Practice: 7th International Conference, DUXU 2018, Held as Part of HCI International 2018, Las Vegas, NV, USA, July 15-20, 2018...

  12. [12]

    Fabien Girardin. 2026. Software Gets Personal: An Introduction. Medium (Próximo Presents). https://medium.com/próximo-presents/software- gets-personal-an-introduction-1175c7f1edbd 13 min read; Accessed: 2026-02-08

  13. [13]

    Google. 2025. A2UI: A Protocol for Agent-Driven Interfaces. https://a2ui.org. https://a2ui.org Accessed: 2026-02-12

  14. [14]

    Google. 2026. Google AI Studio. Google. https://aistudio.google.com/ Accessed: 2026-02-08

  15. [15]

    Devamardeep Hayatpur, Brian Hempel, Richard Lin, and Haijun Xia. 2025. The Shapes of Abstraction in Data Structure Diagrams. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 883, 12 pages. doi:10.1145/3706598.3713723

  16. [16]

    2026.ClawdBot: The New Primary Target for Infostealers in the AI Era

    Hudson Rock Research. 2026.ClawdBot: The New Primary Target for Infostealers in the AI Era. InfoStealers. https://www.infostealers.com/article/ clawdbot-the-new-primary-target-for-infostealers-in-the-ai-era/ Accessed: 2026-02-12

  17. [17]

    Bronwyn Jones, Rhianne Jones, and Ewa Luger. 2022. AI ‘Everywhere and Nowhere’: Addressing the AI Intelligibility Problem in Public Service Journalism.Digital Journalism10, 10 (2022), 1731–1755. doi:10.1080/21670811.2022.2145328

  18. [18]

    Alan C. Kay. 1984.Opening the Hood of a Word Processor. Technical Report Draft. Self-published / Worrydream Refs. https://worrydream.com/refs/ Kay_1984_-_Opening_the_Hood_of_a_Word_Processor.pdf Draft working paper (distributed for comments only)

  19. [19]

    Beresford

    Martin Kleppmann and Alastair R. Beresford. 2018. Automerge: Real-time data sync between edge devices. InProceedings of the 1st UK Mobile, Wearable and Ubiquitous Systems Research Symposium (MobiUK 2018). 101–105. https://mobiuk.org/abstract/S4-P5-Kleppmann-Automerge.pdf Abstract; accessed 2026-02-08

  20. [20]

    David Ledo, Steven Houben, Jo Vermeulen, Nicolai Marquardt, Lora Oehlberg, and Saul Greenberg. 2018. Evaluation Strategies for HCI Toolkit Research. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI ’18). ACM, New York, NY, USA, 36:1–36:17. doi:10.1145/3173574.3173610

  21. [21]

    2025.Towards a Working Definition of Designing Generative User Interfaces

    Kyungho Lee. 2025.Towards a Working Definition of Designing Generative User Interfaces. Association for Computing Machinery, New York, NY, USA, 489–495. https://doi.org/10.1145/3715668.3736365

  22. [22]

    2025.Malleable Software: Restoring User Agency in a World of Locked-Down Apps

    Geoffrey Litt, Josh Horowitz, Peter van Hardenberg, and Todd Matthews. 2025.Malleable Software: Restoring User Agency in a World of Locked-Down Apps. https://www.inkandswitch.com/essay/malleable-software/ Accessed: 2026-02-08

  23. [23]

    Lovable. 2026. Lovable: Build Software with AI. Lovable. https://lovable.dev/ Accessed: 2026-02-08

  24. [24]

    Allan MacLean, Kathleen Carter, Lennart Lövstrand, and Thomas Moran. 1990. User-tailorable systems: pressing the issues with buttons. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(Seattle, Washington, USA)(CHI ’90). Association for Computing Machinery, New York, NY, USA, 175–182. doi:10.1145/97243.97271

  25. [25]

    Ulysse Maes, Lien Michiels, and Annelien Smets. 2024. GenUI(ne) CRS: UI Elements and Retrieval-Augmented Generation in Conversational Recommender Systems with LLMs. InProceedings of the 18th ACM Conference on Recommender Systems(Bari, Italy)(RecSys ’24). Association for Computing Machinery, New York, NY, USA, 1177–1179. doi:10.1145/3640457.3691697

  26. [26]

    Bryan Min, Allen Chen, Yining Cao, and Haijun Xia. 2025. Malleable Overview-Detail Interfaces. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI ’25). Association for Computing Machinery, New York, NY, USA, Article 688, 25 pages. doi:10.1145/3706598.3714164

  27. [27]

    Bryan Min, Peiling Jiang, Zhicheng Huang, and Haijun Xia. 2026. Gradual Generation of User Interfaces as a Design Method for Malleable Software. arXiv:2601.17975 [cs.HC] https://arxiv.org/abs/2601.17975

  28. [28]

    Bryan Min and Haijun Xia. 2025. Meridian: A Design Framework for Malleable Overview-Detail Interfaces. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST ’25). Association for Computing Machinery, New York, NY, USA, Article 200, 14 pages. doi:10.1145/3746059.3747654

  29. [29]

    Pavel Okopnyi, Oda Elise Nordberg, and Frode Guribye. 2024. Against Generative UI. InProceedings of the Halfway to the Future Symposium(Santa Cruz, CA, USA)(HttF ’24). Association for Computing Machinery, New York, NY, USA, Article 12, 4 pages. doi:10.1145/3686169.3686184

  30. [30]

    2026.ChatGPT Codex

    OpenAI. 2026.ChatGPT Codex. https://chatgpt.com/codex Accessed: 2026-02-08

  31. [31]

    2025.MCP Apps-Model Context Protocol Extensions Documentation

    Model Context Protocol. 2025.MCP Apps-Model Context Protocol Extensions Documentation. https://modelcontextprotocol.io/docs/extensions/apps

  32. [32]

    Amirhossein Razavi, Mina Soltangheis, Negar Arabzadeh, Sara Salamat, Morteza Zihayat, and Ebrahim Bagheri. 2025. Benchmarking Prompt Sensitivity in Large Language Models. InAdvances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6–10, 2025, Proceedings, Part III(Lucca, Italy). Springer-Verlag, B...

  33. [33]

    2020.Embodied mathematics by interactive sketching

    Nazmus Saquib. 2020.Embodied mathematics by interactive sketching. Ph. D. Dissertation. Massachusetts Institute of Technology. https: //hdl.handle.net/1721.1/129275 Ph.D. thesis. Cataloged from student-submitted PDF. Includes bibliographical references (pp. 189–197)

  34. [34]

    Bret Victor. 2011. Up and Down the Ladder of Abstraction: A Systematic Approach to Interactive Visualization. Worrydream.com. https: //worrydream.com/LadderOfAbstraction/ Interactive essay; Accessed: 2026-02-08

  35. [35]

    2025.The Complexity of Creating Flexible Interfaces: Why Flexibility Needs Clear Intent

    Mustafa Yücel. 2025.The Complexity of Creating Flexible Interfaces: Why Flexibility Needs Clear Intent. https://compeng.medium.com/the-complexity- of-creating-flexible-interfaces-why-flexibility-needs-clear-intent-dae5ec02a2c7 Accessed: 2026-02-08