pith. machine review for the scientific record. sign in

arxiv: 2604.13621 · v1 · submitted 2026-04-15 · 💻 cs.HC

Recognition: unknown

Nanomentoring: Investigating How Quickly People Can Help People Learn Feature-Rich Software

Ian Drosos , Jo Vermeulen , George Fitzmaurice , Justin Matejka

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:50 UTC · model grok-4.3

classification 💻 cs.HC
keywords online forumsexpert mentoringsoftware learningquick responsesnanoquestionshuman-computer interactionhelp systems
0
0 comments X

The pith

Experts can give helpful advice on more than half of short software questions in under 60 seconds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper explores whether experts on online forums can deliver rapid assistance to users struggling with complex software. It identifies a subset of brief questions that seem answerable in under a minute and tests this with actual experts. The study shows that for over half of these short questions, experts produced advice they judged useful within 60 seconds, in either text or audio. This finding matters because forum users often face long delays for help, so confirming fast expert responses could shorten learning times and reduce frustration. The authors also collected details on what makes a question quick to answer to support future rapid-help tools.

Core claim

The authors collected over 200 forum questions from two feature-rich applications and judged roughly a quarter short enough for sub-minute answers. A study with 28 recruited experts confirmed that for more than half of these nanoquestions, participants could supply advice they believed was helpful in under 60 seconds. Experts showed no clear preference between text and audio formats for these quick replies, and the work surfaces characteristics that make questions fast to handle.

What carries the argument

Nanoquestions, the short forum questions experts suspect can be answered helpfully in less than one minute, used to measure the feasibility of ultra-rapid expert assistance.

If this is right

  • Help platforms could detect short questions automatically and route them to available experts for immediate responses.
  • Audio reply options become viable for quick exchanges since experts found them workable.
  • Forum interfaces might separate or prioritize nanoquestions to reduce overall response times.
  • Designers can use the reported traits of quick-to-answer questions to build better matching or suggestion tools.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Rapid expert input on simple issues could combine with automated systems to handle a larger share of forum traffic.
  • The same quick-response model may apply to other technical domains such as coding or device troubleshooting.
  • Over time this approach could move help forums toward near-real-time mentoring for routine questions.

Load-bearing premise

That an expert's self-judgment of helpfulness matches the actual benefit the question asker would receive, and that the sampled questions represent typical forum posts.

What would settle it

A controlled test in which real users receive the experts' quick answers and later report whether their problem was solved or their understanding improved compared with waiting for standard forum replies.

Figures

Figures reproduced from arXiv: 2604.13621 by George Fitzmaurice, Ian Drosos, Jo Vermeulen, Justin Matejka.

Figure 1
Figure 1. Figure 1: We conducted a study with 28 online forum participants to provide answers to [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Survey question UI that participants saw. Participants could answer the question based on their preferred modality. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Participants self-rating of much they believed they helped the learner vs time taken on the question. The dotted line represents [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Participants’ preferred answer modality: 12 preferred giving advice via text, 10 had no preference, 6 preferred audio. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Participants rated the difficulty of using each modality in the study from Very Easy to Very Difficult. The number of participants [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: At the end, participants were asked “of the questions you saw today, what percentage would be significantly easier to provide [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

People frequently use online forums to get help from experts to answer questions about feature-rich software. However, they may have to wait minutes, hours, or even days to receive advice. We investigate the potential to leverage experts to provide quicker help. We collected over 200 questions from online forums for two feature-rich software applications and suspected a quarter were short enough to be answered in less than one minute (defined as nanoquestions). We then conducted a study with 28 experts recruited from help forums to confirm this assumption, and explore whether there was a preference between text and audio answers. For more than half of the nanoquestions participants saw, they could give advice that they believed was helpful in under 60 seconds. Finally, we collected feedback about what makes a question quick to answer to inspire the design of future tools for ultra rapid human-to-human help.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents an investigation into 'nanomentoring' for feature-rich software, where the authors analyze over 200 forum questions, identify a subset as 'nanoquestions' potentially answerable in under one minute, and conduct a study with 28 experts to test if they can provide helpful advice quickly. The key finding is that for more than half of the nanoquestions, experts believed they could give helpful advice in under 60 seconds. The work also examines preferences for text versus audio responses and gathers qualitative feedback on characteristics of quick-to-answer questions to inform future tool designs.

Significance. If the central findings hold under scrutiny, this work has significance for HCI by demonstrating the feasibility of ultra-rapid expert assistance in online forums, which could reduce user wait times and frustration when learning complex software. The empirical approach and collection of design insights are strengths. However, the absence of validation for self-assessed helpfulness against real-world outcomes tempers the practical implications for deploying such systems.

major comments (2)
  1. [Methods] Methods section: The nanoquestion selection process is described as 'suspecting a quarter' of over 200 posts without specifying the exact criteria applied or reporting inter-rater reliability for this classification step, which is load-bearing for assessing whether the tested questions are representative of typical forum posts.
  2. [Results] Results section: The central claim that experts could give advice they believed was helpful in under 60 seconds for more than half of the nanoquestions rests solely on self-reported belief without external validation against asker outcomes, resolution rates, or independent ratings, which directly weakens support for the motivation of enabling quicker help that actually benefits users.
minor comments (2)
  1. [Abstract] The abstract reports aggregate findings (e.g., 'more than half') without stating the exact number of nanoquestions evaluated or the distribution of response times, which would improve precision and reproducibility.
  2. [Results] The discussion of text versus audio preferences would benefit from reporting the statistical test used and effect sizes to allow readers to evaluate the strength of any observed differences.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript. We address each major comment below, indicating where we agree and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Methods] Methods section: The nanoquestion selection process is described as 'suspecting a quarter' of over 200 posts without specifying the exact criteria applied or reporting inter-rater reliability for this classification step, which is load-bearing for assessing whether the tested questions are representative of typical forum posts.

    Authors: We agree that greater transparency is needed here. The initial screening was performed by a single researcher who reviewed the 200+ posts and flagged those that appeared short and focused on a single software feature or command (e.g., brief queries about one menu item or shortcut). In the revised manuscript we will explicitly list these heuristics, report the exact proportion identified (approximately 25%), and note the single-rater nature of the step as a limitation. We will also add a sentence clarifying that the subsequent expert study was intended to test whether the selected items were indeed quick to answer. revision: yes

  2. Referee: [Results] Results section: The central claim that experts could give advice they believed was helpful in under 60 seconds for more than half of the nanoquestions rests solely on self-reported belief without external validation against asker outcomes, resolution rates, or independent ratings, which directly weakens support for the motivation of enabling quicker help that actually benefits users.

    Authors: We acknowledge the limitation. Our study measured experts' own assessments of whether they could formulate helpful advice quickly; we did not collect follow-up data from the original askers or obtain independent ratings of answer quality. In the revision we will add an explicit limitations subsection in the Discussion that states this reliance on self-report and calls for future work that validates against real asker outcomes and resolution rates. At the same time, we maintain that the current design provides a necessary first-step feasibility check from the expert perspective before investing in deployment studies. revision: partial

Circularity Check

0 steps flagged

Empirical observational study with no derivation chain

full rationale

The paper is a purely empirical study: it collects forum questions, recruits experts to attempt quick answers, and reports direct measurements of self-reported time and perceived helpfulness. No equations, derivations, fitted parameters, or predictions appear. The central finding follows immediately from participant responses without reduction to any prior inputs or self-citations. No load-bearing steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on the assumption that forum-collected questions can be reliably pre-classified as nanoquestions and that expert self-judgment of helpfulness is a sufficient proxy for actual utility.

axioms (2)
  • domain assumption A quarter of questions from online forums for feature-rich software are short enough to be answered in less than one minute
    Stated as a suspicion after collecting over 200 questions
  • domain assumption Expert belief that advice is helpful is a valid indicator of actual helpfulness
    Used to determine success rate in the study
invented entities (1)
  • nanoquestions no independent evidence
    purpose: To label short questions suspected to be answerable in under one minute
    New term coined to scope the investigation

pith-pipeline@v0.9.0 · 5454 in / 1365 out tokens · 57910 ms · 2026-05-10T12:50:52.241129+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 31 canonical work pages

  1. [1]

    M. S. Ackerman and T. W. Malone. 1990. Answer Garden: A Tool for Growing Organizational Memory.SIGOIS Bull.11, 2–3 (mar 1990), 31–39. https://doi.org/10.1145/91478.91485

  2. [2]

    Ackerman and David W

    Mark S. Ackerman and David W. McDonald. 1996. Answer Garden 2: Merging Organizational Memory with Collaborative Help. InProceedings of the 1996 ACM Conference on Computer Supported Cooperative Work(Boston, Massachusetts, USA)(CSCW ’96). Association for Computing Machinery, New York, NY , USA, 97–105. https://doi.org/10.1145/240080.240203

  3. [3]

    2024-03-26

    Google AI. 2024-03-26. Gemini. https://gemini.google.com/app/

  4. [4]

    Sultan A Alharthi, Ben Lafreniere, Tovi Grossman, and George Fitzmaurice. 2022. TwoTorials: A Remote Cooperative Tutorial System for 3D Design Software. InProceedings of Graphics Interface 2022 (GI ’22). Canadian Information Processing Society, Montréal, Quebec, 9 – 23. https://doi.org/10.20380/GI2022.03

  5. [5]

    Roy, and Kevin A

    Muhammad Asaduzzaman, Ahmed Shah Mashiyat, Chanchal K. Roy, and Kevin A. Schneider. 2013. Answering Questions about Unanswered Questions of Stack Overflow. In2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, San Francisco, CA, USA, 97–100. https://doi.org/10.1109/MSR.2013.6624015

  6. [6]

    2023-08-31

    Autodesk. 2023-08-31. Fusion 360 Support. https://forums.autodesk.com/t5/fusion-360-support/bd-p/962

  7. [7]

    Berlin and Robin Jeffries

    Lucy M. Berlin and Robin Jeffries. 1992. Consultants and Apprentices: Observations about Learning and Collaborative Problem Solving. In Proceedings of the 1992 ACM Conference on Computer-Supported Cooperative Work(Toronto, Ontario, Canada)(CSCW ’92). Association for Computing Machinery, New York, NY , USA, 130–137. https://doi.org/10.1145/143457.143471

  8. [8]

    Biehl, Rosta Farzan, and Yingfan Zhou

    Jacob T. Biehl, Rosta Farzan, and Yingfan Zhou. 2022. Can Anybody Help Me?: Using Community Help Desk Call Records to Examine the Impact of Digital Divides During a Global Pandemic. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY , USA, Article...

  9. [9]

    Lasecki, and Steve Oney

    Yan Chen, Sang Won Lee, Yin Xie, YiWei Yang, Walter S. Lasecki, and Steve Oney. 2017. Codeon: On-Demand Software Development Assistance. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems(Denver, Colorado, USA)(CHI ’17). Association for Computing Machinery, New York, NY , USA, 6220–6231. https://doi.org/10.1145/3025453.3025972

  10. [10]

    Yan Chen, Steve Oney, and Walter S. Lasecki. 2016. Towards Providing On-Demand Expert Support for Software Developers. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems(San Jose, California, USA)(CHI ’16). Association for Computing Machinery, New York, NY , USA, 3192–3203. https://doi.org/10.1145/2858036.2858512

  11. [11]

    Chilana, Tovi Grossman, and George Fitzmaurice

    Parmit K. Chilana, Tovi Grossman, and George Fitzmaurice. 2011. Modern Software Product Support Processes and the Usage of Multimedia Formats. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Vancouver, BC, Canada)(CHI ’11). Association for Computing Machinery, New York, NY , USA, 3093–3102. https://doi.org/10.1145/1978942.1979400

  12. [12]

    Chilana, Nathaniel Hudson, Srinjita Bhaduri, Prashant Shashikumar, and Shaun Kane

    Parmit K. Chilana, Nathaniel Hudson, Srinjita Bhaduri, Prashant Shashikumar, and Shaun Kane. 2018. Supporting Remote Real-Time Expert Help: Opportunities and Challenges for Novice 3D Modelers. In2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, Lisbon, Portugal, 157–166. https://doi.org/10.1109/VLHCC.2018.8506568

  13. [13]

    Chilana, Amy J

    Parmit K. Chilana, Amy J. Ko, and Jacob O. Wobbrock. 2012. LemonAid: Selection-Based Crowdsourced Contextual Help for Web Applications. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems(Austin, Texas, USA)(CHI ’12). Association for Computing Machinery, New York, NY , USA, 1549–1558. https://doi.org/10.1145/2207676.2208620

  14. [14]

    Ian Drosos and Philip J. Guo. 2021. Streamers Teaching Programming, Art, and Gaming: Cognitive Apprenticeship, Serendipitous Teachable Moments, and Tacit Expert Knowledge. In2021 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, St Louis, MO, USA, 1–6. https://doi.org/10.1109/VL/HCC51201.2021.9576481

  15. [15]

    Kang, Gabriel Amoako, Neil Sengupta, and Steven P

    Denae Ford, Kristina Lustig, Jeremy Banks, and Chris Parnin. 2018. "We Don’t Do That Here": How Collaborative Editing with Mentors Improves Engagement in Social Q&A Communities. InProceedings of the 2018 CHI Conference on Human Factors in Computing Systems(Montreal QC, Canada)(CHI ’18). Association for Computing Machinery, New York, NY , USA, 1–12. https:...

  16. [16]

    In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering

    Denae Ford, Justin Smith, Philip J. Guo, and Chris Parnin. 2016. Paradise Unplugged: Identifying Barriers for Female Participation on Stack Overflow. InProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering(Seattle, W A, USA)(FSE 2016). Association for Computing Machinery, New York, NY , USA, 846–857. https...

  17. [17]

    Ailie Fraser, Julia M

    C. Ailie Fraser, Julia M. Markel, N. James Basa, Mira Dontcheva, and Scott Klemmer. 2020. ReMap: Lowering the Barrier to Help-Seeking with Multimodal Search. InProceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology(Virtual Event, USA)(UIST ’20). Manuscript submitted to ACM Nanomentoring: Investigating How Quickly People Can...

  18. [18]

    Bennett and Daniela K

    C. Ailie Fraser, Tricia J. Ngoon, Mira Dontcheva, and Scott Klemmer. 2019. RePlay: Contextually Presenting Learning Videos Across Software Applications. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems(Glasgow, Scotland Uk)(CHI ’19). Association for Computing Machinery, New York, NY , USA, 1–13. https://doi.org/10.1145/329060...

  19. [19]

    Guo, Jeffery White, and Renan Zanelatto

    Philip J. Guo, Jeffery White, and Renan Zanelatto. 2015. Codechella: Multi-User Program Visualizations for Real-Time Tutoring and Collaborative Learning. In2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, Atlanta, GA, USA, 79–87. https: //doi.org/10.1109/VLHCC.2015.7357201

  20. [20]

    Sami Uddin, Jinghui Cheng, and Jin L

    Jazlyn Hellman, Jiahao Chen, Md. Sami Uddin, Jinghui Cheng, and Jin L. C. Guo. 2022. Characterizing User Behaviors in Open-Source Software User Forums: An Empirical Study. InProceedings of the 15th International Conference on Cooperative and Human Aspects of Software Engineering (Pittsburgh, Pennsylvania)(CHASE ’22). Association for Computing Machinery, N...

  21. [21]

    Verspoor, and Timothy Baldwin

    Doris Hoogeveen, Karin M. Verspoor, and Timothy Baldwin. 2015. CQADupStack: A Benchmark Data Set for Community Question-Answering Research. InProceedings of the 20th Australasian Document Computing Symposium(Parramatta, NSW, Australia)(ADCS ’15). Association for Computing Machinery, New York, NY , USA, Article 3, 8 pages. https://doi.org/10.1145/2838931.2838934

  22. [22]

    Damon Horowitz and Sepandar D. Kamvar. 2010. The Anatomy of a Large-Scale Social Search Engine. InProceedings of the 19th International Conference on World Wide Web(Raleigh, North Carolina, USA)(WWW ’10). Association for Computing Machinery, New York, NY , USA, 431–440. https://doi.org/10.1145/1772690.1772735

  23. [23]

    Chilana, Xiaoyu Guo, Jason Day, and Edmund Liu

    Nathaniel Hudson, Parmit K. Chilana, Xiaoyu Guo, Jason Day, and Edmund Liu. 2015. Understanding Triggers for Clarification Requests in Community-Based Software Help Forums. In2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, Atlanta, GA, USA, 189–193. https://doi.org/10.1109/VLHCC.2015.7357216

  24. [24]

    Yuchao Jiang, Boualem Benatallah, and Marcos Báez. 2023. Rsourcer: Scaling Feedback on Research Drafts. InIntelligent Information Systems, Cristina Cabanillas and Francisca Pérez (Eds.). Springer International Publishing, Cham, 61–68

  25. [25]

    Yuchao Jiang, Daniel Schlagwein, and Boualem Benatallah. 2018. A Review on Crowdsourcing for Education: State of the Art of Literature and Practice. InProceedings of the 22nd Pacific Asia Conference on Information Systems. PACIS, Yokohama, Japan, 14 pages. ^1^

  26. [26]

    Nikhita Joshi, Justin Matejka, Fraser Anderson, Tovi Grossman, and George Fitzmaurice. 2020. MicroMentor: Peer-to-Peer Software Help Sessions in Three Minutes or Less. InProceedings of the 2020 CHI Conference on Human Factors in Computing Systems(Honolulu, HI, USA)(CHI ’20). Association for Computing Machinery, New York, NY , USA, 1–13. https://doi.org/10...

  27. [27]

    One-Size-Fits-All

    Kimia Kiani, George Cui, Andrea Bunt, Joanna McGrenere, and Parmit K. Chilana. 2019. Beyond "One-Size-Fits-All": Understanding the Diversity in How Software Newcomers Discover and Make Use of Help Resources. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems(Glasgow, Scotland Uk)(CHI ’19). Association for Computing Machinery, N...

  28. [28]

    Tae Soo Kim, Seungsu Kim, Yoonseo Choi, and Juho Kim. 2021. Winder: Linking Speech and Visual Objects to Support Communication in Asynchronous Collaboration. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems(Yokohama, Japan)(CHI ’21). Association for Computing Machinery, New York, NY , USA, Article 453, 17 pages. https://doi.o...

  29. [29]

    Yasmine Kotturi, Herman T Johnson, Michael Skirpan, Sarah E Fox, Jeffrey P Bigham, and Amy Pavel. 2022. Tech Help Desk: Support for Local Entrepreneurs Addressing the Long Tail of Computing Challenges. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New Yor...

  30. [30]

    Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011. IP-QAT: In-Product Questions, Answers, & Tips. InProceedings of the 24th Annual ACM Symposium on User Interface Software and Technology(Santa Barbara, California, USA)(UIST ’11). Association for Computing Machinery, New York, NY , USA, 175–184. https://doi.org/10.1145/2047196.2047218

  31. [31]

    McDonald and Mark S

    David W. McDonald and Mark S. Ackerman. 2000. Expertise Recommender: A Flexible Recommendation System and Architecture. InProceedings of the 2000 ACM Conference on Computer Supported Cooperative Work(Philadelphia, Pennsylvania, USA)(CSCW ’00). Association for Computing Machinery, New York, NY , USA, 231–240. https://doi.org/10.1145/358916.358994

  32. [32]

    2023-08-31

    Microsoft. 2023-08-31. Bing Chat. https://www.microsoft.com/en-us/edge/features/bing-chat

  33. [33]

    2023-08-31

    Microsoft. 2023-08-31. Word: Your community for best practices and the latest news on Word. https://techcommunity.microsoft.com/t5/word/ct- p/Word

  34. [34]

    2023-08-31

    OpenAI. 2023-08-31. ChatGPT. https://openai.com/chatgpt

  35. [35]

    2023-08-31

    Reddit. 2023-08-31. r/Fusion360: Fusion 360 guides, help, show-offs, and more. https://www.reddit.com/r/Fusion360

  36. [36]

    2023-08-31

    Reddit. 2023-08-31. r/MicrosoftWord: Word up! https://www.reddit.com/r/MicrosoftWord/

  37. [37]

    Fatemeh Riahi, Zainab Zolaktaf, Mahdi Shafiei, and Evangelos Milios. 2012. Finding Expert Users in Community Question Answering. InProceedings of the 21st International Conference on World Wide Web(Lyon, France)(WWW ’12 Companion). Association for Computing Machinery, New York, NY , USA, 791–798. https://doi.org/10.1145/2187980.2188202

  38. [38]

    Susanne Trauzettel-Klosinski, Klaus Dietz, and the IReST Study Group. 2012. Standardized Assessment of Reading Performance: The New International Reading Speed Texts IReST.Investigative Ophthalmology & Visual Science53, 9 (08 2012), 5452–5461. https://doi.org/10.1167/iovs.11- 8284 arXiv:https://arvojournals.org/arvo/content_public/journal/iovs/933252/i155...

  39. [39]

    Delvin Varghese, Tom Bartindale, Kyle Montague, Matt Baillie Smith, and Patrick Olivier. 2022. Supporting Real-Time Peer-Mentoring of Rural V olunteers. InProceedings of the 2022 CHI Conference on Human Factors in Computing Systems(New Orleans, LA, USA)(CHI ’22). Association for Computing Machinery, New York, NY , USA, Article 629, 13 pages. https://doi.o...

  40. [40]

    Jules White, Quchen Fu, Sam Hays, Michael Sandborn, Carlos Olea, Henry Gilbert, Ashraf Elnashar, Jesse Spencer-Smith, and Douglas C. Schmidt

  41. [41]

    GET:/users

    A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv:2302.11382 [cs.SE] Manuscript submitted to ACM