A Taxonomy of Runtime Faults in Model Context Protocol Servers
Pith reviewed 2026-06-28 04:55 UTC · model grok-4.3
The pith
A taxonomy derived from 837 GitHub threads organizes runtime faults in Model Context Protocol servers into 11 categories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Runtime faults in MCP servers fall into 11 top-level categories and 27 subcategories that together capture 73 distinct leaf fault types; these were obtained by open coding of 837 MCP-specific threads drawn from 473 GitHub repositories and were found to cover every category reported by 55 surveyed MCP server developers, who on average had experienced 20 of the 27 subcategories.
What carries the argument
The taxonomy of 11 top-level categories and 27 subcategories produced by bottom-up open coding of runtime fault threads.
If this is right
- The taxonomy covers recurrent failures across protocol interactions, tool invocations, schema enforcement, state management, model-provider integration, security validation, and timeouts or cancellations.
- No top-level category remained unobserved when 55 MCP server developers were asked about their experience.
- The taxonomy supplies a structured reference that can support maintenance and evolution tasks in AI software that uses the Model Context Protocol.
- Developers encounter an average of 20 of the 27 subcategories, indicating broad practical coverage.
Where Pith is reading between the lines
- Automated checkers or linters could be written to flag code patterns that match the 73 leaf fault types.
- The same bottom-up method could be applied to fault reports from other LLM-to-tool protocols to produce comparable taxonomies.
- Longitudinal studies of open-source MCP servers could measure how often each subcategory appears in production logs.
- Training materials for MCP server developers could be organized around the 11 categories to reduce common mistakes.
Load-bearing premise
The 837 selected threads from 473 repositories are representative of all runtime faults that actually occur in MCP servers.
What would settle it
Discovery of a large set of MCP server runtime faults that cannot be placed into any of the 11 categories would show the taxonomy is incomplete.
Figures
read the original abstract
MCP (Model Context Protocol) enables LLMs (Large Language Models) to interact with external tools and data sources via a standardized protocol. Its rapid adoption in tool-augmented Artificial Intelligence (AI) workflows has introduced new reliability challenges, such as configuration parameters that are accepted but not enforced at runtime, leading to unintended default behavior, whose runtime fault characteristics remain empirically unexamined. We present the first empirical taxonomy of runtime faults in MCP servers. We manually analyzed 837 MCP-specific runtime fault threads from 473 actively maintained MCP server GitHub repositories and derived a taxonomy using a bottom-up open coding procedure. The taxonomy comprises 11 top-level categories and 27 subcategories (73 leaf fault types), covering recurrent failures across protocol interactions, tool invocations, schema enforcement, state management, model-provider integration, security validation, and timeouts or explicit cancellations of in-progress operations. To assess the taxonomy's external validity, we surveyed 55 MCP server developers. Respondents reported experiencing an average of 20 of the 27 fault subcategories, and no category remained unobserved. These results indicate that the taxonomy reflects widely observed runtime failures in MCP-based systems and shall assist AI software maintenance and evolution in the future.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to present the first empirical taxonomy of runtime faults in Model Context Protocol (MCP) servers. It is based on bottom-up open coding of 837 manually selected MCP-specific runtime fault threads drawn from 473 actively maintained GitHub repositories, yielding 11 top-level categories, 27 subcategories, and 73 leaf fault types. External validity is assessed through a survey of 55 MCP server developers, who reported experiencing an average of 20 of the 27 subcategories with no category unobserved.
Significance. If the sampling frame and coding process prove representative and reliable, the taxonomy could provide a practical reference for diagnosing and mitigating runtime failures in tool-augmented LLM systems, supporting maintenance and evolution of MCP-based AI software.
major comments (3)
- [Abstract] Abstract: the central claim that the taxonomy 'reflects widely observed runtime failures' rests on the 837 threads being representative, yet the description provides no search strings, inclusion/exclusion criteria, temporal bounds, operational definition of 'MCP-specific' or 'runtime fault', or sampling frame. This selection process is load-bearing for all downstream claims.
- [Abstract] Abstract (open coding description): the bottom-up procedure is presented without any information on the number of coders, inter-rater reliability statistics, disagreement resolution protocol, or saturation criteria. These omissions directly affect the trustworthiness of the 11/27/73 category structure.
- [Abstract (validation survey)] Survey validation paragraph: while 55 respondents reported experience with an average of 20/27 subcategories, the paper supplies no recruitment method, response rate, or respondent demographics. This limits the survey's ability to compensate for potential upstream selection bias in the GitHub corpus.
minor comments (1)
- [Abstract] Abstract: the parenthetical '(73 leaf fault types)' appears late; moving the full category counts earlier would improve immediate readability of the contribution size.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on methodological transparency. We agree that the abstract omits key details due to length constraints and will revise it to include concise references to the sampling, coding, and survey procedures described in the methods section. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the taxonomy 'reflects widely observed runtime failures' rests on the 837 threads being representative, yet the description provides no search strings, inclusion/exclusion criteria, temporal bounds, operational definition of 'MCP-specific' or 'runtime fault', or sampling frame. This selection process is load-bearing for all downstream claims.
Authors: We acknowledge the abstract does not detail these elements. The full manuscript (Section 3.1) specifies the sampling frame: 473 actively maintained GitHub repositories implementing MCP (filtered by stars, recent commits, and protocol relevance), with 837 threads selected via keyword searches combining 'MCP'/'model context protocol' with runtime fault terms (e.g., 'error', 'exception', 'fault', 'timeout'). Inclusion required explicit runtime fault reports in MCP server contexts; exclusion covered non-runtime issues, feature requests, and non-MCP projects. Temporal bounds cover threads from the protocol's initial public release through the collection date. We will revise the abstract to briefly summarize the sampling approach and direct readers to Section 3 for complete criteria. revision: yes
-
Referee: [Abstract] Abstract (open coding description): the bottom-up procedure is presented without any information on the number of coders, inter-rater reliability statistics, disagreement resolution protocol, or saturation criteria. These omissions directly affect the trustworthiness of the 11/27/73 category structure.
Authors: The abstract omits these details for brevity. In the full paper, open coding was conducted by two researchers using an iterative bottom-up process on the 837 threads, with weekly consensus meetings to resolve disagreements through discussion (no independent parallel coding was performed, so formal IRR metrics such as Cohen's kappa were not calculated). Saturation was reached when analysis of additional threads yielded no new categories or subcategories. We will expand the abstract with a brief methods summary and add an explicit subsection in Section 3.2 describing the coder count, resolution protocol, and saturation assessment. revision: yes
-
Referee: [Abstract (validation survey)] Survey validation paragraph: while 55 respondents reported experience with an average of 20/27 subcategories, the paper supplies no recruitment method, response rate, or respondent demographics. This limits the survey's ability to compensate for potential upstream selection bias in the GitHub corpus.
Authors: The abstract does not include these survey details. Recruitment occurred via posts in MCP-related GitHub discussions, Discord communities, and direct outreach to repository maintainers; the survey was open for four weeks. We will revise the abstract and methods section to report the recruitment channels, achieved response count, and available respondent characteristics (e.g., self-reported MCP experience levels). Note that exact response rate cannot be computed as the total population size of MCP developers is unknown. revision: partial
Circularity Check
No circularity; taxonomy derived bottom-up from primary data corpus
full rationale
The paper conducts an empirical classification study: 837 threads are manually selected from GitHub repositories and subjected to bottom-up open coding to produce the 11-category taxonomy. No equations, fitted parameters, predictions, or self-citations appear in the derivation chain; categories are stated to emerge from the threads rather than being presupposed or imported. The developer survey functions as an independent consistency check on the resulting taxonomy, not as an input that forces the categories. This matches the default expectation of a self-contained empirical taxonomy with no load-bearing reduction to its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Bottom-up open coding applied to GitHub issue threads produces a valid and useful taxonomy of runtime faults
Reference graph
Works this paper leans on
-
[1]
A Survey of Large Language Models
W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Donget al., “A survey of large language models,”arXiv preprint arXiv:2303.18223, vol. 1, no. 2, pp. 1–124, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
On the Opportunities and Risks of Foundation Models
R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskillet al., “On the opportunities and risks of foundation models,”arXiv preprint arXiv:2108.07258, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
A comprehensive survey on integrating large language models with knowledge- based methods,
W. Yang, L. Some, M. Bain, and B. Kang, “A comprehensive survey on integrating large language models with knowledge- based methods,”Knowledge-Based Systems, vol. 318, p. 113503, 2025
2025
-
[4]
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
S. Tonmoy, S. Zaman, V . Jain, A. Rani, V . Rawte, A. Chadha, and A. Das, “A comprehensive survey of hallucination mitigation tech- niques in large language models,”arXiv preprint arXiv:2401.01313, vol. 6, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[5]
Retrieval- augmented generation for knowledge-intensive nlp tasks,
P . Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Ad- vances in neural information processing systems, vol. 33, pp. 9459– 9474, 2020
2020
-
[6]
Retrieval-Augmented Generation for Large Language Models: A Survey
Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, H. Wang, H. Wanget al., “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, vol. 2, no. 1, p. 32, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[7]
React: Synergizing reasoning and acting in language models,
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. R. Narasimhan, and Y. Cao, “React: Synergizing reasoning and acting in language models,” inThe eleventh international conference on learning repre- sentations, 2022
2022
-
[8]
Toolformer: Language models can teach themselves to use tools,
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Ham- bro, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: Language models can teach themselves to use tools,”Advances in neural information processing systems, vol. 36, pp. 68 539–68 551, 2023
2023
-
[9]
Promises and challenges of microservices: an exploratory study,
Y. Wang, H. Kadiyala, and J. Rubin, “Promises and challenges of microservices: an exploratory study,”Empirical Software Engineer- ing, vol. 26, no. 4, p. 63, 2021
2021
-
[10]
Gray failure: The achilles’ heel of cloud-scale sys- tems,
P . Huang, C. Guo, L. Zhou, J. R. Lorch, Y. Dang, M. Chintalapati, and R. Yao, “Gray failure: The achilles’ heel of cloud-scale sys- tems,” inProceedings of the 16th Workshop on Hot Topics in Operating Systems, 2017, pp. 150–155
2017
-
[11]
Experience report: An empirical study of api failures in openstack cloud environments,
P . Musavi, B. Adams, and F. Khomh, “Experience report: An empirical study of api failures in openstack cloud environments,” in2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2016, pp. 424–434
2016
-
[12]
What is the model context protocol (mcp)?
Model Context Protocol, “What is the model context protocol (mcp)?” https://modelcontextprotocol.io/docs/getting-started/i ntro, 2026, official documentation, accessed February 25, 2026
2026
-
[13]
Specification-model context protocol,
——, “Specification-model context protocol,” https://modelcon textprotocol.io/specification/2025-06-18, Jun. 2025, accessed: 2026-02-26
2025
-
[14]
M. M. Hasan, H. Li, E. Fallahzadeh, G. K. Rajbahadur, B. Adams, and A. E. Hassan, “Model context protocol (mcp) at first glance: Studying the security and maintainability of mcp servers,”arXiv preprint arXiv:2506.13538, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[15]
Systematization of knowledge: Security and safety in the model context protocol ecosystem,
S. Gaire, S. Gyawali, S. Mishra, S. Niroula, D. Thakur, and U. Yadav, “Systematization of knowledge: Security and safety in the model context protocol ecosystem,”arXiv preprint arXiv:2512.08290, 2025
-
[16]
A measurement study of model context protocol ecosystem,
H. Guo, Y. Hao, Y. Zhang, M. Xu, P . Lv, J. Chen, and X. Cheng, “A measurement study of model context protocol ecosystem,”arXiv preprint arXiv:2509.25292, 2025
-
[17]
MCPXKIT: The Unified Toolkit for Analyzing Model Context Protocol Security
Y. Guo, P . Liu, W. Ma, Z. Deng, X. Zhu, P . Di, X. Xiao, and S. Wen, “Systematic analysis of mcp security,”arXiv preprint arXiv:2508.12538, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[18]
W. Zhao, J. Liu, B. Ruan, S. Li, and Z. Liang, “When mcp servers attack: Taxonomy, feasibility, and mitigation,”arXiv preprint arXiv:2509.24272, 2025
-
[19]
Basic concepts and taxonomy of dependable and secure computing,
A. Avizienis, J.-C. Laprie, B. Randell, and C. Landwehr, “Basic concepts and taxonomy of dependable and secure computing,” IEEE transactions on dependable and secure computing, vol. 1, no. 1, pp. 11–33, 2004
2004
-
[20]
Understanding fault-tolerant distributed systems,
F. Cristian, “Understanding fault-tolerant distributed systems,” Communications of the ACM, vol. 34, no. 2, pp. 56–78, 1991
1991
-
[21]
Fault analysis and debugging of microservice systems: Industrial sur- vey, benchmark system, and empirical study,
X. Zhou, X. Peng, T. Xie, J. Sun, C. Ji, W. Li, and D. Ding, “Fault analysis and debugging of microservice systems: Industrial sur- vey, benchmark system, and empirical study,”IEEE Transactions on Software Engineering, vol. 47, no. 2, pp. 243–260, 2018
2018
-
[22]
Taxonomy of real faults in deep learning systems,
N. Humbatova, G. Jahangirova, G. Bavota, V . Riccio, A. Stocco, and P . Tonella, “Taxonomy of real faults in deep learning systems,” in Proceedings of the ACM/IEEE 42nd international conference on software engineering, 2020, pp. 1110–1121
2020
-
[23]
A taxonomy of real faults for hybrid quantum-classical software architectures,
A. Bensoussan, G. Jahangirova, and M. Mousavi, “A taxonomy of real faults for hybrid quantum-classical software architectures,” ACM Transactions on Software Engineering and Methodology, 2025
2025
-
[24]
Empirical validation of a web fault taxonomy and its usage for fault seeding,
A. Marchetto, F. Ricca, and P . Tonella, “Empirical validation of a web fault taxonomy and its usage for fault seeding,” in2007 9th IEEE International Workshop on Web Site Evolution. IEEE, 2007, pp. 31–38
2007
-
[25]
Are mutants a valid substitute for real faults in software testing?
R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraser, “Are mutants a valid substitute for real faults in software testing?” inProceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, 2014, pp. 654–665
2014
-
[26]
Crowd- sourced knowledge on stack overflow: A systematic mapping study,
S. Meldrum, S. A. Licorish, and B. T. R. Savarimuthu, “Crowd- sourced knowledge on stack overflow: A systematic mapping study,” inProceedings of the 21st international conference on evaluation and assessment in software engineering, 2017, pp. 180–185
2017
-
[27]
Sampling methods, types & techniques,
W. Webster, “Sampling methods, types & techniques,” https:// www.qualtrics.com/articles/strategy-research/sampling-metho ds/, Jan. 2023, qualtrics. Accessed: 2026-06-01
2023
-
[28]
Practitioners’ expectations on log anomaly detection,
X. Ma, Y. Li, J. Keung, X. Yu, H. Zou, Z. Yang, F. Sarro, and E. T. Barr, “Practitioners’ expectations on log anomaly detection,”IEEE Transactions on Software Engineering, 2025
2025
-
[29]
Pull request governance in open source communities,
A. Alami, R. Pardo, M. L. Cohn, and A. W ˛ asowski, “Pull request governance in open source communities,”IEEE Transactions on Software Engineering, vol. 48, no. 12, pp. 4838–4856, 2021
2021
-
[30]
Function calling,
OpenAI, “Function calling,” https://platform.openai.com/docs /guides/function-calling, 2025, accessed: 2026-02-26
2025
-
[31]
Json-rpc 2.0 specification,
JSON-RPC Working Group, “Json-rpc 2.0 specification,” https:// www.jsonrpc.org/specification, 2013, accessed: 2026-02-26
2013
-
[32]
Experimental evaluation of the fail- silent behavior of a distributed real-time run-time support built from cots components,
P . Chevochot and I. Puaut, “Experimental evaluation of the fail- silent behavior of a distributed real-time run-time support built from cots components,” in2001 International Conference on Depend- able Systems and Networks. IEEE, 2001, pp. 304–313
2001
-
[33]
An empirical study on api-misuse bugs in open-source c programs,
Z. Gu, J. Wu, J. Liu, M. Zhou, and M. Gu, “An empirical study on api-misuse bugs in open-source c programs,” in2019 IEEE 43rd annual computer software and applications conference (COMPSAC), vol. 1. IEEE, 2019, pp. 11–20. 14
2019
-
[34]
Root cause analysis of anomalies of multitier services in public clouds,
J. Weng, J. H. Wang, J. Yang, and Y. Yang, “Root cause analysis of anomalies of multitier services in public clouds,”IEEE/ACM Transactions on Networking, vol. 26, no. 4, pp. 1646–1659, 2018
2018
-
[35]
Failure diagnosis in microservice systems: A comprehensive survey and analysis,
S. Zhang, S. Xia, W. Fan, B. Shi, X. Xiong, Z. Zhong, M. Ma, Y. Sun, and D. Pei, “Failure diagnosis in microservice systems: A comprehensive survey and analysis,”ACM Transactions on Software Engineering and Methodology, 2024
2024
-
[36]
An empirical study on tensorflow program bugs,
Y. Zhang, Y. Chen, S.-C. Cheung, Y. Xiong, and L. Zhang, “An empirical study on tensorflow program bugs,” inProceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, 2018, pp. 129–140
2018
-
[37]
An empirical study of issues in large language model training systems,
Y. Gao, R. Lu, H. Lin, and Y. Chen, “An empirical study of issues in large language model training systems,” inProceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, 2025, pp. 122–133
2025
-
[38]
Defining and detecting the defects of large language model-based autonomous agents,
K. Ning, J. Chen, J. Zhang, W. Li, Z. Wang, Y. Feng, W. Zhang, and Z. Zheng, “Defining and detecting the defects of large language model-based autonomous agents,”IEEE Transactions on Software Engineering, 2026
2026
-
[39]
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
X. Hou, Y. Zhao, S. Wang, and H. Wang, “Model context protocol (mcp): Landscape, security threats, and future research direc- tions,”arXiv preprint arXiv:2503.23278, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[40]
We urgently need privilege management in mcp: A measurement of api usage in mcp ecosystems,
Z. Li, K. Li, B. Ma, M. Xu, Y. Zhang, and X. Cheng, “We urgently need privilege management in mcp: A measurement of api usage in mcp ecosystems,”arXiv preprint arXiv:2507.06250, 2025
-
[41]
Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem
S. Zhao, Q. Hou, Z. Zhan, Y. Wang, Y. Xie, Y. Guo, L. Chen, S. Li, and Z. Xue, “Mind your server: A systematic study of parasitic toolchain attacks on the mcp ecosystem,”arXiv preprint arXiv:2509.06572, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[42]
On the use of agentic coding: An empirical study of pull requests on github,
M. Watanabe, H. Li, Y. Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”arXiv preprint arXiv:2509.14745, 2025
-
[43]
muprl: A mutation testing pipeline for deep reinforcement learning based on real faults,
D.-G. Thomas, M. Biagiola, N. Humbatova, M. Wardat, G. Ja- hangirova, H. Rajan, and P . Tonella, “muprl: A mutation testing pipeline for deep reinforcement learning based on real faults,” arXiv preprint arXiv:2408.15150, 2024
-
[44]
Taxonomy of faults in attention-based neural networks,
S. Jahan, S. Singh Rajput, T. Sharma, and M. Masudur Rahman, “Taxonomy of faults in attention-based neural networks,”arXiv e-prints, pp. arXiv–2508, 2025
2025
-
[45]
Faults in deep reinforcement learning programs: a taxonomy and a detection approach,
A. Nikanjam, M. M. Morovati, F. Khomh, and H. Ben Braiek, “Faults in deep reinforcement learning programs: a taxonomy and a detection approach,”Automated software engineering, vol. 29, no. 1, p. 8, 2022
2022
-
[46]
An in-depth study of the promises and perils of mining github,
E. Kalliamvakou, G. Gousios, K. Blincoe, L. Singer, D. M. German, and D. Damian, “An in-depth study of the promises and perils of mining github,”Empirical Software Engineering, vol. 21, no. 5, pp. 2035–2071, 2016
2035
-
[47]
Curating github for engineered software projects,
N. Munaiah, S. Kroh, C. Cabrey, and M. Nagappan, “Curating github for engineered software projects,”Empirical Software Engi- neering, vol. 22, no. 6, pp. 3219–3253, 2017
2017
-
[48]
Data quality assessment in the wild: Find- ings from github,
I. Ustunboyacioglu, I. Kumara, D. Di Nucci, D. A. Tamburri, and W.-J. Van Den Heuvel, “Data quality assessment in the wild: Find- ings from github,” inProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, 2024, pp. 120– 129
2024
-
[49]
Chaos engineering in the wild: Findings from github,
J. Owotogbe, I. Kumara, D. Di Nucci, D. A. Tamburri, and W.- J. v. d. Heuvel, “Chaos engineering in the wild: Findings from github,”arXiv preprint arXiv:2505.13654, 2025
-
[50]
M. Khan, M. A. Akbar, and J. Kasurinen, “Integrating large language models in software engineering education: A pi- lot study through github repositories mining,”arXiv preprint arXiv:2509.04877, 2025
-
[51]
Towards detecting prompt knowledge gaps for improved llm-guided issue resolution,
R. Ehsani, S. Pathak, and P . Chatterjee, “Towards detecting prompt knowledge gaps for improved llm-guided issue resolution,” in 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR). IEEE, 2025, pp. 699–711
2025
-
[52]
A first look at the self-admitted technical debt in test code: Taxonomy and detection,
S. Islam, M. N. I. Opu, S. Wang, and S. Chowdhury, “A first look at the self-admitted technical debt in test code: Taxonomy and detection,”arXiv preprint arXiv:2510.22409, 2025
-
[53]
Bugs in machine learning-based systems: a faultload benchmark,
M. M. Morovati, A. Nikanjam, F. Khomh, and Z. M. Jiang, “Bugs in machine learning-based systems: a faultload benchmark,”Em- pirical Software Engineering, vol. 28, no. 3, p. 62, 2023
2023
-
[54]
Sampling in software engineering research: A critical review and guidelines,
S. Baltes and P . Ralph, “Sampling in software engineering research: A critical review and guidelines,”Empirical Software Engineering, vol. 27, no. 4, p. 94, 2022
2022
-
[55]
Bug taxonomies: Use them to generate better tests,
G. Vijayaraghavan and C. Kaner, “Bug taxonomies: Use them to generate better tests,”Star East, vol. 2003, pp. 1–40, 2003
2003
-
[56]
A comprehensive study on security bug characteristics,
Y. Wei, X. Sun, L. Bo, S. Cao, X. Xia, and B. Li, “A comprehensive study on security bug characteristics,”Journal of Software: Evolution and Process, vol. 33, no. 10, p. e2376, 2021
2021
-
[57]
What do users ask in open-source ai repositories? an empirical study of github issues,
Z. Yang, C. Wang, J. Shi, T. Hoang, P . Kochhar, Q. Lu, Z. Xing, and D. Lo, “What do users ask in open-source ai repositories? an empirical study of github issues,” in2023 IEEE/ACM 20th Inter- national Conference on Mining Software Repositories (MSR). IEEE, 2023, pp. 79–91
2023
-
[58]
Wohlin, P
C. Wohlin, P . Runeson, M. Höst, M. C. Ohlsson, B. Regnell, A. Wesslénet al.,Experimentation in software engineering. Karl- skrona, Sweden: Springer, 2012, vol. 236
2012
-
[59]
Are you still working on this? an empirical study on pull request abandon- ment,
Z. Li, Y. Yu, T. Wang, G. Yin, S. Li, and H. Wang, “Are you still working on this? an empirical study on pull request abandon- ment,”IEEE Transactions on Software Engineering, vol. 48, no. 6, pp. 2173–2188, 2021. ACKNOWLEDGMENTS This research was partially funded by the Safeguard project (grant No. 20506020) with Deloitte. Joshua Owotogbeis a Ph.D. candid...
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.