Recognition: unknown
MR-Coupler: Automated Metamorphic Test Generation via Functional Coupling Analysis
Pith reviewed 2026-05-10 15:50 UTC · model grok-4.3
The pith
Functional coupling between methods in source code lets large language models automatically generate valid metamorphic test cases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MR-Coupler identifies functionally coupled method pairs via three coupling features, prompts large language models to instantiate metamorphic relations for those pairs, and validates the resulting metamorphic test cases with test amplification and mutation analysis, yielding valid cases for over 90 percent of evaluated tasks and detecting 44 percent of real-world bugs.
What carries the argument
MR-Coupler, the pipeline that selects method pairs by functional coupling features, delegates relation instantiation to large language models, and applies test amplification plus mutation analysis for validation.
Load-bearing premise
The three chosen features of functional coupling between methods reliably indicate pairs that possess useful metamorphic relations an LLM can formulate correctly.
What would settle it
Applying MR-Coupler to a fresh collection of 50 industrial programs and finding that the generated metamorphic test cases detect fewer than 25 percent of the injected or reported bugs would falsify the reported effectiveness.
Figures
read the original abstract
Metamorphic testing (MT) is a widely recognized technique for alleviating the oracle problem in software testing. However, its adoption is hindered by the difficulty of constructing effective metamorphic relations (MRs), which often require domain-specific or hard-to-obtain knowledge. In this work, we propose a novel approach that leverages the functional coupling between methods, which is readily available in source code, to automatically construct MRs and generate metamorphic test cases (MTCs). Our technique, MR-Coupler, identifies functionally coupled method pairs, employs large language models to generate candidate MTCs, and validates them through test amplification and mutation analysis. In particular, we leverage three functional coupling features to avoid expensive enumeration of possible method pairs, and a novel validation mechanism to reduce false alarms. Our evaluation of MR-Coupler on 100 human-written MTCs and 50 real-world bugs shows that it generates valid MTCs for over 90% of tasks, improves valid MTC generation by 64.90%, and reduces false alarms by 36.56% compared to baselines. Furthermore, the MTCs generated by MR-Coupler detect 44% of the real bugs. Our results highlight the effectiveness of leveraging functional coupling for automated MR construction and the potential of MR-Coupler to facilitate the adoption of MT in practice. We also released the tool and experimental data to support future research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MR-Coupler, a technique for automated metamorphic test case (MTC) generation. It identifies functionally coupled method pairs using three code-derived features to avoid exhaustive enumeration, employs large language models to generate candidate MTCs based on metamorphic relations, and validates candidates via test amplification combined with mutation analysis to reduce false alarms. Evaluation is performed on 100 human-written MTCs and 50 real-world bugs, reporting >90% valid MTC generation, a 64.90% improvement in valid MTC generation over baselines, a 36.56% reduction in false alarms, and detection of 44% of the real bugs.
Significance. If the results hold, the work addresses a longstanding barrier to metamorphic testing adoption by automating MR construction from readily available source-code features rather than domain expertise. The pipeline integrates static analysis, LLM generation, and dynamic validation in a manner that appears internally consistent and externally validated via mutation analysis and real faults. The explicit release of the tool and experimental data is a clear strength that supports reproducibility and follow-on research.
minor comments (2)
- The abstract and evaluation summary report concrete percentages (e.g., 64.90% improvement, 36.56% false-alarm reduction) but do not name the exact baseline techniques or statistical tests used; adding this detail would strengthen the comparison claims.
- The three functional coupling features are central to narrowing the search space, yet the manuscript would benefit from a brief justification or reference to prior work on why these particular features (rather than alternatives) were selected.
Simulated Author's Rebuttal
We thank the referee for the positive summary of our work on MR-Coupler and the recommendation for minor revision. We are encouraged that the approach's potential to address longstanding challenges in metamorphic testing adoption through functional coupling analysis is recognized, along with the strengths in reproducibility via tool and data release.
Circularity Check
No significant circularity identified
full rationale
The paper describes an empirical tool (MR-Coupler) that identifies functionally coupled method pairs via three code-derived features, uses LLMs to propose MTCs, and validates them via test amplification plus mutation analysis. All reported performance numbers (90% valid MTCs, 64.90% improvement, 36.56% false-alarm reduction, 44% bug detection) are obtained by direct measurement against external artifacts: 100 human-written MTCs and 50 real-world bugs. No equations, predictions, or uniqueness claims reduce by construction to quantities defined inside the paper; the validation pipeline is explicitly external and falsifiable. No self-citation chains or ansatzes are load-bearing for the central result.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Functional coupling between methods, detectable from source code, indicates the existence of useful metamorphic relations.
Reference graph
Works this paper leans on
-
[1]
2025.Qwen3-coder
Alibaba. 2025.Qwen3-coder. Retrieved September 1, 2025 from https://qwenlm.github.io/blog/qwen3-coder/
2025
-
[2]
Simon Allier, Stéphane Vaucher, Bruno Dufour, and Houari A. Sahraoui. 2010. Deriving Coupling Metrics from Call Graphs. InTenth IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2010, Timisoara, Romania, 12-13 September 2010. IEEE Computer Society, 43–52. https://doi.org/10.1109/SCAM.2010.25
-
[3]
Juan Altmayer Pizzorno and Emery D. Berger. 2025. CoverUp: Effective High Coverage Test Generation for Python. Proc. ACM Softw. Eng.2, FSE, Article FSE128 (June 2025), 23 pages. https://doi.org/10.1145/3729398
-
[4]
Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2021. Generating metamorphic relations for cyber-physical systems with genetic programming: an industrial case study. InJoint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 1264–1274
2021
-
[5]
Jon Ayerdi, Valerio Terragni, Gunel Jahangirova, Aitor Arrieta, and Paolo Tonella. 2024. GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming.IEEE Transactions on Software Engineering(2024), 1–12
2024
-
[6]
Ernst, Mauro Pezzè, and Antonio Carzaniga
Arianna Blasi, Alessandra Gorla, Michael D. Ernst, Mauro Pezzè, and Antonio Carzaniga. 2021. MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation.J. Syst. Softw.181 (2021), 111041. https://doi.org/10.1016/J.JSS.2021.111041
-
[7]
Adam Bodicoat, Gunel Jahangirova, and Valerio Terragni. 2025. Understanding LLM-Driven Test Oracle Generation. In2025 2nd IEEE/ACM International Conference on AI-powered Software (AIware). IEEE, 29–39
2025
-
[8]
Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi Cheung, and Haiming Chen. 2022. SemMT: A Semantic-Based Testing Approach for Machine Translation Systems.ACM Transactions on Software Engineering and Methodology31, 2 (2022), 34e:1–34e:36
2022
- [9]
-
[10]
Songqiang Chen, Shuo Jin, and Xiaoyuan Xie. 2021. Testing Your Question Answering Software via Asking Recursively. InInternational Conference on Automated Software Engineering. IEEE, 104–116
2021
-
[11]
Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges and Opportunities.ACM Comput. Surv.51, 1 (2018), 4:1–4:27. https: //doi.org/10.1145/3143561
-
[12]
Tsong Yueh Chen, Pak-Lok Poon, and Xiaoyuan Xie. 2016. METRIC: METamorphic Relation Identification based on the Category-choice framework.J. Syst. Softw.116 (2016), 177–190. https://doi.org/10.1016/j.jss.2015.07.037
-
[13]
Yinghao Chen, Zehao Hu, Chen Zhi, Junxiao Han, Shuiguang Deng, and Jianwei Yin. 2024. ChatUniTest: A Framework for LLM-Based Test Generation. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 572–576
2024
-
[14]
Steven Cho, Stefano Ruberto, and Valerio Terragni. 2025. LLMORPH: Automated Metamorphic Testing of Large Language Models. InProceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering. 4102–4105. https://doi.org/10.1109/ASE63991.2025.00385
-
[15]
Steven Cho, Stefano Ruberto, and Valerio Terragni. 2025. Metamorphic Testing of Large Language Models for Natural Language Processing. InProceedings of the 41st IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 174–186. https://doi.org/10.1109/ICSME64153.2025.00025
-
[16]
2025.DeepSeek-V3.1
DeepSeek. 2025.DeepSeek-V3.1. Retrieved September 1, 2025 from https://api-docs.deepseek.com/news/news250821 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE206. Publication date: July 2026. FSE206:22 Congying Xu, Hengcheng Zhu, Songqiang Chen, Jiarong Wu, Valerio Terragni, and Shing-Chi Cheung
2025
-
[17]
2025.SimplerPlannerTest
Diennea. 2025.SimplerPlannerTest. https://github.com/diennea/herddb/blob/master/herddb-core/src/test/java/herddb/ sql/SimplerPlannerTest.java
2025
-
[18]
Xueying Du, Mingwei Liu, Kaixin Wang, Hanlin Wang, Junwei Liu, Yixuan Chen, Jiayi Feng, Chaofeng Sha, Xin Peng, and Yiling Lou. 2024. Evaluating Large Language Models in Class-Level Code Generation. Innternational Conference on Software Engineering. ACM, 81:1–81:13
2024
- [19]
-
[20]
2025.leaderboard
Evalplus. 2025.leaderboard. Retrieved September 1, 2025 from https://evalplus.github.io/leaderboard.html
2025
-
[21]
2025.BasicParserFilteringTest
FasterXML. 2025.BasicParserFilteringTest. https://github.com/FasterXML/jackson-core/blob/3.x/src/test/java/tools/ jackson/core/unittest/filter/BasicParserFilteringTest.java#L432
2025
-
[22]
Enrico Fregnan, Tobias Baum, Fabio Palomba, and Alberto Bacchelli. 2019. A survey on software coupling relations and tools.Inf. Softw. Technol.107 (2019), 159–178. https://doi.org/10.1016/J.INFSOF.2018.11.008
-
[23]
Christoph Hazott and Daniel Große. 2025. LLM-assisted Metamorphic Testing of Embedded Graphics Libraries. In Forum on Specification and Design Languages. https://ics.jku.at/files/2025FDL_LLM-assisted_Metamorphic_Testing_ of_Embedded_Graphics_Libraries.pdf
2025
-
[24]
Dwyer, Sebastian Elbaum, and Willem Visser
Soneya Binta Hossain, Antonio Filieri, Matthew B. Dwyer, Sebastian G. Elbaum, and Willem Visser. 2023. Neural- Based Test Oracle Generation: A Large-Scale Evaluation and Lessons Learned. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023, San Francisco, CA, U...
-
[25]
2025.IoTDB Issue #13691
Apache IoTDB. 2025.IoTDB Issue #13691. https://github.com/apache/iotdb/pull/13691
2025
-
[26]
2024.JavaParser
JavaParser. 2024.JavaParser. Retrieved June 6, 2024 from https://javaparser.org/
2024
-
[27]
2025.GitHub Commit 777a078913
Jcabi. 2025.GitHub Commit 777a078913. https://github.com/jcabi/jcabi-github/commit/777a078913
2025
-
[28]
Yu Jiang, Jie Liang, Fuchen Ma, Yuanliang Chen, Chijin Zhou, Yuheng Shen, Zhiyong Wu, Jingzhou Fu, Mingzhe Wang, Shanshan Li, et al. 2024. When fuzzing meets llms: Challenges and opportunities. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 492–496
2024
-
[29]
Knowledge Cutoff Information of GPT-4o-mini [n. d.]. https://community.openai.com/t/introducing-gpt-4o-mini-in- the-api/871594
-
[30]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. InConference on Programming Language Design and Implementation. ACM, 216–226
2014
-
[31]
Lahiri, and Siddhartha Sen
Caroline Lemieux, Jeevana Priya Inala, Shuvendu K. Lahiri, and Siddhartha Sen. 2023. CodaMosa: Escaping Cover- age Plateaus in Test Generation with Pre-trained Large Language Models. InInternational Conference on Software Engineering. IEEE, 919–931
2023
- [32]
-
[33]
Jiapeng Li, Zheng Zheng, Yuning Xing, Daixu Ren, Steven Cho, and Valerio Terragni. 2025. MDPMORPH: An MDP- Based Metamorphic Testing Framework for Deep Reinforcement Learning Agents. InProceedings of the 36th IEEE International Symposium on Software Reliability Engineering. 154–166. https://doi.org/10.1109/ISSRE66568.2025.00028
-
[34]
Jiapeng Li, Zheng Zheng, Yuning Xing, Daixu Ren, Steven Cho, and Valerio Terragni. 2025. Metamorphic Testing of Deep Reinforcement Learning Agents with MDPMORPH. InProceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering. 4086–4089. https://doi.org/10.1109/ASE63991.2025.00381
-
[35]
Yujia Li, David H. Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, and et al. 2022. Competition-Level Code Generation with AlphaCode.CoRRabs/2203.07814 (2022). arXiv:2203.07814
-
[36]
Huai Liu, Fei-Ching Kuo, Dave Towey, and Tsong Yueh Chen. 2014. How Effectively Does Metamorphic Testing Alleviate the Oracle Problem?IEEE Transactions on Software Engineering40, 1 (2014), 4–22
2014
-
[37]
2025.NTV2Test
LocationTech. 2025.NTV2Test. https://github.com/locationtech/proj4j/blob/master/core/src/test/java/org/locationtech/ proj4j/datum/NTV2Test.java
2025
- [38]
-
[39]
Haoyang Ma, Qingchao Shen, Yongqiang Tian, Junjie Chen, and Shing-Chi Cheung. 2023. Fuzzing Deep Learning Compilers with HirGen. InInternational Symposium on Software Testing and Analysis. ACM, 248–260
2023
-
[40]
2025.Mutation Testing
Major. 2025.Mutation Testing. https://mutation-testing.org/
2025
-
[41]
Agustín Nolasco, Facundo Molina, Renzo Degiovanni, Alessandra Gorla, Diego Garbervetsky, Mike Papadakis, Sebastián Uchitel, Nazareno Aguirre, and Marcelo F. Frias. 2024. Abstraction-Aware Inference of Metamorphic Relations. Proceedings of the ACM on Software Engineering1, FSE (2024), 450–472. Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE206. Publica...
2024
-
[42]
2025.GPT-4o mini
OpenAI. 2025.GPT-4o mini. Retrieved September 1, 2025 from https://platform.openai.com/docs/models/gpt-4o-mini
2025
-
[43]
2025.OjAlgo Issue #49
Optimatika. 2025.OjAlgo Issue #49. https://github.com/optimatika/ojAlgo/issues/49
2025
-
[44]
2025.OjAlgo Issue #49
Optimatika. 2025.OjAlgo Issue #49. Retrieved September 1, 2025 from https://github.com/optimatika/ojAlgo/issues/49
2025
-
[45]
Denys Poshyvanyk, Andrian Marcus, Rudolf Ferenc, and Tibor Gyimóthy. 2009. Using information retrieval based coupling measures for impact analysis.Empir. Softw. Eng.14, 1 (2009), 5–32. https://doi.org/10.1007/S10664-008-9088-2
-
[46]
Ravin Ravi, Dylan Bradshaw, Stefano Ruberto, Gunel Jahangirova, and Valerio Terragni. 2025. LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops. In2025 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 930–934
2025
-
[47]
Max Schäfer, Sarah Nadi, Aryaz Eghbali, and Frank Tip. 2023. An empirical evaluation of using large language models for automated unit test generation.IEEE Transactions on Software Engineering50, 1 (2023), 85–105
2023
-
[48]
Sergio Segura, Gordon Fraser, Ana Belén Sánchez, and Antonio Ruiz Cortés. 2016. A Survey on Metamorphic Testing. IEEE Trans. Software Eng.42, 9 (2016), 805–824. https://doi.org/10.1109/TSE.2016.2532875
-
[49]
Sergio Segura, José Antonio Parejo, Javier Troya, and Antonio Ruiz Cortés. 2018. Metamorphic Testing of RESTful Web APIs.IEEE Transactions on Software Engineering44, 11 (2018), 1083–1099
2018
-
[50]
Seung Yeob Shin, Fabrizio Pastore, Domenico Bianculli, and Alexandra Baicoianu. 2024. Towards Generating Executable Metamorphic Relations Using Large Language Models. InQuality of Information and Communications Technology - 17th International Conference on the Quality of Information and Communications Technology, QUATIC 2024, Pisa, Italy, September 11-13,...
-
[51]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. InInternational Conference on Object-Oriented Programming, Systems, Languages, and Applications,. ACM, 849–863
2016
-
[52]
Chang-Ai Sun, Yiqiang Liu, Zuoyi Wang, and W. K. Chan. 2016. 𝜇MT: a data mutation directed metamorphic relation acquisition methodology. InInternational Workshop on Metamorphic Testing. ACM, 12–18
2016
-
[53]
Yutian Tang, Zhijie Liu, Zhichao Zhou, and Xiapu Luo. 2024. ChatGPT vs SBST: A Comparative Assessment of Unit Test Suite Generation.IEEE Transactions on Software Engineering(2024), 1–19
2024
-
[54]
Valerio Terragni, Gunel Jahangirova, Paolo Tonella, and Mauro Pezzè. 2020. Evolutionary Improvement of Assertion Oracles. InJoint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1178–1189
2020
-
[55]
Valerio Terragni, Annie Vella, Partha Roop, and Kelly Blincoe. 2025. The Future of AI-Driven Software Engineering. ACM Trans. Softw. Eng. Methodol.34, 5 (Jan. 2025). https://doi.org/10.1145/3715003
-
[56]
TheAlgorithms. 2025.AESEncryptionTest. https://github.com/TheAlgorithms/Java/blob/master/src/test/java/com/ thealgorithms/ciphers/AESEncryptionTest.java [57]MR-Coupler. 2025.MR-Couplerwebsite. Retrieved September 2, 2025 from https://mr-coupler.github.io/ [58]MR-Coupler. 2026.MR-Coupleron Zenodo. Retrieved April 2, 2026 from https://doi.org/10.5281/zenodo...
-
[57]
Christos Tsigkanos, Pooja Rani, Sebastian Müller, and Timo Kehrer. 2023. Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software. InComputational Science - ICCS 2023 - 23rd International Conference, Prague, Czech Republic, July 3-5, 2023, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 14073). Springer, 321–335
2023
-
[58]
2025.dubbo
vmgama. 2025.dubbo. https://github.com/vmgama/dubbo/blob/master/dubbo-common/src/main/java/org/apache/ dubbo/common/io/Bytes.java
2025
-
[59]
Ying Wang, Bihuan Chen, Kaifeng Huang, Bowen Shi, Congying Xu, Xin Peng, Yijian Wu, and Yang Liu. 2020. An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects. InInternational Conference on Software Maintenance and Evolution. IEEE, 35–45
2020
-
[60]
2025.SparseBitSet Issue #13
Brett Wooldridge. 2025.SparseBitSet Issue #13. https://github.com/brettwooldridge/SparseBitSet/issues/13
2025
-
[61]
Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, and Lingming Zhang. 2024. Fuzz4All: Universal Fuzzing with Large Language Models. InInternational Conference on Software Engineering. ACM, 126:1–126:13
2024
-
[62]
Xiaoyuan Xie, Shuo Jin, and Songqiang Chen. 2023. qaAskeR+: a novel testing method for question answering software via asking recursive questions.Automated Software Engineering30, 1 (2023), 14
2023
-
[63]
Xiaoyuan Xie, Shuo Jin, Songqiang Chen, and Shing-Chi Cheung. 2024. Word Closure-Based Metamorphic Testing for Machine Translation.ACM Transactions on Software Engineering and Methodology(jul 2024)
2024
-
[64]
Congying Xu, Songqiang Chen, Jiarong Wu, Shing-Chi Cheung, Valerio Terragni, Hengcheng Zhu, and Jialun Cao
-
[65]
MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, ASE 2024, Sacramento, CA, USA, October 27 - November 1, 2024, Vladimir Filkov, Baishakhi Ray, and Minghui Zhou (Eds.). ACM, 557–569. https: //doi.org/10.1145/3691620.3696020 ...
-
[66]
Congying Xu, Valerio Terragni, Hengcheng Zhu, Jiarong Wu, and Shing-Chi Cheung. 2024. MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases.ACM Transactions on Software Engineering and Methodology33, 6 (2024), 150
2024
- [67]
-
[68]
Yuanyuan Yuan, Shuai Wang, Mingyue Jiang, and Tsong Yueh Chen. 2021. Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing. InConference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 16908–16917
2021
-
[69]
Zhiqiang Yuan, Mingwei Liu, Shiji Ding, Kaixin Wang, Yixuan Chen, Xin Peng, and Yiling Lou. 2024. Evaluating and Improving ChatGPT for Unit Test Generation.Proc. ACM Softw. Eng.1, FSE (2024), 1703–1726. https://doi.org/10.1145/ 3660783
2024
-
[70]
Bo Zhang, Hongyu Zhang, Junjie Chen, Dan Hao, and Pablo Moscato. 2019. Automatic Discovery and Cleansing of Numerical Metamorphic Relations. InIEEE International Conference on Software Maintenance and Evolution. IEEE, 235–245
2019
-
[71]
Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-based inference of polynomial metamorphic relations. InACM/IEEE International Conference on Automated Software Engineering. ACM, 701–712
2014
-
[72]
Jiaming Zhang, Chang-Ai Sun, Huai Liu, and Sijin Dong. 2025. Can Large Language Models Discover Metamorphic Rela- tions? A Large-Scale Empirical Study. InIEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2025, Montreal, QC, Canada, March 4-7, 2025. IEEE, 24–35. https://doi.org/10.1109/SANER64311.2025.00011
-
[73]
Yifan Zhang, Tsong Yueh Chen, Matthew Pike, Dave Towey, Zhihao Ying, and Zhi Quan Zhou. 2025. Enhancing autonomous driving simulations: A hybrid metamorphic testing framework with metamorphic relations generated by GPT.Inf. Softw. Technol.187 (2025), 107828. https://doi.org/10.1016/J.INFSOF.2025.107828
-
[74]
Yifan Zhang, Dave Towey, and Matthew Pike. 2023. Automated Metamorphic-Relation Generation with ChatGPT: An Experience Report. In47th IEEE Annual Computers, Software, and Applications Conference, COMPSAC 2023, Torino, Italy, June 26-30, 2023. IEEE, 1780–1785. https://doi.org/10.1109/COMPSAC57700.2023.00275
-
[75]
Yifan Zhang, Dave Towey, Matthew Pike, Quang-Hung Luu, Huai Liu, and Tsong Yueh Chen. 2025. Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT’s Capabilities in Generating Metamorphic Relations.CoRRabs/2503.22141 (2025). arXiv:2503.22141 https://arxiv.org/abs/2503.22141
-
[76]
Ziyao Zhang, Chong Wang, Yanlin Wang, Ensheng Shi, Yuchi Ma, Wanjun Zhong, Jiachi Chen, Mingzhi Mao, and Zibin Zheng. 2025. LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation.Proc. ACM Softw. Eng.2, ISSTA (2025), 481–503. https://doi.org/10.1145/3728894
-
[77]
Zhi Quan Zhou, Liqun Sun, Tsong Yueh Chen, and Dave Towey. 2020. Metamorphic Relations for Enhancing System Understanding and Use.IEEE Transactions on Software Engineering46, 10 (2020), 1120–1154
2020
-
[78]
2025.Zingg Issue #60
Zingg. 2025.Zingg Issue #60. https://github.com/zinggAI/zingg/issues/60 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE206. Publication date: July 2026
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.