pith. machine review for the scientific record. sign in

arxiv: 2605.11790 · v1 · submitted 2026-05-12 · 💻 cs.SE

Recognition: no theorem link

An Extensive Replication Study of the ABLoTS Approach for Bug Localization

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:45 UTC · model grok-4.3

classification 💻 cs.SE
keywords bug localizationreplication studydata leakageinformation retrievalABLoTSTraceScoretemporal splitempirical software engineering
0
0 comments X

The pith

A replication of ABLoTS bug localization cannot reproduce the original results because an incorrect cut-off date allowed test data to leak into training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper replicates ABLoTS, an information-retrieval approach for recommending source-code files that contain the cause of a bug. Its core component, TraceScore, links feature requests and bug reports through traceability information to score code snippets. When run on the original Java dataset plus two larger extended datasets, TraceScore produces comparable or stronger results. Yet the replication fails to match the performance numbers published for ABLoTS. The discrepancy traces to a single overlooked choice: the original evaluation selected a cut-off date that placed later bug reports into the training set, letting test data leak forward and inflate measured accuracy.

Core claim

The central claim is that the promising results reported for ABLoTS cannot be reproduced because the original study chose a cut-off date that caused test data to leak into the training data, producing significantly inflated performance figures on the 11-project Java corpus of 8,494 bug reports.

What carries the argument

The temporal cut-off date used to split bug reports into training and test sets; an earlier date than the one implicitly used in the original work prevents future reports from contaminating the training corpus.

If this is right

  • TraceScore produces competitive results when evaluated on larger, temporally clean datasets without leakage.
  • Bug-localization performance numbers obtained with non-strict temporal splits are unreliable and cannot be compared across studies.
  • Any IRBL technique that trains on version-history or traceability data must enforce a forward-only cut-off to avoid the same leakage artifact.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Many earlier IRBL papers may contain similar hidden temporal leakage and would show lower performance under a correct split.
  • Practitioners adopting bug-localization tools should treat published accuracy figures as upper bounds until the same methods are re-evaluated with strict time ordering.
  • Future replication or benchmark suites for bug localization should publish both the exact cut-off date and the script that enforces it.

Load-bearing premise

The replication faithfully recreated the original ABLoTS code and evaluation pipeline except for the identified cut-off date choice.

What would settle it

Re-run the original ABLoTS pipeline on the 11-project dataset while enforcing a strict cut-off date that keeps every test bug report strictly after the last training report and verify whether the reported performance metrics fall to the levels obtained in the replication.

read the original abstract

Bug localization is the task of recommending source code locations (typically files) that contain the cause of a bug and hence need to be changed to fix the bug. Along these lines, information retrieval-based bug localization (IRBL) approaches have been adopted, which identify the most bug-prone files from the source code space. In current practice, a series of state-of-the-art IRBL techniques leverage the combination of different components (e.g., similar reports, version history, and code structure) to achieve better performance. ABLoTS is a recently proposed approach with the core component, TraceScore, that utilizes requirements and traceability information between different issue reports (i.e., feature requests and bug reports) to identify buggy source code snippets with promising results. To evaluate the accuracy of these results and obtain additional insights into the practical applicability of ABLoTS, we conducted a replication study of this approach with the original dataset and also on two extended datasets (i.e., additional Java dataset and Python dataset). The original dataset consists of 11 open source Java projects with 8,494 bug reports. The extended Java dataset includes 16 more projects comprising 25,893 bug reports and corresponding source code commits. The extended Python dataset consists of 12 projects with 1,289 bug reports. While we find that the TraceScore component, which is the core of ABLoTS, produces comparable or even better results with the extended datasets, we also find that we cannot reproduce the ABLoTS results, as reported in its original paper, due to an overlooked side effect of incorrectly choosing a cut-off date that led to test data leaking into training data with significant effects on performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports an extensive replication of the ABLoTS bug localization approach, evaluating its TraceScore component on the original 11 Java projects (8,494 bug reports) and two extended datasets: 16 additional Java projects (25,893 bug reports) and 12 Python projects (1,289 bug reports). The key finding is that while TraceScore yields comparable or superior results on the extended datasets, the original reported performance cannot be reproduced because of test data leakage into training data caused by an incorrect cut-off date selection.

Significance. If the leakage diagnosis holds, the work is significant for exposing how temporal cut-off choices can invalidate IR-based bug localization results and for demonstrating that TraceScore remains viable on larger, more recent datasets. It provides a concrete, testable explanation for non-reproducibility and encourages stricter data-handling standards in the field.

major comments (1)
  1. [§5] §5 (Results on Original Dataset): The central claim that non-reproducibility is caused solely by the incorrect cut-off date is not load-bearingly verified; the manuscript does not report a controlled re-run of the replication pipeline on the original 11-project dataset using the corrected cut-off date to confirm that MAP/MRR values are restored to the levels published in the ABLoTS paper.
minor comments (2)
  1. [Abstract] Abstract: The quantitative magnitude of the performance drop attributable to leakage (e.g., exact delta in MAP or MRR) is not stated, which would strengthen the claim.
  2. [§3] The manuscript should clarify whether the extended datasets preserve the same temporal ordering and commit quality standards as the original 11 projects.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We appreciate the referee's thorough review and the recognition of the significance of our findings on data leakage in bug localization studies. We address the major comment point by point below.

read point-by-point responses
  1. Referee: [§5] §5 (Results on Original Dataset): The central claim that non-reproducibility is caused solely by the incorrect cut-off date is not load-bearingly verified; the manuscript does not report a controlled re-run of the replication pipeline on the original 11-project dataset using the corrected cut-off date to confirm that MAP/MRR values are restored to the levels published in the ABLoTS paper.

    Authors: We acknowledge that the current manuscript relies on our diagnosis of the cut-off date error and the resulting data leakage without providing a direct controlled experiment that re-runs the pipeline with the corrected date to restore the original performance. This verification would indeed strengthen the causal claim. In the revised manuscript, we will include the results of such a controlled re-run on the original 11-project dataset, demonstrating that the MAP and MRR values are restored to levels comparable to the ABLoTS paper when the leakage is eliminated. This will be added to §5. revision: yes

Circularity Check

0 steps flagged

No circularity: external replication against independent prior work

full rationale

This is an empirical replication study that compares its recreated pipeline and results against the independent ABLoTS paper. No equations, parameters, or predictions are derived from the present paper's own outputs or fits; the central claim (non-reproducibility due to cut-off date leakage) rests on direct empirical mismatch with the external original results rather than any self-referential reduction. The extended datasets and TraceScore evaluations are presented as new observations, not as quantities defined by the paper's conclusions. No self-citation load-bearing, ansatz smuggling, or renaming of known results occurs in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The replication rests on standard assumptions in empirical software engineering: that bug reports and commits can be temporally ordered without leakage, that file-level granularity is appropriate for localization, and that standard IR metrics (e.g., MAP, MRR) are comparable across studies. No new free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Temporal ordering of bug reports and commits must be respected to avoid future information leaking into training data.
    Invoked when diagnosing the original cut-off date error.
  • domain assumption File-level bug localization is a valid proxy for the practical task of identifying code that must be changed.
    Standard assumption in IRBL literature that the replication inherits.

pith-pipeline@v0.9.0 · 5638 in / 1391 out tokens · 82738 ms · 2026-05-13T05:45:17.587080+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages

  1. [1]

    Empirical software engineering19(6), 1665–1705 (2014)

    Tan, L., Liu, C., Li, Z., Wang, X., Zhou, Y., Zhai, C.: Bug characteristics in open source software. Empirical software engineering19(6), 1665–1705 (2014)

  2. [2]

    In: 2017 International Conference on Cyberworlds (cw), pp

    Aslan, ¨O., Samet, R.: Mitigating cyber security attacks by being aware of vulner- abilities and bugs. In: 2017 International Conference on Cyberworlds (cw), pp. 222–225 (2017). IEEE

  3. [3]

    In: Supplementary 13th International Symposium on Software Reliability Engineering, pp

    Piessens, F.: A taxonomy of causes of software vulnerabilities in internet soft- ware. In: Supplementary 13th International Symposium on Software Reliability Engineering, pp. 47–52 (2002). Citeseer

  4. [4]

    In: 2005 OOPSLA Workshop on Eclipse Technology eXchange, pp

    Anvik, J., Hiew, L., Murphy, G.C.: Coping with an open bug repository. In: 2005 OOPSLA Workshop on Eclipse Technology eXchange, pp. 35–39 (2005)

  5. [5]

    In: 29th International Conference on Software Engineering (ICSE’07), pp

    Kim, S., Zimmermann, T., Whitehead Jr, E.J., Zeller, A.: Predicting faults from cached history. In: 29th International Conference on Software Engineering (ICSE’07), pp. 489–498 (2007). IEEE

  6. [6]

    In: 2009 IEEE International Conference on Software Maintenance, pp

    Zhang, H.: An investigation of the relationships between lines of code and defects. In: 2009 IEEE International Conference on Software Maintenance, pp. 274–283 (2009). IEEE

  7. [7]

    361–370 (2006)

    Anvik, J., Hiew, L., Murphy, G.C.: Who should fix this bug? In: 28th International Conference on Software Engineering, pp. 361–370 (2006)

  8. [8]

    really? In: 2008 IEEE International Conference on Software Maintenance, pp

    Bettenburg, N., Premraj, R., Zimmermann, T., Kim, S.: Duplicate bug reports considered harmful. . . really? In: 2008 IEEE International Conference on Software Maintenance, pp. 337–345 (2008). IEEE

  9. [9]

    In: 44th International Conference on Software Engineering, pp

    Ciborowska, A., Damevski, K.: Fast changeset-based bug localization with bert. In: 44th International Conference on Software Engineering, pp. 946–957 (2022)

  10. [10]

    IEEE Transactions on software engineering47(7), 1368–1380 (2019)

    Huo, X., Thung, F., Li, M., Lo, D., Shi, S.-T.: Deep transfer bug localization. IEEE Transactions on software engineering47(7), 1368–1380 (2019)

  11. [11]

    In: 40th ACM SIGPLAN Conference on 32 Programming Language Design and Implementation, pp

    Li, G., Liu, H., Chen, X., Gunawi, H.S., Lu, S.: Dfix: automatically fixing timing bugs in distributed systems. In: 40th ACM SIGPLAN Conference on 32 Programming Language Design and Implementation, pp. 994–1009 (2019)

  12. [12]

    In: 2009 IEEE 17th International Conference on Program Comprehension, pp

    Jeffrey, D., Feng, M., Gupta, N., Gupta, R.: Bugfix: A learning-based tool to assist developers in fixing bugs. In: 2009 IEEE 17th International Conference on Program Comprehension, pp. 70–79 (2009). IEEE

  13. [13]

    In: 27th ACM International Conference on Information and Knowledge Management, pp

    Loyola, P., Gajananan, K., Satoh, F.: Bug localization by learning to rank and represent bug inducing changes. In: 27th ACM International Conference on Information and Knowledge Management, pp. 657–665 (2018)

  14. [14]

    In: 22nd International Conference on Program Comprehension, pp

    Wang, S., Lo, D.: Version history, similar report, and structure: Putting them together for improved bug localization. In: 22nd International Conference on Program Comprehension, pp. 53–63 (2014)

  15. [15]

    Information and Software Technology52(9), 972–990 (2010)

    Lukins, S.K., Kraft, N.A., Etzkorn, L.H.: Bug localization using latent dirichlet allocation. Information and Software Technology52(9), 972–990 (2010)

  16. [16]

    In: 2012 34th International Conference on Software Engineering (ICSE), pp

    Zhou, J., Zhang, H., Lo, D.: Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 14–24 (2012). IEEE

  17. [17]

    Journal of Software: Evolution and Process28(10), 921–942 (2016)

    Wang, S., Lo, D.: Amalgam+: Composing rich information sources for accurate bug localization. Journal of Software: Evolution and Process28(10), 921–942 (2016)

  18. [18]

    In: 17th International Conference on Mining Software Repositories, pp

    Akbar, S.A., Kak, A.C.: A large-scale comparative evaluation of ir-based tools for bug localization. In: 17th International Conference on Mining Software Repositories, pp. 21–31 (2020)

  19. [19]

    In: 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp

    Saha, R.K., Lease, M., Khurshid, S., Perry, D.E.: Improving bug localization using structured information retrieval. In: 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 345–355 (2013). IEEE

  20. [20]

    IEEE Transactions on Software Engineering38(5), 1069–1087 (2011)

    McMillan, C., Grechanik, M., Poshyvanyk, D., Fu, C., Xie, Q.: Exemplar: A source code search engine for finding highly relevant applications. IEEE Transactions on Software Engineering38(5), 1069–1087 (2011)

  21. [21]

    In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp

    Wong, C.-P., Xiong, Y., Zhang, H., Hao, D., Zhang, L., Mei, H.: Boosting bug- report-oriented fault localization with segmentation and stack-trace analysis. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 181–190 (2014). IEEE

  22. [22]

    In: 2015 Asia-Pacific Software Engineering Conference (APSEC), pp

    Youm, K.C., Ahn, J., Kim, J., Lee, E.: Bug localization based on code change histories and bug reports. In: 2015 Asia-Pacific Software Engineering Conference (APSEC), pp. 190–197 (2015). IEEE 33

  23. [23]

    In: 15th International Conference on Mining Software Repositories, pp

    Rath, M., Lo, D., M¨ ader, P.: Analyzing requirements and traceability information to improve bug localization. In: 15th International Conference on Mining Software Repositories, pp. 442–453 (2018)

  24. [24]

    In: 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

    Lee, J., Kim, D., Bissyand´ e, T.F., Jung, W., Traon, Y.L.: Bench4bl: Reproducibil- ity study of the performance of ir-based bug localization. In: 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA 2018, pp. 1–12 (2018). https://doi.org/10.1145/3213846.3213856

  25. [25]

    Empirical Software Engineering27(2), 1–31 (2022)

    Li, W., Li, Q., Ming, Y., Dai, W., Ying, S., Yuan, M.: An empirical study of the effectiveness of ir-based bug localization for large-scale industrial projects. Empirical Software Engineering27(2), 1–31 (2022)

  26. [26]

    Information and Software Technology99, 120–132 (2018)

    Shepperd, M., Ajienka, N., Counsell, S.: The role and value of replication in empirical software engineering results. Information and Software Technology99, 120–132 (2018)

  27. [27]

    In: 1st International Workshop on Replication in Empirical Software Engineering, vol

    Carver, J.C.: Towards reporting guidelines for experimental replications: A pro- posal. In: 1st International Workshop on Replication in Empirical Software Engineering, vol. 1, pp. 1–4 (2010)

  28. [28]

    In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp

    Haben, G., Habchi, S., Papadakis, M., Cordy, M., Le Traon, Y.: A replica- tion study on the usability of code vocabulary in predicting flaky tests. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp. 219–229 (2021). IEEE

  29. [29]

    Empirical Software Engineering19(3), 501–557 (2014)

    Da Silva, F.Q., Suassuna, M., Fran¸ ca, A.C.C., Grubb, A.M., Gouveia, T.B., Mon- teiro, C.V., Santos, I.E.: Replication of empirical studies in software engineering research: a systematic mapping study. Empirical Software Engineering19(3), 501–557 (2014)

  30. [30]

    576–587 (2023)

    Niu, F., Mayr-Dorn, C., Assun¸ c˜ ao, W.K., Huang, L., Ge, J., Luo, B., Egyed, A.: The ablots approach for bug localization: is it replicable and generalizable? In: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), pp. 576–587 (2023). IEEE

  31. [31]

    Information and Software Technology 56(8), 1033–1048 (2014)

    G´ omez, O.S., Juristo, N., Vegas, S.: Understanding replication of experiments in software engineering: A classification. Information and Software Technology 56(8), 1033–1048 (2014)

  32. [32]

    Data in brief25, 104005 (2019)

    Rath, M., M¨ ader, P.: The seoss 33 dataset—requirements, bug reports, code history, and trace links for entire projects. Data in brief25, 104005 (2019)

  33. [33]

    arXiv preprint arXiv:2004.08846 (2020) 34

    Muvva, S., Rao, A.E., Chimalakonda, S.: Bugl–a cross-language dataset for bug localization. arXiv preprint arXiv:2004.08846 (2020) 34

  34. [34]

    322–331 (2011)

    Rahman, F., Posnett, D., Hindle, A., Barr, E., Devanbu, P.: Bugcache for inspec- tions: hit or miss? In: 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, pp. 322–331 (2011)

  35. [35]

    ACM SIGKDD explorations newsletter 11(1), 10–18 (2009)

    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD explorations newsletter 11(1), 10–18 (2009)

  36. [36]

    Cambridge university press, ??? (2008)

    Manning, C.D., Raghavan, P., Sch¨ utze, H.: Introduction to Information Retrieval. Cambridge university press, ??? (2008)

  37. [37]

    Sch¨ utze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval vol. 39. Cambridge University Press Cambridge, ??? (2008)

  38. [38]

    In: Trec, vol

    Voorhees, E.M.,et al.: The trec-8 question answering track report. In: Trec, vol. 99, pp. 77–82 (1999)

  39. [39]

    Rath, M., Lo, D., M¨ ader, P.: Replication Data for: Analyzing Requirements and Traceability Information to Improve Bug Localization (2018)

  40. [40]

    https://www.jira.com (2018)

    JIRA: Jira Issue Tracking Software. https://www.jira.com (2018)

  41. [41]

    https://www.git-scm.com (2018)

    SCM, G.: Git SCM. https://www.git-scm.com (2018)

  42. [42]

    In: Joint International and Annual ERCIM Workshops on Principles of Software Evolution (IWPSE) and Software Evolution (Evol) Workshops, pp

    Bachmann, A., Bernstein, A.: Software process data quality and characteristics: a historical view on open and closed source projects. In: Joint International and Annual ERCIM Workshops on Principles of Software Evolution (IWPSE) and Software Evolution (Evol) Workshops, pp. 119–128 (2009)

  43. [43]

    In: 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp

    Ye, X., Bunescu, R., Liu, C.: Learning to rank relevant files for bug reports using domain knowledge. In: 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 689–699 (2014)

  44. [44]

    Le, T.-D., Lo, D., Le Goues, C., Grunske, L.: A learning-to-rank based fault localization approach using likely invariants

    B. Le, T.-D., Lo, D., Le Goues, C., Grunske, L.: A learning-to-rank based fault localization approach using likely invariants. In: 25th International Symposium on Software Testing and Analysis, pp. 177–188 (2016)

  45. [45]

    In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp

    Benton, S., Ghanbari, A., Zhang, L.: Defexts: A curated dataset of repro- ducible real-world bugs for modern jvm languages. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pp. 47–50 (2019). IEEE

  46. [46]

    Baeza-Yates, R., Ribeiro-Neto, B.,et al.: Modern Information Retrieval vol. 463. ACM press New York, ??? (1999)

  47. [47]

    http://www.nltk.org (2022)

    NLTK: NLTK library. http://www.nltk.org (2022)

  48. [48]

    Journal of documentation (1972)

    Jones, K.S.: A statistical interpretation of term specificity and its application in 35 retrieval. Journal of documentation (1972)

  49. [49]

    https://scikit-learn.org/stable/ (2022)

    scikit-learn: scikit-learn. https://scikit-learn.org/stable/ (2022)

  50. [50]

    In: 2013 35th International Conference on Software Engineering (ICSE), pp

    Lewis, C., Lin, Z., Sadowski, C., Zhu, X., Ou, R., Whitehead, E.J.: Does bug prediction support human developers? findings from a google case study. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 372–381 (2013). IEEE

  51. [51]

    Chris Lewis, R.O.: Bug Prediction at Google. [EB/OL]. http://google-engtools. blogspot.com/2011/12/bug-prediction-at-google.html Accessed 12, 2011

  52. [52]

    In: International Conference on Intelligent Analysis, vol

    Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: International Conference on Intelligent Analysis, vol. 2, pp. 2–6 (2005). Citeseer

  53. [53]

    https://imbalanced-learn.org/stable/ (2022)

    Imblearn: Imblearn. https://imbalanced-learn.org/stable/ (2022)

  54. [54]

    In: The First Text Retrieval Conference (TREC-1), pp

    Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.,et al.: Combining evi- dence from multiple searches. In: The First Text Retrieval Conference (TREC-1), pp. 319–328 (1993)

  55. [55]

    NIST special publication SP, 243–243 (1994)

    Fox, E., Shaw, J.: Combination of multiple searches. NIST special publication SP, 243–243 (1994)

  56. [56]

    Adaptation, Learning, and Optimization, vol

    Wu, S.: Data Fusion in Information Retrieval. Adaptation, Learning, and Optimization, vol. 13. Springer, ??? (2012). https://doi.org/10.1007/ 978-3-642-28866-1 . https://doi.org/10.1007/978-3-642-28866-1

  57. [57]

    In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J

    Aslam, J.A., Montague, M.H.: Models for metasearch. In: Croft, W.B., Harper, D.J., Kraft, D.H., Zobel, J. (eds.) SIGIR 2001: 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September 9-13, 2001, New Orleans, Louisiana, USA, pp. 275–284. ACM, ??? (2001). https://doi.org/10.1145/383952.384007 . https://do...

  58. [58]

    In: 29th ACM/IEEE International Conference on Automated Software Engineering, pp

    Lucia, Lo, D., Xia, X.: Fusion fault localizers. In: 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 127–138 (2014)

  59. [59]

    Journal of the American statistical Association46(253), 68–78 (1951)

    Massey Jr, F.J.: The kolmogorov-smirnov test for goodness of fit. Journal of the American statistical Association46(253), 68–78 (1951)

  60. [60]

    Kluwer Academic Publishers, USA (2000)

    Wohlin, C., Runeson, P., H¨ ost, M., Ohlsson, M.C., Regnell, B., Wessl´ en, A.: Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, USA (2000). https://doi.org/10.1007/978-1-4615-4625-2

  61. [61]

    In: 2012 9th IEEE Working Conference on Mining 36 Software Repositories (MSR), pp

    Sisman, B., Kak, A.C.: Incorporating version histories in information retrieval based bug localization. In: 2012 9th IEEE Working Conference on Mining 36 Software Repositories (MSR), pp. 50–59 (2012). IEEE

  62. [62]

    In: 2023 45th International Conference on Software Engineering (ICSE) (2023)

    Niu, F., Assun¸ cao, W.K.G., Huang, L., Mayr-Dorn, C., Ge, J., Luo, B., Egyed, A.: Rat: A refactoring-aware traceability model for bug localization. In: 2023 45th International Conference on Software Engineering (ICSE) (2023). IEEE. accepted

  63. [63]

    In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp

    Wen, M., Wu, R., Cheung, S.-C.: Locus: Locating bugs from software changes. In: 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 262–273 (2016). IEEE

  64. [64]

    In: Proceedings of the 44th International Conference on Software Engineering, pp

    Ciborowska, A., Damevski, K.: Fast changeset-based bug localization with bert. In: Proceedings of the 44th International Conference on Software Engineering, pp. 946–957 (2022)

  65. [65]

    Automated Software Engineering30(1), 12 (2023)

    Han, J., Huang, C., Sun, S., Liu, Z., Liu, J.: bjxnet: an improved bug localiza- tion model based on code property graph and attention mechanism. Automated Software Engineering30(1), 12 (2023)

  66. [66]

    In: 2023 International Joint Conference on Neural Networks (IJCNN), pp

    Yong, J., Zhu, Z., Li, Y.: Decomposing source codes by program slicing for bug localization. In: 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2023). IEEE

  67. [67]

    In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pp

    Lam, A.N., Nguyen, A.T., Nguyen, H.A., Nguyen, T.N.: Bug localization with combination of deep learning and information retrieval. In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pp. 218–229 (2017). IEEE

  68. [68]

    Information and Software Technology105, 17–29 (2019)

    Xiao, Y., Keung, J., Bennin, K.E., Mi, Q.: Improving bug localization with word embedding and enhanced convolutional neural networks. Information and Software Technology105, 17–29 (2019)

  69. [69]

    arXiv preprint arXiv:2011.03449 (2020)

    Sangle, S., Muvva, S., Chimalakonda, S., Ponnalagu, K., Venkoparao, V.G.: Drast–a deep learning and ast based approach for bug localization. arXiv preprint arXiv:2011.03449 (2020)

  70. [70]

    In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp

    Cao, J., Yang, S., Jiang, W., Zeng, H., Shen, B., Zhong, H.: Bugpecker: Locating faulty methods with deep learning on revision graphs. In: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 1214–1218 (2020)

  71. [71]

    In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pp

    Yang, S., Cao, J., Zeng, H., Shen, B., Zhong, H.: Locating faulty methods with a mixed rnn and attention model. In: 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), pp. 207–218 (2021). IEEE

  72. [72]

    IEEE Transactions on Reliability 71(1), 235–249 (2021) 37

    Qi, B., Sun, H., Yuan, W., Zhang, H., Meng, X.: Dreamloc: A deep relevance matching-based framework for bug localization. IEEE Transactions on Reliability 71(1), 235–249 (2021) 37

  73. [73]

    In: 2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C), pp

    Shi, X., Ju, X., Chen, X., Lu, G., Xu, M.: Semirfl: Boosting fault localization via combining semantic information and information retrieval. In: 2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C), pp. 324–332 (2022). IEEE

  74. [74]

    International Journal of Software Engineering and Knowledge Engineering (2023)

    Xu, G., Wang, X., Wei, D., Shao, Y., Chen, B.: Bug localization with features crossing and structured semantic information matching. International Journal of Software Engineering and Knowledge Engineering (2023)

  75. [75]

    In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp

    Lam, A.N., Nguyen, A.T., Nguyen, H.A., Nguyen, T.N.: Combining deep learning with information retrieval to localize buggy files for bug reports (n). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 476–481 (2015). IEEE

  76. [76]

    IEEE Transactions on Services Computing15(3), 1649–1663 (2020)

    Wang, B., Xu, L., Yan, M., Liu, C., Liu, L.: Multi-dimension convolutional neural network for bug localization. IEEE Transactions on Services Computing15(3), 1649–1663 (2020)

  77. [77]

    In: 2021 28th Asia-Pacific Software Engineering Conference Workshops (APSEC Workshops), pp

    Anh, B.T.M., Luyen, N.V.: An imbalanced deep learning model for bug local- ization. In: 2021 28th Asia-Pacific Software Engineering Conference Workshops (APSEC Workshops), pp. 32–40 (2021). IEEE

  78. [78]

    Information and Software Technology, 107274 (2023)

    Xiao, X., Xiao, R., Li, Q., Lv, J., Cui, S., Liu, Q.: Bugradar: Bug localization by knowledge graph link prediction. Information and Software Technology, 107274 (2023)

  79. [79]

    International Journal of Science and Engineering Applications, 108–111 (2023)

    Al-Aidaroos, A.S., Bamzahem, S.M.: The impact of glove and word2vec word- embedding technologies on bug localization with convolutional neural network. International Journal of Science and Engineering Applications, 108–111 (2023)

  80. [80]

    In: XXX Brazilian Symposium on Software Engineering, pp

    Garnier, M., Garcia, A.: On the evaluation of structured information retrieval- based bug localization on 20 c# projects. In: XXX Brazilian Symposium on Software Engineering, pp. 123–132 (2016) 38 Author Biography Feifei Niuis a Research Fellow at the University of Ottawa. She received her Doctorate from Nanjing University. Her research interests include ...