pith. sign in

arxiv: 2505.01469 · v1 · submitted 2025-05-02 · 💻 cs.SE

Automatic techniques for issue report classification: A systematic mapping study

Pith reviewed 2026-05-22 17:36 UTC · model grok-4.3

classification 💻 cs.SE
keywords issue report classificationsystematic mapping studymachine learningsoftware maintenancebug triagedeep learninglarge language models
0
0 comments X

The pith

Existing studies on automatic issue report classification overlook practitioner involvement and real-world adoption factors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper performs a systematic mapping study to survey the use of automatic techniques for classifying software issue reports into categories such as bugs or enhancements. It finds that the 46 identified studies apply machine learning, deep learning, and large language models but evaluate them almost exclusively on open-source archival data. The work highlights that these studies rarely involve practitioners and focus narrowly on prediction accuracy while ignoring other factors like explainability, scalability, and generalizability. A sympathetic reader would care because improved classification could help software teams assign resources more effectively, yet current methods may not transfer well to industrial settings.

Core claim

The literature applies various techniques for classifying issue reports, including traditional machine learning and deep learning-based techniques and more advanced large language models. These studies lack the involvement of practitioners, do not consider other potentially relevant adoption factors beyond prediction accuracy such as the explainability, scalability, and generalizability of the techniques, and mainly rely on archival data from open-source repositories only. Therefore, future research should focus on real industrial evaluations, consider other potentially relevant adoption factors, and actively involve practitioners.

What carries the argument

A systematic mapping study that identified and analyzed 46 studies on automatic techniques for issue report classification.

If this is right

  • Techniques must incorporate explainability features to increase trust and adoption by development teams.
  • Evaluations should expand beyond accuracy to include scalability and performance on proprietary industrial datasets.
  • Future work should prioritize partnerships with practitioners to ensure relevance to real triage workflows.
  • Large language models may require domain-specific fine-tuning for effective issue classification in practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Teams might prefer building custom internal classifiers trained on their own historical issues rather than adopting published open-source solutions.
  • Integrating issue classification with related tasks like effort estimation or duplicate detection could yield more practical tools.
  • Metrics focused on time saved in triaging or reduction in misassigned issues would better demonstrate value than accuracy alone.

Load-bearing premise

The search and selection process captured a representative sample of all relevant studies on automatic issue report classification without significant omissions.

What would settle it

Discovery of several high-impact studies featuring practitioner involvement and industrial evaluations of issue classifiers that were missed by the mapping would undermine the reported gaps.

read the original abstract

Several studies have evaluated automatic techniques for classifying software issue reports to assist practitioners in effectively assigning relevant resources based on the type of issue. Currently, no comprehensive overview of this area has been published. A comprehensive overview will help identify future research directions and provide an extensive collection of potentially relevant existing solutions. This study aims to provide a comprehensive overview of the use of automatic techniques to classify issue reports. We conducted a systematic mapping study and identified 46 studies on the topic. The study results indicate that the existing literature applies various techniques for classifying issue reports, including traditional machine learning and deep learning-based techniques and more advanced large language models. Furthermore, we observe that these studies (a) lack the involvement of practitioners, (b) do not consider other potentially relevant adoption factors beyond prediction accuracy, such as the explainability, scalability, and generalizability of the techniques, and (c) mainly rely on archival data from open-source repositories only. Therefore, future research should focus on real industrial evaluations, consider other potentially relevant adoption factors, and actively involve practitioners.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports a systematic mapping study of automatic techniques for classifying software issue reports. It identifies 46 primary studies applying traditional machine learning, deep learning, and large language models, and synthesizes three main observations: limited practitioner involvement, narrow focus on prediction accuracy without considering explainability, scalability or generalizability, and predominant use of open-source archival data. The authors conclude that future work should prioritize industrial evaluations and active practitioner engagement.

Significance. If the sample of 46 studies is representative, the mapping provides a useful consolidation of the literature and correctly flags adoption-relevant gaps that have received insufficient attention. Such overviews can help steer the field toward more practical, industry-aligned research on issue report classification.

major comments (1)
  1. [Methodology] Methodology section: The manuscript states that a systematic mapping study was performed and 46 studies were identified, yet provides no explicit search strings, databases, time bounds, inclusion/exclusion criteria, quality assessment, or snowballing procedure. Because the central claims about field-wide gaps (practitioner involvement, adoption factors, data sources) rest on the representativeness of this sample, the absence of a reproducible protocol leaves the synthesis vulnerable to selection bias and prevents readers from verifying completeness.
minor comments (2)
  1. [Results] Table or figure summarizing the 46 studies by technique category, publication year, and data source would improve readability and allow quick assessment of the distribution of open-source versus industrial data.
  2. [Abstract] The abstract lists the three observations but does not quantify them (e.g., how many of the 46 studies involved practitioners). Adding brief counts or percentages would strengthen the summary.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our systematic mapping study. We agree that greater methodological transparency is essential for establishing the representativeness of the 46 primary studies and the validity of our synthesized observations regarding practitioner involvement, adoption factors, and data sources. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Methodology] Methodology section: The manuscript states that a systematic mapping study was performed and 46 studies were identified, yet provides no explicit search strings, databases, time bounds, inclusion/exclusion criteria, quality assessment, or snowballing procedure. Because the central claims about field-wide gaps (practitioner involvement, adoption factors, data sources) rest on the representativeness of this sample, the absence of a reproducible protocol leaves the synthesis vulnerable to selection bias and prevents readers from verifying completeness.

    Authors: We agree that the current manuscript does not provide sufficient explicit detail on the search and selection protocol, which is necessary to demonstrate reproducibility and mitigate concerns about selection bias. In the revised manuscript we will expand the Methodology section to include the complete search strings, the specific databases and repositories queried, the time bounds, the full inclusion and exclusion criteria, any quality assessment criteria applied, and the snowballing procedure (if used). These additions will allow readers to verify the completeness of the sample and will strengthen the foundation for our observations on the identified gaps in the literature. revision: yes

Circularity Check

0 steps flagged

Literature review aggregates external studies; no self-referential reduction

full rationale

The paper is a systematic mapping study that searches for and analyzes 46 independently published external works on automatic issue report classification. Its central claims (lack of practitioner involvement, focus on accuracy over explainability/scalability, reliance on open-source archival data) are summaries and gap identifications drawn from those external papers. No equations, fitted parameters, self-definitions, or self-citation chains appear in the provided text; the derivation does not reduce any result to the paper's own inputs by construction. The search protocol and selection criteria are methodological choices whose validity is external to the reported findings.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is a secondary study that depends on the completeness of prior primary studies and on standard systematic mapping methodology rather than introducing new fitted parameters or invented entities.

axioms (1)
  • domain assumption Standard systematic mapping study guidelines (e.g., search strategy, inclusion criteria, data extraction) produce a representative overview of the field.
    Invoked implicitly when the authors conclude that the 46 studies reflect the state of the literature and its gaps.

pith-pipeline@v0.9.0 · 5707 in / 1202 out tokens · 37175 ms · 2026-05-22T17:36:31.494618+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages

  1. [1]

    In: Proceed- ings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, pp

    Antoniol, G., Ayari, K., Di Penta, M., Khomh, F., Gu´ eh´ eneuc, Y.-G.: Is it a bug or an enhancement? a text-based approach to classify change requests. In: Proceed- ings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, pp. 304–318 (2008)

  2. [2]

    In: 2013 35th International Conference on Software Engineering (ICSE), pp

    Herzig, K., Just, S., Zeller, A.: It’s not a bug, it’s a feature: how misclassifica- tion impacts bug prediction. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 392–401 (2013). IEEE

  3. [3]

    In: 2013 35th International Conference on Software Engineering (ICSE), pp

    Zanetti, M.S., Scholtes, I., Tessone, C.J., Schweitzer, F.: Categorizing bugs with social networks: a case study on four open source software communities. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 1032–1041 (2013). IEEE

  4. [4]

    In: International Conference on Product-Focused Software Process Improvement, pp

    Laiq, M., Ali, N.b., B¨ ostler, J., Engstr¨ om, E.: Early identification of invalid bug reports in industrial settings–a case study. In: International Conference on Product-Focused Software Process Improvement, pp. 497–507 (2022). Springer

  5. [5]

    Information and Software Technology 142, 106743 (2022)

    Cho, H., Lee, S., Kang, S.: Classifying issue reports according to feature descrip- tions in a user manual based on a deep learning model. Information and Software Technology 142, 106743 (2022)

  6. [6]

    In: Pro- ceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp

    Siddiq, M.L., Santos, J.C.: Bert-based github issue report classification. In: Pro- ceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp. 33–36 (2022)

  7. [7]

    In: International Conference on Evaluation and Assessment in Software Engineering, pp

    Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M.: Systematic mapping studies in software engineering. In: International Conference on Evaluation and Assessment in Software Engineering, pp. 1–10 (2008)

  8. [8]

    IEEE Transactions on Software Engineering 46(8), 836–862 (2018)

    Zou, W., Lo, D., Chen, Z., Xia, X., Feng, Y., Xu, B.: How practitioners perceive automated bug report management techniques. IEEE Transactions on Software Engineering 46(8), 836–862 (2018)

  9. [9]

    199–209 (2011)

    Parnin, C., Orso, A.: Are automated debugging techniques actually helping pro- grammers? In: Proceedings of the 2011 International Symposium on Software Testing and Analysis, pp. 199–209 (2011)

  10. [10]

    Information and Software Technology 164, 107305 (2023)

    Laiq, M., Ali, N., B¨ orstler, J., Engstr¨ om, E.: A data-driven approach for under- standing invalid bug reports: An industrial case study. Information and Software Technology 164, 107305 (2023)

  11. [11]

    IEEE transactions on software engineering 46(5), 495–525 (2018) 21

    Fan, Y., Xia, X., Lo, D., Hassan, A.E.: Chaff from the wheat: Characterizing and determining valid bug reports. IEEE transactions on software engineering 46(5), 495–525 (2018) 21

  12. [12]

    Zhang, J., Wang, X., Hao, D., Xie, B., Zhang, L., Mei, H.: A survey on bug-report analysis. Sci. China Inf. Sci. 58(2), 1–24 (2015)

  13. [13]

    Empirical Software Engineering 29(5), 126 (2024)

    Borg, M., Jonsson, L., Engstr¨ om, E., Bartalos, B., Szab´ o, A.: Adopting automated bug assignment in practice—a longitudinal case study at ericsson. Empirical Software Engineering 29(5), 126 (2024)

  14. [14]

    IEEE transactions on software Engineering39(11), 1597–1610 (2013)

    Kim, D., Tao, Y., Kim, S., Zeller, A.: Where should we fix this bug? a two- phase recommendation model. IEEE transactions on software Engineering39(11), 1597–1610 (2013)

  15. [15]

    Zhang, T., Han, D., Vinayakarao, V., Irsan, I.C., Xu, B., Thung, F., Lo, D., Jiang, L.: Duplicate bug report detection: How far are we? ACM Transactions on Software Engineering and Methodology 32(4), 1–32 (2023)

  16. [16]

    Colavito, G., Lanubile, F., Novielli, N., Quaranta, L.: Large language models for issue report classification (2024)

  17. [17]

    In: 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp

    He, J., Xu, L., Fan, Y., Xu, Z., Yan, M., Lei, Y.: Deep learning based valid bug reports determination and explanation. In: 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp. 184–194 (2020). IEEE

  18. [18]

    Applied Sciences 13(16), 9456 (2023)

    Kwak, C., Jung, P., Lee, S.: A multimodal deep learning model using text, image, and code data for improving issue classification tasks. Applied Sciences 13(16), 9456 (2023)

  19. [19]

    IEEE Access (2024)

    Heo, J., Kwon, G., Kwak, C., Lee, S.: A comparison of pretrained models for classifying issue reports. IEEE Access (2024)

  20. [20]

    Empirical Software Engineering 29(5), 130 (2024)

    Laiq, M., Ali, N.b., B¨ orstler, J., Engstr¨ om, E.: Industrial adoption of machine learning techniques for early identification of invalid bug reports. Empirical Software Engineering 29(5), 130 (2024)

  21. [21]

    In: 2022 International Symposium on Multi- disciplinary Studies and Innovative Technologies (ISMSIT), pp

    K¨ oksal, ¨O., ¨Ozt¨ urk, C.E.: A survey on machine learning-based automated software bug report classification. In: 2022 International Symposium on Multi- disciplinary Studies and Innovative Technologies (ISMSIT), pp. 635–640 (2022). IEEE

  22. [22]

    CRC press, ??? (2016)

    Kitchenham, B.A., Budgen, D., Brereton, P.: Evidence-based Software Engineer- ing and Systematic Reviews. CRC press, ??? (2016)

  23. [23]

    Information and software technology 115, 58–78 (2019)

    Gomes, L.A.F., Silva Torres, R., Cˆ ortes, M.L.: Bug report severity level prediction in open source software: A survey and research opportunities. Information and software technology 115, 58–78 (2019)

  24. [24]

    Artificial Intelligence Review 47, 145–180 (2017)

    Uddin, J., Ghazali, R., Deris, M.M., Naseem, R., Shah, H.: A survey on bug 22 prioritization. Artificial Intelligence Review 47, 145–180 (2017)

  25. [25]

    Information and Software Technology , volume =

    Ampatzoglou, A., Bibi, S., Avgeriou, P., Verbeek, M., Chatzigeorgiou, A.: Iden- tifying, categorizing and mitigating threats to validity in software engineering secondary studies. Information and Software Technology 106, 201–230 (2019) https://doi.org/10.1016/j.infsof.2018.10.006

  26. [26]

    e-Informatica Software Engineering Journal 17(1), 230105 (2023) https://doi.org/10.37190/E-INF230105

    Usman, M., Ali, N.B., Wohlin, C.: A quality assessment instrument for systematic literature reviews in software engineering. e-Informatica Software Engineering Journal 17(1), 230105 (2023) https://doi.org/10.37190/E-INF230105

  27. [27]

    Journal of systems and software 83(1), 37–51 (2010)

    Kitchenham, B.: What’s up with software metrics?–a preliminary mapping study. Journal of systems and software 83(1), 37–51 (2010)

  28. [28]

    Information and Software Technology 53(4), 317–343 (2011)

    Petersen, K.: Measuring and predicting software productivity: A systematic map and review. Information and Software Technology 53(4), 317–343 (2011)

  29. [29]

    In: Interna- tional Symposium on Empirical Software Engineering and Measurement, pp

    Ali, N.B.: Is effectiveness sufficient to choose an intervention?: Con- sidering resource use in empirical software engineering. In: Interna- tional Symposium on Empirical Software Engineering and Measurement, pp. 54–1546. ACM, ??? (2016). https://doi.org/10.1145/2961111.2962631 . https://doi.org/10.1145/2961111.2962631

  30. [30]

    In: Proceedings of the 9th International Conference on Software Engineering and Applications, pp

    Rana, R., Staron, M., Hansson, J., Nilsson, M., Meding, W.: A framework for adoption of machine learning in industry for software defect prediction. In: Proceedings of the 9th International Conference on Software Engineering and Applications, pp. 383–392 (2014)

  31. [31]

    ACM Computing Surveys 55(6), 1–29 (2022)

    Paleyes, A., Urma, R.-G., Lawrence, N.D.: Challenges in deploying machine learning: A survey of case studies. ACM Computing Surveys 55(6), 1–29 (2022)

  32. [32]

    In: Proceedings of The 2nd International Workshop on Natural Language-based Software Engineering (NLBSE’23) (2023)

    Kallis, R., Izadi, M., Pascarella, L., Chaparro, O., Rani, P.: The nlbse’23 tool competition. In: Proceedings of The 2nd International Workshop on Natural Language-based Software Engineering (NLBSE’23) (2023)

  33. [33]

    IEEE Software 39(3), 85–95 (2021)

    T¨ uz¨ un, E., Erdogmus, H., Baldassarre, M.T., Felderer, M., Feldt, R., Turhan, B.: Ground-truth deficiencies in software engineering: when codifying the past can be counterproductive. IEEE Software 39(3), 85–95 (2021)

  34. [34]

    IEEE Transactions on Software Engineering 48(7), 2541–2556 (2021)

    Wu, X., Zheng, W., Xia, X., Lo, D.: Data quality matters: A case study on data label correctness for security bug report prediction. IEEE Transactions on Software Engineering 48(7), 2541–2556 (2021)

  35. [35]

    In: 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), pp

    Colavito, G., Lanubile, F., Novielli, N.: Few-shot learning for issue report classification. In: 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), pp. 16–19 (2023). IEEE 23

  36. [36]

    In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp

    Colavito, G., Lanubile, F., Novielli, N.: Issue report classification using pre-trained language models. In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp. 29–32 (2022)

  37. [37]

    In: 2011 IEEE 11th International Conference on Computer and Information Technology, pp

    Zhang, T., Lee, B.: A bug rule based technique with feedback for classifying bug reports. In: 2011 IEEE 11th International Conference on Computer and Information Technology, pp. 336–343 (2011). IEEE

  38. [38]

    In: 2022 Research, Inven- tion, and Innovation Congress: Innovative Electricals and Electronics (RI2C), pp

    Polpinij, J., Kaenampornpan, M., Luaphol, B.: A comparative study of short text classification methods for bug report type identification. In: 2022 Research, Inven- tion, and Innovation Congress: Innovative Electricals and Electronics (RI2C), pp. 27–33 (2022). IEEE

  39. [39]

    Procedia computer science 132, 352–361 (2018)

    Kukkar, A., Mohana, R.: A supervised bug report classification with incorporate and textual field knowledge. Procedia computer science 132, 352–361 (2018)

  40. [40]

    Innovations in Systems and Software Engineering, 1–16 (2024)

    Alraddadi, R., Alshayeb, M.: An empirical evaluation of stacked generalization models for binary bug report classification. Innovations in Systems and Software Engineering, 1–16 (2024)

  41. [41]

    In: 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), pp

    Laiq, M.: An intelligent tool for classifying issue reports. In: 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), pp. 13–15 (2023). IEEE

  42. [42]

    In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp

    Aracena, G., Luster, K., Santos, F., Steinmacher, I., Gerosa, M.A.: Applying large language models to issue classification. In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp. 57–60 (2024)

  43. [43]

    In: Proceedings of the 9th International Conference on Information Communication and Management, pp

    Otoom, A.F., Al-jdaeh, S., Hammad, M.: Automated classification of software bug reports. In: Proceedings of the 9th International Conference on Information Communication and Management, pp. 17–21 (2019)

  44. [44]

    Applied Sciences 12(1), 338 (2021)

    K¨ oksal,¨O., Tekinerdogan, B.: Automated classification of unstructured bilingual software bug reports: An industrial case study research. Applied Sciences 12(1), 338 (2021)

  45. [45]

    In: 2021 IEEE International Sym- posium on Software Reliability Engineering Workshops (ISSREW), pp

    Nadeem, A., Sarwar, M.U., Malik, M.Z.: Automatic issue classifier: A transfer learning framework for classifying issue reports. In: 2021 IEEE International Sym- posium on Software Reliability Engineering Workshops (ISSREW), pp. 421–426 (2021). IEEE

  46. [46]

    In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp

    Terdchanakul, P., Hata, H., Phannachitta, P., Matsumoto, K.: Bug or not? bug report classification using n-gram idf. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 534–538 (2017). IEEE

  47. [47]

    International Journal of Performability Engineering 15(8), 24 2071 (2019)

    Chen, L., Huang, S., Sun, J., Hui, Z., Yang, S.: Bug report classification based on vector space model. International Journal of Performability Engineering 15(8), 24 2071 (2019)

  48. [48]

    In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp

    Alam, K.A., Jumani, A., Aamir, H., Uzair, M.: Classifai: Automating issue reports classification using pre-trained bert (bidirectional encoder representations from transformers) models. In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp. 49–52 (2024)

  49. [49]

    Expert Systems 41(5), 13184 (2024)

    Zhifang, L., Kun, W., Qi, Z., Shengzong, L., Yan, Z., Jianbiao, H.: Classification of open source software bug report based on transfer learning. Expert Systems 41(5), 13184 (2024)

  50. [50]

    In: Proceedings of the 10th Asia-Pacific Symposium on Internetware, pp

    Qin, H., Sun, X.: Classifying bug reports into bugs and non-bugs using lstm. In: Proceedings of the 10th Asia-Pacific Symposium on Internetware, pp. 1–4 (2018)

  51. [51]

    In: 2013 20Th Asia-pacific Software Engineering Conference (APSEC), vol

    Pingclasai, N., Hata, H., Matsumoto, K.-i.: Classifying bug reports to bugs and other requests using topic modeling. In: 2013 20Th Asia-pacific Software Engineering Conference (APSEC), vol. 2, pp. 13–18 (2013). IEEE

  52. [52]

    In: Pro- ceedings of the Fourth International Workshop on Bots in Software Engineering, pp

    Park, D., Cho, H., Lee, S.: Classifying issues into custom labels in gitbot. In: Pro- ceedings of the Fourth International Workshop on Bots in Software Engineering, pp. 28–32 (2022)

  53. [53]

    In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp

    Zolkeply, M.S., Shao, J.: Classifying software issue reports through association mining. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 1860–1863 (2019)

  54. [54]

    Journal of Software: Evolution and Process 28(3), 150–176 (2016)

    Zhou, Y., Tong, Y., Gu, R., Gall, H.: Combining text mining and data mining for bug report classification. Journal of Software: Evolution and Process 28(3), 150–176 (2016)

  55. [55]

    Empirical Software Engineering 28(2), 26 (2023)

    Devine, P., Koh, Y.S., Blincoe, K.: Evaluating software user feedback classi- fier performance on unseen apps, datasets, and metadata. Empirical Software Engineering 28(2), 26 (2023)

  56. [56]

    In: Pro- ceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp

    Ebrahim, F., Joy, M.: Few-shot issue report classification with adapters. In: Pro- ceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp. 41–44 (2024)

  57. [57]

    In: 2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI), pp

    Kumar, G.S., Angel, T.S., Chakreborthy, S., Reddy, K.D.: Github bug clas- sification using pipeline approach in machine learning. In: 2023 International Conference on Data Science, Agents & Artificial Intelligence (ICDSAAI), pp. 1–7 (2023). IEEE

  58. [58]

    In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp

    Bharadwaj, S., Kadam, T.: Github issue classification using bert-style models. In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp. 40–43 (2022) 25

  59. [59]

    Kwak, C., Lee, S.: Issue report classification using a multimodal deep learning technique (2022)

  60. [60]

    Software Quality Journal, 1–21 (2024)

    Du, X., Liu, Z., Li, C., Ma, X., Li, Y., Wang, X.: Llm-brc: A large language model-based bug report classification framework. Software Quality Journal, 1–21 (2024)

  61. [61]

    Automated Software Engineering 31(1), 33 (2024)

    Shen, J., Li, Z., Lu, Y., Pan, M., Li, X.: Mitigating the impact of mislabeled data on deep predictive models: an empirical study of learning with noise approaches in software engineering tasks. Automated Software Engineering 31(1), 33 (2024)

  62. [62]

    GayathriP, M., Babu, G.K.K.: Optimization of the bug report classification using genetic algorithm. (2019). https://api.semanticscholar.org/CorpusID:212538218

  63. [63]

    In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp

    Li, Z., Pan, M., Pei, Y., Zhang, T., Wang, L., Li, X.: Robust learning of deep predictive models from noisy and imbalanced software engineering datasets. In: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, pp. 1–13 (2022)

  64. [64]

    In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp

    Rejithkumar, G., Anish, P.R., Ghaisas, S.: Text-to-text generation for issue report classification. In: Proceedings of the Third ACM/IEEE International Workshop on NL-based Software Engineering, pp. 53–56 (2024)

  65. [65]

    In: 2017 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp

    Du, X., Zheng, Z., Xiao, G., Yin, B.: The automatic classification of fault trigger based bug report. In: 2017 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 259–265 (2017). IEEE

  66. [66]

    121–130 (2017)

    Fan, Q., Yu, Y., Yin, G., Wang, T., Wang, H.: Where is the road for issue reports classification based on text mining? In: 2017 ACM/IEEE International Sympo- sium on Empirical Software Engineering and Measurement (ESEM), pp. 121–130 (2017). IEEE

  67. [67]

    In: Proceedings of the 4th European Symposium on Software Engineering, pp

    Meng, Q., Visser, J.: Which bug reports are valid and why? using the bert trans- former to classify bug reports and explain their validity. In: Proceedings of the 4th European Symposium on Software Engineering, pp. 52–60 (2023)

  68. [68]

    Empirical Software Engineering 27(5), 111 (2022)

    Rahman, M.M., Khomh, F., Castelluccio, M.: Works for me! cannot reproduce– a large scale empirical study of non-reproducible bugs. Empirical Software Engineering 27(5), 111 (2022)

  69. [69]

    In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp

    Trautsch, A., Herbold, S.: Predicting issue types with sebert. In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp. 37–39 (2022)

  70. [70]

    In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp

    Izadi, M.: Catiss: An intelligent tool for categorizing issues reports using transformers. In: Proceedings of the 1st International Workshop on Natural Language-based Software Engineering, pp. 44–47 (2022) 26 A A map of automatic techniques for issue report classification T able 13: A map of automatic techniques for issue report classification. Precision ...