pith. machine review for the scientific record. sign in

arxiv: 2605.03670 · v1 · submitted 2026-05-05 · 💻 cs.SE · cs.CY· cs.SI

Recognition: unknown

Geographic Variation in Stack Overflow Code Quality: Evidence from a Cross-Regional Study of Coding Practices

Authors on Pith no claims yet

Pith reviewed 2026-05-07 15:54 UTC · model grok-4.3

classification 💻 cs.SE cs.CYcs.SI
keywords stack overflowcode qualitygeographic variationstatic analysissoftware reusesocio-economic factorsunited statesprogramming languages
0
0 comments X

The pith

U.S. states with wider internet access, higher income, and more equal wealth show fewer code quality violations in Stack Overflow snippets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines code snippets posted in Stack Overflow answers from contributors across U.S. states in five programming languages. It measures violations in readability, reliability, performance, and security using automated tools, then compares violation rates against state-level measures of device access, internet use, income, and wealth distribution. The central finding is that areas with stronger digital infrastructure and more balanced economic conditions produce code with lower violation densities. If this pattern holds, it means that code shared in public developer forums carries traces of regional differences in technology access and training. Developers who reuse such snippets may encounter more basic errors from some regions and more intricate ones from others.

Core claim

States with broader access to computing devices, Internet subscriptions, higher income, and more equitable wealth distribution tend to show fewer code quality violations. Readability violations dominate across languages, while major technology hubs yield more parsable snippets without necessarily lower violation densities. Qualitative review of snippets from California, Utah, and North Dakota indicates that established technology regions produce more complex violations whereas less mature regions display more fundamental errors.

What carries the argument

Violation density, computed as the count of issues flagged by language-specific static analysis tools divided by snippet size, then correlated with state-level socio-economic indicators.

If this is right

  • Readability issues such as improper whitespace and inconsistent formatting are the most common violations in all five languages studied.
  • Major technology hubs produce snippets that are more often parsable by analysis tools but do not automatically have lower violation densities.
  • Established technology regions tend to generate more complex violations while less mature regions generate more basic, fundamental errors.
  • Developers reusing Stack Overflow code should treat snippets as carrying regional socio-technical patterns rather than uniform quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Improving digital infrastructure and income equity in a region could gradually raise the baseline quality of code shared on public forums by that region's contributors.
  • Similar geographic patterns may appear on other code-sharing platforms or in other countries, offering a way to test whether the link is specific to Stack Overflow or more general.
  • Education or training programs focused on code style and security in lower-access regions could be evaluated for their effect on the types of violations observed.

Load-bearing premise

That the geographic location of Stack Overflow contributors can be reliably inferred from profile or metadata data and that the sampled snippets represent typical coding practices in those regions.

What would settle it

A state with high device access, internet subscriptions, and income but significantly higher violation density than low-access states, or the reverse pattern, after accounting for language and snippet length.

read the original abstract

Developers frequently reuse Stack Overflow code snippets, yet the quality of these snippets remains unevenly understood, particularly across programming languages and geographic contexts. This study investigates code quality in Stack Overflow answers from contributors located in the United States, focusing on SQL, JavaScript, Python, Ruby, and Java snippets. We evaluate four quality dimensions: reliability, readability, performance, and security. Using language-specific linting and static analysis tools, we quantify violations across states and cities, compute violation densities to enable fair regional comparison, and examine relationships between code quality and state-level diversity indicators. We further conduct inductive content analysis on code snippets from California, Utah, and North Dakota to identify qualitative patterns in code quality violations. Results show that readability violations are the most prevalent across all languages, followed by reliability, performance, and security. Common issues include improper whitespace, inconsistent formatting, program-flow errors, inefficient resource use, unsanitised inputs, and insecure dynamic evaluation. Regional analysis indicates that major technology hubs produce more parsable snippets but do not necessarily exhibit higher violation densities. States with broader access to computing devices, Internet subscriptions, higher income, and more equitable wealth distribution tend to show fewer code quality violations. Qualitative findings suggest that established technology regions often produce more complex violations, while less mature technology regions display more fundamental errors. These findings highlight the socio-technical nature of code quality in community question-answering platforms and suggest that developers should exercise caution when reusing online code snippets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript analyzes code quality in Stack Overflow answers from US contributors across SQL, JavaScript, Python, Ruby, and Java. It applies language-specific linting and static analysis tools to quantify violations along reliability, readability, performance, and security dimensions, normalizes them into violation densities for cross-regional comparison, correlates densities with state-level socio-economic indicators (device access, internet subscriptions, income, wealth equity), and performs inductive qualitative analysis on snippets from California, Utah, and North Dakota. The central claims are that readability violations predominate, major tech hubs produce more parsable snippets without necessarily lower densities, and states with stronger socio-economic indicators exhibit fewer violations overall.

Significance. If the geographic attributions and sample representativeness can be substantiated, the work would offer concrete evidence of socio-technical influences on code quality in community question-answering platforms. The mixed-methods design, multi-language scope, and use of density normalization to enable fair regional comparisons are positive features that could inform both developer practices around snippet reuse and broader research on infrastructure effects in software engineering.

major comments (3)
  1. [Data Collection / Geographic Attribution] The geographic attribution of contributors relies on self-reported profile or metadata locations without any described validation against external ground truth (e.g., GitHub, LinkedIn, or IP-based checks). Because many profiles are incomplete, stale, or absent, this directly undermines the state- and city-level correlations with socio-economic indicators reported in the results.
  2. [Sampling and Data Filtering] The sample is conditioned on answers that contain code parsable by the chosen linting tools. This selection criterion likely correlates with the very socio-economic variables under study (e.g., education, infrastructure, experience), introducing bias that prevents interpreting the violation-density differences as evidence of regional coding-practice variation.
  3. [Results / Regional Analysis] The correlation analysis between violation densities and state-level indicators (device access, income, wealth distribution) does not appear to include multivariate controls, robustness checks for multicollinearity, or adjustments for multiple testing. Without these, the reported relationships cannot be confidently attributed to the claimed socio-economic factors.
minor comments (2)
  1. [Abstract and Results] The abstract claims that 'major technology hubs produce more parsable snippets but do not necessarily exhibit higher violation densities,' yet no table or figure directly quantifies parsable-snippet rates by state or hub; adding this would clarify the distinction.
  2. [Methods] The definition and exact formula for 'violation density' (e.g., violations per line, per token, or per snippet) should be stated explicitly in the methods to allow replication and to justify why it enables fair regional comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The comments highlight important methodological considerations, and we address each major point below with explanations and proposed revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: The geographic attribution of contributors relies on self-reported profile or metadata locations without any described validation against external ground truth (e.g., GitHub, LinkedIn, or IP-based checks). Because many profiles are incomplete, stale, or absent, this directly undermines the state- and city-level correlations with socio-economic indicators reported in the results.

    Authors: We acknowledge that geographic attribution is based on self-reported Stack Overflow profile locations, a standard practice in platform-based studies where verified location data is unavailable and privacy constraints apply. No external validation (e.g., via GitHub or IP) was performed because such cross-platform linkage is not feasible at scale without user consent and raises ethical issues. To address the concern, we have added an expanded Limitations section explicitly discussing potential inaccuracies in self-reported data, their possible impact on correlations, and why this approach remains the most practical for large-scale SO analyses. We believe this does not invalidate the exploratory findings but qualifies their interpretation. revision: partial

  2. Referee: The sample is conditioned on answers that contain code parsable by the chosen linting tools. This selection criterion likely correlates with the very socio-economic variables under study (e.g., education, infrastructure, experience), introducing bias that prevents interpreting the violation-density differences as evidence of regional coding-practice variation.

    Authors: We agree this introduces a potential selection bias, as parsability may proxy for developer skill or regional infrastructure. The filtering step is required to enable automated violation detection and density normalization across languages; without it, quantitative comparison would not be possible. In revision, we have added a dedicated subsection on sampling bias, including a sensitivity analysis on a random subset of non-parsable snippets (via manual review) to assess whether patterns hold directionally. We maintain that the results reflect meaningful differences among analyzable contributions, which constitute the relevant population for code-reuse concerns, while transparently noting the limitation. revision: partial

  3. Referee: The correlation analysis between violation densities and state-level indicators (device access, income, wealth distribution) does not appear to include multivariate controls, robustness checks for multicollinearity, or adjustments for multiple testing. Without these, the reported relationships cannot be confidently attributed to the claimed socio-economic factors.

    Authors: We appreciate this observation on statistical rigor. Our original analysis relied on bivariate Spearman correlations for exploratory purposes across multiple indicators. In the revised manuscript, we now include multivariate linear regression models with controls for population size, number of contributors, and other available state-level covariates; we report variance inflation factors to check multicollinearity and apply Bonferroni correction for multiple comparisons. These additions allow stronger attribution while preserving the original exploratory correlations for context. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain.

full rationale

The paper conducts an observational study: it applies external language-specific linting and static analysis tools to quantify code quality violations in Stack Overflow snippets, computes violation densities for regional comparison, and correlates these with publicly available state-level socio-economic indicators. No equations, fitted parameters, or derivations are described that reduce the reported relationships to self-defined inputs or predictions by construction. Location attribution from profiles is treated as an input assumption rather than a derived result. The analysis does not rely on self-citation chains, uniqueness theorems from prior author work, or renaming of known patterns as new findings. The central claims are grounded in external benchmarks (linting tools and census-style indicators) and remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions of static analysis validity and accurate geolocation inference; no free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (2)
  • domain assumption Language-specific linting and static analysis tools provide valid and comparable measures of reliability, readability, performance, and security violations.
    The study quantifies violations using these tools without independent validation of their accuracy for the sampled snippets.
  • domain assumption Contributor locations can be accurately mapped to US states and cities from available metadata.
    Regional comparisons depend on this mapping being reliable.

pith-pipeline@v0.9.0 · 5579 in / 1407 out tokens · 71636 ms · 2026-05-07T15:54:14.387577+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

185 extracted references · 153 canonical work pages

  1. [1]

    Riemer, K., Schellhammer, S., & Meinert, M. (2018). Collaboration in the Digital Age: How Technology Enables Individuals, Teams and Businesses (K. Riemer, S. Schellhammer, & M. Meinert Eds.). Cham: Springer. doi:10.1007/978-3-319-94487-6

  2. [2]

    T., & Shah, C

    Le, L. T., & Shah, C. (2018). Retrieving people: Identifying potential answerers in Community Question- Answering. Journal of the Association for Information Science and Technology, 69 (10), 1246 -1258. doi:10.1002/asi.24042

  3. [3]

    W., & Hassan, A

    Barua, A., Thomas, S. W., & Hassan, A. E. (2014). What are developers talking about? An analysis of topics and trends in Stack Overflow. Empirical Software Engineering, 19 (3), 619 -654. doi:10.1007/s10664-012-9231-y

  4. [4]

    Ahmad, A., Feng, C., Ge, S., & Yousif, A. (2018). A survey on mining Stack Overflow: question and answering (Q&A) community. Data Technologies and Applications, 52(2), 190-247. doi:10.1108/DTA- 07-2017-0054

  5. [5]

    (2013, 18 -19 May 2013)

    Asaduzzaman, M., Mashiyat, A. S., Roy, C. K., & Schneider, K. A. (2013, 18-19 May 2013). Answering questions about unanswered questions of Stack Overflow. Paper presented at the 2013 10th Working Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2013.6624015

  6. [6]

    Trienes, J., & Balog, K. (2019). Identifying Unclear Questions in Community Question Answering Websites, Cham. doi:10.1007/978-3-030-15712-8_18

  7. [7]

    (2014, 29 Sept

    Ponzanelli, L., Mocci, A., Bacchelli, A., Lanza, M., & Fullerton, D. (2014, 29 Sept. -3 Oct. 2014). Improving Low Quality Stack Overflow Post Detection. Paper presented at the 2014 IEEE International Conference on Software Maintenance and Evolution. doi:10.1109/ICSME.2014.90

  8. [8]

    (2013, 18 -19 May 2013)

    Subramanian, S., & Holmes, R. (2013, 18 -19 May 2013). Making Sense of Online Code Snippets. Paper presented at the 2013 10th Working Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2013.6624012

  9. [9]

    C., Krinke, J., Oznacar, G., & White, R

    Bafatakis, N., Boecker, N., Boon, W., Salazar, M. C., Krinke, J., Oznacar, G., & White, R. (2019, 25 -31 May 2019). Python Coding Style Compliance on Stack Overflow. Paper presented at the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2019.00042

  10. [10]

    F., Smethurst, G., Moraes, J

    Campos, U. F., Smethurst, G., Moraes, J. P., Bonifácio, R., & Pinto, G. (2019, 25-31 May 2019). Mining Rule Violations in JavaScript Code Snippets. Paper presented at the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2019.00039

  11. [11]

    A., Owen, C

    Meldrum, S., Licorish, S. A., Owen, C. A., & Savarimuthu, B. T. R. (2020). Understanding stack overflow code quality: A recommendation of caution. Science of Computer Programming, 199 , 102516. doi:10.1016/j.scico.2020.102516

  12. [12]

    A., & Stanger, N

    Zolduoarrati, E., Licorish, S. A., & Stanger, N. (2024). Harmonising Contributions: Exploring Diversity in Software Engineering through CQA Mining on Stack Overflow. ACM Transactions on Software Engineering and Methodology. doi:10.1145/3672453

  13. [13]

    Anonymised. (2024). [Reserved for Contributions Metric work]. [Reserved for Contributions Metric work]. doi:10.1145/RESERVED

  14. [14]

    Tickoo, A., Chauhan, S., & Gupta, G. R. (2022). Friendliness Of Stack Overflow Towards Newbies. arXiv preprint arXiv:2208.10488. doi:10.48550/arXiv.2208.10488

  15. [15]

    (2017, 22 -26 May 2017)

    Fischer, F., Böttinger, K., Xiao, H., Stransky, C., Acar, Y., Backes, M., & Fahl, S. (2017, 22 -26 May 2017). Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security. Paper presented at the 2017 IEEE Symposium on Security and Privacy (SP). doi:10.1109/SP.2017.31 48

  16. [16]

    (2017, 20 -24 Feb

    An, L., Mlouki, O., Khomh, F., & Antoniol, G. (2017, 20 -24 Feb. 2017). Stack Overflow: A code laundering platform? Paper presented at the 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). doi:10.1109/SANER.2017.7884629

  17. [17]

    Atwood, J. (2008). None of Us is as Dumb as All of Us. Retrieved from https://blog.codinghorror.com/stack-overflow-none-of-us-is-as-dumb-as-all-of-us/

  18. [18]

    Yeh, A. (2022). Teaching English as a Foreign Language in the Philippines. In Philippine English (pp. 353-362): Routledge

  19. [19]

    Lutz, B. (2009). Linguistic challenges in global software development: lessons learned in an international SW development division. Paper presented at the 2009 Fourth IEEE International Conference on Global Software Engineering. doi:10.1109/ICGSE.2009.33

  20. [20]

    Haner, J., & Garcia, D. (2019). The artificial intelligence arms race: Trends and world leaders in autonomous weapons development. Global Policy, 10(3), 331-337. doi:10.1111/1758-5899.12713

  21. [21]

    Broughel, J., & Thierer, A. D. (2019). Technological innovation and economic growth: A brief report on the evidence. Mercatus Research Paper. doi:10.2139/ssrn.3346495

  22. [22]

    Zolduoarrati, E., & Licorish, S. A. (2021). On the value of encouraging gender tolerance and inclusiveness in software engineering communities. Information and Software Technology, 139, 106667. doi:10.1016/j.infsof.2021.106667

  23. [23]

    Ford, D., Harkins, A., & Parnin, C. (2017). Someone like me: How does peer parity influence participation of women on stack overflow? Paper presented at the 2017 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). doi:10.1109/VLHCC.2017.8103473

  24. [24]

    A., & Abboud, A

    Abdulkareem, S. A., & Abboud, A. J. (2021). Evaluating Python, C++, JavaScript and Java Programming Languages Based on Software Complexity Calculator (Halstead Metrics). IOP Conference Series: Materials Science and Engineering, 1076(1), 012046. doi:10.1088/1757-899X/1076/1/012046

  25. [25]

    (2004, 8-9 Sept

    Brass, S., & Goldberg, C. (2004, 8-9 Sept. 2004). Semantic errors in SQL queries: a quite complete list. Paper presented at the Fourth International Conference onQuality Software, 2004. QSIC 2004. Proceedings. doi:10.1109/QSIC.2004.1357967

  26. [26]

    N., Kou, B., & Zhang, T

    Kabir, S., Udo -Imeh, D. N., Kou, B., & Zhang, T. (2024). Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions . Paper presented at the Proceedings of the CHI Conference on Human Factors in Computing Systems. doi:10.1145/3613904.3642596

  27. [27]

    Anonymised. (2024). Does Location Influence Coding Practices? A Cross -Regional Study on Stack Overflow Code Quality [Data set]. doi:10.5281/zenodo.13622420

  28. [28]

    O., & Gustavsson, T

    Ahmad, M. O., & Gustavsson, T. (2024). The Pandora's box of social, process, and people debts in software engineering. Journal of Software: Evolution and Process, 36(2), e2516. doi:10.1002/smr.2516

  29. [29]

    -D., & Xie, X

    Luan, K., Ling, C. -D., & Xie, X. -Y. (2016). The nonlinear effects of educational diversity on team creativity. Asia Pacific Journal of Human Resources, 54(4), 465-480. doi:10.1111/1744-7941.12078

  30. [30]

    Olla, P., & Atkinson, C. (2004). Developing a wireless reference model for interpreting complexity in wireless projects. Industrial Management & Data Systems, 104 (3), 262 -272. doi:10.1108/02635570410525807

  31. [31]

    Kankanhalli, A., Tan, B. C. Y., & Wei, K.-K. (2006). Conflict and Performance in Global Virtual Teams. Journal of Management Information Systems, 23(3), 237-274. doi:10.2753/MIS0742-1222230309

  32. [32]

    Morin, J., & Ghosh, K. (2021). Linguistic Analysis of Stack Overflow Data: Native English vs Non-native English Speakers. Paper presented at the ECML PKDD 2021. Communications in Computer and Information Science, Cham. doi:10.1007/978-3-030-93733-1_9

  33. [33]

    Lai, J., & Widmar, N. O. (2021). Revisiting the Digital Divide in the COVID -19 Era. Applied Economic Perspectives and Policy, 43(1), 458-464. doi:10.1002/aepp.13104

  34. [34]

    A., & Stanger, N

    Zolduoarrati, E., Licorish, S. A., & Stanger, N. (2023). Secondary studies on human aspects in software engineering: A tertiary study. Journal of Systems and Software, 200 , 111654. doi:10.1016/j.jss.2023.111654

  35. [35]

    Blincoe, K., Springer, O., & Wrobel, M. R. (2019). Perceptions of Gender Diversity's Impact on Mood in Software Development Teams. IEEE Software, 36(5), 51-56. doi:10.1109/MS.2019.2917428

  36. [36]

    Hidellaarachchi, D., Grundy, J., Hoda, R., & Madampe, K. (2022). The Effects of Human Aspects on the Requirements Engineering Process: A Systematic Literature Review. IEEE Transactions on Software Engineering, 48(6), 2105-2127. doi:10.1109/TSE.2021.3051898

  37. [37]

    Rodríguez-Pérez, G., Nadri, R., & Nagappan, M. (2021). Perceived diversity in software engineering: a systematic literature review. Empirical Software Engineering, 26 (5), 102. doi:10.1007/s10664 -021- 09992-2

  38. [38]

    E., Seppänen, P., & Kuvaja, P

    Rodríguez, P., Mäntylä, M., Oivo, M., Lwakatare, L. E., Seppänen, P., & Kuvaja, P. (2019). Chapter Four - Advances in Using Agile and Lean Processes for Software Development. In A. M. Memon (Ed.), Advances in Computers (Vol. 113, pp. 135-224): Elsevier. doi:10.1016/bs.adcom.2018.03.014

  39. [39]

    Chatley, R., Donaldson, A., & Mycroft, A. (2019). The Next 7000 Programming Languages. In B. Steffen & G. Woeginger (Eds.), Computing and Software Science: State of the Art and Perspectives (pp. 250- 282). Cham: Springer International Publishing. doi:10.1007/978-3-319-91908-9_15 49

  40. [40]

    C., Kuzak, M., Alhamdoosh, M., Barker, M., Batut, B., Borg, M.,

    Jiménez, R. C., Kuzak, M., Alhamdoosh, M., Barker, M., Batut, B., Borg, M., . . . Crouch, S. (2017). Four simple recommendations to encourage best practices in research software. F1000Res, 6 . doi:10.12688/f1000research.11407.1

  41. [41]

    Haoues, M., Sellami, A., Ben -Abdallah, H., & Cheikhi, L. (2017). A guideline for software architecture selection based on ISO 25010 quality related characteristics. International Journal of System Assurance Engineering and Management, 8(2), 886-909. doi:10.1007/s13198-016-0546-8

  42. [42]

    (2005, 17 -18 Nov

    Al-Kilidar, H., Cox, K., & Kitchenham, B. (2005, 17 -18 Nov. 2005). The use and usefulness of the ISO/IEC 9126 quality standard. Paper presented at the 2005 International Symposium on Empirical Software Engineering, 2005. doi:10.1109/ISESE.2005.1541821

  43. [43]

    (2017, 9 -10 Nov

    Izurieta, C., Griffith, I., & Huvaere, C. (2017, 9 -10 Nov. 2017). An Industry Perspective to Comparing the SQALE and Quamoco Software Quality Models. Paper presented at the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). doi:10.1109/ESEM.2017.42

  44. [44]

    Nugroho, A., Visser, J., & Kuipers, T. (2011). An empirical model of technical debt and interest . Paper presented at the Proceedings of the 2nd Workshop on Managing Technical Debt, Waikiki, Honolulu, HI, USA. doi:10.1145/1985362.1985364

  45. [45]

    d., Rastogi, A., Bruntink, M., & Deursen, A

    Biase, M. d., Rastogi, A., Bruntink, M., & Deursen, A. v. (2019, 26 -26 May 2019). The Delta Maintainability Model: Measuring Maintainability of Fine -Grained Code Changes. Paper presented at the 2019 IEEE/ACM International Conference on Technical Debt (TechDebt). doi:10.1109/TechDebt.2019.00030

  46. [46]

    Scalabrino, S., Linares-Vásquez, M., Oliveto, R., & Poshyvanyk, D. (2018). A comprehensive model for code readability. Journal of Software: Evolution and Process, 30(6), e1958. doi:10.1002/smr.1958

  47. [47]

    Buse, R. P. L., & Weimer, W. R. (2010). Learning a Metric for Code Readability. IEEE Transactions on Software Engineering, 36(4), 546-558. doi:10.1109/TSE.2009.70

  48. [48]

    W., & Ouni, A

    AlOmar, E., Mkaouer, M. W., & Ouni, A. (2019, 28 -28 May 2019). Can Refactoring Be Self-Affirmed? An Exploratory Study on How Developers Document Their Refactoring Activities in Commit Messages. Paper presented at the 2019 IEEE/ACM 3rd International Workshop on Refactoring (IWoR). doi:10.1109/IWoR.2019.00017

  49. [49]

    A., Newman, C

    Peruma, A., Simmons, S., AlOmar, E. A., Newman, C. D., Mkaouer, M. W., & Ouni, A. (2021). How do I refactor this? An empirical study on refactoring trends and topics in Stack Overflow. Empirical Software Engineering, 27(1), 11. doi:10.1007/s10664-021-10045-x

  50. [50]

    H., & Hassan, A

    Zhang, H., Wang, S., Li, H., Chen, T. H., & Hassan, A. E. (2022). A Study of C/C++ Code Weaknesses on Stack Overflow. IEEE Transactions on Software Engineering, 48 (7), 2359 -2375. doi:10.1109/TSE.2021.3058985

  51. [51]

    Selvaraj, M., & Uddin, G. (2022). Does Collaborative Editing Help Mitigate Security Vulnerabilities in Crowd-Shared IoT Code Examples? Paper presented at the Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement, Helsinki, Finland. doi:10.1145/3544902.3546235

  52. [52]

    Firouzi, E., Sami, A., Khomh, F., & Uddin, G. (2020). On the use of C# Unsafe Code Context: An Empirical Study of Stack Overflow . Paper presented at the Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Bari, Italy. doi:10.1145/3382494.3422165

  53. [53]

    Treude, C., & Robillard, M. P. (2017, 17-22 Sept. 2017). Understanding Stack Overflow Code Fragments. Paper presented at the 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). doi:10.1109/ICSME.2017.24

  54. [54]

    Van Rossum, G., Warsaw, B., & Coghlan, N. (2001). PEP 8 – Style Guide for Python Code. In Python.org (Vol. 1565, pp. 28)

  55. [55]

    Ahmad, M., & Cinnéide, M. Ó. (2019, 25 -31 May 2019). Impact of Stack Overflow Code Snippets on Software Cohesion: A Preliminary Study. Paper presented at the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2019.00050

  56. [56]

    Counsell, S., Swift, S., & Crampton, J. (2006). The Interpretation and Utility of Three Cohesion Metrics for Object -Oriented Design. ACM Trans. Softw. Eng. Methodol., 15 (2), 123 –149. doi:10.1145/1131421.1131422

  57. [57]

    Zerouali, A., Mens, T., & De Roover, C. (2021). On the usage of JavaScript, Python and Ruby packages in Docker Hub images. Science of Computer Programming, 207 , 102653. doi:10.1016/j.scico.2021.102653

  58. [58]

    Geremia, S., Bavota, G., Oliveto, R., Lanza, M., & Penta, M. D. (2019, 30 Sept. -1 Oct. 2019). Characterizing Leveraged Stack Overflow Posts. Paper presented at the 2019 19th International Working Conference on Source Code Analysis and Manipulation (SCAM). doi:10.1109/SCAM.2019.00025

  59. [59]

    Ebert, C., Cain, J., Antoniol, G., Counsell, S., & Laplante, P. (2016). Cyclomatic Complexity. IEEE Software, 33(6), 27-29. doi:10.1109/MS.2016.147

  60. [60]

    (2018, 23 -29 Sept

    Pantiuchina, J., Lanza, M., & Bavota, G. (2018, 23 -29 Sept. 2018). Improving Code: The (Mis) Perception of Quality Metrics. Paper presented at the 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). doi:10.1109/ICSME.2018.00017 50

  61. [61]

    A., Zumel, L

    González, C. A., Zumel, L. A., Acero, J. A., Lenarduzzi, V., Martínez-Fernández, S., & Rodríguez, S. R. (2021, 6-10 Dec. 2021). A preliminary investigation of developer profiles based on their activities and code quality: Who does what? Paper presented at the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). doi:...

  62. [62]

    T., Bandara, A., Levine, M., Nuseibeh, B., & Sharp, H

    Lopez, T., Tun, T. T., Bandara, A., Levine, M., Nuseibeh, B., & Sharp, H. (2018). An investigation of security conversations in stack overflow: perceptions of security and community involvement . Paper presented at the Proceedings of the 1st International Workshop on Security Awareness from Design to Deployment, Gothenburg, Sweden. doi:10.1145/3194707.3194713

  63. [63]

    Jones, C., & Bonsignour, O. (2011). The Economics of Software Quality. Boston, MA: Addison-Wesley Professional

  64. [64]

    (2009, 16 -17 May 2009)

    Boogerd, C., & Moonen, L. (2009, 16 -17 May 2009). Evaluating the relation between coding standard violations and faultswithin and across software versions. Paper presented at the 2009 6th IEEE International Working Conference on Mining Software Repositories. doi:10.1109/MSR.2009.5069479

  65. [65]

    Santos, J. A. M., Rocha-Junior, J. B., Prates, L. C. L., Nascimento, R. S. d., Freitas, M. F., & Mendonça, M. G. d. (2018). A systematic review on the code smell effect. Journal of Systems and Software, 144 , 450-477. doi:10.1016/j.jss.2018.07.035

  66. [66]

    R., Reddy, M

    Mitra, D., Arora, M., Rakhra, M., Kumar, C. R., Reddy, M. L., Reddy, S. P. K., . . . Shabaz, M. (2021). A Hybrid Framework to Control Software Architecture Erosion for Addressing Maintenance Issues. Annals of the Romanian Society for Cell Biology , 2974 –2989-2974–2989. Retrieved from http://www.annalsofrscb.ro/index.php/journal/article/view/2839

  67. [67]

    Bhatia, S., & Malhotra, J. (2014). A survey on impact of lines of code on software complexity. Paper presented at the 2014 International Conference on Advances in Engineering & Technology Research (ICAETR - 2014). doi:10.1109/ICAETR.2014.7012875

  68. [68]

    Bartish, A., & Thevathayan, C. (2002). BDI Agents for Game Development . Paper presented at the Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2, Bologna, Italy. doi:10.1145/544862.544901

  69. [69]

    (2017, 20 -21 May 2017)

    Yang, D., Martins, P., Saini, V., & Lopes, C. (2017, 20 -21 May 2017). Stack Overflow in GitHub: Any Snippets There? Paper presented at the 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). doi:10.1109/MSR.2017.13

  70. [70]

    Abdalkareem, R., Shihab, E., & Rilling, J. (2017). On code reuse from StackOverflow: An exploratory study on Android apps. Information and Software Technology, 88 , 148 -158. doi:10.1016/j.infsof.2017.04.005

  71. [71]

    G., Licorish, S

    Ndukwe, I. G., Licorish, S. A., Tahir, A., & MacDonell, S. G. (2023). How have views on Software Quality differed over time? Research and practice viewpoints. Journal of Systems and Software, 195 , 111524. doi:10.1016/j.jss.2022.111524

  72. [72]

    (2016, 5-6 April 2016)

    Sedano, T. (2016, 5-6 April 2016). Code Readability Testing, an Empirical Study. Paper presented at the 2016 IEEE 29th International Conference on Software Engineering Education and Training (CSEET). doi:10.1109/CSEET.2016.36

  73. [73]

    Bhat, T., & Nagappan, N. (2006). Building Scalable Failure-proneness Models Using Complexity Metrics for Large Scale Software Systems. Paper presented at the 2006 13th Asia Pacific Software Engineering Conference (APSEC'06). doi:10.1109/APSEC.2006.25

  74. [74]

    Ryoo, J., Kazman, R., & Anand, P. (2015). Architectural Analysis for Security. IEEE Security & Privacy, 13(6), 52-59. doi:10.1109/MSP.2015.126

  75. [75]

    C.-W., & Paelinckx, D

    Chan, J. C.-W., & Paelinckx, D. (2008). Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sensing of Environment, 112(6), 2999-3011. doi:10.1016/j.rse.2008.02.011

  76. [76]

    (2001, 10 -10 Nov

    Warren, P., Gaskell, C., & Boldyreff, C. (2001, 10 -10 Nov. 2001). Preparing the ground for Website metrics research. Paper presented at the Proceedings 3rd International Workshop on Web Site Evolution. WSE 2001. doi:10.1109/WSE.2001.988789

  77. [77]

    Fritchey, G. (2018). SQL Query Performance Tuning. In G. Fritchey (Ed.), SQL Server 2017 Query Performance Tuning: Troubleshoot and Optimize Query Performance (pp. 1-22). Berkeley, CA: Apress. doi:10.1007/978-1-4842-3888-2_1

  78. [78]

    Beasley, R. E. (2020). Database Design, SQL, and Data Binding. In R. E. Beasley (Ed.), Essential ASP.NET Web Forms Development: Full Stack Programming with C#, SQL, Ajax, and JavaScript (pp. 359-394). Berkeley, CA: Apress. doi:10.1007/978-1-4842-5784-5_20

  79. [79]

    Cherfi, A., Nouira, K., & Ferchichi, A. (2018). Very Fast C4.5 Decision Tree Algorithm. Applied Artificial Intelligence, 32(2), 119-137. doi:10.1080/08839514.2018.1447479

  80. [80]

    S., Jaafar, J., Hashmani, M

    Palli, A. S., Jaafar, J., Hashmani, M. A., Gomes, H. M., & Gilal, A. R. (2022). A Hybrid Sampling Approach for Imbalanced Binary and Multi -Class Data Using Clustering Analysis. IEEE Access, 10 , 118639-118653. doi:10.1109/ACCESS.2022.3218463

Showing first 80 references.