pith. machine review for the scientific record. sign in

arxiv: 2605.11844 · v1 · submitted 2026-05-12 · 💻 cs.SE

Recognition: 1 theorem link

· Lean Theorem

The Death Spiral of Open Source Projects: A Post-Mortem Analysis of Pull Request Workflow Dynamics

Kuljit Kaur Chahal, Mohit Kaushik

Pith reviewed 2026-05-13 05:43 UTC · model grok-4.3

classification 💻 cs.SE
keywords open source softwarepull requestsproject mortalitygithubworkflow dynamicsdeath spiralsurvival analysis
0
0 comments X

The pith

Open source projects survive based on value and popularity rather than pull request workflow efficiency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines pull request workflows across 1,736 inactive GitHub repositories and 1.3 million PRs to identify what drives open source project failure. Comparative analysis shows workflow friction, extended reviews, and negativity occur widely on the platform, appearing in both active and inactive projects alike. Evolutionary tracking uncovers a death spiral of falling innovation, expanding backlogs, and longer merge times that ends in disengagement, though labeling stays steady and toxicity does not rise. Explanatory models then establish that lifespan depends on inherent value and ecosystem factors, with popularity and innovation as positive predictors while friction and rejections grow as byproducts of age. The work reframes OSS mortality as a socio-technical issue dominated by value and abandonment over workflow discipline.

Core claim

Our explanatory modeling demonstrates that project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics. Popularity and innovation emerge as strong positive predictors of survival, while friction, rejection rates, labeling formalization, and negativity scale with longevity as byproducts rather than causes of failure.

What carries the argument

Explanatory modeling of project lifespan using PR workflow metrics such as merge latency, negativity scores, innovation rates, and rejection rates to separate causes of mortality from symptoms.

Load-bearing premise

The selected PR workflow metrics and definition of project inactivity accurately distinguish causes of mortality from correlated symptoms without substantial selection bias or omitted variables in the explanatory models.

What would settle it

A dataset showing that high-popularity projects with high friction and rejection rates outlast low-popularity projects with efficient workflows and low rejections would indicate value overrides workflow factors.

Figures

Figures reproduced from arXiv: 2605.11844 by Kuljit Kaur Chahal, Mohit Kaushik.

Figure 1
Figure 1. Figure 1: Distribution of PRs by dominant sentiment category. [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Rising median merge time across quartiles, indicating increasing fric [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Backlog growth across quartiles, showing exponential accumulation [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Backlog and toxicity trends for three case study repositories (Simple-Gallery, Facebook Buck, ReactiveCocoa). The red line shows the yearly average [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
read the original abstract

Open Source Software projects (OSS) are central to modern technology, yet their survival rates remain low. Prior research has examined project mortality through macro-level indicators such as commit activity, developer abandonment, and ecosystem dependencies, but the micro-level dynamics of the Pull Request (PR) workflow have been largely overlooked. This study provides the first large-scale post-mortem analysis of PR workflows across 1,736 inactive GitHub repositories and 1.3 million human-driven PRs. Using a mixed-method quantitative design, we investigate three dimensions of mortality. First, our comparative descriptive analysis shows that workflow friction, extended review cycles, and negativity penalties are endemic properties of the entire GitHub platform across both active and inactive projects. Rejected PRs consistently attract higher discussion and negativity regardless of project health. Second, our evolutionary analysis identifies a universal ``death spiral" marked by declining innovation rates, exponential backlog growth, rising merge latency. The collapse was defined by silence and disengagement. Labeling formalization remained endemic throughout the lifecycle, while toxicity did not intensify. Finally, our explanatory modeling demonstrates that project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics. Popularity and innovation emerge as strong positive predictors of survival, while friction, rejection rates, labeling formalization, and negativity scale with longevity as byproducts rather than causes of failure. Robustness checks across alternative inactivity thresholds confirm these findings. Together, this work reframes OSS mortality as a socio-technical phenomenon in which abandonment and ecosystem value dominate survival outcomes, while PR-level workflow discipline plays a secondary role.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper conducts a large-scale post-mortem analysis of PR workflows in 1,736 inactive GitHub repositories comprising 1.3 million human-driven PRs. Using comparative descriptive, evolutionary, and explanatory modeling approaches, it identifies a universal 'death spiral' of declining innovation, backlog growth, and rising latency, while arguing that project lifespan is driven by inherent value and ecosystem dynamics (popularity, innovation) rather than workflow efficiency; friction, rejection rates, labeling, and negativity are presented as byproducts of longevity, with robustness checks only across alternative inactivity thresholds.

Significance. If the central claims hold after addressing modeling limitations, the work would meaningfully advance OSS sustainability research by providing the first large-scale micro-level PR workflow analysis and reframing mortality as socio-technical rather than process-driven. The scale of the dataset and mixed-method design are strengths, but the explanatory component's ability to support causal separation of symptoms from causes is currently limited by the observational nature of the data.

major comments (2)
  1. [Explanatory modeling] Explanatory modeling section: The headline claim that 'project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics' and that friction/rejection/negativity 'scale with longevity as byproducts rather than causes' rests on regression-style models. These models are fit to observational GitHub data without reported fixed effects, instrumental variables, or explicit controls for project-level confounders (domain, external funding, maintainer reputation, or selection into inactivity). This leaves reverse causality and omitted-variable bias unaddressed, directly undermining the causal interpretation that workflow metrics are non-determinative.
  2. [Abstract] Abstract and methods description: Robustness is reported only for alternative inactivity thresholds, yet no details are provided on exact model specifications (e.g., regression type, variable operationalization for 'innovation rates,' 'negativity scores,' or merge latency), handling of multicollinearity, or tests for endogeneity. These omissions make it impossible to assess whether the models can distinguish causes from correlated symptoms, which is load-bearing for the central reframing of OSS mortality.
minor comments (3)
  1. [Evolutionary analysis] The evolutionary analysis identifies a 'death spiral' but does not quantify the timing or thresholds at which innovation decline transitions to disengagement; providing survival curves or hazard ratios would strengthen the descriptive claims.
  2. [Comparative descriptive analysis] Comparative analysis states that rejected PRs attract higher discussion and negativity 'regardless of project health,' but the operationalization of 'project health' and the statistical test used to establish this invariance should be clarified.
  3. The manuscript would benefit from a dedicated limitations subsection discussing potential selection bias in the sample of inactive repositories and the generalizability beyond GitHub.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address the concerns regarding the explanatory modeling and methodological details below, and we will make revisions to clarify the scope of our claims and provide additional transparency in the methods section.

read point-by-point responses
  1. Referee: [Explanatory modeling] Explanatory modeling section: The headline claim that 'project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics' and that friction/rejection/negativity 'scale with longevity as byproducts rather than causes' rests on regression-style models. These models are fit to observational GitHub data without reported fixed effects, instrumental variables, or explicit controls for project-level confounders (domain, external funding, maintainer reputation, or selection into inactivity). This leaves reverse causality and omitted-variable bias unaddressed, directly undermining the causal interpretation that workflow metrics are non-determinative.

    Authors: We agree that our analysis is observational and does not establish strict causality. The models include controls for project size, age, and popularity as proxies for value and ecosystem dynamics. However, we did not include project fixed effects or instrumental variables due to the nature of the data. We will revise the manuscript to explicitly state that the findings represent associations and predictive relationships rather than causal determinations. We will add a dedicated limitations subsection discussing potential omitted variable bias and reverse causality, and rephrase the abstract and conclusions to avoid causal language such as 'determined by' and 'byproducts rather than causes'. revision: partial

  2. Referee: [Abstract] Abstract and methods description: Robustness is reported only for alternative inactivity thresholds, yet no details are provided on exact model specifications (e.g., regression type, variable operationalization for 'innovation rates,' 'negativity scores,' or merge latency), handling of multicollinearity, or tests for endogeneity. These omissions make it impossible to assess whether the models can distinguish causes from correlated symptoms, which is load-bearing for the central reframing of OSS mortality.

    Authors: We will expand the Methods section to provide full details on the explanatory models, including the specific regression techniques used (e.g., survival analysis or logistic regression for lifespan), precise definitions and operationalizations of variables such as innovation rates (measured as the proportion of PRs introducing new features), negativity scores (derived from sentiment analysis tools), and merge latency (average time from PR creation to merge or close). We will report variance inflation factors to address multicollinearity and include a discussion of endogeneity concerns. While we cannot perform formal endogeneity tests like Hausman without additional instruments, we will acknowledge this limitation. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical derivation chain

full rationale

The paper relies on large-scale observational analysis of external GitHub data (1,736 repositories, 1.3M PRs) using descriptive statistics, evolutionary trend tracking, and standard explanatory regression modeling. No equations, fitted parameters, or self-citations are presented as reducing any prediction or central claim to its own inputs by construction. Claims about predictors of lifespan versus byproducts rest on statistical associations from independent data rather than definitional loops or ansatz smuggling. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard empirical software engineering assumptions about data representativeness and metric validity. No explicit free parameters or invented entities are described in the abstract; any model-specific parameters or thresholds for inactivity are not detailed.

free parameters (1)
  • inactivity threshold
    Definition of inactive projects requires a time-based cutoff whose exact value and sensitivity are not specified in the abstract but affect all downstream comparisons.
axioms (1)
  • domain assumption GitHub PR metadata and derived metrics reliably proxy for contributor engagement, project health, and socio-technical dynamics
    Invoked throughout the comparative, evolutionary, and explanatory sections to interpret workflow friction and negativity as indicators of mortality.

pith-pipeline@v0.9.0 · 5586 in / 1374 out tokens · 128033 ms · 2026-05-13T05:43:10.700512+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

  1. [1]

    Hoffmann, F

    M. Hoffmann, F. Nagle, Y . Zhou, The value of open source software, Harvard Business School Strategy Unit Working Paper (24-038) (2024)

  2. [2]

    Lawson, S

    A. Lawson, S. Hendrick, Global spotlight 2023: Survey-based insights into the global landscape of open source trends, sustainability challenges, and growth opportunities, 2023. URLhttps://api.semanticscholar.org/CorpusID:274661862

  3. [3]

    Coelho, M

    J. Coelho, M. T. Valente, Why modern open source projects fail, in: Pro- ceedings of the 2017 11th Joint meeting on foundations of software engi- neering, 2017, pp. 186–196

  4. [4]

    A. Ait, J. L. C. Izquierdo, J. Cabot, An empirical study on the survival rate of GitHub projects, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 365–375

  5. [5]

    P. Pu, C. Liang, Assessing the risks posed by outdated dependencies in software supply chains: a focus on the npm ecosystem, in: Fourth Interna- tional Conference on Algorithms, Microchips, and Network Applications (AMNA 2025), V ol. 13576, SPIE, 2025, pp. 508–514

  6. [6]

    D. Reid, K. Rahkema, J. Walden, Large scale study of orphan vulnerabili- ties in the software supply chain, in: Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engi- neering, 2023, pp. 22–32

  7. [7]

    Linåker, E

    J. Linåker, E. Papatheocharous, T. Olsson, How to characterize the health of an open source software project? a snowball literature review of an emerging practice, in: Proceedings of the 18th International Symposium on Open Collaboration, 2022, pp. 1–12

  8. [8]

    Cosentino, J

    V . Cosentino, J. L. C. Izquierdo, J. Cabot, Assessing the bus factor of git repositories, in: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), IEEE, 2015, pp. 499– 503

  9. [9]

    Avelino, E

    G. Avelino, E. Constantinou, M. T. Valente, A. Serebrenik, On the aban- donment and survival of open source projects: An empirical investiga- tion, in: 2019 ACM/IEEE International Symposium on Empirical Soft- ware Engineering and Measurement (ESEM), IEEE, 2019, pp. 1–12

  10. [10]

    K. A. Hasan, J. Yasmin, H. Hao, Y . Tian, S. Hassan, S. H. Ding, Under- standing abandonment and slowdown dynamics in the Maven ecosystem, in: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR), IEEE, 2025, pp. 354–358

  11. [11]

    Robinson, K

    D. Robinson, K. Enns, N. Koulecar, M. Sihag, Two approaches to sur- vival analysis of open source Python projects, in: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 660–669

  12. [12]

    Coelho, M

    J. Coelho, M. T. Valente, L. L. Silva, E. Shihab, Identifying unmaintained projects in GitHub, in: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2018, pp. 1–10

  13. [13]

    Z. Liao, B. Zhao, S. Liu, H. Jin, D. He, L. Yang, Y . Zhang, J. Wu, A pre- diction model of the project life-span in open source software ecosystem, Mobile Networks and Applications 24 (4) (2019) 1382–1391

  14. [14]

    Valiev, et al., A case study of the PyPI ecosystem, in: Proceed- ings of the International Conference on Mining Software Repositories, IEEE/ACM, 2021

    M. Valiev, et al., A case study of the PyPI ecosystem, in: Proceed- ings of the International Conference on Mining Software Repositories, IEEE/ACM, 2021

  15. [15]

    Golzadeh, A

    M. Golzadeh, A. Decan, T. Mens, On the effect of discussions on pull request decisions., in: BENEVOL, 2019

  16. [16]

    M. Ortu, G. Destefanis, D. Graziotin, M. Marchesi, R. Tonelli, How do you propose your code changes? empirical analysis of affect metrics of pull requests on GitHub, IEEE Access 8 (2020) 110897–110907.doi: 10.1109/ACCESS.2020.3002663

  17. [17]

    J. Jr, L. Nascimento, A. Santos, I. Machado, Issue labeling dynamics in open-source projects: A comprehensive analysis, in: Proceedings of the 18th Brazilian Symposium on Software Components, Archi- tectures, and Reuse, SBC, Porto Alegre, RS, Brasil, 2024, pp. 51–60. doi:10.5753/sbcars.2024.3855. URLhttps://sol.sbc.org.br/index.php/sbcars/article/ view/30232

  18. [18]

    Kikas, M

    R. Kikas, M. Dumas, D. Pfahl, Using dynamic and contextual features to predict issue lifetime in GitHub projects, Proceedings - 13th Working Conference on Mining Software Repositories, MSR 2016 (2016) 291– 302doi:10.1145/2901739.2901751

  19. [19]

    Alami, M

    A. Alami, M. L. Cohn, A. WKaisowski, How do FOSS communities decide to accept pull requests?, in: Proceedings of the 24th Interna- tional Conference on Evaluation and Assessment in Software Engineer- 21 ing, EASE ’20, Association for Computing Machinery, New York, NY , USA, 2020, p. 220–229.doi:10.1145/3383219.3383242

  20. [20]

    Zhang, Y

    X. Zhang, Y . Yu, G. Gousios, A. Rastogi, Pull request decisions ex- plained: An empirical overview, IEEE Transactions on Software Engi- neering 49 (2) (2022) 849–871

  21. [21]

    Wessel, T

    M. Wessel, T. Mens, A. Decan, P. R. Mazrae, The GitHub develop- ment workflow automation ecosystems, in: T. Mens, C. D. Roover, A. Cleve (Eds.), Software Ecosystems, Springer, Cham, 2023.doi: 10.1007/978-3-031-36060-2_8. URLhttps://doi.org/10.1007/978-3-031-36060-2_8

  22. [22]

    I. E. Asri, N. Kerzazi, G. Uddin, F. Khomh, M. A. J. Idrissi, An empirical study of sentiments in code reviews, Information and Software Technol- ogy 114 (2019) 37–54.doi:10.1016/J.INFSOF.2019.06.005

  23. [23]

    Sanei, J

    A. Sanei, J. Cheng, B. Adams, The impacts of sentiments and tones in community-generated issue discussions, Proceedings - 2021 IEEE/ACM 13th International Workshop on Cooperative and Human Aspects of Software Engineering, CHASE 2021 (2021) 1–10doi:10. 1109/CHASE52884.2021.00009

  24. [24]

    Nourry, M

    O. Nourry, M. Kondo, S. Saito, Y . Iimura, N. Ubayashi, Y . Kamei, Myth: The loss of core developers is a critical issue for OSS communities, arXiv preprint arXiv:2412.00313 (2024)

  25. [25]

    H. Fang, J. Herbsleb, B. Vasilescu, Novelty begets popularity, but curbs participation – a macroscopic view of the Python open-source ecosystem, in: Proceedings of the IEEE/ACM 46th International Conference on Soft- ware Engineering, ICSE ’24, Association for Computing Machinery, New York, NY , USA, 2024.doi:10.1145/3597503.3608142

  26. [26]

    J. Song, C. Kim, What is needed for the sustainable success of OSS projects: Efficiency analysis of commit production process via git, Sus- tainability 10 (9) (2018) 3001.doi:10.3390/su10093001. URLhttps://doi.org/10.3390/su10093001

  27. [27]

    R. Kaur, K. K. Chahal, Exploring factors affecting developer abandon- ment of open source software projects, Journal of Software: Evolution and Process 34 (9) (2022) e2484

  28. [28]

    Calefato, M

    F. Calefato, M. A. Gerosa, G. Iaffaldano, et al., Will you come back to contribute? investigating the inactivity of OSS core developers in GitHub, Empirical Software Engineering 27 (1) (2022) 76.doi:10. 1007/s10664-021-10012-6. URLhttps://doi.org/10.1007/s10664-021-10012-6

  29. [29]

    R. A. Lange, A. Gibson, M. Z. Trujillo, B. F. Welles, Invisible labor: The backbone of open source software, arXiv preprint arXiv:2503.13405 (2025)

  30. [30]

    L. Yin, Z. Chen, Q. Xuan, V . Filkov, Sustainability forecasting for Apache incubator projects, in: Proceedings of the 29th ACM Joint Meet- ing on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, Association for Computing Machinery, New York, NY , USA, 2021, p. 1056–1067. doi:10.1145/3468264.3468563

  31. [31]

    L. Yin, M. Chakraborti, Y . Yan, C. Schweik, S. Frey, V . Filkov, Open source software sustainability: Combining institutional analy- sis and socio-technical networks, Proc. ACM Hum.-Comput. Interact. 6 (CSCW2) (Nov. 2022).doi:10.1145/3555129

  32. [32]

    Miller, C

    C. Miller, C. Kästner, B. Vasilescu, We feel like we’re winging it: A study on navigating open-source dependency abandonment, in: Proceed- ings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023, Association for Computing Machinery, New York, NY , USA, 2023, pp. 1281–1293....

  33. [33]

    E. K. Adejumo, B. Johnson, M. Guizani, Commit stability as a signal for risk in open-source projects, arXiv preprint arXiv:2508.02487 (2025)

  34. [34]

    S. Park, G. Kwon, Analyzing key features of open source software surviv- ability with random forest., Applied Sciences (2076-3417) 15 (2) (2025)

  35. [35]

    Y . Xu, R. He, H. Ye, M. Zhou, H. Wang, Predicting maintenance cessa- tion of open source software repositories with an integrated feature frame- work, arXiv preprint arXiv:2507.21678 (2025)

  36. [36]

    Kaushik, K

    M. Kaushik, K. K. Chahal, Community engagement and the lifespan of open-source software projects, Information and Software Technology 189 (2026) 107914.doi:10.1016/j.infsof.2025.107914

  37. [37]

    Kaushik, K

    M. Kaushik, K. K. Chahal, Beyond speed: Engagement sustains lifespan, Software: Practice and Experience 56 (6) 758–785.doi:https://doi. org/10.1002/spe.70068

  38. [38]

    Khondhu, A

    J. Khondhu, A. Capiluppi, K.-J. Stol, Is it all lost? a study of inactive open source projects, in: IFIP international conference on open source systems, Springer, 2013, pp. 61–79

  39. [39]

    R. He, H. Ye, M. Zhou, Revealing the value of repository centrality in lifespan prediction of open source software projects, arXiv preprint arXiv:2405.07508 (2024)

  40. [40]

    Coelho, M

    J. Coelho, M. T. Valente, L. Milen, L. L. Silva, Is this GitHub project maintained? measuring the level of maintenance activity of open-source projects, Information and Software Technology 122 (2020) 106274

  41. [41]

    Wessel, I

    M. Wessel, I. Wiese, I. Steinmacher, M. A. Gerosa, Don’t disturb me: Challenges of interacting with software bots on open source soft- ware projects, Proceedings of the ACM on Human-Computer Interaction 5 (CSCW2) (2021) 1–21

  42. [42]

    Wessel, A

    M. Wessel, A. Serebrenik, I. Wiese, I. Steinmacher, M. A. Gerosa, Quality gatekeepers: investigating the effects of code review bots on pull request activities, Empirical Software Engineering 27 (5) (2022) 108

  43. [43]

    Khatoonabadi, D

    S. Khatoonabadi, D. E. Costa, S. Mujahid, E. Shihab, Understanding the helpfulness of stale bot for pull-based development: An empirical study of 20 large open-source projects, ACM Transactions on Software Engi- neering and Methodology 33 (2) (2023) 1–43

  44. [44]

    K. A. Hasan, M. Macedo, Y . Tian, B. Adams, S. Ding, Understanding the time to first response in GitHub pull requests, in: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), IEEE, 2023, pp. 1–11

  45. [45]

    Golzadeh, A

    M. Golzadeh, A. Decan, D. Legay, T. Mens, A ground-truth dataset and classification model for detecting bots in GitHub issue and pr comments, Journal of Systems and Software 175 (2021) 110911

  46. [46]

    W. Lu, E. Kasaadah, S. Karim, M. Germonprez, S. Goggins, Open source software lifecycle classification: Developing wrangling techniques for complex sociotechnical systems, arXiv preprint arXiv:2504.16670 (2025)

  47. [47]

    Lumbard, M

    K. Lumbard, M. Germonprez, S. Goggins, An empirical investigation of social comparison and open source community health, Information Sys- tems Journal 34 (2) (2024) 499–532

  48. [48]

    Iaffaldano, I

    G. Iaffaldano, I. Steinmacher, F. Calefato, M. Gerosa, F. Lanubile, Why do developers take breaks from contributing to OSS projects? a prelimi- nary analysis, in: Proceedings of the 2nd International Workshop on Soft- ware Health, 2019, pp. 9–16

  49. [49]

    Y . Qiao, J. Wang, C. Cheng, W. Tang, P. Liang, Y . Zhao, B. Li, Code reviewer recommendation based on a hypergraph with multiplex rela- tionships, in: 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, 2024, pp. 417–428

  50. [50]

    Linåker, G

    J. Linåker, G. Link, K. Lumbard, Sustaining maintenance labor for healthy open source software projects through human infrastructure: A maintainer perspective, in: Proceedings of the 18th ACM/IEEE Interna- tional Symposium on Empirical Software Engineering and Measurement, 2024, pp. 37–48

  51. [51]

    Mashhadi, H

    O. Dabic, E. Aghajani, G. Bavota, Sampling projects in GitHub for msr studies, Proceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021 (2021) 560–564doi:10. 1109/MSR52588.2021.00074

  52. [52]

    In: 45th IEEE/ACM International Conference on Software En- gineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023

    A. Mastropaolo, L. Pascarella, E. Guglielmi, M. Ciniselli, S. Scalabrino, R. Oliveto, G. Bavota, On the robustness of code generation techniques: An empirical study on GitHub copilot, in: 2023 IEEE/ACM 45th Inter- national Conference on Software Engineering (ICSE), IEEE, 2023, pp. 2149–2160.doi:10.1109/ICSE48619.2023.00181

  53. [53]

    Pietri, G

    A. Pietri, G. Rousseau, S. Zacchiroli, Forking without clicking: on how to identify software repository forks, in: Proceedings of the 17th Interna- tional Conference on Mining Software Repositories, MSR ’20, Associa- tion for Computing Machinery, New York, NY , USA, 2020, p. 277–287. doi:10.1145/3379597.3387450

  54. [54]

    T. Dey, A. Mockus, Effect of technical and social factors on pull request quality for the npm ecosystem, in: Proceedings of the ACM/IEEE Interna- tional Symposium on Empirical Software Engineering and Measurement (ESEM ’20), ACM, 2020, pp. 1–11.doi:10.1145/3382494.3410685

  55. [55]

    Thongtanunam, R

    P. Thongtanunam, R. G. Kula, C. Treude, H. Hata, T. Ishio, K. Mat- sumoto, Reviewer recommendation for pull requests in GitHub: What can we learn from code review and bug assignment?, in: Proceedings of the 38th International Conference on Software Engineering (ICSE ’16), ACM, 2016, pp. 222–232.doi:10.1145/2884781.2884826

  56. [56]

    Evangelopoulos, A

    N. Evangelopoulos, A. Sidorova, S. Fotopoulos, I. Chengalur-Smith, De- termining process death based on censored activity data, Communications 22 in Statistics—Simulation and Computation®37 (8) (2008) 1647–1662

  57. [57]

    Chidambaram, T

    N. Chidambaram, T. Mens, A. Decan, Rabbit: A tool for identifying bot accounts based on their recent github event history, in: 21st International Conference on Mining Software Repositories, ACM, 2024.doi:https: //doi.org/10.1145/3643991.3644877

  58. [58]

    Abdellatif, M

    A. Abdellatif, M. Wessel, I. Steinmacher, M. A. Gerosa, E. Shihab, Both- unter: An approach to detect software bots in GitHub, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 6–17

  59. [59]

    Chakraborti, C

    M. Chakraborti, C. Atkisson, ¸ S. St ˘anciulescu, V . Filkov, S. Frey, Do we run how we say we run? formalization and practice of governance in OSS communities, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, Association for Computing Ma- chinery, New York, NY , USA, 2024, pp. 1–26.doi:10.1145/3613904. 3641980. URLhtt...

  60. [60]

    Kaushik, K

    M. Kaushik, K. K. Chahal, Boosting sentiment analysis in OSS: A hy- brid active learning strategy using uncertainty metrics, in: A. Chakra- vorty, S. Hussain, R. Kumari (Eds.), Artificial Intelligence: Theory and Applications, V ol. 1864 of Lecture Notes in Networks and Sys- tems, Springer Nature Switzerland, Cham, 2026, pp. 311–325.doi: 10.1007/978-3-032...

  61. [61]

    Romano, J

    J. Romano, J. D. Kromrey, J. Coraggio, J. Skowronek, Appropriate statis- tics for ordinal level data: Should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys, in: annual meeting of the Florida Association of Institutional Research, V ol. 177, 2006

  62. [62]

    J. S. Long, L. H. Ervin, Using heteroscedasticity consistent standard er- rors in the linear regression model, The American Statistician 54 (3) (2000) 217–224.doi:10.1080/00031305.2000.10474549. URLhttps://doi.org/10.1080/00031305.2000.10474549 23