arxiv: 2605.11844 · v1 · submitted 2026-05-12 · 💻 cs.SE

Recognition: 1 theorem link

· Lean Theorem

The Death Spiral of Open Source Projects: A Post-Mortem Analysis of Pull Request Workflow Dynamics

Kuljit Kaur Chahal, Mohit Kaushik

Pith reviewed 2026-05-13 05:43 UTC · model grok-4.3

classification 💻 cs.SE

keywords open source softwarepull requestsproject mortalitygithubworkflow dynamicsdeath spiralsurvival analysis

0 comments

The pith

Open source projects survive based on value and popularity rather than pull request workflow efficiency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines pull request workflows across 1,736 inactive GitHub repositories and 1.3 million PRs to identify what drives open source project failure. Comparative analysis shows workflow friction, extended reviews, and negativity occur widely on the platform, appearing in both active and inactive projects alike. Evolutionary tracking uncovers a death spiral of falling innovation, expanding backlogs, and longer merge times that ends in disengagement, though labeling stays steady and toxicity does not rise. Explanatory models then establish that lifespan depends on inherent value and ecosystem factors, with popularity and innovation as positive predictors while friction and rejections grow as byproducts of age. The work reframes OSS mortality as a socio-technical issue dominated by value and abandonment over workflow discipline.

Core claim

Our explanatory modeling demonstrates that project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics. Popularity and innovation emerge as strong positive predictors of survival, while friction, rejection rates, labeling formalization, and negativity scale with longevity as byproducts rather than causes of failure.

What carries the argument

Explanatory modeling of project lifespan using PR workflow metrics such as merge latency, negativity scores, innovation rates, and rejection rates to separate causes of mortality from symptoms.

Load-bearing premise

The selected PR workflow metrics and definition of project inactivity accurately distinguish causes of mortality from correlated symptoms without substantial selection bias or omitted variables in the explanatory models.

What would settle it

A dataset showing that high-popularity projects with high friction and rejection rates outlast low-popularity projects with efficient workflows and low rejections would indicate value overrides workflow factors.

Figures

Figures reproduced from arXiv: 2605.11844 by Kuljit Kaur Chahal, Mohit Kaushik.

**Figure 2.** Figure 2: Rising median merge time across quartiles, indicating increasing fric [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Backlog growth across quartiles, showing exponential accumulation [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Backlog and toxicity trends for three case study repositories (Simple-Gallery, Facebook Buck, ReactiveCocoa). The red line shows the yearly average [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

read the original abstract

Open Source Software projects (OSS) are central to modern technology, yet their survival rates remain low. Prior research has examined project mortality through macro-level indicators such as commit activity, developer abandonment, and ecosystem dependencies, but the micro-level dynamics of the Pull Request (PR) workflow have been largely overlooked. This study provides the first large-scale post-mortem analysis of PR workflows across 1,736 inactive GitHub repositories and 1.3 million human-driven PRs. Using a mixed-method quantitative design, we investigate three dimensions of mortality. First, our comparative descriptive analysis shows that workflow friction, extended review cycles, and negativity penalties are endemic properties of the entire GitHub platform across both active and inactive projects. Rejected PRs consistently attract higher discussion and negativity regardless of project health. Second, our evolutionary analysis identifies a universal ``death spiral" marked by declining innovation rates, exponential backlog growth, rising merge latency. The collapse was defined by silence and disengagement. Labeling formalization remained endemic throughout the lifecycle, while toxicity did not intensify. Finally, our explanatory modeling demonstrates that project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics. Popularity and innovation emerge as strong positive predictors of survival, while friction, rejection rates, labeling formalization, and negativity scale with longevity as byproducts rather than causes of failure. Robustness checks across alternative inactivity thresholds confirm these findings. Together, this work reframes OSS mortality as a socio-technical phenomenon in which abandonment and ecosystem value dominate survival outcomes, while PR-level workflow discipline plays a secondary role.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Big dataset on PR patterns in dying OSS projects is a solid descriptive addition, but the claim that workflow friction is just a byproduct rather than a driver rests on under-specified observational models.

read the letter

This paper's core point is that open source projects die because they lack inherent value or ecosystem fit, not because of messy pull request processes. Workflow friction, negativity, and rejection rates show up across active and inactive repos alike, so they are treated as symptoms that scale with project age rather than root causes. Popularity and innovation rates come out as the real predictors of survival in their models. That is the reframing they push. The work is new in its scale and focus. They pull 1.3 million human PRs from 1,736 inactive GitHub repositories and track micro-level metrics like merge latency, discussion volume, negativity scores, and innovation rates over the project lifecycle. The comparative section shows rejected PRs draw more negativity everywhere, not just in failing projects. The evolutionary analysis maps a consistent decline in innovation, backlog growth, and eventual silence before collapse. Labeling stays formal throughout, but toxicity does not spike. Those patterns are useful to see at this volume. The robustness checks across different inactivity thresholds are a plus and give the descriptive results some stability. The explanatory modeling is the weaker part. They conclude workflow efficiency does not determine lifespan, yet the regressions are fit on observational data where project health can easily affect both the PR metrics and the survival outcome. The abstract gives no model equations, variable definitions, or controls for obvious confounders such as project domain, external funding, or maintainer reputation. Without fixed effects, instruments, or explicit discussion of endogeneity, the direction of the claims is hard to trust. Readers who want large-scale empirical patterns on GitHub workflows will get something concrete here. People who need defensible causal statements about what actually kills projects should treat the modeling conclusions as preliminary. The data effort is large enough that a serious referee should see it, mainly to press on the identification strategy and demand clearer reporting of the regressions.

Referee Report

2 major / 3 minor

Summary. The paper conducts a large-scale post-mortem analysis of PR workflows in 1,736 inactive GitHub repositories comprising 1.3 million human-driven PRs. Using comparative descriptive, evolutionary, and explanatory modeling approaches, it identifies a universal 'death spiral' of declining innovation, backlog growth, and rising latency, while arguing that project lifespan is driven by inherent value and ecosystem dynamics (popularity, innovation) rather than workflow efficiency; friction, rejection rates, labeling, and negativity are presented as byproducts of longevity, with robustness checks only across alternative inactivity thresholds.

Significance. If the central claims hold after addressing modeling limitations, the work would meaningfully advance OSS sustainability research by providing the first large-scale micro-level PR workflow analysis and reframing mortality as socio-technical rather than process-driven. The scale of the dataset and mixed-method design are strengths, but the explanatory component's ability to support causal separation of symptoms from causes is currently limited by the observational nature of the data.

major comments (2)

[Explanatory modeling] Explanatory modeling section: The headline claim that 'project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics' and that friction/rejection/negativity 'scale with longevity as byproducts rather than causes' rests on regression-style models. These models are fit to observational GitHub data without reported fixed effects, instrumental variables, or explicit controls for project-level confounders (domain, external funding, maintainer reputation, or selection into inactivity). This leaves reverse causality and omitted-variable bias unaddressed, directly undermining the causal interpretation that workflow metrics are non-determinative.
[Abstract] Abstract and methods description: Robustness is reported only for alternative inactivity thresholds, yet no details are provided on exact model specifications (e.g., regression type, variable operationalization for 'innovation rates,' 'negativity scores,' or merge latency), handling of multicollinearity, or tests for endogeneity. These omissions make it impossible to assess whether the models can distinguish causes from correlated symptoms, which is load-bearing for the central reframing of OSS mortality.

minor comments (3)

[Evolutionary analysis] The evolutionary analysis identifies a 'death spiral' but does not quantify the timing or thresholds at which innovation decline transitions to disengagement; providing survival curves or hazard ratios would strengthen the descriptive claims.
[Comparative descriptive analysis] Comparative analysis states that rejected PRs attract higher discussion and negativity 'regardless of project health,' but the operationalization of 'project health' and the statistical test used to establish this invariance should be clarified.
The manuscript would benefit from a dedicated limitations subsection discussing potential selection bias in the sample of inactive repositories and the generalizability beyond GitHub.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address the concerns regarding the explanatory modeling and methodological details below, and we will make revisions to clarify the scope of our claims and provide additional transparency in the methods section.

read point-by-point responses

Referee: [Explanatory modeling] Explanatory modeling section: The headline claim that 'project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics' and that friction/rejection/negativity 'scale with longevity as byproducts rather than causes' rests on regression-style models. These models are fit to observational GitHub data without reported fixed effects, instrumental variables, or explicit controls for project-level confounders (domain, external funding, maintainer reputation, or selection into inactivity). This leaves reverse causality and omitted-variable bias unaddressed, directly undermining the causal interpretation that workflow metrics are non-determinative.

Authors: We agree that our analysis is observational and does not establish strict causality. The models include controls for project size, age, and popularity as proxies for value and ecosystem dynamics. However, we did not include project fixed effects or instrumental variables due to the nature of the data. We will revise the manuscript to explicitly state that the findings represent associations and predictive relationships rather than causal determinations. We will add a dedicated limitations subsection discussing potential omitted variable bias and reverse causality, and rephrase the abstract and conclusions to avoid causal language such as 'determined by' and 'byproducts rather than causes'. revision: partial
Referee: [Abstract] Abstract and methods description: Robustness is reported only for alternative inactivity thresholds, yet no details are provided on exact model specifications (e.g., regression type, variable operationalization for 'innovation rates,' 'negativity scores,' or merge latency), handling of multicollinearity, or tests for endogeneity. These omissions make it impossible to assess whether the models can distinguish causes from correlated symptoms, which is load-bearing for the central reframing of OSS mortality.

Authors: We will expand the Methods section to provide full details on the explanatory models, including the specific regression techniques used (e.g., survival analysis or logistic regression for lifespan), precise definitions and operationalizations of variables such as innovation rates (measured as the proportion of PRs introducing new features), negativity scores (derived from sentiment analysis tools), and merge latency (average time from PR creation to merge or close). We will report variance inflation factors to address multicollinearity and include a discussion of endogeneity concerns. While we cannot perform formal endogeneity tests like Hausman without additional instruments, we will acknowledge this limitation. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical derivation chain

full rationale

The paper relies on large-scale observational analysis of external GitHub data (1,736 repositories, 1.3M PRs) using descriptive statistics, evolutionary trend tracking, and standard explanatory regression modeling. No equations, fitted parameters, or self-citations are presented as reducing any prediction or central claim to its own inputs by construction. Claims about predictors of lifespan versus byproducts rest on statistical associations from independent data rather than definitional loops or ansatz smuggling. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard empirical software engineering assumptions about data representativeness and metric validity. No explicit free parameters or invented entities are described in the abstract; any model-specific parameters or thresholds for inactivity are not detailed.

free parameters (1)

inactivity threshold
Definition of inactive projects requires a time-based cutoff whose exact value and sensitivity are not specified in the abstract but affect all downstream comparisons.

axioms (1)

domain assumption GitHub PR metadata and derived metrics reliably proxy for contributor engagement, project health, and socio-technical dynamics
Invoked throughout the comparative, evolutionary, and explanatory sections to interpret workflow friction and negativity as indicators of mortality.

pith-pipeline@v0.9.0 · 5586 in / 1374 out tokens · 128033 ms · 2026-05-13T05:43:10.700512+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
explanatory modeling demonstrates that project lifespan is not determined by workflow efficiency but by inherent value and ecosystem dynamics

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

[1]

Hoffmann, F

M. Hoffmann, F. Nagle, Y . Zhou, The value of open source software, Harvard Business School Strategy Unit Working Paper (24-038) (2024)

work page 2024
[2]

Lawson, S

A. Lawson, S. Hendrick, Global spotlight 2023: Survey-based insights into the global landscape of open source trends, sustainability challenges, and growth opportunities, 2023. URLhttps://api.semanticscholar.org/CorpusID:274661862

work page 2023
[3]

Coelho, M

J. Coelho, M. T. Valente, Why modern open source projects fail, in: Pro- ceedings of the 2017 11th Joint meeting on foundations of software engi- neering, 2017, pp. 186–196

work page 2017
[4]

A. Ait, J. L. C. Izquierdo, J. Cabot, An empirical study on the survival rate of GitHub projects, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 365–375

work page 2022
[5]

P. Pu, C. Liang, Assessing the risks posed by outdated dependencies in software supply chains: a focus on the npm ecosystem, in: Fourth Interna- tional Conference on Algorithms, Microchips, and Network Applications (AMNA 2025), V ol. 13576, SPIE, 2025, pp. 508–514

work page 2025
[6]

D. Reid, K. Rahkema, J. Walden, Large scale study of orphan vulnerabili- ties in the software supply chain, in: Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engi- neering, 2023, pp. 22–32

work page 2023
[7]

Linåker, E

J. Linåker, E. Papatheocharous, T. Olsson, How to characterize the health of an open source software project? a snowball literature review of an emerging practice, in: Proceedings of the 18th International Symposium on Open Collaboration, 2022, pp. 1–12

work page 2022
[8]

Cosentino, J

V . Cosentino, J. L. C. Izquierdo, J. Cabot, Assessing the bus factor of git repositories, in: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), IEEE, 2015, pp. 499– 503

work page 2015
[9]

Avelino, E

G. Avelino, E. Constantinou, M. T. Valente, A. Serebrenik, On the aban- donment and survival of open source projects: An empirical investiga- tion, in: 2019 ACM/IEEE International Symposium on Empirical Soft- ware Engineering and Measurement (ESEM), IEEE, 2019, pp. 1–12

work page 2019
[10]

K. A. Hasan, J. Yasmin, H. Hao, Y . Tian, S. Hassan, S. H. Ding, Under- standing abandonment and slowdown dynamics in the Maven ecosystem, in: 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR), IEEE, 2025, pp. 354–358

work page 2025
[11]

Robinson, K

D. Robinson, K. Enns, N. Koulecar, M. Sihag, Two approaches to sur- vival analysis of open source Python projects, in: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 660–669

work page 2022
[12]

Coelho, M

J. Coelho, M. T. Valente, L. L. Silva, E. Shihab, Identifying unmaintained projects in GitHub, in: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, 2018, pp. 1–10

work page 2018
[13]

Z. Liao, B. Zhao, S. Liu, H. Jin, D. He, L. Yang, Y . Zhang, J. Wu, A pre- diction model of the project life-span in open source software ecosystem, Mobile Networks and Applications 24 (4) (2019) 1382–1391

work page 2019
[14]

Valiev, et al., A case study of the PyPI ecosystem, in: Proceed- ings of the International Conference on Mining Software Repositories, IEEE/ACM, 2021

M. Valiev, et al., A case study of the PyPI ecosystem, in: Proceed- ings of the International Conference on Mining Software Repositories, IEEE/ACM, 2021

work page 2021
[15]

Golzadeh, A

M. Golzadeh, A. Decan, T. Mens, On the effect of discussions on pull request decisions., in: BENEVOL, 2019

work page 2019
[16]

M. Ortu, G. Destefanis, D. Graziotin, M. Marchesi, R. Tonelli, How do you propose your code changes? empirical analysis of affect metrics of pull requests on GitHub, IEEE Access 8 (2020) 110897–110907.doi: 10.1109/ACCESS.2020.3002663

work page doi:10.1109/access.2020.3002663 2020
[17]

J. Jr, L. Nascimento, A. Santos, I. Machado, Issue labeling dynamics in open-source projects: A comprehensive analysis, in: Proceedings of the 18th Brazilian Symposium on Software Components, Archi- tectures, and Reuse, SBC, Porto Alegre, RS, Brasil, 2024, pp. 51–60. doi:10.5753/sbcars.2024.3855. URLhttps://sol.sbc.org.br/index.php/sbcars/article/ view/30232

work page doi:10.5753/sbcars.2024.3855 2024
[18]

Kikas, M

R. Kikas, M. Dumas, D. Pfahl, Using dynamic and contextual features to predict issue lifetime in GitHub projects, Proceedings - 13th Working Conference on Mining Software Repositories, MSR 2016 (2016) 291– 302doi:10.1145/2901739.2901751

work page doi:10.1145/2901739.2901751 2016
[19]

Alami, M

A. Alami, M. L. Cohn, A. WKaisowski, How do FOSS communities decide to accept pull requests?, in: Proceedings of the 24th Interna- tional Conference on Evaluation and Assessment in Software Engineer- 21 ing, EASE ’20, Association for Computing Machinery, New York, NY , USA, 2020, p. 220–229.doi:10.1145/3383219.3383242

work page doi:10.1145/3383219.3383242 2020
[20]

Zhang, Y

X. Zhang, Y . Yu, G. Gousios, A. Rastogi, Pull request decisions ex- plained: An empirical overview, IEEE Transactions on Software Engi- neering 49 (2) (2022) 849–871

work page 2022
[21]

Wessel, T

M. Wessel, T. Mens, A. Decan, P. R. Mazrae, The GitHub develop- ment workflow automation ecosystems, in: T. Mens, C. D. Roover, A. Cleve (Eds.), Software Ecosystems, Springer, Cham, 2023.doi: 10.1007/978-3-031-36060-2_8. URLhttps://doi.org/10.1007/978-3-031-36060-2_8

work page doi:10.1007/978-3-031-36060-2_8 2023
[22]

I. E. Asri, N. Kerzazi, G. Uddin, F. Khomh, M. A. J. Idrissi, An empirical study of sentiments in code reviews, Information and Software Technol- ogy 114 (2019) 37–54.doi:10.1016/J.INFSOF.2019.06.005

work page doi:10.1016/j.infsof.2019.06.005 2019
[23]

Sanei, J

A. Sanei, J. Cheng, B. Adams, The impacts of sentiments and tones in community-generated issue discussions, Proceedings - 2021 IEEE/ACM 13th International Workshop on Cooperative and Human Aspects of Software Engineering, CHASE 2021 (2021) 1–10doi:10. 1109/CHASE52884.2021.00009

work page arXiv 2021
[24]

Nourry, M

O. Nourry, M. Kondo, S. Saito, Y . Iimura, N. Ubayashi, Y . Kamei, Myth: The loss of core developers is a critical issue for OSS communities, arXiv preprint arXiv:2412.00313 (2024)

work page arXiv 2024
[25]

H. Fang, J. Herbsleb, B. Vasilescu, Novelty begets popularity, but curbs participation – a macroscopic view of the Python open-source ecosystem, in: Proceedings of the IEEE/ACM 46th International Conference on Soft- ware Engineering, ICSE ’24, Association for Computing Machinery, New York, NY , USA, 2024.doi:10.1145/3597503.3608142

work page doi:10.1145/3597503.3608142 2024
[26]

J. Song, C. Kim, What is needed for the sustainable success of OSS projects: Efficiency analysis of commit production process via git, Sus- tainability 10 (9) (2018) 3001.doi:10.3390/su10093001. URLhttps://doi.org/10.3390/su10093001

work page doi:10.3390/su10093001 2018
[27]

R. Kaur, K. K. Chahal, Exploring factors affecting developer abandon- ment of open source software projects, Journal of Software: Evolution and Process 34 (9) (2022) e2484

work page 2022
[28]

Calefato, M

F. Calefato, M. A. Gerosa, G. Iaffaldano, et al., Will you come back to contribute? investigating the inactivity of OSS core developers in GitHub, Empirical Software Engineering 27 (1) (2022) 76.doi:10. 1007/s10664-021-10012-6. URLhttps://doi.org/10.1007/s10664-021-10012-6

work page doi:10.1007/s10664-021-10012-6 2022
[29]

R. A. Lange, A. Gibson, M. Z. Trujillo, B. F. Welles, Invisible labor: The backbone of open source software, arXiv preprint arXiv:2503.13405 (2025)

work page arXiv 2025
[30]

L. Yin, Z. Chen, Q. Xuan, V . Filkov, Sustainability forecasting for Apache incubator projects, in: Proceedings of the 29th ACM Joint Meet- ing on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, Association for Computing Machinery, New York, NY , USA, 2021, p. 1056–1067. doi:10.1145/3468264.3468563

work page doi:10.1145/3468264.3468563 2021
[31]

L. Yin, M. Chakraborti, Y . Yan, C. Schweik, S. Frey, V . Filkov, Open source software sustainability: Combining institutional analy- sis and socio-technical networks, Proc. ACM Hum.-Comput. Interact. 6 (CSCW2) (Nov. 2022).doi:10.1145/3555129

work page doi:10.1145/3555129 2022
[32]

Miller, C

C. Miller, C. Kästner, B. Vasilescu, We feel like we’re winging it: A study on navigating open-source dependency abandonment, in: Proceed- ings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2023, Association for Computing Machinery, New York, NY , USA, 2023, pp. 1281–1293....

work page doi:10.1145/3611643.3616293 2023
[33]

E. K. Adejumo, B. Johnson, M. Guizani, Commit stability as a signal for risk in open-source projects, arXiv preprint arXiv:2508.02487 (2025)

work page arXiv 2025
[34]

S. Park, G. Kwon, Analyzing key features of open source software surviv- ability with random forest., Applied Sciences (2076-3417) 15 (2) (2025)

work page 2076
[35]

Y . Xu, R. He, H. Ye, M. Zhou, H. Wang, Predicting maintenance cessa- tion of open source software repositories with an integrated feature frame- work, arXiv preprint arXiv:2507.21678 (2025)

work page arXiv 2025
[36]

Kaushik, K

M. Kaushik, K. K. Chahal, Community engagement and the lifespan of open-source software projects, Information and Software Technology 189 (2026) 107914.doi:10.1016/j.infsof.2025.107914

work page doi:10.1016/j.infsof.2025.107914 2026
[37]

Kaushik, K

M. Kaushik, K. K. Chahal, Beyond speed: Engagement sustains lifespan, Software: Practice and Experience 56 (6) 758–785.doi:https://doi. org/10.1002/spe.70068

work page doi:10.1002/spe.70068
[38]

Khondhu, A

J. Khondhu, A. Capiluppi, K.-J. Stol, Is it all lost? a study of inactive open source projects, in: IFIP international conference on open source systems, Springer, 2013, pp. 61–79

work page 2013
[39]

R. He, H. Ye, M. Zhou, Revealing the value of repository centrality in lifespan prediction of open source software projects, arXiv preprint arXiv:2405.07508 (2024)

work page arXiv 2024
[40]

Coelho, M

J. Coelho, M. T. Valente, L. Milen, L. L. Silva, Is this GitHub project maintained? measuring the level of maintenance activity of open-source projects, Information and Software Technology 122 (2020) 106274

work page 2020
[41]

Wessel, I

M. Wessel, I. Wiese, I. Steinmacher, M. A. Gerosa, Don’t disturb me: Challenges of interacting with software bots on open source soft- ware projects, Proceedings of the ACM on Human-Computer Interaction 5 (CSCW2) (2021) 1–21

work page 2021
[42]

Wessel, A

M. Wessel, A. Serebrenik, I. Wiese, I. Steinmacher, M. A. Gerosa, Quality gatekeepers: investigating the effects of code review bots on pull request activities, Empirical Software Engineering 27 (5) (2022) 108

work page 2022
[43]

Khatoonabadi, D

S. Khatoonabadi, D. E. Costa, S. Mujahid, E. Shihab, Understanding the helpfulness of stale bot for pull-based development: An empirical study of 20 large open-source projects, ACM Transactions on Software Engi- neering and Methodology 33 (2) (2023) 1–43

work page 2023
[44]

K. A. Hasan, M. Macedo, Y . Tian, B. Adams, S. Ding, Understanding the time to first response in GitHub pull requests, in: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), IEEE, 2023, pp. 1–11

work page 2023
[45]

Golzadeh, A

M. Golzadeh, A. Decan, D. Legay, T. Mens, A ground-truth dataset and classification model for detecting bots in GitHub issue and pr comments, Journal of Systems and Software 175 (2021) 110911

work page 2021
[46]

W. Lu, E. Kasaadah, S. Karim, M. Germonprez, S. Goggins, Open source software lifecycle classification: Developing wrangling techniques for complex sociotechnical systems, arXiv preprint arXiv:2504.16670 (2025)

work page arXiv 2025
[47]

Lumbard, M

K. Lumbard, M. Germonprez, S. Goggins, An empirical investigation of social comparison and open source community health, Information Sys- tems Journal 34 (2) (2024) 499–532

work page 2024
[48]

Iaffaldano, I

G. Iaffaldano, I. Steinmacher, F. Calefato, M. Gerosa, F. Lanubile, Why do developers take breaks from contributing to OSS projects? a prelimi- nary analysis, in: Proceedings of the 2nd International Workshop on Soft- ware Health, 2019, pp. 9–16

work page 2019
[49]

Y . Qiao, J. Wang, C. Cheng, W. Tang, P. Liang, Y . Zhao, B. Li, Code reviewer recommendation based on a hypergraph with multiplex rela- tionships, in: 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, 2024, pp. 417–428

work page 2024
[50]

Linåker, G

J. Linåker, G. Link, K. Lumbard, Sustaining maintenance labor for healthy open source software projects through human infrastructure: A maintainer perspective, in: Proceedings of the 18th ACM/IEEE Interna- tional Symposium on Empirical Software Engineering and Measurement, 2024, pp. 37–48

work page 2024
[51]

Mashhadi, H

O. Dabic, E. Aghajani, G. Bavota, Sampling projects in GitHub for msr studies, Proceedings - 2021 IEEE/ACM 18th International Conference on Mining Software Repositories, MSR 2021 (2021) 560–564doi:10. 1109/MSR52588.2021.00074

work page arXiv 2021
[52]

In: 45th IEEE/ACM International Conference on Software En- gineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023

A. Mastropaolo, L. Pascarella, E. Guglielmi, M. Ciniselli, S. Scalabrino, R. Oliveto, G. Bavota, On the robustness of code generation techniques: An empirical study on GitHub copilot, in: 2023 IEEE/ACM 45th Inter- national Conference on Software Engineering (ICSE), IEEE, 2023, pp. 2149–2160.doi:10.1109/ICSE48619.2023.00181

work page doi:10.1109/icse48619.2023.00181 2023
[53]

Pietri, G

A. Pietri, G. Rousseau, S. Zacchiroli, Forking without clicking: on how to identify software repository forks, in: Proceedings of the 17th Interna- tional Conference on Mining Software Repositories, MSR ’20, Associa- tion for Computing Machinery, New York, NY , USA, 2020, p. 277–287. doi:10.1145/3379597.3387450

work page doi:10.1145/3379597.3387450 2020
[54]

T. Dey, A. Mockus, Effect of technical and social factors on pull request quality for the npm ecosystem, in: Proceedings of the ACM/IEEE Interna- tional Symposium on Empirical Software Engineering and Measurement (ESEM ’20), ACM, 2020, pp. 1–11.doi:10.1145/3382494.3410685

work page doi:10.1145/3382494.3410685 2020
[55]

Thongtanunam, R

P. Thongtanunam, R. G. Kula, C. Treude, H. Hata, T. Ishio, K. Mat- sumoto, Reviewer recommendation for pull requests in GitHub: What can we learn from code review and bug assignment?, in: Proceedings of the 38th International Conference on Software Engineering (ICSE ’16), ACM, 2016, pp. 222–232.doi:10.1145/2884781.2884826

work page doi:10.1145/2884781.2884826 2016
[56]

Evangelopoulos, A

N. Evangelopoulos, A. Sidorova, S. Fotopoulos, I. Chengalur-Smith, De- termining process death based on censored activity data, Communications 22 in Statistics—Simulation and Computation®37 (8) (2008) 1647–1662

work page 2008
[57]

Chidambaram, T

N. Chidambaram, T. Mens, A. Decan, Rabbit: A tool for identifying bot accounts based on their recent github event history, in: 21st International Conference on Mining Software Repositories, ACM, 2024.doi:https: //doi.org/10.1145/3643991.3644877

work page doi:10.1145/3643991.3644877 2024
[58]

Abdellatif, M

A. Abdellatif, M. Wessel, I. Steinmacher, M. A. Gerosa, E. Shihab, Both- unter: An approach to detect software bots in GitHub, in: Proceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 6–17

work page 2022
[59]

Chakraborti, C

M. Chakraborti, C. Atkisson, ¸ S. St ˘anciulescu, V . Filkov, S. Frey, Do we run how we say we run? formalization and practice of governance in OSS communities, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI ’24, Association for Computing Ma- chinery, New York, NY , USA, 2024, pp. 1–26.doi:10.1145/3613904. 3641980. URLhtt...

work page doi:10.1145/3613904 2024
[60]

Kaushik, K

M. Kaushik, K. K. Chahal, Boosting sentiment analysis in OSS: A hy- brid active learning strategy using uncertainty metrics, in: A. Chakra- vorty, S. Hussain, R. Kumari (Eds.), Artificial Intelligence: Theory and Applications, V ol. 1864 of Lecture Notes in Networks and Sys- tems, Springer Nature Switzerland, Cham, 2026, pp. 311–325.doi: 10.1007/978-3-032...

work page doi:10.1007/978-3-032-19179-3_24 2026
[61]

Romano, J

J. Romano, J. D. Kromrey, J. Coraggio, J. Skowronek, Appropriate statis- tics for ordinal level data: Should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys, in: annual meeting of the Florida Association of Institutional Research, V ol. 177, 2006

work page 2006
[62]

J. S. Long, L. H. Ervin, Using heteroscedasticity consistent standard er- rors in the linear regression model, The American Statistician 54 (3) (2000) 217–224.doi:10.1080/00031305.2000.10474549. URLhttps://doi.org/10.1080/00031305.2000.10474549 23

work page doi:10.1080/00031305.2000.10474549 2000