Beyond the Grave: An Empirical Study of Dormancy and Revival in Scientific Open-Source Software
Pith reviewed 2026-06-26 15:57 UTC · model grok-4.3
The pith
A fixed inactivity threshold cannot reliably classify scientific open-source software as abandoned.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study shows that dormancy cause remains unresolvable from repository evidence for 52.5 percent of projects. Among the rest, feature or milestone freezes outnumber research-output completion by 5.4 to 1. Non-sustained recovery outnumbers sustained recovery by 2.14 to 1, and 11.5 percent of apparent revivals are bot-only or single-spike artifacts. Lifecycle archetype correlates more strongly with revival sustainability than either revival mechanism or work type on the structurally independent subset. Therefore a fixed inactivity threshold is insufficient, and gap duration, lifecycle archetype, and contributor continuity together supply more discriminating information.
What carries the argument
The rule-based classifier that maps coded repository evidence onto five dimensions: dormancy cause (T1), revival mechanism (T2), nature of revival work (T3), revival sustainability (T4), and lifecycle archetype (T5).
If this is right
- Many projects counted as abandoned under common thresholds are actually dormant and may revive.
- Sustainability of recovery depends more on the project's lifecycle archetype than on the specific revival trigger or work performed.
- Contributor continuity supplies a stronger signal for lasting recovery than the type of revival work.
- 11.5 percent of projects that appear revived are in fact bot-driven or single-spike artifacts.
- Changing the inactivity cutoff produces large swings in the number of projects labeled abandoned.
Where Pith is reading between the lines
- Abandonment detectors for scientific software should combine gap length with archetype labels rather than rely on a single cutoff.
- Funding and archival policies could use archetype information to decide whether to treat a project as dormant rather than terminated.
- The same multi-dimensional coding approach could be applied to non-scientific OSS to test whether the same patterns appear outside the scientific domain.
Load-bearing premise
The manual coding of the stratified sample of 750 projects accurately reflects dormancy and revival characteristics across the full set of scientific OSS without significant selection or interpretation bias.
What would settle it
Re-coding an independent sample of projects from the same corpus and testing whether the reported association between lifecycle archetype and sustainability still holds at the same strength.
read the original abstract
Background. Inactivity thresholds classify scientific open-source software (OSS) as abandoned but cannot distinguish permanent abandonment from temporary dormancy; moving the cutoff from 1 to 36 months changes the abandoned count in the SciCat corpus from 18,030 to 8,010. Aims. We characterize dormancy causes, revival mechanisms, recovery durability, and lifecycle archetypes in dormant-revived scientific OSS. Method. From 18,247 SciCat repositories we identify 2,984 dormant-revived candidates and field-code a stratified sample of 750 projects with 75 analyst-coders under a two-phase adjudication protocol (post-adjudication kappa 0.779-0.857). A rule-based classifier produces five dimensions: dormancy cause (T1), revival mechanism (T2), nature of revival work (T3), revival sustainability (T4), and lifecycle archetype (T5). Results. Dormancy cause is unresolvable from repository evidence for 52.5% of projects; among resolvable cases, feature/milestone freeze outnumbers research-output completion 5.4:1. Non-sustained recovery outnumbers sustained 2.14:1; 11.5% of apparent revivals are bot-only or single-spike artifacts. Lifecycle archetype is more strongly associated with sustainability than revival mechanism or work type (medium effect on the structurally-independent subset). Conclusions. A fixed inactivity threshold is insufficient to reliably classify scientific OSS abandonment. Gap duration, lifecycle archetype, and contributor continuity together provide more discriminating information than any single threshold.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that fixed inactivity thresholds are insufficient to classify scientific OSS abandonment, as shifting the cutoff from 1 to 36 months changes the abandoned count in SciCat from 18,030 to 8,010. From 18,247 repositories it identifies 2,984 dormant-revived candidates, codes a stratified sample of 750 projects via 75 coders under two-phase adjudication (kappa 0.779-0.857), and derives five dimensions (T1 dormancy cause, T2 revival mechanism, T3 work nature, T4 sustainability, T5 archetype). Among resolvable cases feature/milestone freeze dominates; non-sustained recoveries outnumber sustained 2.14:1; archetype associates more strongly with sustainability than other factors. It concludes that gap duration, archetype, and contributor continuity together discriminate better than any single threshold.
Significance. If the central empirical patterns hold after addressing sampling limitations, the work supplies concrete evidence on the prevalence of unresolvable dormancy cases (52.5%), revival artifacts (11.5%), and the relative strength of lifecycle archetype for predicting sustained recovery. The threshold-sensitivity result stands independently and directly challenges current practice in scientific OSS health assessment.
major comments (1)
- [Abstract / Results] Abstract and Results (T4/T5 associations): the claim that gap duration, lifecycle archetype (T5), and contributor continuity supply more discriminating information than any fixed threshold for distinguishing temporary dormancy from permanent abandonment is estimated only on the 2,984 revived candidates (and the 750 coded subset). No parallel coding or comparison exists for projects that experienced comparable gaps but never revived, so the superior-discrimination assertion rests on an incomplete contrast and cannot be directly supported by the reported associations (even the medium effect on the independent subset).
minor comments (2)
- [Methods] Methods: the rule-based classifier for T1–T5 is described at high level; an explicit validation step against the adjudicated sample (e.g., precision/recall per dimension) would strengthen reproducibility.
- [Results] The 52.5% unresolvable rate is acknowledged but its impact on the generalizability of the resolvable-case ratios (5.4:1, 2.14:1) could be quantified with sensitivity bounds.
Simulated Author's Rebuttal
We thank the referee for highlighting an important scope limitation in our discrimination analysis. We address the comment directly below and indicate where the manuscript will be revised for precision.
read point-by-point responses
-
Referee: [Abstract / Results] Abstract and Results (T4/T5 associations): the claim that gap duration, lifecycle archetype (T5), and contributor continuity supply more discriminating information than any fixed threshold for distinguishing temporary dormancy from permanent abandonment is estimated only on the 2,984 revived candidates (and the 750 coded subset). No parallel coding or comparison exists for projects that experienced comparable gaps but never revived, so the superior-discrimination assertion rests on an incomplete contrast and cannot be directly supported by the reported associations (even the medium effect on the independent subset).
Authors: We agree that the associations between gap duration, T5 archetype, contributor continuity and T4 sustainability are measured exclusively within the 2,984 revived candidates (and the coded subset). No matched sample of non-revived projects with comparable gaps was coded, so a direct head-to-head contrast between revived and permanently abandoned cases is not available. The manuscript's claim that these factors supply 'more discriminating information than any single threshold' therefore rests on (a) the independent threshold-sensitivity result (abandoned count falling from 18,030 to 8,010) and (b) the relative strength of archetype versus other predictors inside the revived set. We will revise the abstract and Results section to qualify the discrimination statement as applying to the prediction of sustained recovery among projects that have already shown revival activity, and to note the absence of a non-revived contrast group as a limitation. This is a partial revision. revision: partial
- Direct empirical comparison of revived versus non-revived projects with matched gap lengths would require new sampling and coding outside the current study design.
Circularity Check
No circularity: purely observational empirical study with direct data coding and associations
full rationale
The paper performs no derivations, equations, parameter fitting, or model-based predictions. It identifies 2,984 candidates via explicit gap-and-revival criteria in repository data, manually codes a stratified sample of 750 under an adjudication protocol, applies a rule-based classifier to produce categorical dimensions (T1-T5), and reports observed frequencies plus associations (e.g., archetype with sustainability). All reported results are direct tabulations or statistical associations computed from the coded observations; none reduce by construction to the inputs via self-definition, renaming, or self-citation chains. The sample conditioning on revival is a methodological scope limitation affecting generalizability, not a circular reduction of the reported statistics to their own selection criteria. No load-bearing self-citations or ansatzes are invoked to justify the central empirical claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The coding protocol and adjudication process accurately capture the true dormancy causes and revival mechanisms from repository data.
Reference graph
Works this paper leans on
-
[1]
Onthe abandonment and survival of open source projects: An empirical investigation
1 GuilhermeAvelino, EleniConstantinou, MarcoTulioValente, andAlexanderSerebrenik. Onthe abandonment and survival of open source projects: An empirical investigation. InProceedings of the 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1–12. IEEE, 2019.doi:10.1109/ESEM.2019.8870181. 2 Fabio Calefato, Ma...
-
[2]
3 Gemma Catolino, Fabio Palomba, Andy Zaidman, and Filomena Ferrucci
doi:10.1007/s10664-021-10012-6. 3 Gemma Catolino, Fabio Palomba, Andy Zaidman, and Filomena Ferrucci. Not all bugs are the same: Understanding, characterizing, and classifying the root cause of bugs.Journal of Systems and Software, 152:165–181, 2019.doi:10.1016/j.jss.2019.03.002. 4 Jailton Coelho and Marco Tulio Valente. Why modern open source projects fa...
-
[3]
6 Mehdi Golzadeh, Alexandre Decan, Damien Legay, and Tom Mens
doi:10.1007/s11334-017-0287-0. 6 Mehdi Golzadeh, Alexandre Decan, Damien Legay, and Tom Mens. A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments.Journal of Systems and Software, 175:110911, 2021.doi:10.1016/j.jss.2021.110911. 3 https://anonymous.4open.science/r/ESEM2026ReplicationPackage-598E 20 Dormancy and...
-
[4]
URL: https://api.semanticscholar.org/CorpusID:85459292, doi:10.1109/SoHeal.2019.00009. 10 Arne Nils Johanson and Wilhelm Hasselbring. Software engineering for computational science: Past, present, future.Computing in Science & Engineering, 20(2):90–109, 2018.doi:10.1109/ MCSE.2018.021651343. 11 Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Si...
-
[5]
The promises and perils of mining GitHub,
doi:10.1145/2597073.2597074. 12 Klaus Krippendorff.Content Analysis: An Introduction to Its Methodology. Sage Publications, Thousand Oaks, CA, 2 edition,
-
[6]
13 J. Richard Landis and Gary G. Koch. The measurement of observer agreement for categorical data.Biometrics, 33(1):159–174, 1977.doi:10.2307/2529310. 14 Grischa Liebel and Shalini Chakraborty. Ethical issues in empirical studies using student subjects: Re-visiting practices and perceptions.Empirical Software Engineering, 26:40,
-
[7]
doi:10.1007/s10664-021-09958-4. 15 Addi Malviya Thakur, Reed Milewicz, Mahmoud Jahanshahi, Lavínia Paganini, Bogdan Vasilescu, and Audris Mockus. Scientific open-source software is less likely to become abandoned than one might think! Lessons from curating a catalog of maintained scientific software. Proceedings of the ACM on Software Engineering, 2(FSE),...
-
[8]
doi:10.1109/ICSE55347.2025. 00004. 18 Samim Mirhosseini and Chris Parnin. Can automated pull requests encourage software developers to upgrade out-of-date dependencies? InProceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 84–94. IEEE Press, 2017.doi:10.1109/ASE.2017.8115621. 19 Paul Ralph. Toward metho...
-
[9]
doi:10.1109/ TSE.2018.2796554. 20 Paul Ralph. ACM SIGSOFT empirical standards released.ACM SIGSOFT Software Engi- neering Notes, 46(1):19,
arXiv 2018
-
[10]
org/EmpiricalStandards/; full author list (42 contributors) in arXiv:2010.03525
Standards collection available at https://www2.sigsoft. org/EmpiricalStandards/; full author list (42 contributors) in arXiv:2010.03525. doi: 10.1145/3437479.3437483. 21 Johnny Saldaña.The Coding Manual for Qualitative Researchers. SAGE Publications, Thousand Oaks, CA, 4th edition,
-
[11]
22 Carolyn B. Seaman. Qualitative methods in empirical studies of software engineering.IEEE Transactions on Software Engineering, 25(4):557–572, 1999.doi:10.1109/32.799955. A. Malviya Thakur, B. Vasilescu, and A. Mockus 21 23 Igor Steinmacher, Marco Aurelio Graciotto Silva, Marco Aurelio Gerosa, and David F. Redmiles. A systematic literature review on the...
-
[12]
doi:10.1016/j.infsof.2014. 11.001. 24 Steve Stemler. An overview of content analysis.Practical Assessment, Research, and Evaluation, 7(1):17, 2001.doi:10.7275/z6fm-2e34. 25 Klaas-Jan Stol and Brian Fitzgerald. The ABC of software engineering research.ACM Transactions on Software Engineering and Methodology, 27(3):1–51,
-
[13]
doi:10.1145/3757462. 28 Mairieli Wessel, Bruno Mendes de Souza, Igor Steinmacher, Igor Scaliante Wiese, Ivanilton Polato, Ana Paula Chaves, and Marco Aurelio Gerosa. The power of bots: Characterizing and understanding bots in OSS projects. InProceedings of the ACM on Human-Computer Interaction (CSCW), volume 2, pages 1–19, 2018.doi:10.1145/3274451. 29 Cla...
-
[14]
doi:10.1007/978-3-642-29044-2. 30 Minghui Zhou and Audris Mockus. What make long term contributors: Willingness and opportunity in OSS community. InProceedings of the 34th International Conference on Software Engineering (ICSE), pages 518–528. IEEE, 2012.doi:10.1109/ICSE.2012.6227164. 31 Minghui Zhou and Audris Mockus. Who will stay in the FLOSS community...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.