arxiv: 2605.06817 · v1 · submitted 2026-05-07 · 💻 cs.SE

Recognition: 2 theorem links

· Lean Theorem

Analyzing the Adoption of Database Management Systems Throughout the History of Open Source Projects

Camila A. Paiva , Raquel Maximino , Frederico Paiva , Rafael Accetta Vieira , Nicole Espanha , Jo\~ao Felipe Pimentel , Igor Wiese , Marco Aur\'elio Gerosa , Igor Steinmacher , Leonardo Murta , Vanessa Braganholo

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:02 UTC · model grok-4.3

classification 💻 cs.SE

keywords database management systemsopen source projectsJavapolyglot persistenceORM frameworksadoption patternsGitHub repositories

0 comments

The pith

MySQL and PostgreSQL lead DBMS adoption in open-source Java projects, with Redis and MongoDB showing long-term stability and frequent multi-DB use.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the history of database choices across 362 popular open-source Java projects on GitHub. It tracks when projects first adopt, keep, replace, or combine different relational and non-relational DBMSs by scanning source code over time. The results highlight clear popularity leaders and stable patterns such as polyglot persistence, where multiple systems run together. These observations matter because they show what actually happens in real, evolving codebases rather than what surveys or vendor claims suggest.

Core claim

Using source-code heuristics on the full commit histories of the 362 projects, the study finds MySQL and PostgreSQL to be the most widely adopted relational DBMSs while Redis and MongoDB rank highest among non-relational systems and remain in place once introduced. Projects commonly run several DBMSs at once to address different data requirements, and Object-Relational Mapping frameworks appear routinely as the bridge between application code and the chosen storage systems.

What carries the argument

Longitudinal source-code heuristics that scan Git commit histories to detect DBMS adoption, replacement, co-occurrence, and ORM usage events across the 362 Java projects.

If this is right

Non-relational DBMSs such as Redis and MongoDB tend to stay in projects after adoption, unlike some relational systems that are replaced.
Projects routinely combine multiple DBMS types, pointing to deliberate polyglot persistence strategies.
ORM frameworks serve as the primary layer for application-DBMS interaction in the majority of cases.
Replacement events are more common for certain relational systems as projects mature.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Tool builders could create migration assistants that prioritize the stable non-relational options observed in the data.
Curriculum designers might emphasize training on multi-DB architectures and ORM patterns rather than single-DB approaches.
DBMS vendors could test interoperability features against the common co-use combinations found in the projects.

Load-bearing premise

The source-code heuristics accurately detect actual developer intent and usage without substantial false positives or missed cases.

What would settle it

A manual review of a random sample of projects that cross-checks the heuristic-detected DBMS events against commit messages, issue trackers, and runtime configuration files for mismatches.

Figures

Figures reproduced from arXiv: 2605.06817 by Camila A. Paiva, Frederico Paiva, Igor Steinmacher, Igor Wiese, Jo\~ao Felipe Pimentel, Leonardo Murta, Marco Aur\'elio Gerosa, Nicole Espanha, Rafael Accetta Vieira, Raquel Maximino, Vanessa Braganholo.

read the original abstract

Database Management Systems (DBMSs) are widely used to store, retrieve, and manage the data handled by modern applications. Although prior work has studied the co-evolution of DBMSs and application source code, less is known about DBMS adoption, co-use, and replacement in real systems. This paper presents a historical study of DBMS usage in 362 popular open-source Java projects hosted on GitHub. We investigated the adoption of the top DBMSs ranked by DB-Engines, covering relational and non-relational systems. Using source-code heuristics, we analyzed DBMS popularity, stability, migration patterns, co-occurrence, and the role of Object-Relational Mappers (ORMs). Our findings show that MySQL and PostgreSQL are the most popular DBMSs in our corpus. Among non-relational DBMSs, Redis and MongoDB are the most frequently used and tend to remain stable after adoption. In contrast, systems such as HyperSQL are more often replaced as projects evolve. We also observed frequent co-use of multiple DBMSs, suggesting patterns of polyglot persistence in which projects combine systems to handle different data needs. Finally, we found that ORM frameworks are commonly used to mediate interactions between applications and DBMSs. Overall, our study provides empirical evidence on how DBMSs are adopted, combined, and replaced over time, offering guidance for developers, architects, educators, and DBMS vendors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper supplies new empirical counts on DBMS popularity, stability, and polyglot patterns in a Java GitHub corpus, but all claims rest on unvalidated source-code heuristics.

read the letter

This paper supplies new empirical counts on DBMS popularity, stability, and polyglot patterns in a Java GitHub corpus, but all claims rest on unvalidated source-code heuristics. The authors scanned 362 popular projects for adoption timing, replacements, co-use, and ORM involvement. MySQL and PostgreSQL rank highest overall, Redis and MongoDB hold steady once picked among the non-relational options, and many projects run several systems together. Those specific rankings and the replacement frequency for things like HyperSQL are not in the earlier co-evolution papers they cite. The work also notes that ORMs appear often as the bridge layer. That gives practitioners and educators some concrete longitudinal snapshots they can reference when talking about real choices in open source. The corpus size and the focus on history rather than just snapshots are the parts that add something beyond prior studies. The detection method is the clear soft spot. The abstract describes source-code heuristics for spotting adoption and migration events, yet gives no precision or recall numbers, no manual validation on samples, and no discussion of false positives from unused imports or test code. Without those checks the popularity and stability numbers could shift if the heuristics miss dynamic loading or flag dead references. Project selection is also restricted to popular Java repos on GitHub, so broader claims about open-source practice carry that limit. Readers who work on or teach data-layer decisions in open source will get the most from the numbers and patterns. The paper does not change core database theory, but the observations are grounded enough in external data to be worth referee time. I would send it to peer review and ask the authors to add validation details and bias checks in revision.

Referee Report

2 major / 1 minor

Summary. This paper presents a historical empirical study of DBMS adoption, usage, stability, replacement, and co-occurrence in 362 popular open-source Java projects hosted on GitHub. Using source-code heuristics, it examines the top relational and non-relational DBMSs ranked by DB-Engines, along with the role of ORM frameworks. The central claims are that MySQL and PostgreSQL are the most popular, Redis and MongoDB are the most stable non-relational systems after adoption, projects frequently co-use multiple DBMSs in polyglot persistence patterns, and ORMs commonly mediate application-DBMS interactions.

Significance. If the source-code heuristics are shown to be accurate, the study would provide useful longitudinal data on real-world DBMS adoption trends in open-source Java projects, offering practical guidance to developers, architects, educators, and vendors. The large corpus size and focus on historical evolution are strengths that distinguish it from smaller or cross-sectional studies. The work contributes to empirical software engineering by extracting observable patterns from public GitHub data rather than relying on surveys alone.

major comments (2)

[Methodology] Methodology section: The source-code heuristics for identifying DBMS adoption, usage, replacement events, and co-occurrence are described but receive no validation (e.g., no precision/recall on a manually labeled sample of files, no comparison against runtime traces or configuration files, and no inter-rater agreement metrics). This is load-bearing because every reported finding—MySQL/PostgreSQL dominance, Redis/MongoDB stability, polyglot co-use frequencies, and ORM mediation—rests directly on the output of these unverified pattern matches, which are vulnerable to false positives (unused imports, test code) and false negatives (dynamic loading, external wrappers).
[Corpus and Results] Corpus and Results sections: The selection criteria for the 362 projects are stated but potential selection biases (GitHub popularity filter, Java-only focus, project age distribution) are not quantified or tested for impact on the observed DBMS frequencies and stability claims. Without this, it is unclear whether the headline popularity rankings generalize beyond the sampled corpus.

minor comments (1)

[Abstract] The abstract and methods could more explicitly define the temporal window of the Git history analyzed and the exact string patterns or import rules used in the heuristics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We agree that the lack of explicit validation for the heuristics and discussion of corpus biases represent areas for improvement. We address each major comment below and describe the revisions we will incorporate.

read point-by-point responses

Referee: [Methodology] Methodology section: The source-code heuristics for identifying DBMS adoption, usage, replacement events, and co-occurrence are described but receive no validation (e.g., no precision/recall on a manually labeled sample of files, no comparison against runtime traces or configuration files, and no inter-rater agreement metrics). This is load-bearing because every reported finding—MySQL/PostgreSQL dominance, Redis/MongoDB stability, polyglot co-use frequencies, and ORM mediation—rests directly on the output of these unverified pattern matches, which are vulnerable to false positives (unused imports, test code) and false negatives (dynamic loading, external wrappers).

Authors: We acknowledge that the original manuscript did not include quantitative validation of the heuristics. In the revised version we will add a dedicated validation subsection in the Methodology. This will report precision and recall computed on a manually labeled random sample of 200 source files drawn from the corpus, with two authors independently labeling to compute inter-rater agreement. We will also explicitly discuss known limitations, including false positives from unused imports or test code and the inability to detect dynamic loading or wrapper libraries. These additions will directly support the reliability of the reported adoption, stability, and co-occurrence findings. revision: yes
Referee: [Corpus and Results] Corpus and Results sections: The selection criteria for the 362 projects are stated but potential selection biases (GitHub popularity filter, Java-only focus, project age distribution) are not quantified or tested for impact on the observed DBMS frequencies and stability claims. Without this, it is unclear whether the headline popularity rankings generalize beyond the sampled corpus.

Authors: The corpus was deliberately restricted to popular Java projects on GitHub to enable longitudinal analysis of widely used systems (Section 3.1). We agree that potential biases merit more explicit treatment. In the revision we will expand the Corpus section with a new paragraph that reports the distribution of project ages and star counts, discusses the implications of the Java-only and popularity filters, and notes that results may not generalize to other languages or smaller projects. We will also add a brief sensitivity note comparing DBMS frequencies in the top 100 versus the full 362 projects. A full cross-language replication or exhaustive bias quantification, however, lies beyond the scope of the current study. revision: partial

Circularity Check

0 steps flagged

No circularity: purely observational empirical analysis with no derivations or self-referential predictions

full rationale

This paper conducts a historical study of DBMS adoption by applying source-code heuristics to 362 GitHub Java projects and extracting observational statistics on popularity, stability, co-use, and ORM mediation. No equations, fitted parameters, predictions, or first-principles derivations exist; all results are direct extractions from external repository data. The analysis contains no self-definitional steps, no fitted inputs renamed as predictions, and no load-bearing self-citations that reduce the central claims to prior author work. The study is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study depends on the assumption that static code heuristics can reliably identify DBMS adoption events and that the selected 362 projects are representative of broader open-source Java practice.

axioms (1)

domain assumption Source-code heuristics can reliably identify DBMS usage, adoption timing, and replacement events in Java projects.
All quantitative findings rest on these detection rules applied to repository histories.

pith-pipeline@v0.9.0 · 5594 in / 1342 out tokens · 42137 ms · 2026-05-11T01:02:31.015947+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Using source-code heuristics, we analyzed DBMS popularity, stability, migration patterns, co-occurrence, and the role of Object-Relational Mappers (ORMs).
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We applied heuristics to detect DBMS presence, tracked usage trends over time, and analyzed the coexistence and replacement of different systems.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

71 extracted references · 17 canonical work pages

[1]

Fast algorithms for mining association rules , author=. Proc. 20th int. conf. very large data bases, VLDB , volume=. 1994 , organization=

1994
[2]

proceedings of the 17th international conference on data engineering , pages=

Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth , author=. proceedings of the 17th international conference on data engineering , pages=. 2001 , organization=

2001
[3]

2011 , publisher=

Engenharia de software , author=. 2011 , publisher=

2011
[4]

Acm Sigmod Record , volume=

Scalable SQL and NoSQL data stores , author=. Acm Sigmod Record , volume=. 2011 , publisher=

2011
[5]

2013 International Conference on Machine Intelligence and Research Advancement , pages=

Data Mining: Data Mining Concepts and Techniques , author=. 2013 International Conference on Machine Intelligence and Research Advancement , pages=. 2013 , organization =

2013
[9]

and Nasser, Mohamed and Flora, Parminder , booktitle=

Chen, Tse-Hsun and Shang, Weiyi and Hassan, Ahmed E. and Nasser, Mohamed and Flora, Parminder , booktitle=. Detecting Problems in the Database Access Code of Large Scale Systems - An Industrial Experience Report , year=
[10]

How do Developers Document Database Usages in Source Code? (N) , year=

Linares-Vásquez, Mario and Li, Boyang and Vendome, Christopher and Poshyvanyk, Denys , booktitle=. How do Developers Document Database Usages in Source Code? (N) , year=
[11]

Towards a survival analysis of database framework usage in Java projects , year=

Goeminne, Mathieu and Mens, Tom , booktitle=. Towards a survival analysis of database framework usage in Java projects , year=
[12]

Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part III 16 , pages=

The SPMF open-source data mining library version 2 , author=. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part III 16 , pages=. 2016 , organization=

2016
[13]

Data Science and Pattern Recognition , volume=

A survey of sequential pattern mining , author=. Data Science and Pattern Recognition , volume=
[14]

2017 , publisher=

Designing Data-intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems , author=. 2017 , publisher=

2017
[15]

Lyu, Yingjun and Gui, Jiaping and Wan, Mian and Halfond, William G. J. , booktitle=. An Empirical Study of Local Database Usage in Android Applications , year=
[16]

Computer Science-Research and Development , volume=

NoSQL database systems: a survey and decision guidance , author=. Computer Science-Research and Development , volume=. 2017 , publisher=

2017
[18]

ACM Computing Surveys (CSUR) , volume=

A survey on NoSQL stores , author=. ACM Computing Surveys (CSUR) , volume=. 2018 , publisher=

2018
[19]

2018 41st international convention on information and communication technology, electronics and microelectronics (MIPRO) , pages=

Comparison between relational and NOSQL databases , author=. 2018 41st international convention on information and communication technology, electronics and microelectronics (MIPRO) , pages=. 2018 , organization=

2018
[21]

Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39 , pages=

An empirical study on the design and evolution of NoSQL database schemas , author=. Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39 , pages=. 2020 , organization=

2020
[22]

Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39 , pages=

A study on the effect of a table’s involvement in foreign keys to its schema evolution , author=. Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39 , pages=. 2020 , organization=

2020
[23]

2021 IEEE 37th International Conference on Data Engineering (ICDE) , pages=

Profiles of schema evolution in free open source software projects , author=. 2021 IEEE 37th International Conference on Data Engineering (ICDE) , pages=. 2021 , organization=

2021
[24]

Developer Survey Results 2020 , howpublished =

Stackoverflow , year=. Developer Survey Results 2020 , howpublished =

2020
[25]

TIOBE Index for June 2021 , howpublished =

TIOBE , year=. TIOBE Index for June 2021 , howpublished =

2021
[26]

DB-Engines Ranking , howpublished =

DB-Engines , year=. DB-Engines Ranking , howpublished =
[27]

Pro JPA 2 in Java EE 8: An in-Depth Guide to Java Persistence APIs , year =

Keith, Mike and Schincariol, Merrick and Nardone, Massimo , address =. Pro JPA 2 in Java EE 8: An in-Depth Guide to Java Persistence APIs , year =
[28]

Kikas, M

Chen, Tse-Hsun and Shang, Weiyi and Yang, Jinqiu and Hassan, Ahmed E. and Godfrey, Michael W. and Nasser, Mohamed and Flora, Parminder , title =. Proceedings of the International Conference on Mining Software Repositories , pages =. 2016 , isbn =. doi:10.1145/2901739.2901758 , abstract =

work page doi:10.1145/2901739.2901758 2016
[30]

2020 Java Technology Report , howpublished =

JRebel , year =. 2020 Java Technology Report , howpublished =

2020
[31]

2006 , isbn =

Keith, Mike and Schincariol, Merrick , title =. 2006 , isbn =

2006
[32]

2021 , note =

OpenHMS , title =. 2021 , note =

2021
[33]

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management , pages =

Yan, Cong and Cheung, Alvin and Yang, Junwen and Lu, Shan , title =. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management , pages =. 2017 , isbn =

2017
[35]

, biburl =

Gamma, Erich and Helm, Richard and Johnson, Ralph and Vlissides, John M. , biburl =. Design Patterns: Elements of Reusable Object-Oriented Software , url =
[36]

and Kafura, D

Henry, S. and Kafura, D. , journal=. Software Structure Metrics Based on Information Flow , year=
[37]

2024 , note =

N+1 Select Problem , howpublished =. 2024 , note =

2024
[39]

and Navathe, S.B

Elmasri, R. and Navathe, S.B. , publisher =
[40]

Interrater reliability: the kappa statistic

Mary McHugh. Interrater reliability: the kappa statistic. Biochemia Medica. 2012 , volume =

2012
[41]

Mining source code repositories at massive scale using language modeling , year=

Allamanis, Miltiadis and Sutton, Charles , booktitle=. Mining source code repositories at massive scale using language modeling , year=
[42]

2011 , url =

Martin Fowler , title =. 2011 , url =

2011
[43]

Agarwal S (2013) Data mining: Data mining concepts and techniques. In: 2013 International Conference on Machine Intelligence and Research Advancement, IEEE, Institute of Electrical and Electronics Engineers, Katra, India, pp 203--207, doi:10.1109/ICMIRA.2013.45

work page doi:10.1109/icmira.2013.45 2013
[44]

(1994) Fast algorithms for mining association rules

Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, Santiago, Chile, vol 1215, pp 487--499

1994
[45]

In: 2013 10th Working Conference on Mining Software Repositories (MSR), pp 207--216, doi:10.1109/MSR.2013.6624029

Allamanis M, Sutton C (2013) Mining source code repositories at massive scale using language modeling. In: 2013 10th Working Conference on Mining Software Repositories (MSR), pp 207--216, doi:10.1109/MSR.2013.6624029

work page doi:10.1109/msr.2013.6624029 2013
[46]

Journal of Systems and Software 146:112–129, doi:10.1016/j.jss.2018.09.016, ://dx.doi.org/10.1016/j.jss.2018.09.016

Borges H, Tulio Valente M (2018) What’s in a github star? understanding repository starring practices in a social coding platform. Journal of Systems and Software 146:112–129, doi:10.1016/j.jss.2018.09.016, ://dx.doi.org/10.1016/j.jss.2018.09.016

work page doi:10.1016/j.jss.2018.09.016 2018
[47]

Acm Sigmod Record 39(4):12--27

Cattell R (2011) Scalable sql and nosql data stores. Acm Sigmod Record 39(4):12--27

2011
[48]

ACM Computing Surveys (CSUR) 51(2):1--43

Davoudian A, Chen L, Liu M (2018) A survey on nosql stores. ACM Computing Surveys (CSUR) 51(2):1--43

2018
[49]

https://db-engines.com/en/ranking, accessed: 2022-02-28

DB-Engines (2022) Db-engines ranking. https://db-engines.com/en/ranking, accessed: 2022-02-28

2022
[50]

In: Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39, Springer, pp 456--470

Dimolikas K, Zarras AV, Vassiliadis P (2020) A study on the effect of a table’s involvement in foreign keys to its schema evolution. In: Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39, Springer, pp 456--470

2020
[51]

Elmasri R, Navathe S (2010) Fundamentals of Database Systems . Pearson

2010
[52]

Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part III 16, Springer, Springer International Publishing, Cham, pp 36--40

2016
[53]

Data Science and Pattern Recognition 1(1):54--77

Fournier-Viger P, Lin JCW, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Science and Pattern Recognition 1(1):54--77

2017
[54]

://martinfowler.com/bliki/PolyglotPersistence.html, accessed: 2025-01-27

Fowler M (2011) Polyglot persistence. ://martinfowler.com/bliki/PolyglotPersistence.html, accessed: 2025-01-27

2011
[55]

Addison-Wesley Professional, ://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0201633612/ref=ntt_at_ep_dpi_1

Gamma E, Helm R, Johnson R, Vlissides JM (1994) Design Patterns: Elements of Reusable Object-Oriented Software, 1st edn. Addison-Wesley Professional, ://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0201633612/ref=ntt_at_ep_dpi_1

work page arXiv 1994
[56]

Computer Science-Research and Development 32:353--365

Gessert F, Wingerath W, Friedrich S, Ritter N (2017) Nosql database systems: a survey and decision guidance. Computer Science-Research and Development 32:353--365

2017
[57]

In: 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 551--555, doi:10.1109/ICSM.2015.7332512

Goeminne M, Mens T (2015) Towards a survival analysis of database framework usage in java projects. In: 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 551--555, doi:10.1109/ICSM.2015.7332512

work page doi:10.1109/icsm.2015.7332512 2015
[58]

Goeminne M, Decan A, Mens T (2014) Co-evolving code-related and database-related changes in a data-intensive software system. In: 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), Institute of Electrical and Electronics Engineers, Antwerp, Belgium, pp 353--357, doi:10.1109/CSMR-WCRE...

work page doi:10.1109/csmr-wcre.2014.6747193 2014
[59]

In: proceedings of the 17th international conference on data engineering, IEEE, pp 215--224

Han J, Pei J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M (2001) Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: proceedings of the 17th international conference on data engineering, IEEE, pp 215--224

2001
[60]

IEEE Transactions on Software Engineering SE-7(5):510--518, doi:10.1109/TSE.1981.231113

Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE Transactions on Software Engineering SE-7(5):510--518, doi:10.1109/TSE.1981.231113

work page doi:10.1109/tse.1981.231113 1981
[61]

Computer 38(1):107–110, doi:10.1109/MC.2005.22, ://doi.org/10.1109/MC.2005.22

Johnson R (2005) J2ee development frameworks. Computer 38(1):107–110, doi:10.1109/MC.2005.22, ://doi.org/10.1109/MC.2005.22

work page doi:10.1109/mc.2005.22 2005
[62]

https://www.jrebel.com/blog/2020-java-technology-report, accessed: 2022-02-09

JRebel (2020) 2020 java technology report. https://www.jrebel.com/blog/2020-java-technology-report, accessed: 2022-02-09

2020
[63]

Kalliamvakou E, Gousios G, Blincoe K, Singer L, German DM, Damian D (2014) The promises and perils of mining github. In: Proceedings of the 11th Working Conference on Mining Software Repositories, ACM, Association for Computing Machinery, New York, NY, USA, MSR 2014, p 92–101, doi:10.1145/2597073.2597074, ://doi.org/10.1145/2597073.2597074

work page doi:10.1145/2597073.2597074 2014
[64]

Apress, USA

Keith M, Schincariol M (2006) Pro EJB 3: Java Persistence API (Pro). Apress, USA

2006
[65]

Apress L

Keith M, Schincariol M, Nardone M (2018) Pro JPA 2 in Java EE 8: An in-Depth Guide to Java Persistence APIs. Apress L. P, Berkeley, CA

2018
[66]

O'Reilly Media, ://books.google.com.br/books?id=BM7woQEACAAJ

Kleppmann M (2017) Designing Data-intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media, ://books.google.com.br/books?id=BM7woQEACAAJ

2017
[67]

In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 36--41, doi:10.1109/ASE.2015.67

Linares-Vásquez M, Li B, Vendome C, Poshyvanyk D (2015) How do developers document database usages in source code? (n). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 36--41, doi:10.1109/ASE.2015.67

work page doi:10.1109/ase.2015.67 2015
[68]

In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 444--455, doi:10.1109/ICSME.2017.75

Lyu Y, Gui J, Wan M, Halfond WGJ (2017) An empirical study of local database usage in android applications. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp 444--455, doi:10.1109/ICSME.2017.75

work page doi:10.1109/icsme.2017.75 2017
[69]

Biochemia Medica 22(3):276--82

McHugh M (2012) Interrater reliability: the kappa statistic. Biochemia Medica 22(3):276--82

2012
[70]

https://openhms.sourceforge.io/sqlbuilder/, accessed: 2023-05-23

OpenHMS (2021) Sql query builders. https://openhms.sourceforge.io/sqlbuilder/, accessed: 2023-05-23

2021
[71]

Qiu D, Li B, Su Z (2013) An empirical analysis of the co-evolution of schema and code in database applications. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ACM, Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2013, pp 125--135, doi:10.1145/2491411.2491431, ://doi.org/10.1145/2491411.2491431

work page doi:10.1145/2491411.2491431 2013
[72]

The Journal of Open Source Software 3(24):638, doi:10.21105/joss.00638, ://joss.theoj.org/papers/10.21105/joss.00638

Raschka S (2018) Mlxtend: Providing machine learning and data science utilities and extensions to python’s scientific computing stack. The Journal of Open Source Software 3(24):638, doi:10.21105/joss.00638, ://joss.theoj.org/papers/10.21105/joss.00638

work page doi:10.21105/joss.00638 2018
[73]

Data & Knowledge Engineering 137:101950, doi:https://doi.org/10.1016/j.datak.2021.101950, ://www.sciencedirect.com/science/article/pii/S0169023X21000744

Roy-Hubara N, Shoval P, Sturm A (2022) Selecting databases for polyglot persistence applications. Data & Knowledge Engineering 137:101950, doi:https://doi.org/10.1016/j.datak.2021.101950, ://www.sciencedirect.com/science/article/pii/S0169023X21000744

work page doi:10.1016/j.datak.2021.101950 2022
[74]

Sahatqija K, Ajdari J, Zenuni X, Raufi B, Ismaili F (2018) Comparison between relational and nosql databases. In: 2018 41st international convention on information and communication technology, electronics and microelectronics (MIPRO), IEEE, Institute of Electrical and Electronics Engineers, Opatija, Croatia, pp 0216--0221

2018
[75]

In: Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39, Springer, pp 441--455

Scherzinger S, Sidortschuck S (2020) An empirical study on the design and evolution of nosql database schemas. In: Conceptual Modeling: 39th International Conference, ER 2020, Vienna, Austria, November 3--6, 2020, Proceedings 39, Springer, pp 441--455

2020
[76]

https://medium.com/@turkishtechnology/n-1-select-problem-21a3717325b6, accessed: 2024-11-17

Turkish Technology (2024) N+1 select problem. https://medium.com/@turkishtechnology/n-1-select-problem-21a3717325b6, accessed: 2024-11-17

2024
[77]

In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, pp 1--12

Vassiliadis P (2021) Profiles of schema evolution in free open source software projects. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, pp 1--12

2021
[78]

://doi.org/10.1145/3132847.3132954

Yan C, Cheung A, Yang J, Lu S (2017) Understanding Database Performance Inefficiencies in Real-World Web Applications, Association for Computing Machinery, New York, NY, USA, p 1299–1308. ://doi.org/10.1145/3132847.3132954

work page doi:10.1145/3132847.3132954 2017
[79]

Yang J, Subramaniam P, Lu S, Yan C, Cheung A (2018) How <i>not</i> to structure your database-backed web applications: A study of performance bugs in the wild. In: Proceedings of the 40th International Conference on Software Engineering, Association for Computing Machinery, New York, NY, USA, ICSE '18, p 800–810, doi:10.1145/3180155.3180194, ://doi.org/10...

work page doi:10.1145/3180155.3180194 2018