Smaller, Younger, and More Impactful: How AI-Assisted Writing Transforms Research Teams

Haoyang Wang; Meijun Liu; Mingze Zhang; Star Xing Zhao; Yi Bu

arxiv: 2605.27404 · v1 · pith:64556LJLnew · submitted 2026-04-24 · 💻 cs.CY · cs.AI

Smaller, Younger, and More Impactful: How AI-Assisted Writing Transforms Research Teams

Haoyang Wang , Mingze Zhang , Yi Bu , Star Xing Zhao , Meijun Liu This is my paper

Pith reviewed 2026-07-04 17:47 UTC · model glm-5.2

classification 💻 cs.CY cs.AI

keywords researchteamswritingai-assistedregressionteamimpactfullarge

0 comments

The pith

AI-Assisted Writing Shrinks Research Teams and Raises Their Impact

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper argues that the adoption of AI-assisted writing in scientific research is reshaping the structure of research teams: teams that use AI tend to be smaller and younger, yet they produce research with higher scientific impact, not lower. Drawing on 147,074 full-text publications from the PLoS family and the Nature portfolio published since 2020, the authors classify papers into AI-assisted and human-written groups using a text-based detection algorithm that estimates the proportion of AI-modified content in each paper. They then compare team age (average career seniority of authors), team size (number of co-authors), and scientific impact (field-weighted citation impact, or FWCI) across these groups using OLS, Poisson, logistic, and quantile regression, plus propensity score matching. The central finding is a triple association: higher AI usage scores correlate with younger teams, smaller teams, and a higher probability of landing in the top 5% of citation impact. The authors interpret this as evidence that AI is partially reversing the decades-long trend toward ever-larger research teams in science, by absorbing writing and editing labor that previously required senior co-authors, while simultaneously freeing researchers to concentrate on the intellectual substance of their work.

Core claim

The paper's central claim is that AI-assisted writing is associated with a structural shift in research teams — toward fewer authors and lower average seniority — and that this shift is accompanied by an increase, not a decrease, in the probability of producing highly cited work. In the PLoS dataset, AI-assisted teams had a 3.0 percentage-point higher probability of producing a top-5% FWCI publication than matched controls; in the Nature dataset, the advantage was 2.7 percentage points. The team-age effect was strongest among the youngest teams (25th percentile), suggesting AI acts as an equalizer for junior researchers who lack the writing experience and academic-discourse mastery of their,

What carries the argument

The paper relies on an AI Usage Score (0–1) derived from a text-based detection algorithm (Liang et al., 2025) that estimates the proportion of AI-modified content in each paper's full text. Papers scoring above the 95th percentile of the pre-ChatGPT distribution for their journal are classified as 'AI-assisted.' The dependent variables are Team Age (average career age of co-authors, based on years since first publication), Team Size (author count), and Top 5% FWCI (a binary indicator for whether a paper falls in the top 5% of field-weighted citation impact). Controls include first-author and corresponding-author prior productivity, collaborator counts, prior team-size and team-age habits,FW

If this is right

If AI-assisted writing genuinely enables smaller, younger teams to produce high-impact research, funding agencies may need to reconsider evaluation criteria that implicitly reward large, senior-heavy teams.
The reversal of the decades-long trend toward larger teams could change the structure of scientific labor markets, reducing demand for senior co-authors whose primary contribution is manuscript polishing and editing.
Junior researchers and non-native English speakers may gain disproportionate advantage from AI writing tools, potentially democratizing access to elite publication venues.
If the pattern generalizes beyond PLoS and Nature, it could signal a broad reorganization of how scientific collaboration is structured, with AI absorbing coordination and communication overhead that previously required larger teams.

Load-bearing premise

The entire analysis rests on the accuracy of a single AI-detection algorithm (from Liang et al., 2025) that estimates how much of each paper's text was modified by AI. If this detector systematically misclassifies certain writing styles, disciplines, or author demographics as 'AI-assisted' or 'human-written,' every downstream association — team age, team size, impact — could be biased. The paper does not independently validate the detector on its specific PLoS and Nature full

What would settle it

If an independent, validated AI-detection method applied to the same corpus produced substantially different group classifications — for instance, if many papers currently labeled 'AI-assisted' were reclassified as human-written — the observed associations with team age, size, and impact could weaken or disappear.

read the original abstract

The era of Big Science has long been defined by increasingly large and specialized research teams pushing the frontiers of knowledge. However, recent advances in artificial intelligence (AI), particularly large language models (LLMs), are beginning to reshape academic writing and scientific research, potentially disrupting the longstanding trend toward ever-larger teams and transforming other dimensions of research team structure. Drawing on 147,074 full-text publications from the PLoS family and the Nature portfolio since 2020, we examined whether and how AI-assisted writing influences team structure and team outcomes in science. Using multiple methods, including ordinary least square, quantile regression, Poisson regression, logistic regression and propensity score matching, we found that research teams using AI-assisted writing tend to be younger and smaller. Importantly, this shift toward more compact, junior-leaning teams does not come at the expense of scientific impact. On the contrary, we observed a higher probability of research teams that employed AI-assisted writing producing highly impactful publications. These results highlight the significant role of AI-assisted writing in reshaping not only how research is produced, but also how research teams are formed and assembled. Our findings call for policy improvements in research evaluation, funding, and training to address this emerging trend.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Large-scale correlational study linking AI-assisted writing to younger, smaller, higher-impact research teams. The association is real and consistently observed, but the measurement rests entirely on an external AI detector whose systematic biases on this specific corpus are unexamined.

read the letter

The headline finding is that papers flagged as AI-assisted tend to come from younger and smaller teams, and these papers are also more likely to be highly cited. The dataset is large (147K publications across PLoS and Nature), the methods are thorough (OLS, Poisson, logistic, quantile regression, PSM with exact discipline matching), and the authors are transparent about the observational design. This is a genuine new contribution at the team level — prior work on AI in science has focused on individual productivity or macro-level trends, not on how team composition correlates with AI usage. Credit is earned for the multi-method approach and for running the analysis on two independent corpora with consistent results on team age and impact. The PLoS-Nature comparison is useful: the team-age effect shows up in both, while the team-size effect is significant in PLoS but not in Nature PSM (p=0.384), which the authors report honestly. The stress-test concern about the AI detector is the right thing to worry about, and it lands. The Liang et al. detector is the single load-bearing measurement. If it systematically flags formulaic or non-native English prose as AI-modified, and those styles correlate with junior authors or smaller teams, the headline associations could be partly or wholly artifactual. The paper does not validate the detector on its own corpora, nor does it examine what characteristics the pre-GPT false positives share. This is the central weakness, and it is not minor — it is the measurement on which everything else depends. That said, the authors did not build the detector themselves, it comes from a published Nature Human Behaviour paper, and the thresholding strategy (95th percentile of pre-GPT distribution per journal) is reasonable as far as it goes. The career-age measurement is a secondary concern: it is computed only within PLoS and Nature publications, so authors with long histories in other venues will appear artificially junior. This biases toward finding younger teams among AI-assisted papers, though the direction and magnitude of the bias is hard to gauge without external validation. The causal framing in the prose occasionally overreaches — phrases like 'AI is pushing back against a long-standing trend' imply more than the observational design supports. The authors acknowledge this limitation in their section on limitations, but the framing in the introduction and discussion could be tighter. This paper is for scientometrics researchers and science-policy scholars interested in how AI is changing team structures. It raises a question worth pursuing seriously, even if the current evidence is correlational and measurement-sensitive. It deserves a serious referee who can push hard on the detector-validation question and the career-age measurement. I would accept it for peer review.

Referee Report

3 major / 6 minor

Summary. This paper examines whether AI-assisted writing is associated with changes in research team structure (age and size) and scientific impact, using 147,074 publications from the PLoS family and Nature portfolio (2020–2025). AI usage is measured via the Liang et al. (2025) text-based detection algorithm applied to full-text articles. The authors find that teams with higher AI usage scores tend to be younger and smaller, and that AI-assisted publications are more likely to achieve top-5% FWCI. The analysis employs OLS, quantile regression, Poisson regression, logistic regression, and propensity score matching, with extensive controls and fixed effects.

Significance. The paper addresses a timely and important question at the intersection of AI adoption and scientific team dynamics. Its strengths include the use of full-text (rather than abstract-only) analysis across two distinct publisher portfolios, a multi-method analytical strategy with consistent specifications, and a falsifiable set of predictions that are tested across multiple estimators. The finding that AI-assisted teams are smaller and younger without apparent loss of impact, if robust, has genuine policy relevance. The use of an externally published detector (Liang et al. 2025, Nature Human Behaviour) rather than a home-grown measure is a reasonable methodological choice, though it introduces dependencies discussed below.

major comments (3)

§3.2.1: The AI usage score from Liang et al. (2025) is the sole basis for the key independent variable, and the paper acknowledges that 5% of pre-GPT papers exceed the 95th-percentile threshold—these are necessarily false positives since ChatGPT did not exist. The paper does not examine what characteristics these pre-GPT false-positive papers share. If the detector systematically flags certain human writing styles (e.g., formulaic prose, non-native English patterns) that correlate with author career stage or team composition, the headline findings could be partially or wholly artifactual. The authors should at minimum: (a) analyze the pre-GPT false-positive papers for systematic correlates (discipline, author seniority, team size, journal), and (b) discuss the detector's known sensitivity to writing-style confounds. The discipline and journal fixed effects in the regressions partially,但不
§3.2.2, Eq. (1): Career age is defined as years since an author's first publication found in the PLoS family and Nature Portfolio specifically. A researcher who published extensively in Science, Cell, Lancet, or other venues before appearing in PLoS/Nature would have their career age systematically underestimated. This measurement choice is load-bearing for RQ1 (team age) and could bias results if, for example, AI-assisted writing is more common among researchers who are newer to PLoS/Nature specifically but not necessarily junior in their careers. The authors should either justify why this measurement is adequate given that OpenAlex (which they already use) provides broader publication histories, or acknowledge this as a substantive limitation rather than a minor one.
Table 2, Panel B (Nature): The PSM result for team size in the Nature portfolio is not statistically significant (ATT = -0.131, p = 0.384). Yet the headline claim in the abstract and §4.2 states that 'teams using AI-assisted writing tend to be smaller' as a general finding across both datasets. The Poisson regression (Table 3) does show a significant coefficient for Nature, but the PSM—the authors' preferred quasi-causal method—does not corroborate this for Nature. The paper should more carefully distinguish which findings are robust across methods and datasets and which are not, particularly in the abstract and conclusion.

minor comments (6)

§3.2.3 is labeled '3.3.3 Team size' in the text, breaking the section numbering sequence.
Table 3: The variable 'First Author Co-author Avg. Career Age (ln)' appears in the Poisson regression but is not described in the variable definitions section. It is unclear how this differs from the 'First Author Prior Team Age (ln)' used in Table 1.
§4.1.1: The text states that for Nature, 'the differences in team age distributions across the three groups are not statistically significant,' yet the OLS regression (Table 1) and PSM (Table 2) show significant results. This apparent tension between the descriptive and inferential results should be clarified.
Abstract: The phrase 'higher probability of research teams that employed AI-assisted writing producing highly impactful publications' is grammatically awkward; consider revision.
The paper would benefit from reporting the actual AI usage score distribution statistics (mean, median, SD) for each of the three groups, beyond the visual distribution in Fig. 1.
Several references have 2026 publication dates (e.g., Hao et al., 2026; He & Bu, 2026). If these are forthcoming or in-press, this should be noted.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for a careful and constructive report. All three major comments identify legitimate issues. We will (1) analyze pre-GPT false-positive papers for systematic correlates and add a discussion of detector writing-style confounds, (2) recompute career age using OpenAlex's broader publication histories and treat the current measurement as a robustness check, and (3) revise the abstract, §4.2, and conclusion to accurately reflect that the team-size finding is robust across methods for PLoS but that PSM does not corroborate the Poisson result for Nature. We view all three revisions as strengthening the paper.

read point-by-point responses

Referee: §3.2.1: The AI usage score from Liang et al. (2025) is the sole basis for the key independent variable, and the paper acknowledges that 5% of pre-GPT papers exceed the 95th-percentile threshold—these are necessarily false positives since ChatGPT did not exist. The paper does not examine what characteristics these pre-GPT false-positive papers share. If the detector systematically flags certain human writing styles (e.g., formulaic prose, non-native English patterns) that correlate with author career stage or team composition, the headline findings could be partially or wholly artifactual. The authors should at minimum: (a) analyze the pre-GPT false-positive papers for systematic correlates (discipline, author seniority, team size, journal), and (b) discuss the detector's known sensitivity to writing-style confounds.

Authors: The referee raises a valid and important concern. We agree that if the detector systematically flags particular human writing styles that correlate with team composition, our findings could be confounded. We will address this in two ways. First, we will conduct the requested analysis of pre-GPT false-positive papers: we will compare the 5% of pre-GPT papers exceeding the threshold to other pre-GPT papers on discipline, author seniority (career age), team size, and journal. If the false positives are systematically associated with younger or smaller teams, this would suggest a writing-style confound that inflates our headline findings; if not, the concern is substantially mitigated. Second, we will add a dedicated paragraph in §3.2.1 discussing the detector's known sensitivity to formulaic prose and non-native English writing patterns, drawing on the validation evidence reported in Liang et al. (2025) and related literature. We will also note that our regression specifications include discipline and journal fixed effects, which partially absorb systematic differences in writing conventions across fields and venues, though we agree these do not fully eliminate individual-level confounds within a discipline-journal cell. We acknowledge that this is a genuine limitation of relying on any text-based detector and will state this explicitly. revision: yes
Referee: §3.2.2, Eq. (1): Career age is defined as years since an author's first publication found in the PLoS family and Nature Portfolio specifically. A researcher who published extensively in Science, Cell, Lancet, or other venues before appearing in PLoS/Nature would have their career age systematically underestimated. This measurement choice is load-bearing for RQ1 (team age) and could bias results if, for example, AI-assisted writing is more common among researchers who are newer to PLoS/Nature specifically but not necessarily junior in their careers. The authors should either justify why this measurement is adequate given that OpenAlex (which they already use) provides broader publication histories, or acknowledge this as a substantive limitation rather than a minor one.

Authors: The referee is correct that restricting career age to first appearance in PLoS/Nature systematically underestimates seniority for researchers whose first publications appeared in other venues. This is a substantive measurement concern, not a minor one. We already use OpenAlex for author disambiguation and metadata, and OpenAlex does index broader publication histories across venues. We will recompute career age using each author's first publication year as recorded in OpenAlex across all indexed venues, which provides a far more accurate measure of academic seniority. We will then re-estimate all team-age analyses (OLS, quantile regression, PSM) with the revised measure. We will retain the original PLoS/Nature-based measure as a robustness check and report both sets of results. If the pattern holds under the broader measure, this strengthens our findings; if it weakens, we will report that honestly. We will also move this measurement issue from its current implicit treatment to an explicit discussion in the limitations section, acknowledging that even OpenAlex coverage is not complete, particularly for researchers in non-English-dominant scientific communities. revision: yes
Referee: Table 2, Panel B (Nature): The PSM result for team size in the Nature portfolio is not statistically significant (ATT = -0.131, p = 0.384). Yet the headline claim in the abstract and §4.2 states that 'teams using AI-assisted writing tend to be smaller' as a general finding across both datasets. The Poisson regression (Table 3) does show a significant coefficient for Nature, but the PSM—the authors' preferred quasi-causal method—does not corroborate this for Nature. The paper should more carefully distinguish which findings are robust across methods and which are not, particularly in the abstract and conclusion.

Authors: The referee is correct. The PSM result for team size in the Nature portfolio is not statistically significant, and our current framing in the abstract and §4.2 overstates the generality of the team-size finding. We will revise the manuscript as follows. First, in §4.2, we will explicitly state that the team-size finding is robust across PSM and Poisson regression for PLoS, but that for Nature, the Poisson coefficient is significant while the PSM ATT is not (p = 0.384). We will offer a substantive interpretation: the Nature sample has far fewer AI-assisted papers (1,581 vs. 11,914 in PLoS), reducing PSM statistical power, and Nature teams are on average larger (mean ~8.5 vs. ~6 in PLoS), meaning the absolute difference of 0.131 authors may be too small to detect reliably. Second, we will revise the abstract to state that AI-assisted writing is associated with smaller teams in PLoS across multiple methods, and with a significant negative coefficient in Poisson regression for Nature, while noting that the PSM result for Nature does not reach significance. Third, the conclusion will be adjusted to reflect this nuance rather than presenting the team-size finding as uniformly robust across both datasets. We agree that precision here is essential for the paper's credibility. revision: yes

Circularity Check

0 steps flagged

No circularity found: the paper's central claims are derived from external data sources and an externally-authored AI detection method, with no self-definitional or fitted-input-as-prediction patterns.

full rationale

The paper's central claims rest on three independent legs: (1) the AI usage score from Liang et al. (2025), an external benchmark authored by a different research group (no author overlap with the present paper); (2) bibliometric data from OpenAlex (author career age, team size, citations) computed via standard scientometric definitions (Eq. 1 for CareerAge, author counts for team size, FWCI normalization by year/discipline/document type); and (3) standard econometric methods (OLS, Poisson, logistic regression, PSM) applied to these variables. The regression equations (Eqs. 2-5) specify AIUsageScore as the independent variable and team age, team size, and Top5% FWCI as dependent variables—none of these are defined in terms of each other. The control variables (prior productivity, prior team structure, prior impact) are computed from historical data that is temporally prior to and distinct from the dependent variables. The 95th-percentile threshold for AI classification is derived from the pre-GPT distribution of the same external detector, not from a parameter the authors fit to their outcome data. No author of the present paper appears in the author list of Liang et al. (2025), so the load-bearing measurement is not a self-citation. The Nature dataset is noted as 'provided by Liang et al. (2025),' but this is data sharing, not a circular definitional dependency. The derivation chain is self-contained against external benchmarks, and no 'prediction' reduces by construction to a fitted input. The concerns raised by the skeptic (detector false positives, writing-style confounds) are correctness and validity risks, not circularity.

Axiom & Free-Parameter Ledger

3 free parameters · 4 axioms · 0 invented entities

The paper introduces no new entities, particles, forces, or theoretical constructs. It uses existing methods (AI detection from Liang et al., regression models, PSM) and existing datasets (PLoS, Nature, OpenAlex). The free parameters are detection thresholds derived from data distributions, and the axioms are measurement assumptions inherited from prior literature.

free parameters (3)

AI detection threshold (PLoS) = 0.092
Set as the 95th percentile of the pre-GPT AI usage score distribution for PLoS publications. This threshold determines which papers are classified as AI-assisted vs. human-written (post-GPT). A different percentile would change group sizes and downstream associations.
AI detection threshold (Nature) = 0.026
Set as the 95th percentile of the pre-GPT AI usage score distribution for Nature publications. Same rationale as above; the much lower threshold reflects different baseline score distributions in Nature vs. PLoS.
Career age classification boundaries = 5, 15, 30 years
Authors are classified into Junior (<5), Early-career (5-15), Mid-career (15-30), and Late-career (>=30) based on externally defined categories (Robinson-Garcia et al., 2020). These are not fitted to data but are imported categorical boundaries.

axioms (4)

domain assumption The AI usage score from Liang et al. (2025) accurately estimates the proportion of AI-modified content in academic text.
This is the foundational measurement assumption. The entire classification of papers into AI-assisted vs. human-written groups depends on this detector being valid and reliable on the PLoS and Nature corpora. No independent validation is performed in this paper.
domain assumption Career age measured within PLoS and Nature publications is a valid proxy for author seniority.
Equation 1 defines career age as years since first publication in PLoS/Nature. Authors with prior publications in other venues (e.g., IEEE, ACS, Springer) would have their seniority undercounted. This is acknowledged indirectly but not addressed.
domain assumption Field-weighted citation impact (FWCI) within a 3-year window is a valid measure of scientific impact.
Section 3.2.4 defines team outcome as FWCI. For papers published after 2022, the citation window is less than 3 years, introducing right-censoring bias. The authors acknowledge this in limitations but it remains a load-bearing assumption for the impact claim.
standard math Article teams (co-authors on a single paper) are a valid proxy for research teams.
Section 3.2 states 'research teams are often measured by article teams' following Leahey (2016) and Liu et al. (2022). This is a standard scientometric convention but conflates one-time collaborations with persistent research teams.

pith-pipeline@v1.1.0-glm · 26141 in / 2953 out tokens · 183997 ms · 2026-07-04T17:47:34.975215+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 2 internal anchors

[1]

Our findings call for policy improvements in research evaluation, funding, and training to address this emerging trend

only how research is produced, but also how research teams are formed and assembled. Our findings call for policy improvements in research evaluation, funding, and training to address this emerging trend. Keywords: Large language models; scientific collaboration; academic writing; team structure; scientific impact; artificial intelligence 2 1 INTRODUCTION...

work page 2007
[2]

and funding acquiring (Qian et al., 2026). While a growing body of empirical studies have documented the profound impact of AI on science, it has largely done so at either the micro level (i.e., on individual researchers or publications) (Bianchini et al.,

work page 2026
[3]

By contrast, despite a few exceptions (Slade et al., 2026), the meso level, particularly the level of the research team, remains underexplored

or the macro level (i.e., on research system as a whole) (Hao et al., 2026). By contrast, despite a few exceptions (Slade et al., 2026), the meso level, particularly the level of the research team, remains underexplored. AI has evolved beyond its initial role as a methodological tool to function as a general-purpose collaborator, capable of managing high-...

work page 2026
[4]

This study offers several important theoretical and practical contributions

to the full text of publications in the two datasets, to quantify AI adoption in academic writing of each publication, and explore whether and how it is associated with team size, team age, and team outcomes. This study offers several important theoretical and practical contributions. First, it enhances our understanding of how emerging technologies are i...

work page 2007
[5]

Our finding suggests that AI is pushing back against a long‑standing trend in the era of big science: the steady growth of team size

and the increasing complexity of scientific problems, which necessitate larger collaborative efforts. Our finding suggests that AI is pushing back against a long‑standing trend in the era of big science: the steady growth of team size. Second, this research provides critical insight into how AI shapes science at the meso level, specifically within researc...

work page 2024
[6]

general method of invention

help LLMs break down complex problems, mimic human reasoning, and learn from expert input. This lets them adapt quickly to new scientific challenges. AI can also process huge amounts of data automatically, spotting relationships and anomalies that human researchers might miss (Bianchini et al., 2025). As a result, AI is increasingly being integrated into ...

work page 2025
[7]

Nature portfolio dataset consists of 41,080 publications published from January 2020 to September

work page 2020
[8]

It indexes over 200 million publications from at least 109,000 global institutions and about 124,000 venues

OpenAlex is an open-access and comprehensive bibliographic database (Priem et al., 2022). It indexes over 200 million publications from at least 109,000 global institutions and about 124,000 venues. Due to its comprehensive coverage and open-access nature, it has been widely adopted in relevant research (Culbert et al., 2025; Yang et al., 2026; Zhang et a...

work page 2022
[9]

article team

3.2 Variables In this study, the unit of analysis is a research team denoted by 𝑖, which is proxied by an “article team”. Research teams are defined as research groups or project groups who work together (Guzzo & Dickson, 1996). In scientometrics studies or science of science studies, research teams are often measured by “article teams” (Leahey, 2016; Liu...

work page 1996
[10]

Fig.1 provides the distribution of AI usage scores for both PLoS and Nature

That score reflects the share of content altered beyond basic orthographic and grammatical corrections. Fig.1 provides the distribution of AI usage scores for both PLoS and Nature. It shows that most publications demonstrate very low AI usage scores and only a small portion exabits high AI usage scores. In addition, a higher mean AI usage score is founded...

work page arXiv 2022
[11]

Since these papers were written prior to the availability of ChatGPT, they are unlikely to apply AI-assisted writing and serve as the pre-treatment baseline. Group 2: Human-written (post-GPT) group: This group consists of publications from November 30, 2022 onward whose AI usage score is below the percentile of the pre-GPT score distribution for the same ...

work page 2022
[12]

3.2.2 Team age This variable is applied to measure the average seniority of authors in a research team

The distribution of AI usage scores in the PLoS family (a) and Nature portfolio (b). 3.2.2 Team age This variable is applied to measure the average seniority of authors in a research team. Following established practice (Xu et al., 2022), we defined the career age of authors as the number of years elapsed between their first publications in the PLoS famil...

work page 2022
[13]

Late-career: Authors with a career age ≥ 30 years. 3.3.3 Team size Following the traditional measurement (Lee et al., 2015; Liu, Jaiswal, et al., 2022), we quantified team size of 𝑖 by counting the total number of authors listed in a publication’s byline. 3.2.4 Team outcome Team outcome is operationalized as the normalized citation count of a research tea...

work page 2015
[14]

We denoted this variable by 𝐹𝑊𝐶𝐼!

To enable cross‑publication comparison, we normalized each publication’s citation count by the mean citation count of all papers sharing the same publication year, discipline, and document type (e.g., article or review). We denoted this variable by 𝐹𝑊𝐶𝐼!. This normalized measure captures a publication’s relative citation performance against its peers. An ...

work page 1947
[15]

#, 50"#, and 75

𝑇𝑒𝑎𝑚𝐴𝑔𝑒!= 𝛽'+ 𝛽(𝐴𝐼𝑈𝑠𝑎𝑔𝑒𝑆𝑐𝑜𝑟𝑒!+ Γ𝑋!+ 𝑇𝑖𝑚𝑒!+𝐷𝑖𝑠𝑐𝑖𝑝𝑙𝑖𝑛𝑒!+𝐽𝑜𝑢𝑟𝑛𝑎𝑙!+ 𝜖! (2) Where 𝑖 denotes a research team; 𝐴𝐼𝑠𝑐𝑜𝑟𝑒! represents the key independent variable, the estimated proportion of AI-assisted writing for a publication produced by 𝑖; 𝑇𝑒𝑎𝑚𝐴𝑔𝑒! is the dependent variable, defined as the average career age of all authors for the publication produced by 𝑖. 𝑋!...

work page 2022
[16]

The summary statistics of variables employed in this study are shown in Table S1 in Appendix. 3.3.3 Propensity score matching Publications may differ systematically in key confounding variables that are correlated with both the AI usage score and the three dependent variables. This raise concerns that any observed relationship between AI usage and the dep...

work page 2008
[17]

When the dependent variable was Top5% FWCI, we excluded FWCI from the covariates

When the dependent variable was team size or team age, the covariates included FWCI and ten author-level controls. When the dependent variable was Top5% FWCI, we excluded FWCI from the covariates. We utilized 1 to 1 nearest neighbor matching without replacement, imposing a caliper of 0.2 standard deviations of the propensity score to enforce strict simila...

work page 2011
[18]

publish or perish

5 DISCUSSION AND CONCLUSION Based on 147,074 full-text publications from the PLoS family and the Nature portfolio since 2020, we examined whether and how AI-assisted writing shapes research teams, focusing on team age, team size, and team outcomes. Our findings show that teams using AI-assisted writing tend to be younger and smaller. Importantly, this shi...

work page 2020
[19]

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

to ensure transparent and accountable use of these tools, preserving the integrity of scholarly communication. This study has several limitations. First, it only analyzes publications from two sources: the PLoS family and the Nature portfolio. The findings therefore apply directly only to these two open‑access publishers. Whether they generalize to other ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/j.respol.2005.01.014 2022
[20]

promising technology

https://doi.org/10.1007/s10462-024-10902-3 Bouschery, S. G., Blazevic, V., & Piller, F. T. (2023). Augmenting human innovation teams with artificial intelligence: Exploring transformer-based language models. Journal of Product Innovation Management, 40(2), 139-153. https://doi.org/https://doi.org/10.1111/jpim.12656 Brynjolfsson, E., Li, D., & Raymond, L. ...

work page doi:10.1007/s10462-024-10902-3 2023
[21]

R., Zadeh, M

https://doi.org/10.1057/s41599-026-06956-z Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1),

work page doi:10.1057/s41599-026-06956-z 2021
[22]

Death of the Renaissance Man

https://www.mdpi.com/2227-7080/9/1/2 Jones, B. F. (2009). The Burden of Knowledge and the “Death of the Renaissance Man”: Is Innovation Getting Harder? The Review of Economic Studies, 76(1), 283-317. https://doi.org/10.1111/j.1467-937X.2008.00531.x Jones, B. F., Wuchty, S., & Uzzi, B. (2008). Multi-university research teams: Shifting impact, geography, an...

work page doi:10.1111/j.1467-937x.2008.00531.x 2009
[23]

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

Journal of the Association for Information Science and Technology, 66(7), 1323-1332. https://doi.org/https://doi.org/10.1002/asi.23266 Lee, Y.-N., Walsh, J. P., & Wang, J. (2015). Creativity in scientific teams: Unpacking novelty and 27 impact. Research Policy, 44(3), 684-697. https://doi.org/https://doi.org/10.1016/j.respol.2014.10.007 Liang, W., Izzo, Z...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1002/asi.23266 2015
[24]

https://doi.org/10.1057/s41599-023-02392-5 Liu, Y., Xie, Y., Shen, X., & Wu, D. (2026). Artificial intelligence use and scientific innovation. Journal of the Association for Information Science and Technology, 77(5), 682-698. https://doi.org/https://doi.org/10.1002/asi.70043 Lund, B. D., Wang, T., Mannuru, N. R., Nie, B., Shimray, S., & Wang, Z. (2023). C...

work page doi:10.1057/s41599-023-02392-5 2026

[1] [1]

Our findings call for policy improvements in research evaluation, funding, and training to address this emerging trend

only how research is produced, but also how research teams are formed and assembled. Our findings call for policy improvements in research evaluation, funding, and training to address this emerging trend. Keywords: Large language models; scientific collaboration; academic writing; team structure; scientific impact; artificial intelligence 2 1 INTRODUCTION...

work page 2007

[2] [2]

and funding acquiring (Qian et al., 2026). While a growing body of empirical studies have documented the profound impact of AI on science, it has largely done so at either the micro level (i.e., on individual researchers or publications) (Bianchini et al.,

work page 2026

[3] [3]

By contrast, despite a few exceptions (Slade et al., 2026), the meso level, particularly the level of the research team, remains underexplored

or the macro level (i.e., on research system as a whole) (Hao et al., 2026). By contrast, despite a few exceptions (Slade et al., 2026), the meso level, particularly the level of the research team, remains underexplored. AI has evolved beyond its initial role as a methodological tool to function as a general-purpose collaborator, capable of managing high-...

work page 2026

[4] [4]

This study offers several important theoretical and practical contributions

to the full text of publications in the two datasets, to quantify AI adoption in academic writing of each publication, and explore whether and how it is associated with team size, team age, and team outcomes. This study offers several important theoretical and practical contributions. First, it enhances our understanding of how emerging technologies are i...

work page 2007

[5] [5]

Our finding suggests that AI is pushing back against a long‑standing trend in the era of big science: the steady growth of team size

and the increasing complexity of scientific problems, which necessitate larger collaborative efforts. Our finding suggests that AI is pushing back against a long‑standing trend in the era of big science: the steady growth of team size. Second, this research provides critical insight into how AI shapes science at the meso level, specifically within researc...

work page 2024

[6] [6]

general method of invention

help LLMs break down complex problems, mimic human reasoning, and learn from expert input. This lets them adapt quickly to new scientific challenges. AI can also process huge amounts of data automatically, spotting relationships and anomalies that human researchers might miss (Bianchini et al., 2025). As a result, AI is increasingly being integrated into ...

work page 2025

[7] [7]

Nature portfolio dataset consists of 41,080 publications published from January 2020 to September

work page 2020

[8] [8]

It indexes over 200 million publications from at least 109,000 global institutions and about 124,000 venues

OpenAlex is an open-access and comprehensive bibliographic database (Priem et al., 2022). It indexes over 200 million publications from at least 109,000 global institutions and about 124,000 venues. Due to its comprehensive coverage and open-access nature, it has been widely adopted in relevant research (Culbert et al., 2025; Yang et al., 2026; Zhang et a...

work page 2022

[9] [9]

article team

3.2 Variables In this study, the unit of analysis is a research team denoted by 𝑖, which is proxied by an “article team”. Research teams are defined as research groups or project groups who work together (Guzzo & Dickson, 1996). In scientometrics studies or science of science studies, research teams are often measured by “article teams” (Leahey, 2016; Liu...

work page 1996

[10] [10]

Fig.1 provides the distribution of AI usage scores for both PLoS and Nature

That score reflects the share of content altered beyond basic orthographic and grammatical corrections. Fig.1 provides the distribution of AI usage scores for both PLoS and Nature. It shows that most publications demonstrate very low AI usage scores and only a small portion exabits high AI usage scores. In addition, a higher mean AI usage score is founded...

work page arXiv 2022

[11] [11]

Since these papers were written prior to the availability of ChatGPT, they are unlikely to apply AI-assisted writing and serve as the pre-treatment baseline. Group 2: Human-written (post-GPT) group: This group consists of publications from November 30, 2022 onward whose AI usage score is below the percentile of the pre-GPT score distribution for the same ...

work page 2022

[12] [12]

3.2.2 Team age This variable is applied to measure the average seniority of authors in a research team

The distribution of AI usage scores in the PLoS family (a) and Nature portfolio (b). 3.2.2 Team age This variable is applied to measure the average seniority of authors in a research team. Following established practice (Xu et al., 2022), we defined the career age of authors as the number of years elapsed between their first publications in the PLoS famil...

work page 2022

[13] [13]

Late-career: Authors with a career age ≥ 30 years. 3.3.3 Team size Following the traditional measurement (Lee et al., 2015; Liu, Jaiswal, et al., 2022), we quantified team size of 𝑖 by counting the total number of authors listed in a publication’s byline. 3.2.4 Team outcome Team outcome is operationalized as the normalized citation count of a research tea...

work page 2015

[14] [14]

We denoted this variable by 𝐹𝑊𝐶𝐼!

To enable cross‑publication comparison, we normalized each publication’s citation count by the mean citation count of all papers sharing the same publication year, discipline, and document type (e.g., article or review). We denoted this variable by 𝐹𝑊𝐶𝐼!. This normalized measure captures a publication’s relative citation performance against its peers. An ...

work page 1947

[15] [15]

#, 50"#, and 75

𝑇𝑒𝑎𝑚𝐴𝑔𝑒!= 𝛽'+ 𝛽(𝐴𝐼𝑈𝑠𝑎𝑔𝑒𝑆𝑐𝑜𝑟𝑒!+ Γ𝑋!+ 𝑇𝑖𝑚𝑒!+𝐷𝑖𝑠𝑐𝑖𝑝𝑙𝑖𝑛𝑒!+𝐽𝑜𝑢𝑟𝑛𝑎𝑙!+ 𝜖! (2) Where 𝑖 denotes a research team; 𝐴𝐼𝑠𝑐𝑜𝑟𝑒! represents the key independent variable, the estimated proportion of AI-assisted writing for a publication produced by 𝑖; 𝑇𝑒𝑎𝑚𝐴𝑔𝑒! is the dependent variable, defined as the average career age of all authors for the publication produced by 𝑖. 𝑋!...

work page 2022

[16] [16]

The summary statistics of variables employed in this study are shown in Table S1 in Appendix. 3.3.3 Propensity score matching Publications may differ systematically in key confounding variables that are correlated with both the AI usage score and the three dependent variables. This raise concerns that any observed relationship between AI usage and the dep...

work page 2008

[17] [17]

When the dependent variable was Top5% FWCI, we excluded FWCI from the covariates

When the dependent variable was team size or team age, the covariates included FWCI and ten author-level controls. When the dependent variable was Top5% FWCI, we excluded FWCI from the covariates. We utilized 1 to 1 nearest neighbor matching without replacement, imposing a caliper of 0.2 standard deviations of the propensity score to enforce strict simila...

work page 2011

[18] [18]

publish or perish

5 DISCUSSION AND CONCLUSION Based on 147,074 full-text publications from the PLoS family and the Nature portfolio since 2020, we examined whether and how AI-assisted writing shapes research teams, focusing on team age, team size, and team outcomes. Our findings show that teams using AI-assisted writing tend to be younger and smaller. Importantly, this shi...

work page 2020

[19] [19]

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

to ensure transparent and accountable use of these tools, preserving the integrity of scholarly communication. This study has several limitations. First, it only analyzes publications from two sources: the PLoS family and the Nature portfolio. The findings therefore apply directly only to these two open‑access publishers. Whether they generalize to other ...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/j.respol.2005.01.014 2022

[20] [20]

promising technology

https://doi.org/10.1007/s10462-024-10902-3 Bouschery, S. G., Blazevic, V., & Piller, F. T. (2023). Augmenting human innovation teams with artificial intelligence: Exploring transformer-based language models. Journal of Product Innovation Management, 40(2), 139-153. https://doi.org/https://doi.org/10.1111/jpim.12656 Brynjolfsson, E., Li, D., & Raymond, L. ...

work page doi:10.1007/s10462-024-10902-3 2023

[21] [21]

R., Zadeh, M

https://doi.org/10.1057/s41599-026-06956-z Jaiswal, A., Babu, A. R., Zadeh, M. Z., Banerjee, D., & Makedon, F. (2021). A Survey on Contrastive Self-Supervised Learning. Technologies, 9(1),

work page doi:10.1057/s41599-026-06956-z 2021

[22] [22]

Death of the Renaissance Man

https://www.mdpi.com/2227-7080/9/1/2 Jones, B. F. (2009). The Burden of Knowledge and the “Death of the Renaissance Man”: Is Innovation Getting Harder? The Review of Economic Studies, 76(1), 283-317. https://doi.org/10.1111/j.1467-937X.2008.00531.x Jones, B. F., Wuchty, S., & Uzzi, B. (2008). Multi-university research teams: Shifting impact, geography, an...

work page doi:10.1111/j.1467-937x.2008.00531.x 2009

[23] [23]

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

Journal of the Association for Information Science and Technology, 66(7), 1323-1332. https://doi.org/https://doi.org/10.1002/asi.23266 Lee, Y.-N., Walsh, J. P., & Wang, J. (2015). Creativity in scientific teams: Unpacking novelty and 27 impact. Research Policy, 44(3), 684-697. https://doi.org/https://doi.org/10.1016/j.respol.2014.10.007 Liang, W., Izzo, Z...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1002/asi.23266 2015

[24] [24]

https://doi.org/10.1057/s41599-023-02392-5 Liu, Y., Xie, Y., Shen, X., & Wu, D. (2026). Artificial intelligence use and scientific innovation. Journal of the Association for Information Science and Technology, 77(5), 682-698. https://doi.org/https://doi.org/10.1002/asi.70043 Lund, B. D., Wang, T., Mannuru, N. R., Nie, B., Shimray, S., & Wang, Z. (2023). C...

work page doi:10.1057/s41599-023-02392-5 2026