Unintended Negative Impacts of Promotional Language in Patent Evaluation
Pith reviewed 2026-05-08 16:28 UTC · model grok-4.3
The pith
Higher promotional language in patent applications links to lower grant, transfer, and appeal success rates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A higher frequency of words from a validated 135-term promotional lexicon is negatively associated with the probability that a patent application is granted, ownership is transferred, or a rejection is successfully appealed. The association persists after matching and regression controls for confounders and holds across technological areas. In matched samples the success gap between lowest and highest promotional-density quintiles is 5.5, 5.9, and 5.3 percentage points for the three outcomes. Promotional language nevertheless correlates positively with combinatorial novelty and forward citation impact, indicating it signals rather than conceals technological strength. Tolerance for the same
What carries the argument
A 135-word promotional lexicon combined with large-scale matching and regression on 2.7 million USPTO applications to isolate language effects on grant, transfer, and appeal outcomes.
Load-bearing premise
The 135-word lexicon captures promotional framing without systematic bias or omission in the legal-technical language of patents, and the matching plus regression setup accounts for all relevant confounders.
What would settle it
Re-running the analysis with an alternative promotional word list or additional controls for unmeasured technology quality that eliminates or reverses the negative association with grant probability.
Figures
read the original abstract
Promotional language has been increasingly used to aid the communication of innovative ideas in science. Yet, less is known about its role in the context of technological innovation. Here, we use a validated and domain-diagnosed lexicon of 135 promotional words to study the association between promotional language and patent evaluation outcomes among 2.7 million USPTO patent applications. Our large-scale study reveals three unexpected findings. First, in contrast to scientific evaluation, we find that a higher frequency of promotional words is negatively associated with the probability of an application being (i) granted a patent, (ii) transferred ownership, and (iii) successfully appealed. This promotional penalty holds even after accounting for a range of confounding factors and is largely robust across different technological areas. Among matched samples, the difference in the success rate between the lowest and highest promotional density quintile is 5.5, 5.9, and 5.3 percentage points for patentability, transferability, and rejection reversal. Second, contrary to institutional skepticism, we show that promotional language is not a mask of weak technology, but objectively reflects the degree of combinatorial novelty and future citation impact. Third, digging into the mechanisms, we find that the tolerance to promotional framing is strongly moderated by human factors, with men and experienced examiners showing a higher acceptance of promotional narratives than women and novice examiners. By revealing an emerging paradox in the patent system, our study offers theoretical and practical implications for improving patent evaluation through more objective scrutiny of linguistic patterns in patent filings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes 2.7 million USPTO patent applications using a 135-word promotional lexicon (originally validated in science and described as domain-diagnosed) to examine associations with evaluation outcomes. It reports that higher promotional density is negatively linked to grant probability, ownership transfer, and successful appeal reversal (differences of 5.5, 5.9, and 5.3 percentage points across quintiles in matched samples), even after controls and matching; that promotional language correlates with combinatorial novelty and future citations rather than masking weakness; and that effects are moderated by examiner gender and experience, with robustness across technological fields.
Significance. If the associations hold after addressing measurement concerns, the study identifies an unintended penalty in the patent system against language that objectively signals innovation, with implications for examiner training and evaluation design. The large sample, matched-sample design, and cross-field robustness checks provide a solid empirical base for these claims.
major comments (3)
- [Methods] Methods: The 135-word lexicon is applied to patent texts without patent-specific validation (e.g., no expert annotation of context, inter-rater reliability on excerpts, or ablation of terms like 'novel' or 'improved' that fulfill statutory requirements). This risks conflating promotional tone with claim breadth or legal phrasing, which could bias the density measure and undermine the reported negative associations with grant/transfer/appeal outcomes.
- [Results] Results: Exact regression specifications (full controls, fixed effects, quintile boundary definitions, and sensitivity to alternative word lists) are not detailed, despite claims of robustness; this makes it difficult to evaluate whether the 5.5–5.9 pp differences in matched samples are sensitive to specification choices or measurement error correlated with application quality.
- [Results] Results/Discussion: The claim that promotional language 'objectively reflects' novelty and citation impact requires clearer separation of these measures from the lexicon itself, plus checks on whether associations persist after additional controls for technological complexity or filing ambition.
minor comments (2)
- [Abstract] Abstract and Methods: Clarify the precise 'domain-diagnosis' process applied to patents, including any adjustments to the original science-validated list.
- [Tables] Tables/Figures: Report standard errors or confidence intervals alongside the quintile success-rate differences and ensure all robustness checks include error bars.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, indicating the revisions we will incorporate to strengthen the manuscript.
read point-by-point responses
-
Referee: [Methods] Methods: The 135-word lexicon is applied to patent texts without patent-specific validation (e.g., no expert annotation of context, inter-rater reliability on excerpts, or ablation of terms like 'novel' or 'improved' that fulfill statutory requirements). This risks conflating promotional tone with claim breadth or legal phrasing, which could bias the density measure and undermine the reported negative associations with grant/transfer/appeal outcomes.
Authors: We acknowledge the importance of domain-specific validation. The lexicon originates from prior work validated in scientific contexts and applied here as a measure of promotional tone. In the revision, we will add an ablation analysis excluding terms such as 'novel' and 'improved' to test sensitivity of the main results. We will also report inter-rater reliability from manual annotation of a random sample of 500 patent excerpts by two independent coders, focusing on whether promotional usage aligns with or diverges from legal phrasing. These steps will directly address potential conflation with claim breadth. revision: yes
-
Referee: [Results] Results: Exact regression specifications (full controls, fixed effects, quintile boundary definitions, and sensitivity to alternative word lists) are not detailed, despite claims of robustness; this makes it difficult to evaluate whether the 5.5–5.9 pp differences in matched samples are sensitive to specification choices or measurement error correlated with application quality.
Authors: We agree that full transparency on specifications is essential for evaluating robustness. The revised manuscript will include a new appendix section providing the complete regression equations, all control variables and fixed effects (year, technology class, examiner), exact quintile boundaries derived from the promotional density distribution, and results from alternative word lists. This documentation will allow readers to confirm that the reported percentage-point differences in matched samples are not driven by specification choices or correlated measurement error. revision: yes
-
Referee: [Results] Results/Discussion: The claim that promotional language 'objectively reflects' novelty and citation impact requires clearer separation of these measures from the lexicon itself, plus checks on whether associations persist after additional controls for technological complexity or filing ambition.
Authors: We will clarify the independence of measures in the revision: combinatorial novelty is computed from patent classification co-occurrence patterns and citation impact from forward citations, neither of which depends on the promotional lexicon. We will also add explicit robustness checks controlling for technological complexity (number of claims and IPC subclasses) and filing ambition (inventor team size and priority claims). These additions will demonstrate that the positive associations with novelty and citations, as well as the negative associations with evaluation outcomes, persist after accounting for these factors. revision: yes
Circularity Check
Empirical regression analysis on external data with pre-validated lexicon; no derivation reduces to inputs by construction
full rationale
The paper reports observational associations between word counts from a pre-existing 135-word lexicon and USPTO outcomes (grant, transfer, appeal) via matching and regression on 2.7 million applications. No equations, fitted parameters, or predictions are presented as independent results; the lexicon is described as already validated and domain-diagnosed prior to this study. Claims about novelty and citations are treated as separate objective measures rather than derived from the promotional counts. No self-citation chain or self-definitional step is load-bearing for the central findings.
Axiom & Free-Parameter Ledger
free parameters (1)
- promotional density quintile boundaries
axioms (2)
- domain assumption The pre-validated 135-word lexicon captures promotional language without domain-specific bias when applied to patent text.
- domain assumption All relevant confounders for patent success are captured by the included controls and matching procedure.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.