pith. machine review for the scientific record. sign in

arxiv: 2604.20477 · v1 · submitted 2026-04-22 · 🧬 q-bio.PE

Recognition: unknown

Emergence biases in molecular evolution

Nikolaos Vakirlis, Timothy Fuqua

Authors on Pith no claims yet

Pith reviewed 2026-05-09 22:55 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords emergence biasmolecular evolutionde novo proteinspromotersenhancersevolutionary innovationmutation biasnew functions
0
0 comments X

The pith

Genetic sequences carry an inherent bias that makes mutations more likely to produce new functions like promoters or proteins.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This perspective formalizes emergence bias as a molecular predisposition in genetic sequences that, when mutated, favors or disfavors the acquisition of new functions or phenotypes. Drawing from existing research on how promoters, enhancers, and de novo proteins arise, the authors argue these cases share a common underlying bias rather than being isolated phenomena. If correct, this bias would mean that evolutionary innovations are not entirely random but guided by sequence-intrinsic properties, making certain evolutionary paths more probable. This matters because it offers a new lens for understanding how biological novelty originates at the molecular level, potentially explaining patterns of repeated evolution.

Core claim

Biases in molecular evolution influence trajectories in development and mutation, but not previously formalized for acquiring new functions. We define emergence bias as the molecular predisposition that biases a genetic sequence towards or against gaining new functions or phenotypes upon mutation. These have been observed in promoters, enhancers, and de novo proteins. We synthesize these findings to support the concept and speculate on molecular underpinnings, suggesting emergence biases play an important role in evolutionary innovations.

What carries the argument

emergence bias: the molecular predisposition in a genetic sequence that biases it toward or against acquiring new functions or phenotypes upon mutation

If this is right

  • The emergence of new regulatory elements such as promoters and enhancers occurs more readily due to inherent sequence properties rather than mutation and selection alone.
  • De novo protein birth is facilitated in certain genetic backgrounds by these predispositions.
  • Evolutionary innovations are channeled toward specific outcomes by molecular biases, affecting the likelihood of functional novelty.
  • Models of molecular evolution should account for emergence biases in addition to mutation rates and selective pressures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If general, emergence biases could help explain convergent evolution where similar novelties arise repeatedly across independent lineages.
  • Synthetic biology experiments could deliberately introduce or remove biased sequences to accelerate or suppress the evolution of new functions.
  • The idea suggests testable predictions for whether analogous biases influence other molecular innovations, such as the origin of new metabolic enzymes.
  • Directed evolution protocols might be optimized by selecting starting sequences that exhibit high emergence bias for the target function.

Load-bearing premise

That the patterns observed in studies of promoters, enhancers, and de novo proteins reflect a general molecular predisposition rather than case-specific or selection-driven effects.

What would settle it

A controlled experiment measuring the frequency of new functional emergence after random mutations across sequence classes, finding no consistent difference between those predicted to have high versus low emergence bias.

Figures

Figures reproduced from arXiv: 2604.20477 by Nikolaos Vakirlis, Timothy Fuqua.

Figure 4
Figure 4. Figure 4: Information theory and sequence space influence emergence. (A) A hypothetical representation of different functions (dashed gray shapes labeled I-V) in genotypic space (x and y-axes). The colored circles correspond to the three different DNA sequences in the genotypic space. The arrows show which functions these sequences are biased towards based on their proximity in the space. (B) The information content… view at source ↗
read the original abstract

Biases in molecular evolution can significantly influence evolutionary trajectories. They have been described in a variety of contexts such as development and mutation, but not for acquiring new functions (i.e. emergence). Here, we formalize the term, emergence bias, as the molecular predisposition that, upon mutation, biases a genetic sequence towards or against gaining new functions or causing new phenotypes. These biases have been observed in previous studies for the emergence of promoters, enhancers, and de novo proteins, but never formally characterized as such. In this Perspective piece, we describe these studies and synthesize their findings through the prism of a unifying term, emergence bias, to provide support for this new concept , and speculate on its molecular underpinnings. We believe that emergence biases may play an important role in evolutionary innovations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript is a Perspective article that formalizes the concept of 'emergence bias' as a molecular predisposition biasing mutational outcomes towards or against gaining new functions or phenotypes. It synthesizes findings from prior studies on the emergence of promoters, enhancers, and de novo proteins under this term, speculates on molecular underpinnings, and suggests these biases may play an important role in evolutionary innovations.

Significance. If the proposed unifying concept holds, it offers a valuable conceptual framework for integrating observations across molecular systems in evolutionary biology, potentially guiding research into common mechanisms of functional innovation beyond known mutational and developmental biases. The manuscript is credited for its clear synthesis of disparate literature under a single term, which provides a useful lens even without new data or models.

major comments (1)
  1. [Synthesis of prior studies] The synthesis of studies on promoters, enhancers, and de novo proteins does not address or distinguish whether the reported patterns reflect a general molecular predisposition (emergence bias) versus case-specific selection effects in the experimental or natural contexts of those studies; this distinction is load-bearing for the unifying claim.
minor comments (1)
  1. [Abstract] The abstract phrasing 'never formally characterized as such' could be revised for precision, as the work is a perspective synthesis rather than a primary formalization.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation of the manuscript as a Perspective and for the constructive comment on distinguishing general molecular predispositions from selection effects. We address this point below and will revise accordingly.

read point-by-point responses
  1. Referee: The synthesis of studies on promoters, enhancers, and de novo proteins does not address or distinguish whether the reported patterns reflect a general molecular predisposition (emergence bias) versus case-specific selection effects in the experimental or natural contexts of those studies; this distinction is load-bearing for the unifying claim.

    Authors: We agree that this distinction is important for the strength of the unifying concept. The Perspective synthesizes patterns observed across independent systems (promoters in random libraries, enhancers in reporter assays, and de novo proteins in both experimental and natural contexts), which were not designed to test emergence bias a priori. While we cannot retroactively eliminate all selection effects from the cited studies, the consistency of mutational biases toward functional emergence in controlled experimental setups (e.g., unselected random sequence libraries) provides evidence for a molecular-level predisposition. In the revised manuscript, we will add an explicit discussion section addressing this point, clarifying the limitations of the existing data, and outlining how future experiments could better isolate emergence bias from selection. This will refine rather than overstate the unifying claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

This is a perspective piece that formalizes the term 'emergence bias' as a conceptual label for molecular predispositions observed in prior studies on promoters, enhancers, and de novo proteins. No equations, quantitative models, predictions, or derivations are present. The argument synthesizes external observations under a new unifying term without any internal fitting, self-referential definitions, or load-bearing self-citations that reduce the central claim to its own inputs. The claim is explicitly hedged and offered as a lens rather than a demonstrated mechanism, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that prior case studies reflect a general molecular mechanism; no free parameters or new entities with independent evidence are introduced beyond the conceptual label itself.

axioms (1)
  • domain assumption Biases toward new function acquisition have been observed in studies of promoters, enhancers, and de novo proteins.
    Invoked in the abstract as the basis for synthesis without re-examination of the original data.
invented entities (1)
  • emergence bias no independent evidence
    purpose: To unify observations of molecular predispositions that bias sequences toward or against gaining new functions upon mutation.
    Newly coined term without new supporting measurements or falsifiable predictions in this manuscript.

pith-pipeline@v0.9.0 · 5419 in / 1205 out tokens · 58542 ms · 2026-05-09T22:55:07.495787+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

85 extracted references · 82 canonical work pages

  1. [1]

    Biased Embryos and Evolution

    Arthur W. Biased Embryos and Evolution. Cambridge: Cambridge University Press; 2004. doi:10.1017/CBO9780511606830

  2. [2]

    Mutation bias and the predictability of evolution

    Cano AV , Gitschlag BL, Rozhoňová H, Stoltzfus A, McCandlish DM, Payne JL. Mutation bias and the predictability of evolution. Philosophical Transactions of the Royal Society B: Biological Sciences. 2023;378: 20220055. doi:10.1098/rstb.2022.0055

  3. [3]

    Codon usage bias

    Parvathy ST, Udayasuriyan V , Bhadana V . Codon usage bias. Mol Biol Rep. 2022;49: 539–565. doi:10.1007/s11033-021-06749-4

  4. [4]

    Dense and pleiotropic regulatory information in a developmental enhancer

    Fuqua T, Jordan J, van Breugel ME, Halavatyi A, Tischer C, Polidoro P , et al. Dense and pleiotropic regulatory information in a developmental enhancer. Nature. 2020;587: 235–239. doi:10.1038/s41586-020-2816-5

  5. [5]

    Developmental bias predicts 60 million years of wing shape evolution

    Rohner PT, Berger D. Developmental bias predicts 60 million years of wing shape evolution. Proceedings of the National Academy of Sciences. 2023;120: e2211210120. doi:10.1073/pnas.2211210120

  6. [6]

    Cano and Joshua L

    Cano AV , Payne JL. Mutation bias interacts with composition bias to influence adaptive evolution. PLOS Computational Biology. 2020;16: e1008296. doi:10.1371/journal.pcbi.1008296

  7. [7]

    Mutational Biases Influence Parallel Adaptation

    Stoltzfus A, McCandlish DM. Mutational Biases Influence Parallel Adaptation. Mol Biol Evol. 2017;34: 2163–2172. doi:10.1093/molbev/msx180

  8. [8]

    The role of mutation bias in adaptive molecular evolution: insights from convergent changes in protein function

    Storz JF , Natarajan C, Signore AV , Witt CC, McCandlish DM, Stoltzfus A. The role of mutation bias in adaptive molecular evolution: insights from convergent changes in protein function. Philosophical Transactions of the Royal Society B: Biological Sciences. 2019;374: 20180238. doi:10.1098/rstb.2018.0238

  9. [9]

    The fitness landscape of the codon space across environments

    Fragata I, Matuszewski S, Schmitz MA, Bataillon T, Jensen JD, Bank C. The fitness landscape of the codon space across environments. Heredity (Edinb). 2018;121: 422–437. doi:10.1038/s41437-018- 0125-7

  10. [10]

    Developmental Bias and Evolution: A Regulatory Network Perspective

    Uller T, Moczek AP , Watson RA, Brakefield PM, Laland KN. Developmental Bias and Evolution: A Regulatory Network Perspective. Genetics. 2018;209: 949–966. doi:10.1534/genetics.118.300995

  11. [11]

    Random sequences rapidly evolve into de novo promoters

    Yona AH, Alm EJ, Gore J. Random sequences rapidly evolve into de novo promoters. Nat Commun. 2018;9: 1530. doi:10.1038/s41467-018-04026-w Fuqua and Vakirlis, page 10

  12. [12]

    Synthetic reversed sequences reveal default genomic states

    Camellato BR, Brosh R, Ashe HJ, Maurano MT, Boeke JD. Synthetic reversed sequences reveal default genomic states. Nature. 2024;628: 373–380. doi:10.1038/s41586-024-07128-2

  13. [13]

    Predicting bacterial promoter function and evolution from random sequences

    Lagator M, Sarikas S, Steinrueck M, Toledo-Aparicio D, Bollback JP , Guet CC, et al. Predicting bacterial promoter function and evolution from random sequences. Elife. 2022;11: e64543. doi:10.7554/eLife.64543

  14. [14]

    Deep conservation of the enhancer regulatory code in animals

    Wong ES, Zheng D, Tan SZ, Bower NI, Garside V , Vanwalleghem G, et al. Deep conservation of the enhancer regulatory code in animals. Science. 2020;370: eaax8137. doi:10.1126/science.aax8137

  15. [15]

    Regulatory activity is the default DNA state in eukaryotes

    Luthra I, Jensen C, Chen XE, Salaudeen AL, Rafi AM, de Boer CG. Regulatory activity is the default DNA state in eukaryotes. Nat Struct Mol Biol. 2024;31: 559–567. doi:10.1038/s41594-024-01235-4

  16. [16]

    Developmental drive: an important determinant of the direction of phenotypic evolution

    Arthur W. Developmental drive: an important determinant of the direction of phenotypic evolution. Evol Dev. 2001;3: 271–278. doi:10.1046/j.1525-142x.2001.003004271.x

  17. [17]

    Developmental Constraints and Evolution: A Perspective from the Mountain Lake Conference on Development and Evolution

    Smith JM, Burian R, Kauffman S, Alberch P , Campbell J, Goodwin B, et al. Developmental Constraints and Evolution: A Perspective from the Mountain Lake Conference on Development and Evolution. The Quarterly Review of Biology. 1985;60: 265–287

  18. [18]

    Emergence and evolution of protein-coding de novo genes

    Bornberg-Bauer E, Eicholt LA. Emergence and evolution of protein-coding de novo genes. Nat Rev Genet. 2026. doi:10.1038/s41576-025-00929-9

  19. [19]

    De novo human brain enhancers created by single-nucleotide mutations

    Li S, Hannenhalli S, Ovcharenko I. De novo human brain enhancers created by single-nucleotide mutations. Sci Adv. 2023;9: eadd2911. doi:10.1126/sciadv.add2911

  20. [20]

    De Novo Genesis of Enhancers in Vertebrates

    Eichenlaub MP , Ettwiller L. De Novo Genesis of Enhancers in Vertebrates. PLoS Biol. 2011;9: e1001188. doi:10.1371/journal.pbio.1001188

  21. [21]

    De Novo Genes

    Zhao L, Svetec N, Begun DJ. De Novo Genes. Annual Review of Genetics. 2024;58: 211–232. doi:10.1146/annurev-genet-111523-102413

  22. [22]

    Cryptic Genetic Variation Is Enriched for Potential Adaptations

    Masel J. Cryptic Genetic Variation Is Enriched for Potential Adaptations. Genetics. 2006;172: 1985–

  23. [23]

    doi:10.1534/genetics.105.051649

  24. [24]

    Exaptation—a Missing Term in the Science of Form

    Gould SJ, Vrba ES. Exaptation—a Missing Term in the Science of Form. Paleobiology. 1982;8: 4–15. doi:10.1017/S0094837300004310

  25. [25]

    Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation

    van Hijum SAFT, Medema MH, Kuipers OP . Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation. Microbiology and Molecular Biology Reviews. 2009;73: 481–509. doi:10.1128/mmbr.00037-08

  26. [26]

    The emergence and evolution of gene expression in genome regions replete with regulatory motifs

    Fuqua T, Sun Y , Wagner A. The emergence and evolution of gene expression in genome regions replete with regulatory motifs. Lynch VJ, Moses AM, editors. eLife. 2024;13: RP98654. doi:10.7554/eLife.98654

  27. [27]

    The latent cis-regulatory potential of mobile DNA in Escherichia coli

    Fuqua T, Wagner A. The latent cis-regulatory potential of mobile DNA in Escherichia coli. Nat Commun. 2025;16: 4740. doi:10.1038/s41467-025-60023-w

  28. [28]

    De-novo promoters emerge more readily from random DNA than from genomic DNA

    Fuqua T, Wagner A. De-novo promoters emerge more readily from random DNA than from genomic DNA. bioRxiv. 2025; 2025.08.25.672121. doi:10.1101/2025.08.25.672121

  29. [29]

    Expression noise facilitates the evolution of gene regulation

    Wolf L, Silander OK, van Nimwegen E. Expression noise facilitates the evolution of gene regulation. Golding I, editor. eLife. 2015;4: e05856. doi:10.7554/eLife.05856 Fuqua and Vakirlis, page 11

  30. [30]

    A non-canonical promoter element drives spurious transcription of horizontally acquired bacterial genes

    Warman EA, Singh SS, Gubieda AG, Grainger DC. A non-canonical promoter element drives spurious transcription of horizontally acquired bacterial genes. Nucleic Acids Res. 2020;48: 4891–4901. doi:10.1093/nar/gkaa244

  31. [31]

    Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations

    Peterman N, Levine E. Sort-seq under the hood: implications of design choices on large-scale characterization of sequence-function relations. BMC Genomics. 2016;17: 206. doi:10.1186/s12864- 016-2533-5

  32. [32]

    Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence

    Kinney JB, Murugan A, Callan CG, Cox EC. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proceedings of the National Academy of Sciences. 2010;107: 9158–9163. doi:10.1073/pnas.1004290107

  33. [33]

    DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers

    de Almeida BP , Reiter F , Pagani M, Stark A. DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat Genet. 2022;54: 613–624. doi:10.1038/s41588-022-01048-5

  34. [34]

    Diverse families of transposable elements affect the transcriptional regulation of stress-response genes in Drosophila melanogaster

    Villanueva-Cañas JL, Horvath V , Aguilera L, González J. Diverse families of transposable elements affect the transcriptional regulation of stress-response genes in Drosophila melanogaster. Nucleic Acids Res. 2019;47: 6842–6857. doi:10.1093/nar/gkz490

  35. [35]

    FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one- hybrid system

    Zhu LJ, Christensen RG, Kazemian M, Hull CJ, Enuameh MS, Basciotta MD, et al. FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial one- hybrid system. Nucleic Acids Res. 2011;39: D111–D117. doi:10.1093/nar/gkq858

  36. [36]

    Logomaker: beautiful sequence logos in Python

    Tareen A, Kinney JB. Logomaker: beautiful sequence logos in Python. Bioinformatics. 2020;36: 2272–

  37. [37]

    doi:10.1093/bioinformatics/btz921

  38. [38]

    Spatial expression of transcription factors in Drosophila embryonic organ development

    Hammonds AS, Bristow CA, Fisher WW, Weiszmann R, Wu S, Hartenstein V , et al. Spatial expression of transcription factors in Drosophila embryonic organ development. Genome Biol. 2013;14: R140. doi:10.1186/gb-2013-14-12-r140

  39. [39]

    Global analysis of patterns of gene expression during Drosophila embryogenesis

    Tomancak P , Berman BP , Beaton A, Weiszmann R, Kwan E, Hartenstein V , et al. Global analysis of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 2007;8: R145. doi:10.1186/gb-2007-8-7-r145

  40. [40]

    Systematic determination of patterns of gene expression during Drosophila embryogenesis

    Tomancak P , Beaton A, Weiszmann R, Kwan E, Shu S, Lewis SE, et al. Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 2002;3: RESEARCH0088. doi:10.1186/gb-2002-3-12-research0088

  41. [41]

    Dissociation of the dorsal-cactus complex and phosphorylation of the dorsal protein correlate with the nuclear localization of dorsal

    Whalen AM, Steward R. Dissociation of the dorsal-cactus complex and phosphorylation of the dorsal protein correlate with the nuclear localization of dorsal. J Cell Biol. 1993;123: 523–534. doi:10.1083/jcb.123.3.523

  42. [42]

    Yeh and I

    Banerji J, Olson L, Schaffner W. A lymphocyte-specific cellular enhancer is located downstream of the joining region in immunoglobulin heavy chain genes. Cell. 1983;33: 729–740. doi:10.1016/0092- 8674(83)90015-6

  43. [43]

    Ever-changing landscapes: transcriptional enhancers in development and evolution

    Long HK, Prescott SL, Wysocka J. Ever-changing landscapes: transcriptional enhancers in development and evolution. Cell. 2016;167: 1170–1187. doi:10.1016/j.cell.2016.09.018

  44. [44]

    Evolutionary origin of a novel gene expression pattern through co-option of the latent activities of existing regulatory sequences

    Rebeiz M, Jikomes N, Kassner VA, Carroll SB. Evolutionary origin of a novel gene expression pattern through co-option of the latent activities of existing regulatory sequences. Proceedings of the National Academy of Sciences. 2011;108: 10036–10043. doi:10.1073/pnas.1105937108

  45. [45]

    Decoding gene regulation in the fly brain

    Janssens J, Aibar S, Taskiran II, Ismail JN, Gomez AE, Aughey G, et al. Decoding gene regulation in the fly brain. Nature. 2022;601: 630–636. doi:10.1038/s41586-021-04262-z Fuqua and Vakirlis, page 12

  46. [46]

    Cell-type-directed design of synthetic enhancers

    Taskiran II, Spanier KI, Dickmänken H, Kempynck N, Pančíková A, Ekşi EC, et al. Cell-type-directed design of synthetic enhancers. Nature. 2024;626: 212–220. doi:10.1038/s41586-023-06936-2

  47. [47]

    Using AlphaFold to predict the impact of single mutations on protein stability and function

    Pak MA, Markhieva KA, Novikova MS, Petrov DS, Vorobyev IS, Maksimova ES, et al. Using AlphaFold to predict the impact of single mutations on protein stability and function. PLOS ONE. 2023;18: e0282689. doi:10.1371/journal.pone.0282689

  48. [48]

    Can AlphaFold2 predict the impact of missense mutations on structure? Nat Struct Mol Biol

    Buel GR, Walters KJ. Can AlphaFold2 predict the impact of missense mutations on structure? Nat Struct Mol Biol. 2022;29: 1–2. doi:10.1038/s41594-021-00714-2

  49. [49]

    Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development

    Galupa R, Alvarez-Canales G, Borst NO, Fuqua T, Gandara L, Misunou N, et al. Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development. Dev Cell. 2023;58: 51-62.e4. doi:10.1016/j.devcel.2022.12.003

  50. [50]

    Gene regulatory networks and essential transcription factors for de novo-originated genes

    Peng J, Wang B-J, Svetec N, Zhao L. Gene regulatory networks and essential transcription factors for de novo-originated genes. Nat Ecol Evol. 2025; 1–12. doi:10.1038/s41559-025-02747-y

  51. [51]

    Ancient Transposable Elements Transformed the Uterine Regulatory Landscape and Transcriptome during the Evolution of Mammalian Pregnancy

    Lynch VJ, Nnamani MC, Kapusta A, Brayer K, Plaza SL, Mazur EC, et al. Ancient Transposable Elements Transformed the Uterine Regulatory Landscape and Transcriptome during the Evolution of Mammalian Pregnancy. Cell Reports. 2015;10: 551–561. doi:10.1016/j.celrep.2014.12.052

  52. [52]

    Enhancer evolution across 20 mammalian species

    Villar D, Berthelot C, Aldridge S, Rayner TF , Lukk M, Pignatelli M, et al. Enhancer evolution across 20 mammalian species. Cell. 2015;160: 554–566. doi:10.1016/j.cell.2015.01.006

  53. [53]

    Stress response, behavior, and development are shaped by transposable element-induced mutations in Drosophila

    Rech GE, Bogaerts-Márquez M, Barrón MG, Merenciano M, Villanueva-Cañas JL, Horváth V , et al. Stress response, behavior, and development are shaped by transposable element-induced mutations in Drosophila. PLOS Genetics. 2019;15: e1007900. doi:10.1371/journal.pgen.1007900

  54. [54]

    Endogenous retroviruses drive KRAB zinc-finger protein family expression for tumor suppression

    Ito J, Kimura I, Soper A, Coudray A, Koyanagi Y , Nakaoka H, et al. Endogenous retroviruses drive KRAB zinc-finger protein family expression for tumor suppression. Science Advances. 2020;6: eabc3020. doi:10.1126/sciadv.abc3020

  55. [55]

    Control of the Hypoxic Response in Drosophila melanogaster by the Basic Helix-Loop-Helix PAS Protein Similar

    Lavista-Llanos S, Centanin L, Irisarri M, Russo DM, Gleadle JM, Bocca SN, et al. Control of the Hypoxic Response in Drosophila melanogaster by the Basic Helix-Loop-Helix PAS Protein Similar. Mol Cell Biol. 2002;22: 6842–6853. doi:10.1128/MCB.22.19.6842-6853.2002

  56. [56]

    Rapid evolution of protein diversity by de novo origination in Oryza

    Zhang L, Ren Y , Yang T, Li G, Chen J, Gschwend AR, et al. Rapid evolution of protein diversity by de novo origination in Oryza. Nature Ecology & Evolution. 2019; 1. doi:10.1038/s41559-019-0822-5

  57. [57]

    A Molecular Portrait of De Novo Genes in Yeasts

    Vakirlis N, Hebert AS, Opulente DA, Achaz G, Hittinger CT, Fischer G, et al. A Molecular Portrait of De Novo Genes in Yeasts. Mol Biol Evol. 2018;35: 631–645. doi:10.1093/molbev/msx315

  58. [58]

    Recent de novo origin of human protein-coding genes

    Knowles DG, McLysaght A. Recent de novo origin of human protein-coding genes. Genome Res. 2009 [cited 10 Feb 2016]. doi:10.1101/gr.095026.109

  59. [59]

    Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting

    Fellner L, Simon S, Scherling C, Witting M, Schober S, Polte C, et al. Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting. BMC Evolutionary Biology. 2015;15: 283. doi:10.1186/s12862-015-0558-z

  60. [60]

    Translation of neutrally evolving peptides provides a basis for de novo gene evolution

    Ruiz-Orera J, Verdaguer-Grau P , Villanueva-Cañas JL, Messeguer X, Albà MM. Translation of neutrally evolving peptides provides a basis for de novo gene evolution. Nat Ecol Evol. 2018;2: 890–896. doi:10.1038/s41559-018-0506-6

  61. [61]

    Proto-genes and de novo gene birth

    Carvunis A-R, Rolland T, Wapinski I, Calderwood MA, Yildirim MA, Simonis N, et al. Proto-genes and de novo gene birth. Nature. 2012;487: 370–374. doi:10.1038/nature11184 Fuqua and Vakirlis, page 13

  62. [62]

    A Shift in Aggregation Avoidance Strategy Marks a Long-Term Direction to Protein Evolution

    Foy SG, Wilson BA, Bertram J, Cordes MHJ, Masel J. A Shift in Aggregation Avoidance Strategy Marks a Long-Term Direction to Protein Evolution. Genetics. 2019;211: 1345–1355. doi:10.1534/genetics.118.301719

  63. [63]

    De novo gene birth

    Oss SBV , Carvunis A-R. De novo gene birth. PLOS Genetics. 2019;15: e1008160. doi:10.1371/journal.pgen.1008160

  64. [64]

    A Continuum of Evolving De Novo Genes Drives Protein- Coding Novelty in Drosophila

    Heames B, Schmitz J, Bornberg-Bauer E. A Continuum of Evolving De Novo Genes Drives Protein- Coding Novelty in Drosophila. J Mol Evol. 2020 [cited 8 Apr 2020]. doi:10.1007/s00239-020-09939-z

  65. [65]

    Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth

    Wilson BA, Foy SG, Neme R, Masel J. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nature Ecology & Evolution. 2017;1: 0146. doi:10.1038/s41559-017-0146

  66. [66]

    High GC content causes orphan proteins to be intrinsically disordered

    Basile W, Sachenkova O, Light S, Elofsson A. High GC content causes orphan proteins to be intrinsically disordered. PLOS Computational Biology. 2017;13: e1005375. doi:10.1371/journal.pcbi.1005375

  67. [67]

    The origin and structural evolution of de novo genes in Drosophila

    Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun. 2024;15: 810. doi:10.1038/s41467-024-45028-1

  68. [68]

    Noncoding translation mitigation

    Kesner JS, Chen Z, Shi P , Aparicio AO, Murphy MR, Guo Y , et al. Noncoding translation mitigation. Nature. 2023; 1–8. doi:10.1038/s41586-023-05946-4

  69. [69]

    Degradation determinants are abundant in human noncanonical proteins

    Casola C, Owoyemi A, Vakirlis N. Degradation determinants are abundant in human noncanonical proteins. bioRxiv; 2024. p. 2024.05.01.592071. doi:10.1101/2024.05.01.592071

  70. [70]

    De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences

    Vakirlis N, Acar O, Hsu B, Castilho Coelho N, Van Oss SB, Wacholder A, et al. De novo emergence of adaptive membrane proteins from thymine-rich genomic sequences. Nat Commun. 2020;11: 781. doi:10.1038/s41467-020-14500-z

  71. [71]

    Intergenic Regions of Saccharomycotina Yeasts are Enriched in Potential to Encode Transmembrane Domains

    Tassios E, Nikolaou C, Vakirlis N. Intergenic Regions of Saccharomycotina Yeasts are Enriched in Potential to Encode Transmembrane Domains. Molecular Biology and Evolution. 2023;40: msad059. doi:10.1093/molbev/msad059

  72. [72]

    The hidden bacterial microproteome

    Fesenko I, Sahakyan H, Dhyani R, Shabalina SA, Storz G, Koonin EV . The hidden bacterial microproteome. Molecular Cell. 2025;0. doi:10.1016/j.molcel.2025.01.025

  73. [73]

    Intergenic polyA/T tracts explain the propensity of yeast de novo genes to encode transmembrane domains

    Vakirlis N, Fuqua T. Intergenic polyA/T tracts explain the propensity of yeast de novo genes to encode transmembrane domains. j evol Biol. 2025; voaf089. doi:10.1093/jeb/voaf089

  74. [74]

    Cellular processing of beneficial de novo emerging proteins

    Houghton CJ, Coelho NC, Chiang A, Hedayati S, Parikh SB, Ozbaki-Yagan N, et al. Cellular processing of beneficial de novo emerging proteins. bioRxiv. 2024; 2024.08.28.610198. doi:10.1101/2024.08.28.610198

  75. [75]

    The Role of Nucleosome Positioning in the Evolution of Gene Regulation

    Tsankov AM, Thompson DA, Socha A, Regev A, Rando OJ. The Role of Nucleosome Positioning in the Evolution of Gene Regulation. PLOS Biology. 2010;8: e1000414. doi:10.1371/journal.pbio.1000414

  76. [76]

    Studying membrane proteins through the eyes of the genetic code revealed a strong uracil bias in their coding mRNAs

    Prilusky J, Bibi E. Studying membrane proteins through the eyes of the genetic code revealed a strong uracil bias in their coding mRNAs. PNAS. 2009;106: 6662–6666. doi:10.1073/pnas.0902029106

  77. [77]

    Poly(dA:dT) Tracts: Major Determinants of Nucleosome Organization

    Segal E, Widom J. Poly(dA:dT) Tracts: Major Determinants of Nucleosome Organization. Curr Opin Struct Biol. 2009;19: 65–71. doi:10.1016/j.sbi.2009.01.004

  78. [78]

    Why transcription factor binding sites are ten nucleotides long

    Stewart AJ, Hannenhalli S, Plotkin JB. Why transcription factor binding sites are ten nucleotides long. Genetics. 2012;192: 973–985. doi:10.1534/genetics.112.143370 Fuqua and Vakirlis, page 14

  79. [79]

    Information content of binding sites on nucleotide sequences

    Schneider TD, Stormo GD, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol. 1986;188: 415–431. doi:10.1016/0022-2836(86)90165-8

  80. [80]

    Different gene regulation strategies revealed by analysis of binding motifs

    Wunderlich Z, Mirny LA. Different gene regulation strategies revealed by analysis of binding motifs. Trends in Genetics. 2009;25: 434–440. doi:10.1016/j.tig.2009.08.003

Showing first 80 references.