Recognition: unknown
The origin of the genetic code is encrypted in the structure of present-day transfer RNAs
Pith reviewed 2026-05-10 14:52 UTC · model grok-4.3
The pith
A genealogical tree built from Bacillus subtilis tRNA sequences reflects the order amino acids were added to the genetic code.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Comparison of tRNA primary structures from Bacillus subtilis produces a genealogical tree whose branching order corresponds to the chronological entry of amino acids into the Universal Codon Table, indicating that present-day tRNA sequences preserve information about the code's origin.
What carries the argument
The genealogical tree constructed by comparing tRNA primary sequences, which serves as a record of the temporal order of amino acid incorporation into the codon assignments.
If this is right
- The genetic code assembled gradually through successive addition of amino acids carried by specific tRNAs.
- tRNA molecules were present and functional at the earliest stages of code formation.
- The order of amino acid addition follows a pattern visible in the current code structure.
- Early life used a simpler set of amino acids that expanded over time according to this sequence.
- This provides a scenario for how the 20 amino acids colonized the codon table.
Where Pith is reading between the lines
- If the tree order holds across other species, it would strengthen the case for a universal historical sequence.
- Experiments could test whether early tRNAs with fewer amino acids can still support basic translation.
- The poly-tRNA theory proposed could be explored by modeling how multiple tRNAs interacted in prebiotic conditions.
- This approach might apply to reconstructing the code's evolution in other genetic systems.
Load-bearing premise
The primary sequences of modern tRNAs have remained sufficiently unchanged since their origin to allow the constructed tree to accurately reflect the historical order of amino acid entry without major disruptions from mutations or gene transfers.
What would settle it
Demonstrating that a phylogenetic tree from tRNAs of another bacterium yields a different branching order that does not align with the same amino acid addition sequence, or finding that the tree order contradicts known codon assignments in a systematic way.
read the original abstract
Background/ Objectives: Resolving the origin of the genetic code is fundamental to understanding how life began its journey out of the chemical world. Since its deciphering some 60 years ago, there is still no general theory of the emergence of the genetic code. My objectives are to bring some unique data that might provide some insight into this particular issue. Methods: Because tRNA (transfer RNA) constitutes a crucial piece of the present translational system, having unique structural characteristics, I hypothesized that they might constitute the key elements at the origin of the genetic code and thus decided to compare the primary structure of the tRNAs from a bacterium, Bacillus subtilis. Results: The comparison of the primary structure of the tRNAs from Bacillus subtilis generated a genealogical tree, meaning that the tRNAs were all related and appeared gradually in a precise time sequence. Remarkably, analysis of the various characteristics of this tRNAs tree showed that it very likely reflects the time of entry of amino acids into the Universal Codon Table. Conclusions: These results strongly suggest that the tRNA entity was indeed a major component in the formation of the genetic code and, further, provide a likely scenario for the time sequence of codon colonization of the Universal Codon Table by the various amino acids at the very beginning of life. Also, these data are interpreted in terms of a general theory of the origin of the genetic code I propose, the poly-tRNA theory.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that a genealogical tree constructed by comparing the primary sequences of tRNAs from Bacillus subtilis encodes the historical sequence in which amino acids entered the universal codon table. Analysis of tree characteristics is presented as evidence for this chronology, leading to the proposal of a 'poly-tRNA theory' for the origin of the genetic code.
Significance. If the central result holds after rigorous validation, it would provide a novel, sequence-based timeline for genetic code evolution derived from contemporary molecules, potentially offering a falsifiable framework for the order of amino acid incorporation and emphasizing tRNA's role in early translation. This could stimulate new experimental tests in evolutionary biology.
major comments (3)
- [Results (genealogical tree generation)] The Results section on tree construction provides no alignment protocol, substitution model, distance metric, tree-building algorithm, or outgroup rooting for the ~76-nt tRNA sequences. This omission prevents assessment of whether the reported branching order is robust or an artifact of saturation, long-branch attraction, or functional convergence in the anticodon loop.
- [Conclusions] The claim in the Conclusions that tree characteristics 'very likely reflect the time of entry of amino acids into the Universal Codon Table' lacks any independent dating method, error estimates, or external benchmark validation. The mapping relies on interpretive matching to the existing codon table, rendering the interpretation circular without a falsifiable prediction independent of known amino-acid assignments.
- [Discussion / poly-tRNA theory] The poly-tRNA theory is introduced without quantitative comparison to prior models of code evolution or discussion of how the approach accounts for post-duplication sequence changes, horizontal transfers, or base modifications that could overwrite the original chronological signal in modern tRNA primary structures.
minor comments (2)
- [Results] Include the full tRNA alignment, the resulting tree topology (with branch lengths and support values), and an explicit table mapping tree nodes to amino acids and entry times to allow independent verification.
- [Abstract and Results] Clarify the precise 'various characteristics' of the tree used for the chronological mapping, as this step is load-bearing for the central claim.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which have helped us identify areas for improvement in clarity and rigor. We address each major comment below and indicate the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: [Results (genealogical tree generation)] The Results section on tree construction provides no alignment protocol, substitution model, distance metric, tree-building algorithm, or outgroup rooting for the ~76-nt tRNA sequences. This omission prevents assessment of whether the reported branching order is robust or an artifact of saturation, long-branch attraction, or functional convergence in the anticodon loop.
Authors: We agree that the original manuscript omitted essential methodological details for the tRNA sequence comparison and tree construction. In the revised version, we will add a dedicated Methods section specifying the alignment protocol (Clustal Omega with RNA-specific parameters), substitution model (Kimura two-parameter), distance metric, tree-building algorithm (neighbor-joining), and rooting approach. This will allow readers to assess robustness against the mentioned artifacts. revision: yes
-
Referee: [Conclusions] The claim in the Conclusions that tree characteristics 'very likely reflect the time of entry of amino acids into the Universal Codon Table' lacks any independent dating method, error estimates, or external benchmark validation. The mapping relies on interpretive matching to the existing codon table, rendering the interpretation circular without a falsifiable prediction independent of known amino-acid assignments.
Authors: The referee is correct that no independent dating or error estimates are provided, as the analysis relies on the post-hoc correlation between the sequence-derived tree and known codon assignments. We will revise the Conclusions to explicitly state that the tree is built solely from primary sequence data without using amino-acid identities as input, and we will add a section proposing falsifiable predictions (e.g., consistency checks with tRNA sets from additional species). Independent chronological calibration remains difficult for such ancient events, but the sequence-first approach reduces circularity. revision: partial
-
Referee: [Discussion / poly-tRNA theory] The poly-tRNA theory is introduced without quantitative comparison to prior models of code evolution or discussion of how the approach accounts for post-duplication sequence changes, horizontal transfers, or base modifications that could overwrite the original chronological signal in modern tRNA primary structures.
Authors: We accept that the poly-tRNA theory section requires expansion for context and limitations. The revised Discussion will include qualitative comparisons to prior models (coevolution, stereochemical, and frozen-accident hypotheses) and a new subsection addressing post-duplication divergence, horizontal transfers, and base modifications, explaining why conserved tRNA structural motifs may still preserve chronological information. We will also note these as areas for future quantitative modeling. revision: yes
Circularity Check
No significant circularity; phylogenetic tree interpreted as historical signal without definitional reduction
full rationale
The paper constructs a genealogical tree directly from primary sequence comparisons of Bacillus subtilis tRNAs and interprets its branching order as reflecting the chronological entry of amino acids into the codon table. This is an empirical hypothesis resting on the assumption that modern sequences retain detectable historical signal, not a derivation that reduces the output to the input by construction. No equations, fitted parameters, or self-citations are invoked to force the claimed order; the result is presented as a scenario derived from the tree rather than tautologically equivalent to the sequence data or the known codon table. The analysis is therefore self-contained against external benchmarks such as independent phylogenetic validation or falsification via sequence saturation tests.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Present-day tRNA primary structures have not been substantially altered by sequence evolution or horizontal transfer since the origin of the genetic code.
- ad hoc to paper Characteristics of the tRNA tree can be mapped onto the order of amino acid entry without circular reference to the current codon table.
invented entities (1)
-
poly-tRNA theory
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Origin and evolution of the genetic code: The universal enigma
Koonin, E.V.; Novozhilov, A.S. Origin and evolution of the genetic code: The universal enigma. IUBMB Life 2009, 61(2), 99-111
2009
-
[2]
Origin and evolution of the Universal Genetic Code
Koonin, E.V.; Novozhilov, A.S. Origin and evolution of the Universal Genetic Code. Annu. Rev. Genet 2017, 51, 45-62
2017
-
[3]
Crick, F.H.The origin of the genetic code. J. Mol. Biol. 1968, 38, 367‒379
1968
-
[4]
Wong, J. T-F. A co-evolution theory of the genetic code. Proc. Natl. Acad. Sci. USA 1975, 72, 1909‒1912
1975
-
[5]
First approximation of a stereochemical rationale for the genetic code based on the topography and physicochemical properties of “cavities “constructed from models of DNA
Hendry, L.B.; Bransome, E.D.; Hutson, M.S.; Campbell, L.K. First approximation of a stereochemical rationale for the genetic code based on the topography and physicochemical properties of “cavities “constructed from models of DNA. Proc. Natl. Acad. Sci. USA 1981, 78, 7440‒7444
1981
-
[6]
RNA-amino acid binding: a stereochemical era for the genetic code
Yarus, M.; Widmann, J.J.; Knight, R. RNA-amino acid binding: a stereochemical era for the genetic code. J. Mol Evol. 2009, 69, 406-429
2009
-
[7]
The case for an error minimizing standard genetic code
Freeland, S.J.; Wu, T.; Keulmann, N. The case for an error minimizing standard genetic code. Orig. Life Evol. Biosph. 2003, 34, 457‒477
2003
-
[8]
On the possible origin and evolution of genetic coding
Daniel, J.H. On the possible origin and evolution of genetic coding. Preprint at https://arXiv.org/q-bio/1910.13622 (2019)
-
[9]
On some predictions of the poly-tRNA model for the origin and evolution of genetic coding
Daniel, J.H. On some predictions of the poly-tRNA model for the origin and evolution of genetic coding. Preprint at https://arXiv.org/q-bio/2008.05902 (2020)
-
[10]
The evolution of the ribosome and the genetic code
Harman, H.; Smith, T.F. The evolution of the ribosome and the genetic code. Life (Basel) 2014, 4, 227‒249
2014
-
[11]
Vibrational dynamics of transfer RNAs: comparison of the free and synthetase-bound forms
Bahar, I.; Jernigan, R.L. Vibrational dynamics of transfer RNAs: comparison of the free and synthetase-bound forms. J. Mol. Biol. 1998, 281, 871‒884
1998
-
[12]
Evolutionary rate at the molecular level
Kimura, M. Evolutionary rate at the molecular level. Nature 1968, 217, 624-626
1968
-
[13]
Ohta, T.The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 1992, 23, 263- 286
1992
-
[14]
RNA 2020, 26, 278‒289
Chan, C.W.; Badong, D.; Rajan, R.; Mondragon, A.Crystal structures of unmodified bacterial tRNA reveal intrinsic structural flexibility and plasticity as general properties of unbound tRNAs. RNA 2020, 26, 278‒289
2020
-
[15]
Molecular basis for the genetic code
Shimizu, M. Molecular basis for the genetic code. J. Mol. Evol. 1982, 18, 297‒303. 23
1982
-
[16]
Prebiotic beta-strand peptides may be a plausible solution to the protocell permeability problem
Sems, A. Prebiotic beta-strand peptides may be a plausible solution to the protocell permeability problem. Preprints.org: Life Sciences DOI:10.20944/preprints202509.2525.v1 (2025)
-
[17]
Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code
Blazej, P.; Wnetrzak, M.; Mackiewicz, D.; Gagat, P.; Mackiewitcz, P. Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J. Theor. Biol. 2019, 464, 21‒32
2019
-
[18]
Genetic code evolution reveals the neutral emergence of mutational robustness and information as an evolutionary constraint
Massey, S.E. Genetic code evolution reveals the neutral emergence of mutational robustness and information as an evolutionary constraint. Life (Basel) 2015, 5(2), 1301-1332
2015
-
[19]
The neutral emergence of error minimized genetic codes superior to the standard genetic code
Massey, S.E. The neutral emergence of error minimized genetic codes superior to the standard genetic code. J. Theor. Biol. 2016, 408, 237-242
2016
-
[20]
Jacob, Evolution and tinkering
F. Jacob, Evolution and tinkering. Science 1976, 196, 1161‒1166
1976
-
[21]
Lei, L.; Burton, Z.F. Evolution of the genetic code. Transcription 2021, 12, 28-53, doi: 10.1080/21541264.2021.1927652
-
[22]
A prebiotically plausible scenario of an RNA-peptide world
Muller, F.; Escobar, L.; Xu, F.; Wegrzyn, E.; Nainyte, M.; Amatov, T.; Chan, C.Y.; Pichler, A.; Carell, T. A prebiotically plausible scenario of an RNA-peptide world. Nature 2022, 605, 279- 284, doi: 10.1038/s41586-022-04676-3
-
[23]
Wehbi, S.; Wheeler, A.; Morel, B.; Manepalli, N.; Minh, B.Q.; Lauretta, D.S.; Masel, J. Order of amino acid recruitment into the genetic code resolved by last universal common ancestor's protein domains. Proc Natl Acad Sci U S A 2024, 121, e2410311121, doi:10.1073/pnas.2410311121
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.