pith. sign in

arxiv: q-bio/0412037 · v1 · submitted 2004-12-18 · 🧬 q-bio.GN

Divergence and Shannon information in genomes

classification 🧬 q-bio.GN
keywords sequencedivergencegenomeslengthcompositiongenomegenomicgreater
0
0 comments X
read the original abstract

Shannon information (SI) and its special case, divergence, are defined for a DNA sequence in terms of probabilities of chemical words in the sequence and are computed for a set of complete genomes highly diverse in length and composition. We find the following: SI (but not divergence) is inversely proportional to sequence length for a random sequence but is length-independent for genomes; the genomic SI is always greater and, for shorter words and longer sequences, hundreds to thousands times greater than the SI in a random sequence whose length and composition match those of the genome; genomic SIs appear to have word-length dependent universal values. The universality is inferred to be an evolution footprint of a universal mode for genome growth.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.