pith. machine review for the scientific record. sign in

arxiv: 1511.04944 · v2 · submitted 2015-11-16 · 🧬 q-bio.GN · cs.IT· math.IT

Recognition: unknown

NASCUP: Nucleic Acid Sequence Classification by Universal Probability

Authors on Pith no claims yet
classification 🧬 q-bio.GN cs.ITmath.IT
keywords classificationnascupnucleotideprobabilitysequencesequencesuniversalaccuracy
0
0 comments X
read the original abstract

Motivated by the need for fast and accurate classification of unlabeled nucleotide sequences on a large scale, we developed NASCUP, a new classification method that captures statistical structures of nucleotide sequences by compact context-tree models and universal probability from information theory. NASCUP achieved BLAST-like classification accuracy consistently for several large-scale databases in orders-of-magnitude reduced runtime, and was applied to other bioinformatics tasks such as outlier detection and synthetic sequence generation.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.