pith. sign in

arxiv: 1604.02598 · v1 · pith:TKRLUOH3new · submitted 2016-04-09 · 📊 stat.ME

Species richness estimation with high diversity but spurious singletons

classification 📊 stat.ME
keywords richnessbreakawayestimatetaxacountsestimationfrequencynof1
0
0 comments X
read the original abstract

The presence of uncommon taxa in high-throughput sequenced ecological samples pose challenges to the microbial ecologist, bioinformatician and statistician. It is rarely certain whether these taxa are truly present in the sample or the result of sequencing errors. Unfortunately, alpha-diversity quantification relies on accurate frequency counts, which can rarely be guaranteed. We present a species richness estimation tool which predicts both the number of unobserved taxa and the number of true singletons based on the non-singleton frequency counts. This method can be treated as either inferential (for formally estimating richness) or exploratory (for assessing robustness of the richness estimate to the singleton count). If the estimate, called breakaway_nof1, is comparable to other richness estimators, this provides evidence that the richness estimate is robust to the level of quality control (eg. chimera-checking) employed in pre-processing. The function breakaway_nof1 is freely available from CRAN via the R package breakaway.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.