Help Login Create account

Data released on December 20, 2017

Supporting data for "Hybrid-denovo: A de novo OTU-picking pipeline integrating single-end and paired-end16S sequence tags"

Chen, X; Johnson, S; Jeraldo, P; Wang, J; Chia, N; Kocher, J, A; Chen, J (2017): Supporting data for "Hybrid-denovo: A de novo OTU-picking pipeline integrating single-end and paired-end16S sequence tags" GigaScience Database. RIS BibTeX Text

Illumina paired-end sequencing has been increasingly popular for 16S rRNA gene-based microbiota profiling. It provides higher phylogenetic resolution than single-end reads due to a longer read length. However, the reverse read (R2) often has much significantly base quality and a large proportion of R2s will be discarded after quality control, resulting in a mixture of paired-end and single-end reads. A typical 16S analysis pipeline usually processes either paired-end or single-end reads but not a mixture. Thus, the quantification accuracy and statistical power will be reduced due to the loss of a large amount of reads. As a result, rare taxa may not be detectable with paired-end approach or low taxonomic resolution will be resulted with single-end approach.
To have both the higher phylogenetic resolution provided by paired-end reads and the higher sequence coverage by single-end reads, we propose a novel de novo OTU-picking pipeline, hybrid-denovo, that can process a hybrid of single-end and paired-end reads. Using high quality paired-end reads as a “gold standard”, we show that hybrid-denovo achieved the highest correlation with the “gold standard” and performed better than the approaches based on paired-end or single-end reads in terms of quantifying the microbial diversity and taxonomic abundances. By applying our method to a rheumatoid arthritis (RA) data set, we demonstrated that hybrid-denovo captured more microbial diversity and identified more RA-associated taxa than paired-end or single-end approach. Hybrid-denovo is more powerful than de novo OTU picking approaches based on paired-end or single-end 16S sequence tags, and is recommended for 16S rRNA gene targeted paired-end sequencing data.

Contact Submitter

Read the peer-reviewed publication(s):

Chen, X., Johnson, S., Jeraldo, P., Wang, J., Chia, N., Kocher, J.-P. A., & Chen, J. (2017). Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags. GigaScience, 7(3). doi:10.1093/gigascience/gix129

Additional information:

Accessions (data not in GigaDB):

BioProject: PRJNA317370
BioProject: PRJEB13940


microbiome otu picking 16s rrna 




  • Funding body - Mayo Clinic
  • Comment - Center for Individualized Medicine
  • Awardee - Xianfeng Chen

Files: (FTP site) Table Settings


File Description
Sample ID
Data Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
ReadmeTEXT2.6 KB2017-11-28
SoftwareTAR1.01 GB2017-11-28
SoftwareTAR5.59 GB2017-11-28
Displaying 1-3 of 3 File(s).



Other datasets you might like: