Supporting data for "CNVcaller: High efficient and Widely Applicable Software for Detecting Copy Number Variations in large Populations"

Dataset type: Software
Data released on December 20, 2017

Wang X; Zheng Z; Cai Y; Chen T; Li C; Fu W; Jiang Y (2017): Supporting data for "CNVcaller: High efficient and Widely Applicable Software for Detecting Copy Number Variations in large Populations" GigaScience Database. http://dx.doi.org/10.5524/100380

DOI10.5524/100380

The increasing amount of sequencing data available for a wide variety of species can be theoretically used for detecting copy number variations (CNVs) at the population level. However, the growing sample sizes and the divergent complexity of non-human genomes challenge the efficiency and robustness of current human-oriented CNV detection methods.
Here, we present CNVcaller, a read-depth method for discovering CNVs in population sequencing data. The computational speed of CNVcaller was 1-2 orders of magnitude faster than CNVnator and Genome STRiP for complex genomes with thousands of unmapped scaffolds. CNV detection of 232 goats required only 1.4 days on a single compute node. Additionally, the Mendelian consistency of sheep trios indicated that CNVcaller mitigated the influence of high proportions of gaps and misassembled duplications in the non-human reference genome assembly. Furthermore, multiple evaluations using real sheep and human data indicated that CNVcaller achieved the best accuracy and sensitivity for detecting duplications.
The fast, generalized detection algorithms included in CNVcaller overcome prior computational barriers for detecting CNVs in large-scale sequencing data with complex genomic structures. Therefore, CNVcaller promotes population genetic analyses of functional CNVs in more species.

Additional details

Read the peer-reviewed publication(s):

Wang, X., Zheng, Z., Cai, Y., Chen, T., Li, C., Fu, W., & Jiang, Y. (2017). CNVcaller: highly efficient and widely applicable software for detecting copy number variations in large populations. GigaScience, 6(12). doi:10.1093/gigascience/gix115

Additional information:

https://github.com/JiangYuLab/CNVcaller

http://animal.nwsuaf.edu.cn/software





File NameSample IDData TypeFile FormatSizeRelease Date 
ReadmeTEXT2.34 KB2017-12-20
GitHub archivearchive36.85 KB2017-11-17
mixed archiveTAR3.96 GB2017-11-17
Displaying 1-3 of 3 File(s).
Funding body Awardee Award ID Comments
National Natural Science Foundation of China Y Jiang 31572381
Ministry of Science and Technology (MOST) Y Jiang National Thousand Youth Talents Plan
Date Action
December 20, 2017 Dataset publish
December 20, 2017 Description updated from : The increasing amount of sequencing data available for a wide variety of species can be theoretically used for detecting copy number variations (CNVs) at the population level. However, the growing sample sizes and the divergent complexity of non-human genomes challenge the efficiency and robustness of current human-oriented CNV detection methods. Here, we present CNVcaller, a read-depth method for discovering CNVs in population sequencing data. The computational speed of CNVcaller was 1-2 orders of magnitude faster than CNVnator and Genome STRiP for complex genomes with thousands of unmapped scaffolds. CNV detection of 232 goats required only 1.4 days on a single compute node. Additionally, the Mendelian consistency of sheep trios indicated that CNVcaller mitigated the influence of high proportions of gaps and misassembled duplications in the non-human reference genome assembly. Furthermore, multiple evaluations using real sheep and human data indicated that CNVcaller achieved the best accuracy and sensitivity for detecting duplications. The fast, generalized detection algorithms included in CNVcaller overcome prior computational barriers for detecting CNVs in large-scale sequencing data with complex genomic structures. Therefore, CNVcaller promotes population genetic analyses of functional CNVs in more species.
January 9, 2018 Manuscript Link added : 10.1093/gigascience/gix115