Data released on December 03, 2015
Comprehensive characterization of genomic variation in a human individual is important for understanding disease and for development of personalized approaches to treatment. Many tools exist for identification of single nucleotide polymorphism (snps), small indels and large deletions based on DNA re-sequencing strategy. However, those approaches consistently display significant bias for recovery of complex structural variants and novel sequence in the individual genomes and lack sequence interpretation such as ancestral state and mechanism. Here we present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variants and novel sequence in population-scale de novo assemblies at single nucleotide resolution. Our approach displays good scalability and makes it applicable for investigations in large population studies of species with complex genomes, such as homo sapiens. Application of AsmVar to several human de novo assemblies captures a wide spectrum of structural variants and novel sequences present in the human population with high sensitivity and specificity. Our method provides a direct solution to investigate the structural variations and novel sequences from de novo assemblies, which is important for construction of population-scale pan genome. Our study also suggests the advantages of the de novo assembly strategy for definition of genome structure.
This software has been released under the MIT License Copyright 2014-2015.