Data released on July 31, 2016
Domesticated apple (Malus domestica Borkh) is a popular temperate fruit with high levels of nutrients and a diversity of flavors. In 2012, global apple production accounted for at least one tenth of all harvested fruits. A high-quality apple genome assembly is crucial for the selection and breeding of new cultivars. Currently, there is only a single genome reference available for apple, assembled from 16.9 genome coverage short reads via Sanger and 454 sequencing technologies. Although this is a useful resource, this assembly covers only ~89% of the non-repetitive portion of the genome, and has a relatively short (16.7 kb) contig N50 length. These downsides make it difficult to apply this reference in transcriptive or whole-genome re-sequencing
Here we present an improved hybrid de novo genomic assembly of apple (Golden Delicious), which was obtained from 76 Gb (~ 102 genome coverage) Illumina HiSeq data and 21.7 Gb (~ 29 genome coverage) PacBio data. The final draft genome is approximately 632.4 Mb, representing ~ 90 % of the estimated genome. The contig N50 size is 111,619 bp, representing a 7 fold improvement. Further annotation analyses predicted 53,922 protein-coding genes and 2,765 non-coding RNA genes.
The new apple genome assembly will serve as a valuable resource for investigating complex apple traits at the genomic level. It is not only suitable for genome editing and gene cloning, but also for RNA-seq and whole-genome resequencing studies.