Supporting data for the near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

Dataset type: Genomic
Data released on September 29, 2017

Zimin AV; Puiu D; Hall R; Kingan S; Clavijo B; Salzberg SL (2017): Supporting data for the near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum GigaScience Database. http://dx.doi.org/10.5524/100356

DOI10.5524/100356

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15,344,693,583 bases and has a weighted average (N50) contig size of 232,659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4,179,762,575 bp of T. aestivum that correspond to its D genome components.

Additional details

Read the peer-reviewed publication(s):

Zimin, A. V., Puiu, D., Hall, R., Kingan, S., Clavijo, B. J., & Salzberg, S. L. (2017). The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum. GigaScience, 6(11), 1–7. doi:10.1093/gigascience/gix097

Accessions (data included in GigaDB):

BioProject: PRJNA392179
GenBank: NMPL00000000





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
Triticum aestivum WGS4565Canadian hard winter wheatbread wheatTriticum aestivum Description:Genomic DNA extracted from leaves of t...
Analyte type:DNA
Alternative accession-BioSample:SAMN07284949
...
+
Displaying 1-1 of 1 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
Sequence assemblyFASTA3.31 GB2017-09-26
Mixed archiveTAR1.15 MB2017-09-26
ReadmeTEXT4.47 KB2017-09-26
otherTAR125.55 MB2017-09-26
OtherTAR142.91 MB2017-09-26
Genome sequenceFASTA3.89 GB2017-09-26
Sequence assemblyFASTA3.92 GB2017-09-26
Sequence assemblyFASTA3.93 GB2017-09-26
Sequence assemblyFASTA4.31 GB2017-09-26
Sequence assemblyFASTA3.71 GB2017-09-26
Displaying 1-10 of 13 File(s).
Funding body Awardee Award ID Comments
National Human Genome Research Institute SL Salzberg R01HG006677
National Science Foundation AV Zimin IOS-1444893 Directorate for Biological Sciences
National Science Foundation SL Salzberg IOS-1238231 Directorate for Biological Sciences
National Science Foundation J Dvorak IOS-1238231 Directorate for Biological Sciences
Date Action
September 28, 2017 Dataset publish
November 13, 2017 Manuscript Link added : 10.1093/gigascience/gix097