Genomic data of the diploid cotton (Gossypium raimondii).

Dataset type: Genomic
Data released on March 07, 2014

Wang K; Wang Z; Li F; Ye W; Wang J; Song G; Yue Z; Cong L; Shang H; Zhu S; Zou C; Li Q; Yuan Y; Lu C; Wei H; Gou C; Zheng Z; Yin Y; Zhang X; Liu K; Wang B; Song C; Shi N; Kohel RJ; Percy RG; Yu JZ; Zhu Y; Wang J; Yu S (2014): Genomic data of the diploid cotton (Gossypium raimondii). GigaScience Database.


Cotton is one of the most economically important crop plants worldwide. Its fiber, commonly known as cotton lint, is the principal natural source for the textile industry.
We have sequenced and assembled a draft genome of G. raimondii, whose progenitor is the putative contributor of the D subgenome to the economically important fiber-producing cotton species Gossypium hirsutum and Gossypium barbadense.
We sequenced the 0.78 Gb genome to a depth of approximately 103 X with short reads from a series of libraries with various insert sizes ( 170 bp, 250 bp, 500 bp, 800 bp, 2 kb, 5 kb, 10 kb, 20 kb and 40 kb) on a HiSeq 2000 sequencer.
The assembled scaffolds of high quality sequences total 78.7 Gb, with the contig and scaffold N50 values of 44.9 kb and 2.3 Mb respectively. We identified 40,976 protein-coding genes with an mean length of 1104 bb.

Additional details

Read the peer-reviewed publication(s):

Wang, K., Wang, Z., Li, F., Ye, W., Wang, J., Song, G., … Yu, S. (2012). The draft genome of a diploid cotton Gossypium raimondii. Nature Genetics, 44(10), 1098–1103. doi:10.1038/ng.2371 (PubMed: 22922876)

Accessions (data generated as part of this study):

BioProject: PRJNA82769

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
Diploid Cotton CMD1029730Gossypium raimondiiGossypium raimondii Cultivar:CMD10
Geographic location (country and/or sea,region):Ch...
Geographic location (latitude and longitude):not r...
Displaying 1-1 of 1 Sample(s).

File NameSample IDData TypeFile FormatSizeRelease Date 
Diploid Cotton CMD10Coding sequenceFASTA46.35 MB2014-03-07
Diploid Cotton CMD10Sequence assemblyFASTA739.33 MB2014-03-07
Diploid Cotton CMD10AnnotationGFF17.94 MB2014-03-07
Diploid Cotton CMD10Protein sequenceFASTA18.06 MB2014-03-07
ReadmeUNKNOWN0.97 KB2014-03-07
Displaying 1-5 of 5 File(s).
Date Action