Genome data from sweet and grain sorghum (Sorghum bicolor).

Dataset type: Genomic
Data released on November 12, 2011

Zheng LY; Guo XS; He B; Sun LJ; Peng Y; Dong S; Liu TF; Jiang S; Ramachandran S; Liu CM; Jing HC (2011): Genome data from sweet and grain sorghum (Sorghum bicolor). GigaScience.


Sorghum is produced globally as a source of food, feed, fiber, and fuel. Grain and sweet sorghums differ in a number of important traits including stem sugar and juice accumulation, plant height, and the production of grain and biomass. The first sorghum whole-genome sequences are now available for analysis, but additional genomic sequences will be required to study genome-wide and intraspecific variation for dissecting the genetic basis of these important traits and for tailor-designed breeding of this important C4 crop. In a joint effort with scientists from the Institute of Botany of Chinese Academy of Sciences (Beijing) and Temasek Life Sciences Laboratory (Singapore), BGI resequenced two sweet and one grain sorghum inbred lines: E-Tian, Ji2731, and Keller. E-Tian (literally meaning Russian Sweet in Chinese) is a sweet sorghum line introduced into China in the early 1970’s. Ji2731 is a Chinese kaoliang grain sorghum that is well adapted to Northeast China. Keller is an American-bred elite sweet sorghum line shown to perform well across a wide range of environmental conditions. Using the re-sequencing data, a set of nearly 1,500 genes differentiating sweet and grain sorghum were identified. These genes fall into 10 major metabolic pathways involved in sugar and starch metabolisms, lignin and coumarin biosynthesis, nucleic acid metabolism, stress responses and DNA damage repair. In addition, 1,057,018 SNPs, 99,948 indels of 1-10bp in length and 16,487 presence/absence variations were uncovered, and 17,111 CNVs were detected. The majority of the SNPs, large-effect SNPs, indels and presence/absence variations resided in genes containing leucine rich repeats, PPR repeats and disease resistance R genes possessing diverse biological functions or under diversifying selection, but were absent in genes which are essential for life. This is the first publically available data that allows the identification of genome-wide patterns of genetic variation in sorghum. The high-density SNP and indel markers presented here will be a valuable resource for future genotype and phenotype studies and the molecular breeding of this important crop and for related species.

Additional details

Read the peer-reviewed publication(s):

(PubMed: 22104744)

Accessions (data generated as part of this study):

dbVar: nstd63
SRA: SRP008750
http: //
http: //
http: //


Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
E-Tian4558Sorghum bicolorsorghumSorghum bicolor
Ji27314558Sorghum bicolorsorghumSorghum bicolor
Keller4558Sorghum bicolorsorghumSorghum bicolor
Displaying 1-3 of 3 Sample(s).

File NameSample IDData TypeFile FormatSizeRelease Date 
E-TianGenome sequenceFASTQ593.63 MB2011-11-12
E-TianGenome sequenceFASTQ610.46 MB2011-11-12
E-TianGenome sequenceFASTQ656.67 MB2011-11-12
E-TianGenome sequenceFASTQ669.98 MB2011-11-12
E-TianGenome sequenceFASTQ580.15 MB2011-11-12
E-TianGenome sequenceFASTQ594.85 MB2011-11-12
E-TianGenome sequenceFASTQ551.19 MB2011-11-12
E-TianGenome sequenceFASTQ561.62 MB2011-11-12
E-TianGenome sequenceFASTQ654.79 MB2011-11-12
E-TianGenome sequenceFASTQ666.55 MB2011-11-12
Displaying 1-10 of 79 File(s).
Date Action
September 9, 2015 File readme.txt updated
October 15, 2015 File E-Tian_CNV.txt updated
October 15, 2015 File E-Tian_CNV.gff updated