Supporting data for "Genome sequencing of the sweetpotato whitefly Bemisia tabaci MED/Q"

Dataset type: Genomic
Data released on April 19, 2017

Xie W; Chen C; Yang Z; Guo L; Yang X; Wang D; Chen M; Huang J; Wen Y; Zeng Y; Liu Y; Xia J; Tian L; Cui H; Wu Q; Wang S; Xu B; Li X; Tan X; Ghanim M; Qiu B; Pan H; Chu D; Delatte H; Maruthi MN; Ge F; Zhou X; Wang X; Wan F; Du Y; Luo C; Yan F; Preisser EL; Jiao X; Coates BS; Zhao J; Gao Q; Xia J; Yin Y; Liu Y; Brown JK; Zhou XJ; Zhang Y (2017): Supporting data for "Genome sequencing of the sweetpotato whitefly Bemisia tabaci MED/Q" GigaScience Database.


Invasive whitefly, Bemisia tabaci, is a highly destructive agricultural and ornamental crop pest. As a group, B. tabaci damages host plants through phloem feeding and vectoring plant pathogens. Introductions of B. tabaci are difficult to quarantine and eradicate due to high reproductive rates, broad host plant range, and resistance to chemical insecticides. A 658 Mb draft genome for the Q-type B. tabaci (MED/Q) assembled and annotated with 20,786 protein-coding genes. Metabolic pathways show an expansion in the number of gene family members, in particular, the cytochrome P450 monooxygenases. Additionally, amino acid biosynthesis pathways are partitioning among host and endosymbiont genomes in a manner that is distinct from other hemipteran systems, wherein evidence of horizontal gene transfer to the host genome likely form the basis of obligatory relationships. Putative loss of function of the immune deficiency (IMD) signaling pathway due to gene loss is a shared ancestral trait of hemipteran insects that show competency for hosting endosymbiotic bacteria. This expansion of P450 gene family member may influence the well-noted capacity of MED/Q to adapt to repeated exposures to chemical insecticides, and furthermore, be related to invasiveness in monoculture cropping systems where such applications are prevalent.
This sequencing project was a collaborative effort between BGI-shenzhen and a consortium of international whitefly researchers. Various members have corresponded extensively through e-mails and phone calls. Many aspects of the genome sequencing project were discussed including the choice of B. tabaci cryptic species to propose. The researchers collectively decided that given the many resources already developed, the global invasion status and the large number of scientists studying them, the Q-type B. tabaci (MED/Q) would be the best choice. We expect that fairly extensive studies can be undertaken on other cryptic species of B. tabaci through the use of heterologous sequences once the B. tabaci Q genome sequence is available.


Additional details

Read the peer-reviewed publication(s):

Xie, W., Chen, C., Yang, Z., Guo, L., Yang, X., Wang, D., … Zhang, Y. (2017). Genome sequencing of the sweetpotato whitefly Bemisia tabaci MED/Q. GigaScience, 6(5). doi:10.1093/gigascience/gix018

Accessions (data generated as part of this study):

BioProject: PRJNA276952
BioProject: PRJNA299727
BioProject: PRJNA299729
SRA: SRA307591
SRA: SRA307569
SRA: SRA304343

Accessions (data referenced by this study):

BioProject: PRJNA299728
SRA: SRA307586

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
PRJNA2769527038sweet potato whitefly Bemisia tabaci Locus tag:VD02
Organism:Bemisia tabaci
PRJNA299727568987Candidatus HamiltonellaCandidatus Hamiltonella Alternative names:Candidatus Hamiltonella defensa
Host common name:Bemisia tabaci Q
Collection date:Mon Oct 01 00:00:00 HKT 2012
PRJNA299728672794Candidatus CardiniumCandidatus Cardinium Alternative names:Candidatus Cardinium endosymbion...
Host common name:Bemisia tabaci Q
Collection date:Mon Oct 01 00:00:00 HKT 2012
PRJNA2997291232201Candidatus PortieraCandidatus Portiera Alternative names:Candidatus Portiera aleyrodidaru...
Collection date:Mon Oct 01 00:00:00 HKT 2012
Geographic location (country and/or sea,region):Ch...
Displaying 1-4 of 4 Sample(s).

File NameSample IDData TypeFile FormatSizeRelease Date 
Genome sequenceFASTA640.41 MB2017-03-02
annotationGFF9.23 MB2017-03-02
annotationGFF480.58 KB2017-03-02
Coding sequenceFASTA31.35 MB2017-03-02
protein sequenceFASTA11.63 MB2017-03-02
tabular dataEXCEL1.25 KB2017-03-02
transcriptome sequenceFASTA66.73 MB2017-03-02
Genome sequenceFASTA1.72 MB2017-03-02
Genome sequenceFASTA372.91 KB2017-03-02
ReadmeTEXT0.99 KB2017-03-02
Displaying 1-10 of 10 File(s).
Date Action
April 19, 2017 Dataset publish
June 27, 2017 Manuscript Link added : 10.1093/gigascience/gix018