The haplotype-resolved chromosome pairs and transcriptome data of a heterozygous diploid African cassava cultivar

Dataset type: Genomic, Transcriptomic
Data released on February 11, 2022

Qi W; Lim Y; Patrignani A; Schläpfer P; Bratus-Neuenschwander A; Grüter S; Chanez C; Rodde N; Prat E; Vautrin S; Fustier M; Pratas D; Schlapbach R; Gruissem W (2022): The haplotype-resolved chromosome pairs and transcriptome data of a heterozygous diploid African cassava cultivar GigaScience Database. http://dx.doi.org/10.5524/102193

DOI10.5524/102193

Cassava (Manihot esculenta) is an important clonally propagated food crop in tropical and sub-tropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult to assemble genome.
Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler Hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present two chromosome scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy above QV46, contig N50 above 18 Mbp, BUSCO completeness of 99%, and 35 K phased gene loci, it is the most accurate, continuous, complete and haplotype-resolved cassava genome assembly so far. Ab initio gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue-specific and inconsistent across different tissues. Direction-shifting was observed in less than 2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome re-arrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR-retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding.
The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy and continuity.

Additional details

Read the peer-reviewed publication(s):

(PubMed: 35333302)

Accessions (data generated as part of this study):

BioProject: PRJEB43673
GenBank: MZ959795.1
GenBank: MZ959796
GenBank: MZ959797
GenBank: MZ959798
BioProject: PRJNA758615
BioProject: PRJNA758616

Accessions (data referenced by this study):

BioProject: PRJNA324539





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
Fibrous Root rep13983manioccassavaManihot esculenta Description:RNA extracted from Fibrous Root(root h...
Analyte type:RNA
Replicate:1
...
+
Fibrous Root rep23983manioccassavaManihot esculenta Description:RNA extracted from Fibrous Root(root h...
Analyte type:RNA
Replicate:2
...
+
Fibrous Root rep33983manioccassavaManihot esculenta Description:RNA extracted from Fibrous Root(root h...
Analyte type:RNA
Replicate:3
...
+
Lateral Bud rep13983manioccassavaManihot esculenta Description:RNA extracted from Lateral Bud tissue ...
Analyte type:RNA
Replicate:1
...
+
Lateral Bud rep23983manioccassavaManihot esculenta Description:RNA extracted from Lateral Bud tissue ...
Analyte type:RNA
Replicate:2
...
+
Lateral Bud rep33983manioccassavaManihot esculenta Description:RNA extracted from Lateral Bud tissue ...
Analyte type:RNA
Replicate:3
...
+
Leaf rep13983manioccassavaManihot esculenta Description:RNA extracted from leaves of Manihot e...
Analyte type:RNA
Replicate:1
...
+
Leaf rep23983manioccassavaManihot esculenta Description:RNA extracted from leaves of Manihot e...
Analyte type:RNA
Replicate:2
...
+
Leaf rep33983manioccassavaManihot esculenta Description:RNA extracted from leaves of Manihot e...
Analyte type:RNA
Replicate:3
...
+
Midvein rep13983manioccassavaManihot esculenta Description:RNA extracted from Midvein tissue of M...
Analyte type:RNA
Replicate:1
...
+
Displaying 1-10 of 27 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
ScriptUNKNOWN0.59 KB2022-01-28
ScriptUNKNOWN0.45 KB2022-01-28
ScriptUNKNOWN1.62 KB2022-01-28
ScriptUNKNOWN1.16 KB2022-01-28
ScriptUNKNOWN1.96 KB2022-01-28
ScriptUNKNOWN0.66 KB2022-01-28
ScriptUNKNOWN13.11 KB2022-01-28
ScriptUNKNOWN0.88 KB2022-01-28
ScriptUNKNOWN2.66 KB2022-01-28
TextUNKNOWN9.46 MB2022-01-28
Displaying 1-10 of 86 File(s).
Funding body Awardee Award ID Comments
Bill & Melinda Gates Foundation W Gruissem INV-008213
Ministry of Education Taiwan W Gruissem Yushan Scholarship
Fundação para a Ciência e a Tecnologia D Pratas CEECINST/00026/2018 Institutional Call to Scientific Employment Stimulus
Date Action
February 11, 2022 Dataset publish
March 7, 2022 Manuscript Link added : 10.1093/gigascience/giac028
October 7, 2022 Manuscript Link updated : 10.1093/gigascience/giac028