Updated genome assembly of Ginkgo biloba

Dataset type: Genomic
Data released on June 04, 2019

Guan R; Zhao Y; Zhang H; Fan G; Liu X; Zhou W; Shi C; Wang J; Liu W; Liang X; Fu Y; Ma K; Zhao L; Zhang F; Lu Z; Lee SM; Xu X; Wang J; Yang H; Fu C; Ge S; Chen W (2019): Updated genome assembly of Ginkgo biloba GigaScience Database. http://dx.doi.org/10.5524/100613


Ginkgo biloba is one of the world’s most ancient plants, a living fossil that has remained essentially unchanged in terms of gross morphology for more than 200 million years. Representing one of the four extant gymnosperm lineages and having no living relatives, it possesses a suite of fascinating characteristics including including a large genome, outstanding resistance/tolerance to abiotic and biotic stresses, and dioecious reproduction, making it an ideal model species for biological studies.
Here we present an updated chromosome-level genome assembly using HiC technology as a major improvement of the ginkgo draft assembly. A chromosome-level reference represents a valuable resource to facilitate of studies of biologic diversity, evolutionary history, and population genetics. With technological advances, we constructed to update the existing draft assembly to the chromosome-level using Hi-C, which has been proven to be a fast, inexpensive, and accurate technology that can be applied to many species. The fresh plant leaves of two-years seedling (TM301S) were crosslinked with 1% formaldehyde. To destroy the cell wall, formaldehyde fixed powder was added to Buffer solution. The restriction endonuclease MboI was used to digest DNA, followed by biotinylated residue labeling. The Hi-C library was then sequenced on BGISEQ-500 platform with 50 bp pair-end sequencing. HiC-Pro pipeline (v2.11.1) was implemented in quality control. Of all 653,202,535 raw pair-end reads, there are 32% (207,324,555) paired Hi-C reads are valid and suitable for following analysis. Basing on these valid Hi-C reads, we used Juicer (v1.6.2) and Aiden lab’s Hi-C assembly pipeline (v180922) to assemble the genome with the main parameter "-m haploid -s 4 -c 12", generating 12 chromosomes spanning 9.03 Gb (~94% of the whole genome).

Additional details

Related datasets:

doi:10.5524/100613 IsNewVersionOf doi:10.5524/100209
doi:10.5524/100613 IsCitedBy doi:10.5524/100773

Accessions (data generated as part of this study):

BioSample: SAMN11919598

Accessions (data referenced by this study):

BioProject: PRJNA307642

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
TM301S3311ginkgomaidenhair treeGinkgo biloba Description:DNA extracted from the leaf of seedlin...
Molecular data type:DNA
Alternative accession-BioSample:SAMN11919598
Displaying 1-1 of 1 Sample(s).

File NameSample IDData TypeFile FormatSizeRelease Date 
otherarchive8.7 MB2019-06-04
Coding SequenceFASTA47.58 MB2019-06-04
Genome sequenceAGP5.46 MB2019-06-04
annotationGFF12.53 MB2019-06-04
Genome sequenceFASTA2.53 GB2019-06-04
protein sequenceFASTA17.25 MB2019-06-04
readmeTEXT0.5 KB2019-06-04
Displaying 1-7 of 7 File(s).
Funding body Awardee Award ID Comments
Shenzhen Municipal Government NO.JSGG20130918102805062 Technology Innovation Program
Shenzhen Municipal Government NO.JCYJ20120618172523025 Basic Research Program Support
Shenzhen Municipal Government NO.JCYJ20150529150505656 Basic Research Program Support
Zhejiang Province Government 2014C32107 Public Technology Research Project of Zhejiang Province
Ministry of Science and Technology 31000102 National Natural Science Foundation of China
Date Action
June 4, 2019 Dataset publish
June 6, 2021 File readme_100613.txt updated