Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012)

Dataset type: Genomic
Data released on December 12, 2012

Wang J; Li Y; Luo R; Liu B; Xie Y; Li Z; Fang X; Zheng H; Qin J; Yang B; Yu C; Ni P; Li N; Guo G; Ye J; Fang L; Su Y; Asan ; Zheng H; Kristiansen K; Wong GK; Nielsen R; Durbin R; Bolund L; Zhang X; Li S; Yang H; Wang J (2012): Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012) GigaScience Database. http://dx.doi.org/10.5524/100038


Updated genomic data from the YH (Homo sapiens) diploid genome – the first sequenced Han Chinese individual, a representative of the Asian population. The genomic DNA used in this study came from an anonymous male Han Chinese individual who has no known genetic diseases. The original version of the YH genome was assembled based on 3.3 billion reads using the Illumina Genome Analyzer (see dataset doi:10.5524/100015). This latest (as of 07/2012) and improved version of the YH genome was assembled based on 2.1 billion reads using the Illumina HiSeq2000. A total of 202G nucleotides data was achieved using 100 bp-long paired end reads with an insert size ranging from 180 bp to 40 kbp, and the genome was sequenced to 67.5-fold average coverage. The latest version of SOAPdenovo2 was used to reassemble, improve and update the previously assembled genome (tools and pipelines available here: doi:10.5524/100044). By aligning the short reads with SOAP, 177G nucleotides were mapped onto the NCBI reference genome and 99.99% of the genome was covered. The raw sequences, assemblies and relevant tools are released for public use under a CC0 license. More information about the YH genome can be viewed at: http://yh.genomics.org.cn/

Additional details

Read the peer-reviewed publication(s):

Wang, J., Wang, W., Li, R., Li, Y., Tian, G., Goodman, L., … Zhang, J. (2008). The diploid genome sequence of an Asian individual. Nature, 456(7218), 60–65. doi:10.1038/nature07484 (PubMed: 18987735)

Related datasets:

doi:10.5524/100038 IsNewVersionOf doi:10.5524/100015
doi:10.5524/100038 IsCompiledBy doi:10.5524/100044
doi:10.5524/100038 IsSupplementedBy doi:10.5524/100097
doi:10.5524/100038 IsSupplementedBy doi:10.5524/100096

Additional information:


Genome browser:


Accessions (data generated as part of this study):

ENA: ERP001652


Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
YH9606HumanhumanHomo sapiens
Displaying 1-1 of 1 Sample(s).

File NameSample IDData TypeFile FormatSizeRelease Date 
YHGenome sequenceFASTQ5.96 GB2012-12-12
YHGenome sequenceFASTQ5.32 GB2012-12-12
YHGenome sequenceFASTQ5.16 GB2012-12-12
YHGenome sequenceFASTQ5.06 GB2012-12-12
YHGenome sequenceFASTQ4.86 GB2012-12-12
YHGenome sequenceFASTQ6.49 GB2012-12-12
YHGenome sequenceFASTQ6.14 GB2012-12-12
YHGenome sequenceFASTQ6.48 GB2012-12-12
YHGenome sequenceFASTQ6.11 GB2012-12-12
YHGenome sequenceFASTQ5.15 GB2012-12-12
Displaying 1-10 of 15 File(s).
Date Action
August 18, 2016 File 110112_I199_FC819BBABXX_L3_HUMiqvDBTDWAAPE_2.fq.gz removed
August 18, 2016 File removed : 110112_I199_FC819BBABXX_L3_HUMiqvDBTDWAAPE_2.fq.gz