Help Login Create account

Data released on March 22, 2017

An updated reference human genome dataset of the BGISEQ-500 sequencer

Chen, H; Gao, S; Geng, C; Huang, J; Jiang, H; Li, Y; Liang, X; Liu, X; Lu, H; Mei, X; Mu, F; Qu, S; Sun, N; Xuan, Y; Yang, Z; Yu, T (2017): An updated reference human genome dataset of the BGISEQ-500 sequencer GigaScience Database. RIS BibTeX Text

The BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoballs (DNB) and combinational probe-anchor synthesis (cPAS) developed from Complete Genomics™ sequencing technology, it generates short reads at a large scale, which can help fulfill the growing demands for sequencing. Here, we present the first human whole genome sequencing dataset from the BGISEQ-500. The dataset was generated by sequencing the widely-used Genome in a Bottle Consortium cell line, HG001 (NA12878). We have previously released the paired end 50bp (PE50) sequences (DOI:10.5524/100252) and here we present the PE100 reads from same sample, together with the assembled genome. We also included examples of the raw images from the sequencer for reference. Finally, we carried out variation calling based on the dataset and compared that to similar amounts of publicly available HiSeq2500 data and the previously identified high confident variations in this previously sequenced genome.

Contact Submitter

Read the peer-reviewed publication(s):

Huang, J., Liang, X., Xuan, Y., Geng, C., Li, Y., Lu, H., … Gao, S. (2017). A reference human genome dataset of the BGISEQ-500 sequencer. GigaScience, 6(5), 1–9. doi:10.1093/gigascience/gix024

Related datasets:

doi:10.5524/100274 IsNewVersionOf doi:10.5524/100252
doi:10.5524/100274 IsPreviousVersionOf doi:10.5524/100449 (It is a more recent version of this dataset)

There is a new version of this dataset available at: DOI: 10.5524/100449

Accessions (data included in GigaDB):

BioProject: PRJEB15427



Files: (FTP site) Table Settings


File Description
Sample ID
Data Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
Sequence variantsVCF204.79 MB2017-03-21
Sequence variantsVCF191.77 MB2017-03-21
NA12878Genome sequenceFASTQ20.78 GB2017-01-23
NA12878Genome sequenceFASTQ23.47 GB2017-01-23
NA12878Genome sequenceFASTQ22.19 GB2017-01-23
NA12878Genome sequenceFASTQ24.2 GB2017-01-23
Sequence variantsVCF199.98 MB2017-03-21
md5sum valuesTEXT0.25 KB2017-01-23
ReadmeTEXT0.67 KB2017-01-23
Displaying 1-9 of 9 File(s).



Other datasets you might like: