Help Login Create account

Data released on October 31, 2016

BGISEQ-500 sequencer first reference dataset

Chen, H; Chen, Y; Geng, C; Huang, J; Jiang, H; Li, Y; Liang, X; Liao, S; Lu, H; Mei, X; Qu, S; Rao, J; Sun, N; Wang, J; Xuan, Y; Yu, T; Zhang, W; Liu, X; Yang, Z; Mu, F; Gao, S (2016): BGISEQ-500 sequencer first reference dataset GigaScience Database. RIS BibTeX Text

BGISEQ-500 sequencer is a new desktop sequencer developed by BGI. Using DNA nanoballs (DNB) and combinational probe-anchor synthesis (cPAS) developed from Complete Genomics(TM) sequencing technology, it generates short reads at a large scale, which can help fulfill the growing demands for sequencing. Here, we present the first human whole genome sequencing dataset from the BGISEQ-500. The dataset was generated by sequencing the widely used Genome in a Bottle Consortium cell line, HG001 (NA12878) in one sequencing run. And the sequencing data were ~1,000 million paired sequences with the length of 50 bp (PE50). We also include examples of the raw images from the sequencer for reference. Finally, we carried out variation calling based on the dataset and compared it that identified from similar amount of publicly available HiSeq2500 data and the previously identified high confident variations.

Contact Submitter

Read the peer-reviewed publication(s):

Huang, J., Liang, X., Xuan, Y., Geng, C., Li, Y., Lu, H., … Gao, S. (2017). A reference human genome dataset of the BGISEQ-500 sequencer. GigaScience, 6(5), 1–9. doi:10.1093/gigascience/gix024

Related datasets:

doi:10.5524/100252 IsPreviousVersionOf doi:10.5524/100274 (It is a more recent version of this dataset)

There is a new version of this dataset available at: DOI: 10.5524/100274



Samples: Table Settings


Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
NA128789606HumanhumanHomo sapiens Description:NA12878 cell line (RRID: CVCL_7526) ge...
Analyte type:DNA
Displaying 1-1 of 1 Sample(s).

Files: (FTP site) Table Settings


File Description
Sample ID
Data Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
Genome sequenceFASTQ29.06 GB2016-10-31
Genome sequenceFASTQ31.88 GB2016-10-31
Genome sequenceFASTQ27.4 GB2016-10-31
Genome sequenceFASTQ29.83 GB2016-10-31
ReadmeTEXT0.2 KB2016-10-31
MD5sumTEXT0.12 KB2016-10-31
MD5sumTEXT0.12 KB2016-10-31
imageTAR588.62 MB2016-10-31
MD5sumTEXT0.05 KB2016-10-31
ReadmeTEXT2.47 KB2016-10-31
Displaying 1-10 of 10 File(s).



Other datasets you might like: