Help Login Create account

Data released on April 10, 2018

Supporting data for "Chiron: Translating nanopore raw signal directly into nucleotide sequence using deep learning"

Teng, H; Cao, M, D; Hall, M, B; Duarte, T; Wang, S; Coin, L, J (2018): Supporting data for "Chiron: Translating nanopore raw signal directly into nucleotide sequence using deep learning" GigaScience Database. http://dx.doi.org/10.5524/100425 RIS BibTeX Text

Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology which offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling: directly translating the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4000 reads, we show that our model provides state-of-the-art basecalling accuracy even on previously unseen species. Chiron achieves basecalling speeds of over 2000 bases per second using desktop computer graphics processing units, making it competitive with other deep-learning basecalling algorithms.

Contact Submitter

Additional information:

https://github.com/haotianteng/chiron

https://pypi.python.org/pypi/chiron

https://github.com/nanopore-wgs-consortium/NA12878

Accessions (data included in GigaDB):

BioProject: PRJNA386696
SRA: SRP136964

Keywords:

ont nanopore sequencing deep learning artifcial neural network comparative performance 

Software

http://gigadb.org/images/data/cropped/100425.jpg

Funding:

  • Funding body - National Health and Medical Research Council
  • Award ID - GNT1130084
  • Awardee - LJM Coin
  • Funding body - Australian Research Council
  • Award ID - DP170102626
  • Awardee - LJM Coin
  • Funding body -
  • Comment - Westpac Future Leaders Scholarship
  • Awardee - MB Hall

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
MT208231773  Mycobacterium tuberculosis Description:Oxford Nanopore MinION and Illumina se...
Collected by:CPHL
Collection date:30-Jun-2013
...
+
NA128789606HumanhumanHomo sapiens Description:
Age:not provided
Source material identifiers:Coriell:NA12878
...
+
Displaying 1-2 of 2 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
Data Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
Mixed archiveGZIP519.48 MB2018-03-19
GitHub archivearchive81.55 MB2018-03-08
mixed archiveTAR457.81 MB2018-03-08
GitHub archivearchive2.58 MB2018-03-08
Mixed archiveGZIP24.71 GB2018-03-19
ReadmeTEXT2.04 KB2018-03-08
mixed archiveTAR447.71 MB2018-03-08
Displaying 1-7 of 7 File(s).

History:

+

Other datasets you might like: