Supporting data for "Filling reference gaps via assembling DNA barcodes using high-throughput sequencing - moving toward barcoding the world"

Dataset type: Genomic, Metabarcoding
Data released on November 03, 2017

Liu S; Yang C; Zhou C; Zhou X (2017): Supporting data for "Filling reference gaps via assembling DNA barcodes using high-throughput sequencing - moving toward barcoding the world" GigaScience Database. http://dx.doi.org/10.5524/100363

DOI10.5524/100363

Over the past decade, biodiversity scientists have dedicated tremendous efforts in constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2,000, further dramatic reduction on barcoding costs is unlikely because the Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also refrained High-Throughput-Sequencing (HTS) based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length COI barcodes from pooled PCR amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from over 78% of the PCR reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform, Pacbio, confirmed the accuracy the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about 1/10 of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes.

Additional details

Read the peer-reviewed publication(s):

Liu, S., Yang, C., Zhou, C., & Zhou, X. (2017). Filling reference gaps via assembling DNA barcodes using high-throughput sequencing—moving toward barcoding the world. GigaScience, 6(12), 1–8. doi:10.1093/gigascience/gix104

Additional information:

https://github.com/comery/HIFI-barcode-hiseq

https://github.com/comery/HIFI-barcode-pacbio

dx.doi.org/10.17504/protocols.io.ka9csh6

Accessions (data included in GigaDB):

BioProject: PRJNA414137





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
hifi01-F011572514  Brahmaea hearseyi Description:COI gene amplified from genomic DNA ex...
Species-a:Brahmaea hearseyi
Sample collection device or method:Malaise trap
...
+
hifi01-F021137966  Calyptra lata Description:COI gene amplified from genomic DNA ex...
Species-a:Oraesia lata
Sample collection device or method:Malaise trap
...
+
hifi01-F031430718  Obeidia gigantearia Description:COI gene amplified from genomic DNA ex...
Species-a:Obeidia gigantearia
Sample collection device or method:Malaise trap
...
+
hifi01-F04704655  Amblychia sp. Description:COI gene amplified from genomic DNA ex...
Species-a:Amblychia moltrechti
Sample collection device or method:Malaise trap
...
+
hifi01-F05351166  Papilio dialis Description:COI gene amplified from genomic DNA ex...
Species-a:Papilio (Princeps) dialis
Sample collection device or method:Malaise trap
...
+
hifi01-F06934919  Asthena sp. Description:COI gene amplified from genomic DNA ex...
Species-a:Asthena undulata
Sample collection device or method:Malaise trap
...
+
hifi01-F0772256 grass yellowsEurema Description:COI gene amplified from genomic DNA ex...
Species-a:Eurema hecabe
Sample collection device or method:Malaise trap
...
+
hifi01-F0876216 treebrownsLethe Description:COI gene amplified from genomic DNA ex...
Species-a:Lethe andersoni
Sample collection device or method:Malaise trap
...
+
hifi01-F09311074  Pseudergolis wedah Description:COI gene amplified from genomic DNA ex...
Species-a:Pseudergolis wedah
Sample collection device or method:Malaise trap
...
+
hifi01-F10423314  Catocala patala Description:COI gene amplified from genomic DNA ex...
Species-a:Catocala patala
Sample collection device or method:Malaise trap
...
+
Displaying 61-70 of 192 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
Github archivearchive12.55 KB2017-10-16
Github archivearchive2.64 MB2017-10-16
OtherFASTA7.01 MB2017-10-16
Amplicon sequenceFASTA53.17 KB2017-10-16
AlignmentsFASTA13.03 MB2017-10-16
Sequence assemblyFASTA62.65 KB2017-10-16
AlignmentsFASTA5.5 MB2017-10-16
Sequence assemblyFASTA68.34 KB2017-10-16
Amplicon sequenceFASTA53.17 KB2017-10-16
Amplicon sequenceFASTQ351.5 MB2017-10-16
Displaying 1-10 of 23 File(s).
Funding body Awardee Award ID Comments
Chinese Universities Scientific Fund X Zhou 2017QC114
Date Action
November 3, 2017 Dataset publish
November 13, 2017 Manuscript Link added : 10.1093/gigascience/gix104