Data released on April 12, 2017
The lined seahorse, Hippocampus erectus, is an Atlantic species and mainly inhabits shallow sea-beds or coral reefs. It has become very popular in China for its wide use in traditional Chinese medicine. In order to improve the aquaculture yield of this valuable fish species, we are trying to develop genomic resources for assistant selection in genetic breeding. Here, we provide whole genome sequencing, assembly and gene annotation of the lined seahorse, which can enrich genome resource and further application for its molecular breeding.
A total of 174.6-Gb (Gigabase) raw DNA sequences were generated by the Illumina Hiseq2500 platform. The final assembly of the lined seahorse genome is around 458 Mb, representing 94% of the estimated genome size (489 Mb by k-mer analysis). The contig N50 and scaffold N50 reached 14.57 kb and 1.97 Mb respectively. Quality of the assembled genome was assessed by BUSCO with prediction of 85% of the known vertebrate genes and evaluated using the de novo assembled RNA-seq transcripts to prove a high mapping ratio (more than 99% transcripts could be mapped to the assembly). Using homology-based, de novo annotation and transcriptome-based prediction methods, we predicted 20,788 protein-coding genes in the generated assembly, which is similar to our previously reported gene number (23,458) of the tiger tail seahorse (H. comes).
We report a draft genome of the lined seahorse. These generated genomic data are going to enrich genome resource of this economically important fish, and also provide insights into the genetic mechanisms of its iconic morphology and male pregnancy behavior.