Supporting data for "The genome of an underwater architect, the caddisfly Stenopsyche tienmushanensis Hwang (Insecta: Trichoptera)"

Dataset type: Genomic, Transcriptomic
Data released on November 22, 2018

Luo S; Tang M; Frandsen PB; Stewart RJ; Zhou X (2018): Supporting data for "The genome of an underwater architect, the caddisfly Stenopsyche tienmushanensis Hwang (Insecta: Trichoptera)" GigaScience Database. http://dx.doi.org/10.5524/100538

DOI10.5524/100538

Caddisflies (Insecta: Trichoptera) are a highly adapted freshwater group of insects split from a common ancestor with Lepidoptera. They are the most diverse (> 16,000 species) of the strictly aquatic insect orders and are widely employed as bioindicators in water quality assessment and monitoring. Among the numerous adaptations to aquatic habitats, caddisfly larvae use silk and materials from the environment (stones, sticks, leaf matter and etc.) to build composite structures such as fixed retreats and portable cases. Understanding how caddisflies have adapted to aquatic habitats will help explain the evolution and subsequent diversification of the group.
We sequenced a retreat-builder caddisfly Stenopsyche tienmushanensis Hwang and assembled a high-quality genome from both Illumina and PacBio sequencing. In total, 601.2 M Illumina reads (90.2 Gb), and 16.9 M PacBio subreads (89.0 Gb) were generated. The 451.5 Mb assembled genome has a contig N50 of 1.29 M, a longest contig of 4.76 Mb, and covers 97.65% of the 1,658 insect single-copy genes as assessed by Benchmarking Universal Single-Copy Orthologs (BUSCO). The genome comprises 36.76% repetitive elements. A total of 14,672 predicted protein-coding genes were identified. The genome revealed gene expansions in specific groups of the cytochrome P450 family and olfactory binding proteins, suggesting potential genomic features associated with pollutant tolerance and mate finding. In addition, the complete gene complex of the highly repetitive H-fibroin, the major protein component of caddisfly larval silk, was assembled.
We report the draft genome of Stenopsyche tienmushanensis, the highest quality caddisfly genome so far. The genome information will be an important resource for the study of caddisflies, and may shed light on the evolution of aquatic insects.

Additional details

Read the peer-reviewed publication(s):

Luo, S., Tang, M., Frandsen, P. B., Stewart, R. J., & Zhou, X. (2018). The genome of an underwater architect, the caddisflyStenopsyche tienmushanensisHwang (Insecta: Trichoptera). GigaScience, 7(12). doi:10.1093/gigascience/giy143

Accessions (data generated as part of this study):

BioProject: PRJNA436868





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
Stie11560151  Stenopsyche tienmushanensis Description:DNA extracted from the whole body (exc...
Alternative accession-BioProject:PRJNA436868
Sex:female
...
+
Stie1+Stie21560151  Stenopsyche tienmushanensis Description:Combined DNA extracts from the whole b...
Alternative accession-BioProject:PRJNA436868
Sex:female
...
+
Stie21560151  Stenopsyche tienmushanensis Description:DNA extracted from the whole body (exc...
Alternative accession-BioProject:PRJNA436868
Sex:female
...
+
Stie31560151  Stenopsyche tienmushanensis Description:RNA extracted from the whole body (exc...
Alternative accession-BioProject:PRJNA436868
Sex:female
...
+
Displaying 1-4 of 4 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
Genome sequenceTAR41.34 GB2018-11-15
FASTAFASTA542.63 KB2018-11-21
FASTAFASTA289.83 KB2018-11-21
readmeTEXT3.7 KB2018-11-15
Coding SequenceGFF7.65 MB2018-11-15
Tabular dataTEXT28.44 KB2018-11-21
Tabular dataTEXT92.25 KB2018-11-21
protein sequenceFASTA23.88 MB2018-11-15
Coding SequenceGFF5.09 MB2018-11-15
Genome sequenceFASTA435.97 MB2018-11-15
Displaying 1-10 of 26 File(s).
Funding body Awardee Award ID Comments
National Natural Science Foundation of China X Zhou 31772493
Chinese Universities Scientific Fund X Zhou 2017QC114
Chinese Universities Scientific Fund X Zhou 2018QC133
Date Action
November 21, 2018 Dataset publish