Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences".

Dataset type: Transcriptomic
Data released on April 21, 2015

Maudhoo MD; Madison JD; Norgren Jr RB (2015): Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences". GigaScience Database. http://dx.doi.org/10.5524/100137

DOI10.5524/100137

Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.

Additional details

Read the peer-reviewed publication(s):


Accessions (data generated as part of this study):

BioProject: PRJNA173089
ENA: GABD01000000
ENA: GABC01000000
ENA: GABF01000000
ENA: GABE01000000





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
SRX1792649598ChimpanzeechimpanzeePan troglodytes Cell type:Stem cells
Tissue:Adipose stroma
Alternative accession-INSDC:GABC01000001–GA...
...
+
SRX1792669598ChimpanzeechimpanzeePan troglodytes Cell type:Endothelial cells
Tissue:Vascular smooth muscle
Alternative accession-INSDC:GABF01000001–GA...
...
+
SRX1792679598ChimpanzeechimpanzeePan troglodytes Cell type:Fibroblasts
Tissue:Skin
Alternative accession-INSDC:GABD01000001–GA...
...
+
SRX1792719598ChimpanzeechimpanzeePan troglodytes Cell type:Myoblasts
Tissue:Skeletal muscle
Alternative accession-INSDC:GABE01000001–GA...
...
+
Displaying 1-4 of 4 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
ISA-Tabzip4.71 KB2015-05-28
ReadmeTEXT1.62 KB2015-04-09
Displaying 1-2 of 2 File(s).
Date Action
April 27, 2015 Dataset publish
April 27, 2015 Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
April 27, 2015 Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
May 28, 2015 Additional file chimpanzee_transcriptome_ISA.zip added