Help Login Create account

Data released on July 13, 2017

Supporting data for "Population-wide Sampling of Retrotransposon Insertion Polymorphisms Using Deep Sequencing and Efficient Detection"

Yu, Q; Zeng, Y; Zhang, W; Zhang, X; Wang, Y; Wang, Y; Xu, L; Huang, X; Li, N; Zhou, X; Lu, J; Guo, X; Li, G; Hou, Y; Liu, S; Li, B (2017): Supporting data for "Population-wide Sampling of Retrotransposon Insertion Polymorphisms Using Deep Sequencing and Efficient Detection" GigaScience Database. http://dx.doi.org/10.5524/100318 RIS BibTeX Text

Active retrotransposons play important roles during evolution and continue to shape our genomes today, especially in genetic polymorphisms underlying a diverse set of diseases. However, studies of human retrotransposon insertion polymorphisms (RIPs) based on whole-genome deep sequencing at the population level have not been sufficiently undertaken, despite the obvious need for a thorough characterization of RIPs in the general population.
Herein, we present a novel and efficient computational tool named Specific Insertions Detector (SID) for the detection of non-reference RIPs. We demonstrate that SID is suitable for high depth whole-genome sequencing (WGS) data using paired-end reads obtained from simulated and real datasets. We construct a comprehensive RIP database using a large population of 90 Han Chinese individuals with a mean 68× depth per individual. In total, we identify 9342 recent RIPs, and 8433 of these RIPs are novel compared with dbRIP, including 5826 Alu, 2169 long interspersed nuclear element 1 (L1), 383 SVA, and 55 long terminal repeats (LTR). Among the 9342 RIPs, 4828 were located in gene regions and five were located in protein-coding regions. We demonstrate that RIPs can, in principle, be an informative resource to perform population evolution and phylogenetic analyses. Taking the demographic effects into account, we identify a weak negative selection on SVA and L1 but approximately neutral selection for Alu elements based on the frequency spectrum of RIPs.
SID is a powerful open-source program for the detection of non-reference RIPs. We built a non-reference RIP dataset that greatly enhanced the diversity of RIPs detected in the general population and should be invaluable to researchers interested in many aspects of human evolution, genetics, and disease. As a proof-of-concept, we demonstrate that the RIPs can be used as biomarkers in a similar way as single nucleotide polymorphisms (SNPs).

Contact Submitter

Related datasets:

doi:10.5524/100318 Cites doi:10.5524/100096

Additional information:

https://github.com/Jonathanyu2014/SID

Protocols.io:

+

Accessions (data not in GigaDB):

BioProject: PRJEB11005

Keywords:

Transposable element retrotransposon insertion polymorphism next-generation sequencing whole-genome sequencing 

Genomic, Software

/images/uploads/image_upload/Images_409.png

Funding:

  • Funding body - Shenzhen Municipal Government of China
  • Location - China
  • Award ID - JSGG20140702161347218
  • Awardee - Yong Hou
  • Funding body - Shenzhen Municipal Government of China
  • Location - China
  • Award ID - KQCX20150330171652450
  • Awardee - Guanglei Li

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
HG006839606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
HG006849606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
HG006909606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
HG006989606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
HG006999606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
NA128789606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
NA128919606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
NA128929606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
NA185249606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
NA185269606HumanhumanHomo sapiens Isolation source:peripheral vein
Cell type:B-Lymphocyte
Tissue:Blood
...
+
Displaying 41-50 of 95 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
File Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
Simulated_dataDirectoryUNKNOWN122 GB2017-06-27
TextTAR116.65 MB2017-06-27
Mixed archivezip17.73 MB2017-07-17
imageTAR208.7 KB2017-06-27
imageTAR1.58 MB2017-06-27
ReadmeTEXT0.5 KB2017-06-27
TextTAR15.65 MB2017-06-27
GitHub archivearchive374.52 KB2017-06-27
Phylogenetic treeUNKNOWN40.32 KB2017-06-27
TextUNKNOWN3.86 KB2017-06-27
Displaying 1-10 of 14 File(s).

History:

+

Other datasets you might like: