Supporting data for "Synonymous variants that disrupt mRNA structure are significantly constrained in the human population"

Dataset type: Genomic, Software, Bioinformatics
Data released on March 09, 2021

Gaither JBS; Lammi GE; Li JL; Gordon DM; Kuck HC; Kelly BJ; Fitch JR; White P (2021): Supporting data for "Synonymous variants that disrupt mRNA structure are significantly constrained in the human population" GigaScience Database. http://dx.doi.org/10.5524/100878

DOI10.5524/100878

The role of synonymous single nucleotide variants in human health and disease is poorly understood, yet there is a growing body of evidence to suggest that this class of “silent” genetic variation plays multiple regulatory roles in both transcription and translation. One mechanism by which synonymous codons direct and modulate the translational process is through alteration of the elaborate structure formed by single-stranded mRNA molecules. While tools to computationally predict the impact of non-synonymous variants on protein structure are plentiful, analogous tools to systematically assess how synonymous variants might disrupt mRNA structure are lacking.

To address this need, we developed novel software using a parallel processing framework for large-scale generation of secondary RNA structures and folding statistics for the transcriptome of any species. Focusing our analysis on the human transcriptome, we calculated 5 billion RNA folding statistics for 469 million single nucleotide variants in 45,800 transcripts. By considering the impact of all possible synonymous variants globally, we discover that synonymous variants predicted to disrupt mRNA structure have significantly lower rates of incidence in the human population.

These findings support the hypothesis that synonymous variants may play a role in genetic disorders due to their effects on mRNA structure. Given that the community lacks tools to evaluate the potential pathogenic impact of synonymous variants, we provide RNA stability, edge distance and diversity metrics for every nucleotide in the human transcriptome and introduce a “Structural Predictivity Index” (SPI) to quantify structural constraint operating on any synonymous variant. Because no single RNA-folding metric can capture the diversity of mechanisms by which a variant could alter secondary mRNA structure, we generated a SUmmarized RNA Folding (SURF) metric to provide a single measurement to predict the impact of secondary structure altering variants in human genetic studies.

To access the unique list of genomic coordinates and their associated scores download RNAStability_v10.5.1_hg38_distinct_SURF_SPI_Phred_GitHub_Export.tsv.gz

Additional details

Read the peer-reviewed publication(s):

(PubMed: 33822938)

Additional information:

https://bio.tools/rna-stability

https://scicrunch.org/resolver/RRID:SCR_019259

Github links:

https://github.com/nch-igm/rna-stability





File NameSample IDData TypeFile FormatSizeRelease Date 
Tabular dataGZIP1.62 GB2021-03-07
Tabular dataGZIP1.63 GB2021-03-07
Tabular dataGZIP1.61 GB2021-03-07
Tabular dataGZIP1.54 GB2021-03-07
Tabular dataGZIP1.58 GB2021-03-07
Tabular dataGZIP242.37 MB2021-03-07
Tabular dataGZIP247.1 MB2021-03-07
Tabular dataGZIP1.37 GB2021-03-07
SoftwareUNKNOWN1.45 MB2021-03-07
SoftwarePython17.04 KB2021-03-07
Displaying 1-10 of 33 File(s).
Funding body Awardee Award ID Comments
National Institute of Health P White R01HL109758 National Heart, Lung, And Blood Institute
Date Action
March 9, 2021 Dataset publish
March 10, 2021 Funder added : National Institute of Health
March 10, 2021 Description updated from : The role of synonymous single nucleotide variants in human health and disease is poorly understood, yet there is a growing body of evidence to suggest that this class of “silent” genetic variation plays multiple regulatory roles in both transcription and translation. One mechanism by which synonymous codons direct and modulate the translational process is through alteration of the elaborate structure formed by single-stranded mRNA molecules. While tools to computationally predict the impact of non-synonymous variants on protein structure are plentiful, analogous tools to systematically assess how synonymous variants might disrupt mRNA structure are lacking.

To address this need, we developed novel software using a parallel processing framework for large-scale generation of secondary RNA structures and folding statistics for the transcriptome of any species. Focusing our analysis on the human transcriptome, we calculated 5 billion RNA folding statistics for 469 million single nucleotide variants in 45,800 transcripts. By considering the impact of all possible synonymous variants globally, we discover that synonymous variants predicted to disrupt mRNA structure have significantly lower rates of incidence in the human population.

These findings support the hypothesis that synonymous variants may play a role in genetic disorders due to their effects on mRNA structure. Given that the community lacks tools to evaluate the potential pathogenic impact of synonymous variants, we provide RNA stability, edge distance and diversity metrics for every nucleotide in the human transcriptome and introduce a “Structural Predictivity Index” (SPI) to quantify structural constraint operating on any synonymous variant. Because no single RNA-folding metric can capture the diversity of mechanisms by which a variant could alter secondary mRNA structure, we generated a SUmmarized RNA Folding (SURF) metric to provide a single measurement to predict the impact of secondary structure altering variants in human genetic studies.

To access the a the unique list of genomic coordinates and their associated scores download RNAStability_v10.5.1_hg38_distinct_SURF_SPI_Phred_GitHub_Export.tsv.gz
March 15, 2021 Manuscript Link added : 10.1093/gigascience/giab023
November 29, 2021 Manuscript Link updated : 10.1093/gigascience/giab023