Supporting data for "The case for using Mapped Exonic Non-Duplicate (MEND) read counts in RNA-Seq experiments: examples from pediatric cancer datasets"
Dataset type: Software, Transcriptomic
Data released on January 26, 2021
Beale HC; Roger JM; Cattle MA; McKay LT; Thompson DKA; Learned K; Lyle AG; Kephart ET; Currie R; Lam DL; Sanders L; Pfeil J; Vivian J; Bjork I; Salama SR; Haussler D; Vaske OM (2021): Supporting data for "The case for using Mapped Exonic Non-Duplicate (MEND) read counts in RNA-Seq experiments: examples from pediatric cancer datasets" GigaScience Database. http://dx.doi.org/10.5524/100859
The reproducibility of gene expression measured by RNA sequencing (RNA-Seq) is dependent on the sequencing depth. While unmapped or non-exonic reads do not contribute to gene expression quantification, duplicate reads contribute to the quantification but are not informative for reproducibility. We show that Mapped, Exonic, Non-duplicate (MEND) reads are a useful measure of reproducibility of RNA-Seq datasets utilized for gene expression analysis. In bulk RNA-Seq datasets from 2179 tumors in 48 cohorts, the fraction of reads that contribute to the reproducibility of gene expression analysis varies greatly. Unmapped reads constitute 1-77% of all reads (med.) 3%; IQR 3%); duplicate reads constitute 3-100% of mapped reads (med. 27%; IQR 30%); and non-exonic reads constitute 4-97% of mapped, non-duplicate reads (med. 25%; IQR 21%). Mapped, Exonic, Non-duplicate (MEND) reads constitute 0-79% of total reads (med. 50%; IQR 31%). Since not all reads in a RNA-Seq dataset are informative for reproducibility of gene expression measurements, and the fraction of reads that are informative varies, we propose reporting a dataset's sequencing depth in MEND reads, which definitively inform the reproducibility of gene expression, rather than total, mapped or exonic reads. We provide a Docker image containing 1) the existing required tools (RSeQC, sambamba and samblaster) and 2) a custom script. We recommend that all RNA-Seq gene expression experiments, sensitivity studies and depth recommendations use MEND units for sequencing depth.
Additional details
Read the peer-reviewed publication(s):
(PubMed: 33712853)
Additional information:
https://cavatica.squarespace.com/






