Supporting data for "SL-quant: A fast and flexible pipeline to quantify spliced leader trans-splicing events from RNA-seq data"
Dataset type: Software, Transcriptomic
Data released on July 03, 2018
The spliceosomal transfer of a short spliced leader (SL) RNA to an independent pre-mRNA molecule is called SL trans-splicing and is widespread in the nematode C. elegans. While RNA-seq data contain information on such events, properly documented methods to extract them are lacking. To address this, we developed SL-quant, a fast and flexible pipeline that adapts to paired-end and single-end RNA-seq data and accurately quantifies SL trans-splicing events. It is designed to work downstream of read mapping and uses the reads left unmapped as primary input. Briefly, the SL-sequences are identified with high specificity and are trimmed from the input reads, which are then re-mapped on the reference genome and quantified at the nucleotide position level (SL trans-splice sites) or at the gene level. SL-quant completes within 10 minutes on a basic desktop computer for typical RNA-seq datasets. Validating the method, the SL trans-splice sites identified display the expected consensus sequence and the results of the gene-level quantification are predictive of the gene position within operons. We also compared SL-quant to a recently published SL-containing read identification strategy which revealed being more sensitive, but less specific than SL-quant. Both methods are implemented as a bash script available under the MIT licence at https://github.com/cyaguesa/SL-quant. Full instructions for its installation, usage, and adaptation to other organisms are provided.