Data released on October 20, 2015
Next generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, due to the variation in gene expression levels and alternative splicing.
We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which employ a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read.
The software as published is available directly from here, but for the most up to date version please see the project GitHub https://github.com/mourisl/Rcorrector/ repository.