Supporting data for "SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data"

Dataset type: Software, Transcriptomic
Data released on November 20, 2020

Young MD; Behjati S (2020): Supporting data for "SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data" GigaScience Database. http://dx.doi.org/10.5524/100836

DOI10.5524/100836

Droplet based single-cell RNA sequence analyses assume all acquired RNAs are endogenous to cells. However, any cell free RNAs contained within the input solution are also captured by these assays. This sequencing of cell free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data. We demonstrate that contamination from this ‘soup’ of cell free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating ‘background corrected’ cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics. We present ‘SoupX’, a tool for removing ambient RNA contamination from droplet based single cell RNA sequencing experiments. This tool has broad applicability and its application can improve the biological utility of existing and future data sets.





File NameSample IDData TypeFile FormatSizeRelease Date 
GitHub archivezip19.38 KB2020-11-18
mixed archiveGZIP385.56 MB2020-11-18
mixed archiveGZIP4.36 GB2020-11-18
mixed archiveGZIP33.45 MB2020-11-18
mixed archiveGZIP47.75 MB2020-11-18
readmeTEXT3.08 KB2020-11-18
GitHub archivezip56.78 MB2020-11-18
mixed archiveGZIP723.08 MB2020-11-18
Displaying 1-8 of 8 File(s).
Funding body Awardee Award ID Comments
Wellcome Trust S Behjati
Date Action
November 20, 2020 Dataset publish
November 20, 2020 Description updated from : Droplet based single-cell RNA sequence analyses assume all acquired RNAs are endogenous to cells. However, any cell free RNAs contained within the input solution are also captured by these assays. This sequencing of cell free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data. We demonstrate that contamination from this ‘soup’ of cell free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating ‘background corrected’ cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics. We present ‘SoupX’, a tool for removing ambient RNA contamination from droplet based single cell RNA sequencing experiments. This tool has broad applicability and its application can improve the biological utility of existing and future data sets.
December 14, 2020 Manuscript Link added : 10.1093/gigascience/giaa151