Help Login Create account

Data released on May 21, 2018

Supporting data for "MetaMap: An atlas of metatranscriptomic reads in human disease-related RNA-seq data"

Simon, L, M; Karg, S; Westermann, A; Engel, M; Elbehery, A; Hense, B; Heinig, M; Deng, L; Theis, F (2018): Supporting data for "MetaMap: An atlas of metatranscriptomic reads in human disease-related RNA-seq data" GigaScience Database. RIS BibTeX Text

With the advent of the age of big data in bioinformatics, large volumes of data and high performance computing power enable researchers to perform reanalyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts, but its generic nature also enables the detection of microbial and viral transcripts.
We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-humanmapping read fraction. We validated this approach by recapitulating outcomes from 6 independent controlled infection experiments of cell line models and comparison with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from >17,000 samples from >400 studies relevant to human disease using state-of-the-art high performance computing systems. The resulting data of this large-scale re-analysis are made available in the presented MetaMap resource.
Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation towards the role of the microbiome in human disease.

Contact Submitter

Additional information:


rna-seq metatranscriptomics microbiome virome human disease 

Metagenomic, Software


  • Funding body - H2020 Marie Skłodowska-Curie Actions
  • Award ID - 753039
  • Awardee - Lukas Simon

Files: (FTP site) Table Settings


File Description
Sample ID
Data Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
RdataR2.18 KB2018-05-18
ReadmeTEXT2.77 KB2018-05-18
Tabular dataTSV325.28 KB2018-05-18
Tabular DataCSV675.29 KB2018-05-18
mixed archivearchive11.16 MB2018-05-18
RdataR11.4 MB2018-05-18
Tabular dataTSV45.81 MB2018-05-18
TextTEXT118.65 MB2018-05-18
Displaying 1-8 of 8 File(s).



Other datasets you might like: