Help Login Create account

Data released on January 20, 2016

Software and supporting data for Colib'read on Galaxy.

Aabidine, A, Z; Alves-Carvalho, S; Andrieux, A; Bras, Y, L; Cazaux, B; Collin, O; Lacroix, V; Lemaitre, C; Marchet, C; Miele, V; Monjeaud, C; Peterlongo, P; Rivals, E; Sacomoto, G; Salmela, L; Uricaru, R (2016): Software and supporting data for Colib'read on Galaxy. GigaScience Database. http://dx.doi.org/10.5524/100170 RIS BibTeX Text

With NGS technologies, life sciences face a raw data deluge. Classical analysis processes of such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to directly focus on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools. Dedicated to ”whole genome assembly-free” treatments, the Colib’read tools suite uses optimized algorithms for various analyses of NGS datasets, such as variant calling or read set comparisons. Based on the use of de Bruijn graph and bloom filter, such analyses can be performed in few hours, using small amounts of memory. Applications on real data demonstrate the good accuracy of these tools compared to classical approaches. To facilitate data analysis and tools dissemination, we developed Galaxy tools and tool shed repositories. With the Colib’read Galaxy tools suite, we give the possibility to a broad range of life scientists to analyze raw NGS data. More importantly, our approach allows to keep the maximum of biological information from data and use very low memory footprint.

Contact Submitter

Related manuscripts:

doi:10.1186/s13742-015-0105-2

Additional information:

https://colibread.inria.fr/

https://github.com/genouest/tools-colibread

http://files.pacb.com/datasets/primary-analysis/e-coli-k12/1.3.0/e-coli-k12-mg1655-raw-reads-1.3.0.tgz

ftp://webdata:webdata@ussd-ftp.illumina.com/Data/SequencingRuns/MG1655/MiSeq_Ecoli_MG1655_110721_PF.bam

https://github.com/PacificBiosciences/DevNet/wiki/Saccharomyces-cerevisiae-W303-Assembly-Contigs

Accessions (data not in GigaDB):

ENA: ERP000546
SRA: SRR567755

Keywords:

NGS de Bruijn graph bloom filter DiscoSNP LoRDEC Commet Mapsembler2 TakeABreak KisSplice Galaxy 

Software

http://gigadb.org/images/data/cropped/100170.jpg

Funding:

  • Funding body - Agence Nationale de la Recherche
  • Award ID - ANR-12-BS02-0008
  • Funding body - European Research Council
  • Award ID - [247073]10
  • Comment - Gustavo Sacomoto
  • Funding body - Academy of Finland
  • Award ID - 267591
  • Comment - Leena Salmela

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
brain9606HumanhumanHomo sapiens Description:Used as KisSplice exampler data, downloaded from SRA, part of project ERP000546
Alternative accession-SRA_file:ERR030882 ERR030890
E.coli562E. coli Escherichia coli Description:Used as LoRDEC example data, downloaded from PacBio and Illumina websites.
Relevant electronic resources:http://files.pacb.com/datasets/primary-analysis/e-coli-k12/1.3.0/e-coli-k12-mg1655-raw-reads-1.3.0.tgz ftp://webdata:webdata@ussd-ftp.illumina.com/Data/SequencingRuns/MG1655/MiSeq_Ecoli_MG1655_110721_PF.bam
F1410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F1_Rothamsted_2009_February_0-21cmDirect_MPBIO1O1.fna
F2a410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F2a_Rothamsted_2009_February_0-21cm_Indirect_MPBIO1O1.fna
F2b410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F2b_Rothamsted_2009_February_0-21cm_Indirect_MPBIO1O1.fna
F3410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F3_Rothamsted_2009_February_0-10cm_Indirect_in_plug.fna
F4410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F4_Rothamsted_2009_February_0-10cm_Indirect_DNA_Tissue.fna
F5410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F5_Rothamsted_2009_February_0-10cm_Indirect_Gram_positive.fna
F6410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-F6_Rothamsted_2009_February_11-21cm_Indirect_in_plug.fna
J1410658Soil Metagenome soil metagenome Relevant electronic resources:ftp://ftp-adn.ec-lyon.fr/Metasoil-datasets/METASOIL-J1_Rothamsted_2009_July_0-21cm_Direct_MPBIO1O1.fna
Displaying 1-10 of 20 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
File Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
Otherzip1.97 GB2016-01-20
OtherEXCEL9.01 KB2016-01-20
F1Genome sequenceFASTA418.67 MB2016-01-04
F2aGenome sequenceFASTA536.05 MB2016-01-04
F2bGenome sequenceFASTA376.53 MB2016-01-04
F3Genome sequenceFASTA356.96 MB2016-01-04
F4Genome sequenceFASTA448.54 MB2016-01-04
F5Genome sequenceFASTA303.33 MB2016-01-04
F6Genome sequenceFASTA353.33 MB2016-01-04
J1aGenome sequenceFASTA502.99 MB2016-01-04
Displaying 1-10 of 20 File(s).

History:

+

Other datasets you might like: