Help Login Create account

Data released on March 09, 2018

Supporting data for "Improving draft genome contiguity with reference-derived in silico mate-pair libraries"

Grau, J, H; Hackl, T; Koepfli, K; Hofreiter, M (2018): Supporting data for "Improving draft genome contiguity with reference-derived in silico mate-pair libraries" GigaScience Database. http://dx.doi.org/10.5524/100394 RIS BibTeX Text

Contiguous genome assemblies are a highly valued biological resource because of the higher number of completely annotated genes and genomic elements that are usable compared to fragmented draft genomes. Nonetheless, contiguity is difficult to obtain if only low coverage data and/or only distantly related reference genome assemblies are available. In order to improve genome contiguity, we have developed Cross-Species Scaffolding - a new pipeline which imports long-range distance information directly into the de novo assembly process by constructing mate-pair libraries in silico. We show how genome assembly metrics and gene prediction dramatically improve with our pipeline by assembling two primate genomes solely based on ~30x coverage of shotgun sequencing data.

Contact Submitter

Additional information:

https://github.com/thackl/cross-species-scaffolding

Accessions (data included in GigaDB):

BioProject: PRJNA74997

Accessions (data not in GigaDB):

BioSample: SAMN00857914
BioProject: PRJNA170813
GENBANK: GCF_000001405
GENBANK: GCF_000165445
GENBANK: GCA_000241425
GENBANK: GCA_001693075.2
GENBANK: GCA_001693035.2
GENBANK: GCA_001923025.1
GENBANK: GCA_001870725.1
GENBANK: AEWM00000000.1
GitHub: jstjohn/SeqPrep
GitHub: mahajrod/KrATER

Keywords:

genome assembly mate-pairs in silico scaffolding shotgun sequencing 

Software, Genomic

/images/uploads/image_upload/Images_534.png

Funding:

  • Funding body - European Research Council
  • Award ID - 310763
  • Awardee - M Hofreiter

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
SAMN0069038031869 aye-ayeDaubentonia madagascariensis Description:Genomic DNA isolated from male Daubent...
Alternative names:Dm6514m, Goblin, aye-aye
Tissue:liver
...
+
SAMN008579149598 chimpanzeePan troglodytes Description:Genomic DNA isolated from chimpanzee
Alternative names:Clint, chimpanzee
Sex:male
...
+
CLIB324929629Saccharomyces cerevisiae CLIB324Saccharomyces cerevisiae CLIB324 Description:Genomic DNA isolated CLIB324 is a Viet...
Alternative names:baker's yeast
Sex:not applicable
...
+
SAMN010906826204pig tapewormpork tapewormTaenia solium Description:Genomic DNA isolated from Taenia soliu...
Alternative names:TsUNAM cysticerci, pork tapeworm
Sex:not applicable
...
+
Displaying 1-4 of 4 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
Data Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDData TypeFile FormatSizeRelease Date 
Sequence assemblyTAR1.67 GB2018-02-17
otherTAR39.69 MB2018-02-17
GitHub archivearchive6.42 KB2018-02-17
otherTAR8.26 GB2018-02-17
ReadmeTEXT1.87 KB2018-02-17
Displaying 1-5 of 5 File(s).

History:

+

Other datasets you might like: