Supporting data for "G-Anchor: a novel approach for whole-genome comparative mapping utilising evolutionarily conserved DNA sequences"

Dataset type: Genome-Mapping, Metabolomic, Software, Workflow
Data released on February 15, 2018

Lenis VPE; Swain M; Larkin DM (2018): Supporting data for "G-Anchor: a novel approach for whole-genome comparative mapping utilising evolutionarily conserved DNA sequences" GigaScience Database. http://dx.doi.org/10.5524/100415

DOI10.5524/100415

Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole genome alignment is a computationally intense process, requiring expensive high performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available now from multiple projects there is an increasing demand for genome comparative analyses.
Here we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species’ reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours, and with comparable accuracy to that achieved by a highly accurate whole genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources.
G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionarily conserved DNA sequences, and that are not highly repetitive, polyploid or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available via the github repo.

Additional details

Read the peer-reviewed publication(s):

Lenis, V. P. E., Swain, M., & Larkin, D. M. (2018). G-Anchor: a novel approach for whole-genome comparative mapping utilizing evolutionary conserved DNA sequences. GigaScience, 7(5). doi:10.1093/gigascience/giy017

Additional information:

https://github.com/vasilislenis/G-Anchor

https://scicrunch.org/resolver/RRID:SCR_016046





File NameSample IDData TypeFile FormatSizeRelease Date 
mixed archivearchive89.95 MB2018-02-14
annotationTAR474.85 MB2018-02-14
mixed archiveTAR2.63 GB2018-02-14
MD5sumTEXT0.15 KB2018-02-14
ReadmeTEXT3.25 KB2018-02-14
Displaying 1-5 of 5 File(s).
Funding body Awardee Award ID Comments
Biotechnology and Biological Sciences Research Council Denis M Larkin BB/J010170/1
Date Action
February 15, 2018 Dataset publish
March 6, 2018 Description updated from : Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole genome alignment is a computationally intense process, requiring expensive high performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available now from multiple projects there is an increasing demand for genome comparative analyses.
Results Here we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species’ reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours, and with comparable accuracy to that achieved by a highly accurate whole genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources.
Conclusions G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionarily conserved DNA sequences, and that are not highly repetitive, polyploid or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available via the github repo.
July 4, 2018 Manuscript Link added : 10.1093/gigascience/giy017