De novo high-coverage sequencing and annotated assemblies of the budgerigar genome

Dataset type: Genomic, Transcriptomic, Software, Genome-Mapping
Data released on July 22, 2013

Aboukhalil R; Bukovnik L; Fedrigo O; Ganapathy G; Howard JT; Jarvis ED; Knight JR; Koren S; Li B; Li J; Phillippy A; Rasolonjatovo I; Schatz M; Schwartz D; Wang T; Ward JM; Warren W; Winer R; Wray G; Zhang G; Zhou S (2013): De novo high-coverage sequencing and annotated assemblies of the budgerigar genome GigaScience Database. http://dx.doi.org/10.5524/100059

DOI10.5524/100059

Background: Parrots are considered one of the most behaviorally advanced vertebrate groups. They have an advanced ability of vocal learning. Parrots can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, very little is known about the genetics of these traits. In order to understand the molecular and genetic basis of these traits we need whole genome sequencing and a robust assembly of coding and noncoding regions of a parrot genome including regulatory regions and repetitive elements.
Findings: Here we present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) that is the most widely used parrot species for studying vocal learning. Specifically, we present the genomic reads, four high quality annotated assemblies and optical maps. This sequence reads were in part used for the Assemblathon 2 competition (see dataset doi:10.5524/100060). The sequence data presented here includes over 300X raw read coverage from multiple sequencing technologies (454 Titanium, 454 Flexplus, Illumina and Pacific Biosciences) and chromosome optical maps from a single male animal. The reads and optical maps were used to create hybrid assemblies representing some of the largest genome scaffolds to date for a bird genome using next generation sequence technology. Annotation of these assemblies was generated using brain transcriptome sequence assemblies.
Conclusions: Along several quality metric dimensions, these assemblies are comparable to or better than the Chicken and Zebra Finch genome assemblies that were built from traditional Sanger sequencing reads. These assemblies are sufficient to analyze difficult to sequence and assemble regions, including those not yet assembled in the finch genome, and promoter regions of genes deferentially regulated in vocal learning brain regions.

Additional details

Read the peer-reviewed publication(s):

Koren, S., Schatz, M. C., Walenz, B. P., Martin, J., Howard, J. T., Ganapathy, G., … Phillippy, A. M. (2012). Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nature Biotechnology, 30(7), 693–700. doi:10.1038/nbt.2280 (PubMed: 22750884)
Zhang, G., Li, B., Li, C., Gilbert, M. T. P., Jarvis, E. D., & Wang, J. (2014). Comparative genomic data of the Avian Phylogenomics Project. GigaScience, 3(1). doi:10.1186/2047-217x-3-26
Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., … Meredith, R. W. (2014). Comparative genomics reveals insights into avian genome evolution and adaptation. Science, 346(6215), 1311–1320. doi:10.1126/science.1251385
Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., … Howard, J. T. (2014). Whole-genome analyses resolve early branches in the tree of life of modern birds. Science, 346(6215), 1320–1331. doi:10.1126/science.1253451
Ganapathy, G., Howard, J. T., Ward, J. M., Li, J., Li, B., Li, Y., … Jarvis, E. D. (2014). High-coverage sequencing and annotated assemblies of the budgerigar genome. GigaScience, 3(1). doi:10.1186/2047-217x-3-11

Genome browser:

http://avian.genomics.cn/en/index.html

Accessions (data included in GigaDB):

ENA: ERP002324
ENA: ERS222880





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
000461913146Melopsittacus undulatusbudgerigarMelopsittacus undulatus Environment (biome):anthropogenic terrestrial biom...
Alternative accession-BioSample:SAMEA836052
Alternative accession-BioSample:SAMN00004619
...
+
198545413146Melopsittacus undulatusbudgerigarMelopsittacus undulatus Alternative accession-BioSample:SAMEA1713774
Alternative accession-BioSample:1985454
Cell type:blood
...
+
Displaying 1-2 of 2 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
SAMPLE:1985454OtherPDF7.78 KB2013-02-17
SAMPLE:1985454OtherPDF5.9 KB2013-02-17
SAMPLE:1985454OtherPDF13.98 KB2013-02-17
SAMPLE:1985454OtherPDF4.61 KB2013-02-17
SAMPLE:0004619Transcriptome sequenceSFF2.04 GB2013-06-26
SAMPLE:1985454OtherPDF8.04 KB2013-02-17
SAMPLE:1985454OtherPDF5.33 KB2013-02-17
Tabular dataEXCEL22.41 KB2014-03-26
SAMPLE:1985454Sequence assemblyFASTA1.06 GB2013-02-26
SAMPLE:1985454AnnotationGFF10.61 MB2013-02-26
Displaying 1-10 of 133 File(s).
Date Action
October 15, 2015 File umd.mega.fa.gff updated
October 15, 2015 File Melopsittacus_undulatus.gene.cds updated
October 29, 2015 File Melopsittacus_undulatus.fa.gz updated
October 29, 2015 File Melopsittacus_undulatus.gene.cds updated
October 29, 2015 File Melopsittacus_undulatus.gene.gff updated
October 29, 2015 File Melopsittacus_undulatus.gff.gz updated
November 5, 2015 File Melopsittacus_undulatus.cds.gz updated
November 5, 2015 File Melopsittacus_undulatus.fa.gz updated
November 5, 2015 File Melopsittacus_undulatus.gff.gz updated
November 5, 2015 File Melopsittacus_undulatus.pep.gz updated
November 13, 2017 External Link updated : http://avian.genomics.cn/en/index.html