Help Login Create account

Data released on July 22, 2013

De novo high-coverage sequencing and annotated assemblies of the budgerigar genome

Aboukhalil, R; Ganapathy, G; Howard, J, T; Koren, S; Li, J; Phillippy, A; Schatz, M; Schwartz, D; Ward, J, M; Zhou, S; Bukovnik, L; Fedrigo, O; Li, B; Wang, T; Wray, G; Knight, J, R; Rasolonjatovo, I; Warren, W; Winer, R; Zhang, G; Jarvis, E, D (2013): De novo high-coverage sequencing and annotated assemblies of the budgerigar genome GigaScience Database. RIS BibTeX Text

Background: Parrots are considered one of the most behaviorally advanced vertebrate groups. They have an advanced ability of vocal learning. Parrots can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, very little is known about the genetics of these traits. In order to understand the molecular and genetic basis of these traits we need whole genome sequencing and a robust assembly of coding and noncoding regions of a parrot genome including regulatory regions and repetitive elements.
Findings: Here we present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) that is the most widely used parrot species for studying vocal learning. Specifically, we present the genomic reads, four high quality annotated assemblies and optical maps. This sequence reads were in part used for the Assemblathon 2 competition (see dataset doi:10.5524/100060). The sequence data presented here includes over 300X raw read coverage from multiple sequencing technologies (454 Titanium, 454 Flexplus, Illumina and Pacific Biosciences) and chromosome optical maps from a single male animal. The reads and optical maps were used to create hybrid assemblies representing some of the largest genome scaffolds to date for a bird genome using next generation sequence technology. Annotation of these assemblies was generated using brain transcriptome sequence assemblies.
Conclusions: Along several quality metric dimensions, these assemblies are comparable to or better than the Chicken and Zebra Finch genome assemblies that were built from traditional Sanger sequencing reads. These assemblies are sufficient to analyze difficult to sequence and assemble regions, including those not yet assembled in the finch genome, and promoter regions of genes deferentially regulated in vocal learning brain regions.

Contact Submitter

Related manuscripts:

doi:10.1038/nbt.2280 (PubMed: 22750884)

Genome browser:

Accessions (data included in GigaDB):

ENA: ERP002324
ENA: ERS222880

Genomic, Transcriptomic, Software, Genome-Mapping


Samples: Table Settings


Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
000461913146Melopsittacus undulatusbudgerigarMelopsittacus undulatus Environment (biome):anthropogenic terrestrial biom...
Alternative accession-BioSample:SAMEA836052
Alternative accession-BioSample:SAMN00004619
198545413146Melopsittacus undulatusbudgerigarMelopsittacus undulatus Alternative accession-BioSample:SAMEA1713774
Alternative accession-BioSample:1985454
Cell type:blood
Displaying 1-2 of 2 Sample(s).

Files: (FTP site) Table Settings


File Description
Sample ID
File Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
SAMPLE:1985454OtherPDF7.78 KB2013-02-17
SAMPLE:1985454OtherPDF5.9 KB2013-02-17
SAMPLE:1985454OtherPDF13.98 KB2013-02-17
SAMPLE:1985454OtherPDF4.61 KB2013-02-17
SAMPLE:0004619Transcriptome sequenceSFF2.04 GB2013-06-26
SAMPLE:1985454OtherPDF8.04 KB2013-02-17
SAMPLE:1985454OtherPDF5.33 KB2013-02-17
Tabular dataEXCEL22.41 KB2014-03-26
SAMPLE:1985454Sequence assemblyFASTA1.06 GB2013-02-26
SAMPLE:1985454AnnotationGFF10.61 MB2013-02-26
Displaying 1-10 of 133 File(s).



Other datasets you might like: