Data released on July 22, 2013
Background: Parrots are considered one of the most behaviorally advanced vertebrate groups. They have an advanced ability of vocal learning. Parrots can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, very little is known about the genetics of these traits. In order to understand the molecular and genetic basis of these traits we need whole genome sequencing and a robust assembly of coding and noncoding regions of a parrot genome including regulatory regions and repetitive elements.
Findings: Here we present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) that is the most widely used parrot species for studying vocal learning. Specifically, we present the genomic reads, four high quality annotated assemblies and optical maps. This sequence reads were in part used for the Assemblathon 2 competition (see dataset doi:10.5524/100060). The sequence data presented here includes over 300X raw read coverage from multiple sequencing technologies (454 Titanium, 454 Flexplus, Illumina and Pacific Biosciences) and chromosome optical maps from a single male animal. The reads and optical maps were used to create hybrid assemblies representing some of the largest genome scaffolds to date for a bird genome using next generation sequence technology. Annotation of these assemblies was generated using brain transcriptome sequence assemblies.
Conclusions: Along several quality metric dimensions, these assemblies are comparable to or better than the Chicken and Zebra Finch genome assemblies that were built from traditional Sanger sequencing reads. These assemblies are sufficient to analyze difficult to sequence and assemble regions, including those not yet assembled in the finch genome, and promoter regions of genes deferentially regulated in vocal learning brain regions.