Genomic data of the Bald Eagle (Haliaeetus leucocephalus).

Dataset type: Genomic
Data released on May 22, 2014

Gilbert MTP; Howard JT; Jarvis ED; The Avian Genome Consortium ; Warren W; Wilson RK; Zhang G (2014): Genomic data of the Bald Eagle (Haliaeetus leucocephalus). GigaScience Database.


The Bald Eagle (Haliaeetus leucocephalus(Linnaeus, 1766)) is the official emblem of the United States and it belongs to a group of birds known as fish eagles. They are a large bird with adults weighing upto 4kg.
These data have been produced as part of the Avian Phylogenomic Project . DNA was collected from a male bird bred at the NC Raptor Center, Huntersville, North Carolina, USA. The sequencing was conducted at WUSTL. We sequenced the 1.4Gb genome to a depth of approximately 88X with short reads from a series of libraries with various insert sizes (170bp, 500bp, 800bp, 2kb, 5kb, 10kb and 20kb).
The assembled scaffolds of high quality sequences total 1.26Gb, with the contig and scaffold N50 values of 10kb and 670kb respectively. We identified 16,526 protein-coding genes with an mean length of 19kb.

Zhang, G., Li, B., Li, C., Gilbert, M. T. P., Jarvis, E. D., & Wang, J. (2014). Comparative genomic data of the Avian Phylogenomics Project. GigaScience, 3(1). doi:10.1186/2047-217x-3-26
Zhang, G., Li, C., Li, Q., Li, B., Larkin, D. M., Lee, C., … Meredith, R. W. (2014). Comparative genomics reveals insights into avian genome evolution and adaptation. Science, 346(6215), 1311–1320. doi:10.1126/science.1251385
Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., … Howard, J. T. (2014). Whole-genome analyses resolve early branches in the tree of life of modern birds. Science, 346(6215), 1320–1331. doi:10.1126/science.1253451

BioProject: PRJNA237821
SRA: SRP038924


Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
Bald Eagle52644 bald eagleHaliaeetus leucocephalus Source material identifiers:Erich Jarvis
Estimated genome size:1.4
Funding source:NIH/Wash U
File NameSample IDData TypeFile FormatSizeRelease Date 
Coding sequenceFASTA6.81 MB2014-05-12
Genome sequenceFASTA362 MB2014-05-12
AnnotationGFF1.69 MB2014-05-12
Protein sequenceFASTA4.43 MB2014-05-12
Repeat sequenceUNKNOWN8.16 MB2014-05-12
