Help Login Create account

Data released on March 28, 2014

Genomic data of the Chinese alligator (Alligator sinensis).

Cao, C; Chen, H; Chen, Y; Dong, W; Fang, L; Fang, S; Gao, S; Ge, Y; He, G; He, J; Hou, H; Hu, L; Li, P; Liao, L; Liao, S; Ni, X; Pan, S; Wan, Q; Wang, M; Xia, J; Xu, P; Yang, H; Zhang, S; Zhu, Y (2014): Genomic data of the Chinese alligator (Alligator sinensis). GigaScience Database. RIS BibTeX Text

The Chinese alligator (Alligator sinensis), a freshwater crocodilian endemic to China, is one of the most endangered crocodilian species. Currently, there are ~100 Chinese alligators in the wild and ~10 000 captive individuals in Zhejiang and Anhui Provinces. We chose the Chinese alligator for genome sequencing with the hope of providing information that could help design scientific captive-breeding programs for population recovery project of this endangered species.
DNA from the chinese aligator was collected in Zhejiang Province, China. We sequenced the 2.3Gb genome with short reads from a series of libraries with various insert sizes ( 170bp, 500bp, 800bp, 2kb, 5kb, 10kb and 20kb) on a HiSeq 2000 sequencer.
The assembled scaffolds of high quality sequences total 314Gb, with the contig and scaffold N50 values of 23.4kb and 2.2Mb respectively. We identified 22,200 protein-coding genes with an mean length of 1403bp.

Contact Submitter

Related manuscripts:

doi:10.1038/cr.2013.104 (PubMed: 24165891)

Accessions (data included in GigaDB):

BioProject: PRJNA215016



Samples: Table Settings


Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
SRS47074238654Alligator sinensischinese alligatorAlligator sinensis Geographic location (country and/or sea,region):Ch...
Geographic location (latitude and longitude):not r...
IUCN Red List:Critically endangered
Displaying 1-1 of 1 Sample(s).

Files: (FTP site) Table Settings


File Description
Sample ID
File Type
File Format
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
SRS470742AnnotationGFF18.74 MB2014-03-28
SRS470742Coding sequenceFASTA32.52 MB2014-03-28
SRS470742Sequence assemblyFASTA2.12 GB2014-03-28
SRS470742Protein sequenceFASTA12.05 MB2014-03-28
ReadmeUNKNOWN0.5 KB2014-03-28
Displaying 1-5 of 5 File(s).

Other datasets you might like: