Skip to main content

Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences".

Dataset type: Transcriptomic
Data released on April 21, 2015

Maudhoo MD; Madison JD; Norgren Jr RB (2015): Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences". GigaScience Database. https://doi.org/10.5524/100137

DOI10.5524/100137

Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.

Additional details

Read the peer-reviewed publication(s):

  • Maudhoo, M. D., Madison, J. D., & Norgren, R. B. (2015). De novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences. GigaScience, 4(1). https://doi.org/10.1186/s13742-015-0061-x

Accessions (data included in GigaDB):

BioProject: PRJNA173089
ENA: GABD01000000
ENA: GABC01000000
ENA: GABF01000000
ENA: GABE01000000

Click on a table column to sort the results.

Table Settings
Sample ID Common Name Scientific Name Sample Attributes Taxonomic ID Genbank Name
SRX179267 Chimpanzee Pan troglodytes Tissue:Skin
Cell type:Fibroblasts
Alternative accession-INSDC:GABD01000001–GA...
...
9598 chimpanzee
SRX179264 Chimpanzee Pan troglodytes Description:untransformed stem cells from adipose ...
Tissue:Adipose stroma
Cell type:Stem cells
...
9598 chimpanzee
SRX179266 Chimpanzee Pan troglodytes Cell type:Endothelial cells
Description:untransformed endothelial cells from ...
Tissue:Vascular smooth muscle
...
9598 chimpanzee
SRX179271 Chimpanzee Pan troglodytes Description:untransformed myoblasts from skeletal ...
Cell type:Myoblasts
Tissue:Skeletal muscle
...
9598 chimpanzee

Click on a table column to sort the results.

Table Settings

File Name Description Sample ID Data Type File Format Size Release Date File Attributes Download
Readme TEXT 1.66 kB 2015-04-09 MD5 checksum: 445ad42882cf90b4c4217b631d493ae9
the ISA-tab archive of metadata associated with the Chimpanzee transcriptome sequence ISA-Tab zip 4.83 kB 2015-05-28 MD5 checksum: 346aecc6151a6292bc2be1e67480eaf8
Date Action
April 27, 2015 Dataset publish
April 27, 2015 Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
April 27, 2015 Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
May 28, 2015 Additional file chimpanzee_transcriptome_ISA.zip added