Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences".

Dataset type: Transcriptomic
Data released on April 21, 2015

Maudhoo MD; Madison JD; Norgren Jr RB (2015): Supporting data and materials for "De Novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences". GigaScience Database. https://doi.org/10.5524/100137

DOI10.5524/100137

Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.

Keywords:

Additional details

Read the peer-reviewed publication(s):

Maudhoo, M. D., Madison, J. D., & Norgren, R. B. (2015). De novo assembly of the chimpanzee transcriptome from NextGen mRNA sequences. GigaScience, 4(1). https://doi.org/10.1186/s13742-015-0061-x

Accessions (data included in GigaDB):

BioProject: PRJNA173089
ENA: GABD01000000
ENA: GABC01000000
ENA: GABF01000000
ENA: GABE01000000

Click on a table column to sort the results.

Table Settings

Sample ID	Common Name	Scientific Name	Sample Attributes	Taxonomic ID	Genbank Name
SRX179267	Chimpanzee	Pan troglodytes	Tissue:Skin Cell type:Fibroblasts Alternative accession-INSDC:GABD01000001â€“GA... ...	9598	chimpanzee
SRX179264	Chimpanzee	Pan troglodytes	Description:untransformed stem cells from adipose ... Tissue:Adipose stroma Cell type:Stem cells ...	9598	chimpanzee
SRX179266	Chimpanzee	Pan troglodytes	Cell type:Endothelial cells Description:untransformed endothelial cells from ... Tissue:Vascular smooth muscle ...	9598	chimpanzee
SRX179271	Chimpanzee	Pan troglodytes	Description:untransformed myoblasts from skeletal ... Cell type:Myoblasts Tissue:Skeletal muscle ...	9598	chimpanzee

Click on a table column to sort the results.

Table Settings

File Name	Description	Sample ID	Data Type	File Format	Size	Release Date	File Attributes	Download
readme.txt			Readme	TEXT	1.66 kB	2015-04-09	MD5 checksum: 445ad42882cf90b4c4217b631d493ae9
chimpanzee_transcriptome_ISA.zip	the ISA-tab archive of metadata associated with the Chimpanzee transcriptome sequence		ISA-Tab	zip	4.83 kB	2015-05-28	MD5 checksum: 346aecc6151a6292bc2be1e67480eaf8

Date	Action
April 27, 2015	Dataset publish
April 27, 2015	Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
April 27, 2015	Description updated from : Common chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the species most closely related to humans. For this reason, it is especially important to have complete and accurate chimpanzee nucleotide and protein sequences to understand how humans evolved their unique capabilities. We provide transcriptome data from four untransformed cell types derived from the reference Pan troglodytes “Clint”, to better annotate the chimpanzee genome and provide empirical validation for proposed gene models of this important species. RNA was extracted from primary cells cultured from four tissues: skin, adipose stroma, vascular smooth muscle, and skeletal muscle. These four RNA samples were sequenced on the Illumina HiSeq 2000 platform. Sequences were deposited in the Sequence Read Archive (SRA). Transcripts were assembled, annotated and deposited in the INSDC Transcriptome Shotgun Assembly (TSA) database. We have provided a high quality annotation of 44,275 transcripts with full-length coding sequence (CDS). This set represented a total of 10,110 unique genes, thus providing empirical support for their existence. This dataset can be used to improve annoannotation of the Pan troglodytes genome.
May 28, 2015	Additional file chimpanzee_transcriptome_ISA.zip added