Opinion Volume 1 Issue 5
1Department of Biological Sciences, Florida Atlantic University, USA
2Department of Human Genetics, University of Leuven, Belgium
Correspondence: Ramaswamy Narayanan, Department of Biological Sciences, Charles E. Schmidt College of Science, Florida Atlantic University, Boca Raton, FL 33431 USA, Tel 5612972247, Fax 5612973859
Received: September 07, 2014 | Published: September 29, 2014
Citation: Narayanan R, Van De Ven WJM. Transcriptome and proteome analysis: a perspective on correlation. MOJ Proteomics Bioinform. 2014;1(5):119-120. DOI: 10.15406/mojpb.2014.01.00027
Druggable proteins (such as enzymes, receptors, transporters and channel proteins, signal transduction proteins, oncogenes and transcription factors) are powerful drug targets for therapeutics of diverse diseases.1–4 Among the approximately 22,000 human proteins, novel drug targets are likely to emerge.5 Numerous bioinformatics tools are used to monitor the gene expression at mRNA level, including UniGene and Expressed Sequence Tag, EST,6,7 Serial Analysis of Gene Expression, SAGE,8 Digital Differential Display, DDD,9,10 Cancer Genome Anatomy Project X-Profiler and Digital Gene Expression Displayer.11,12 High throughput transcriptome analysis is often performed using Microarray,13–15 and Next-Generation RNA sequencing, NGSRNA-seq.16,17
In recent years proteome analysis has been greatly aided by various protein expression tools, the Allan Brain Atlas,18,19 Human Protein Atlas,20 Multi Omics Profiling Expression Database, MOPED,21 the Human Protein Reference Database, HPRD22,23 and the recently described Human Proteome Map24 and proteomics Db. The last two mentioned studies together account for approximately 84% of the total annotated protein-coding genes in humans. Together, these two studies identified >18,000 proteins encoded by both known genes and uncharacterized Open Reading Frames. Over 20,000 protein isoforms expressions were also characterized.25,26
Numerous normal tissues, body fluids and cell lines were used in these two studies. These recently described protein expression databases greatly expand our ability to establish correlative evidence for mRNA and protein expression for most of the human proteome.
Based on mRNA expression using a high throughput transcriptome analysis, interpretations are often made about the functional relevance, pathways, interacting proteins and the possibility of drug therapy use. Protein expression is often inferred, but frequently a correlation of mRNA versus protein is missing in studies. The complex regulation of protein expression at the level of noncoding RNAs, DNA methylation, epigenetic changes, gene amplifications, copy number variations, mutations, post translational modifications such as acetylation, amidation, myristylation, phosphorylation, sumoylation, Ubiquitination etc., stability and degradation as well as the interacting proteins adds to the complexity of transcriptome-based interpretations.27,28
For identifying candidate driver genes for therapeutic target discovery and functional elucidation, knowledge of mRNA and protein correlation is critical. The vast amount of data generated by microarray and the NGSRNA-sequencing technology can be reduced to a manageable number of putative targets using the correlation at the mRNA and the protein levels. In a recent study, Zhang et al.,29 using The Cancer Genome Atlas (TCGA) tissues (n=87) demonstrated that the mRNA transcript abundance and gene amplifications did not often correlate. Among 3,764 genes analyzed, only 32% showed statistically significant correlation. Further, copy number alterations had no significant effects on the protein levels. Our own results as well as several other reports support such a lack of correlation.30–34
During the early stages of microarray technology, vast data sets were generated with no standards. The establishment of Minimum Information About a Microarray Experiment, MIAME,35 greatly reduced the noise in subsequent array-based datasets. Similar guidelines for RNA and protein expression are needed to develop meaningful interpretations. Where possible, the investigators and authors should be expected to demonstrate such a correlation using the protein expression tools, which are becoming increasingly available. This would greatly help reduce the noise in the published literature, verify functional roles for the proteins and optimize the chances of identifying candidate driven genes for druggableness.
Further, the development of single cell transcriptome and proteome analysis capability (in contrast to whole tissue-based analysis as is currently done) would allow for a single cell-based correlation between mRNA and proteins. This will allow for cellular heterogeneity in expression profiling in a tissue environment. An offshoot of the protein expression databases, such as creation of a separate human proteome database showing correlation with the transcriptome, would greatly aid in the therapeutic target discovery.
We thank Jeanine Narayanan for editorial assistance.
The author declares no conflict of interest.
©2014 Narayanan, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.