Review Article Volume 5 Issue 6
Department of Bioinformatics, Sathyabama University, India
Correspondence: Harishchander Anandaram, Department of Bioinformatics, Sathyabama University, Chennai, India
Received: April 04, 2017 | Published: July 13, 2017
Citation: Anandaram H. A review on application of biomarkers in the field of bioinformatics & nanotechnology for individualized cancer treatment. MOJ Proteomics Bioinform. 2017;5(6):179-184. DOI: 10.15406/mojpb.2017.05.00179
In the current era of post genomics, research in discovery of biomarker with respect to the application of principles in nanotechnology and bioinformatics raised opportunities in the field of personalized medicine in which diseases are detected, diagnosed and regular follow up of therapeutic regimen were tailored to each and every individual on the basis of their molecular profile. In Case of predictive medicine the usage of genetic/molecular information play a vital role in predicting the development of disease progression and clinical outcome. In this review, we discuss the recent advancement of tools developed on the principles of bioinformtics to accelerate the biomarker discovery of cancer and the usage of multiplexed nanoparticle probes for profiling the biomarker of cancer. Finally, with respect to future prospectives and further challenges in biomarker discover of Cancer, we correlate the signatures of biomolecules with their clinical outcome. So, the term Bio–Nano–Info is a promise for individualized therapy with respect to molecular diagnosis of cancer and the same principle can also be applied to other human diseases.
In the Current era of post genomics, there is an exponential growth in the area of genomics and proteomics data and hence there was a major advancement in the field of understanding the molecular mechanisms of complex diseases in humans. There raised an increase in pace to develop new technologies to diagnose the molecular signatures of complex diseases to initiate the therapy with respect to the principle of personalized medicine and hence the new era of molecular medicine laid the path for detection of disease, diagnosis of complex disorders and performing treatment with respect to the molecular profile of each and every individual.1–4 This revolution had laid the foundation for the availability and application of new and novel biomarkers to predict the behavior of disease using advanced technologies to perform rapid diagnosis and detection. The evolutions of modern therapies to target cells were based on the principles of biocomputing. In order to address the molecular profiling and diagnostics, a major challenge was raised in characterization by histologic lesions in complex disorders because those lesions are heterogeneous at the cellular and molecular levels. In cancerous tumors, malignant cells are intermixed with blood vessels, stroma, and inflammatory cells.5–8 Current technologies like Gene microarrays and Real–Time Polymerase Chain Reactions (RT–PCR) were not designed to handle the heterogeneic nature of cancer lesions and hence the development of nanotechnology provided a new opportunity to integrate morphological profile of diseased contion and molecular signatures and it is also used for correlating the observed cellular and molecular changes with respect to the behavior of pathological condition in complex disorders.9–11 In particular, the application of bio conjugated quantum dots (QDs)12–15 quantifies the presence of multiple biomarkers in intact tissue specimens and cancer cells and it allows us to conduct a comparative test between traditional histopathology V.S. molecular signatures of the same tissue.16–20 Nanotechnology is being used in the fields of molecular imaging and therapy and it can be also be used to improve the toxicity and efficacy profiles of chemotherapeutic agents because these agents can be covalently attached or encapsulated.21–23
The present major task in biomedical nanotechnology is to understand the interaction of nanoparticles with cells, organs and blood under the physiological conditions in vivo and the need to overcome their limitations with respect to their delivery to organs or diseased target sites.24–26 The next major challenge is to generate a series of critical studies to identify the clear link between biomarkers VS disease behaviors like the rate of progression in tumor and their different responses with respect to radiation, drug therapy or surgery.27 In this context, we discuss the ways and level of integration of biomarkers and bio computing with nanotechnology to perform high–throughput analysis of gene expression data. We also discuss the application of web–based bioinformatics tools for the discovery, optimization and clinical validation of biomarker.
Biomarker or Bio–molecular markers include mutant or genes, proteins, RNA, carbohydrates, lipids, small metabolites and the altered expression states of such markers can be correlated with respect to a clinical outcome or biological behavior.28–31 Most of the biomarkers were discovered by the molecular profiling studies based on correlation or association between a disease behavior and molecular signature. The first study on molecular profiling of complex diseases was reported by Golub et al.32 and the outcome of this study helped in identifying the gene expression patterns that could classify tumors and it served as a base for yielding a new and novel insights into tumor pathology such as grade, stage, response to treatment and further clinical course. Gene expression studies further revealed the fact that the molecular signatures of each and every tumor as an outcome of the combined stromal, tumoral and inflammatory factors of the original heterogeneous lesion.33 The initial correlation of gene expression patterns with clinical outcome was first reported for the diffusion of large B–cell lymphoma,34 a heterogeneous disease in clinical conditions. Whereas, most patients responded well to therapy had a prolonged survival. This variability in progression of disease can also be correlated with a distinct pattern of gene expression. The concept of identifying a specific molecular portrait for a specific tumor in each patient was validated in later stage by Bittner, Perou and their coworkers by an array of clinical samples.35,36 Recent work in several groups identified a unique gene expression pattern to correlate with the clinical outcomes various tumors which includes lung, breast, prostate and liver cancers.37–41
Biomarkers are divided into 3 categories: Predictive, Prognostic and therapeutic response. Predictive biomarkers identify the probability of patients benefit with respect to a particular treatment. For example, breast cancer patients with HER2 (ERBB2) gene encodes tyrosine kinase receptor were expected to benefit from trastuzumab (Herceptin) treatment.42 Prognostic biomarkers allow the prediction of natural course needed for individual cancer and it can distinguish between aggressive tumors and indolent tumors. In case of gene encoding the estrogen receptor is over expressed by the tumor, the patients might be a better candidate to respond tamoxifen treatment.43 Pharmacodynamic biomarkers measure the short–term treatment effects of a drug has on tumor and these were used to guide the selection of dosage in the earlier stages of clinical development of new drugs from lead molecules.
In most cases, single biomarkers failed to provide the required sensitivity and specificity with respect to the substantial heterogeneity in various type of cancer. It is not realistic to expect a single biomarker to provide information about the tissue type and malignant transformation in the various stages of tumor progression and development. Hence, panels of biomarkers are required for diagnosis. The discovery and validation of biomarkers must be subjected to several key steps before their application in clinical practice. Here, the initial step involves the acquisition and experimental design of molecular data, i.e. large amount of proteomic or genomic expression data together with the case history of patient. The data need to be properly annotated and organized using the available web–based tools and databases. Further, the original data were improved and evaluated by combining multiple datasets to increase the statistical significance. In the second stage of data processing the concept of feature extraction and classification principles were used to identify the relevant biomarkers, which are differentially expressed. Prior to the clinical application of these biomarkers, their functional relevance is validated by determining their level of expression with the application of RT–PCR (for nucleic acids) or multiplexed nanotechnology (for proteins). Hence, we elaborate on the web–based application of bioinformatics tools for the analysis of microarray data to initiate the discovery of biomarker and analyze their validation in clinical studies.
In the final stage of pre–genomic era, microarray data analysis using bioinformatics tools focused on the concept of unsupervised clustering technique (Eg. Kohonen Map and Self Propagation based learning) in “Machine Learning” and the initiative was to explore new technologies and discover new properties within the structure of expression data by neglecting the normalized dwelling factors for application in potential clinical studies. For example, Eisen et al.,44 had developed a software application that combines several types of unsupervised clustering methods. A more recent development was the combination of clustering algorithms and visualization tools into a web–based application45 with a focus on unsupervised clustering method. Similar methods have been applied to analyze the high–throughput data of gene expression from different clinical scenarios and certain significant findings from these tools lead to the identification of cancer subtypes.46,47 As such, applications developed by the principle of unsupervised clustering are still used for visualization of expression data and biomarker discovery
Recently, the major focus of microarray data analysis has shifted from unsupervised clustering to and supervised analysis (For Example Back Propagation Algorithm, Support Vector Machine and etc.). Consequently, a web–based application of bioinformatics shifted towards the development of new tools for the analysis of genes which are differentially expressed under the known conditions. Some of these tools are specific to the platforms of microarray (For example, ILOOP (Interwoven Loop) and MAGMA were web based applications designed for analyzing the two–channel microarrays).48,49 ILOOP is an interface to assist the experimental design of two–channel microarrays and MAGMA incorporates the standard normalization procedure for converting statistical methods into an application for usability and reproducibility. Most of these web–based applications were implemented functionally to incorporate several common steps in the pipeline of data analysis. GEPAS (Gene Expression Profile Analysis Suite) includes the principles of data normalization, feature selection, class prediction, and unsupervised/supervised clustering.50
CARMA web (comprehensive R and bioconductor–based web) is another tool which was recently developed to perform the microarray data analysis,51 it uses several modules from Bioconductor (An open source bioinformatics software package incorporated in the R programming language–https:// carmaweb.genome.tugraz.at). The functions of microarray data analysis were available in Bio conductor and it includes the concept of background correction, quality control, normalization, differential gene detection, clustering, dimensionality reduction, and visualization.53 In bioinformatics application, the main contribution of CARMAweb is to integrate the numerous tools into a user–friendly interface in web. Gene Pattern53 is another tool to compile the analysis of different gene expression tools to reproduce integration in the cancer Bioinformatics Grid (caBIG), an initiative taken by the National Cancer Institute (NCI), to create a standard for bioinformatics software.54
It is a well established fact that, the candidate biomarkers were obtained from the outcome of microarray data analysis and they depend on the available samples and the selection algorithm.55 In fact, these biomarkers can also be highly unstable and often varies from sample to sample. Furthermore, platforms with highthroughput assay can handle ten thousand of genes and most of the assays were not completely understood. Hence, the task of interpreting their results needs improvisation in statistics. By associating each candidate gene with a biological function, one might be able to understand the underlying mechanisms of the associated disease and their biological relevance on the basis of the algorithm for feature selection. Databases such as the Gene Ontology (GO) were designed to facilitate the interpretation of gene functions in large scale.56 A diverse range of GO tools were also available to extract the statistically significant conclusions from the analysis of GO database. Modules for GO analysis are either web–based or downloadable packages which includes GoMiner,57,58 GOStat,59 AmiGO,60 BiNGO,61 and GOEAST.62 There are similar applications that mine the literature. CoPub links the lists of candidate genes to keywords that are obtained by searching literature in Medline abstracts and visualizes the keywords that are overrepresented using a network structure and the web address is http://services.nbic.nl/cgi–bin/copub/CoPub.pl.63
Since there is an increase in the accumulation of gene expression data, certain applications have emerged with the objective to organize and integrate the heterogeneous datasets from various data sources in an effective manner. As it was mentioned previously, the exponential increase in the sample size of data can improve the process of reproducibility in predictive models. Thus, there raised a demand for solutions to allow data sharing. Gene Expression Omnibus (GEO)64 and Array Express65 were the examples of large warehouses (repositories) to manage the the community of data standards such as MIAME (Minimum Information about a Microarray Experiment).66 An alternative solution, ArrayWiki can allow the community of users to annotate gene expression metadata.67 Another initiative, caArray is a part of the caBIG initiative to become an interoperable standard for the storage of microarray in caBIG applications.54 There is an overlap between the analytical methods used in gene expression analysis and gene interpretation software. These overlaps are deposited in the data repositories using high–throughput methods. Further, a web–based application called the ‘Microarray Retriever’68 was developed to retrieve the gene expression data from GEO and Array Express repositories for maximizing the potential of large–sample in microarray studies. Similarly, GEO meta db is an improvement for increasing the querying capabilities in GEO repository.69 Although this application available only for GEO, it is anticipated that it can also be applied to meta–analysis applications to increase the usefulness of repositories.
In spite of the availability of software packages, it is still difficult to use the output of data after the process of normalization and quality without the application of a subsequent clustering feature for selection.70 Further, there is a need to translate the gene symbols on the basis of feature selection, prior to the interpretation of a particular GO application. Workflow applications, such as Taverna and Gene Trail Express address the issue of feature selection in various ways. Gene Trail Express is a web–based portal that implements its own process of statistical analysis, normalization, interpretation and visualization modules based on common methods.71 Taverna is more general and builds workflows for certified web services certified by caBIG.72
In order to identify a biomarker using a Web–based bioinformatics resource, omniBioMarker was developed by Phan et al.73 In this software, biomarkers are identified through several steps which include normalization, quality control, feature selection, biological interpretation, GO validation, and clinical prediction. Since a single path is not enough for the identification of biomarker, a pipeline was developed to perform well for all possible datasets,74 unique analysis of these parameters must be applied specifically for each clinical problem. The computational layout of omniBiomarker addresses the concept of fine–tuning in each and every step in the pipeline with respect to a particular dataset or clinical problem on the basis of prior knowledge in biology.73 Biological knowledge overcomes the “increase of dimensionality” to stabilize the results by increasing the reproducibility of clinical prediction.
The initial step in the pipeline for biomarker identification is quality control. Due to the stochastic nature (randomness) of high–throughput data, it is important to analyze the quality of data, prior to further analysis. Moreover, the large quantity of high–throughput data requires the applications of specialized software. There are several existing applications to analyze the quality of data for microarrays within a population of samples. These applications may with respect to their complexity of model in usability and it ranging from web based downloadable software packages like RMA Express75 and dChip76 to web–based online portals such as caCorrect.77 Though the gene expression assays are generally reproducible78 their statistical artifacts were in smaller datasets and there rise a need for identifying the corrected data prior to further data analysis.
In case of large data sets (say >100,000 genes and proteins), computation tools should be used to select and optimize a small panel of biomarkers which predicts the patients outcome for therapeutic response. Conjugated Nanoparticles of antibodies can be designed with the purpose of targeted therapy and molecular diagnosis. In case of multiplexed QD probes, a selected biomarker panel of clinical specimens can be in needle biopsies and tissue microarrays. The use age of a minimum of five to a maximum of ten protein biomarkers have a significant impact in disease diagnosis and personalized treatment. In order to achieve these goals, Xing et al.18 have obtained a promising result for the molecular profiling of clinical paraffin–embedded by fixed formalin (FFPE) specimens in prostate cancer. In this study, four conjugates of QD–antibodies were used to recognize and detect the four antigens which are responsible for causing tumor (the tumor–suppressor p53, the E3 ubiquitin ligase mdm–2, the zinc–finger transcription factor EGR–1 and the cyclin–dependent kinase inhibitor p21/CDN1A). These markers were known to vital in diagnosis of prostate cancer and they have also been correlated with the behavior of tumor.79,80 In case of molecular profiling, the results of QDs were consistent with results obtained by the fluorescence in situ hybridization (FISH) and traditional immunohistochemistry (IHC) and using human breast cancer cells.19 Finally, It is important to note that the classification of tumor with antigens which are expressed at low levels can also be subjective and therefore it requires an experienced observer to contribute a considerable amount of variations in clinical studies. In contrast, quantitative QD measurements allow the accurate and user–independent determination of tumor antigens for genes/proteins that are expressed at low levels. Thus, the molecular profiling of quantitative QD can standardize the categorization of antigens on specimens. This factor is the key fundamental of management in breast cancer because the benefit of hormonal therapies and drug trastuzumab depends not only on the presence but also on the quantity of hormone or HER2 receptors.81–90
In the mere the future, various directions are needed to carry out researches that are promising particularly in the application of biomedical science but still it requires additional effort with concentration to achieve success. The initial direction of research involves the design and development of nanoparticles by a single or multiple functionalities.
In case of applications in cancer and other conditions in medical field, the functions of nanoparticle includes the concept of imaging (exist either as single or dual–modality) and therapy via drug delivery or a combination of several drugs to target ligands.
By adding certain functions, nanoparticles can be designed to have novel properties for novel applications. For example, the binary nanoparticle with dual functions can be utilized for targeted therapy and molecular imaging.
Bioconjugated QDs with drug target and imaging functions can be used for the applications involved in molecular profiling.
In Contrast, the ternary nanoparticles that combine three functions can be designed for simultaneous imaging and targeted therapy.
The next stage of research must address the issues involved in optimizing the panels of biomarkers on the basis of quantitative molecular profiling in bioinformatics along with the help of nanotechnology. For example, probes in bioconjugated nanoparticle can predict the treatment response and clinical outcome of cancer behavior in personalized therapy.
The most important direction in future research of personalized medicine is to further investigate the process of distribution, excretion, metabolism, and pharmacodynamics of nanoparticle in the in–vivo studies of animal models. These studies will play a vital role in the development of nanoparticles for clinical applications in treating cancer.
None.
The author declares no conflict of interest.
©2017 Anandaram. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.