Relative influence of submicroscopic genomic variants in the etiology of diseases and personalized medicine: time has come to move forward!

doi:10.15406/jig.2014.01.00007

Journal of

eISSN: 2373-4469

Investigative Genomics

Short Communication Volume 1 Issue 2

Relative influence of submicroscopic genomic variants in the etiology of diseases and personalized medicine: time has come to move forward!

Farid Menaa

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Department of Oncology, Stem cells and Nanomedicine, Fluorotronics Inc., USA

Correspondence: Farid Menaa, Department of Oncology, Stem cells and Nanomedicine, Fluorotronics Inc., 2453 Cades Way, Vista, CA 92081, USA

Received: June 15, 2014 | Published: July 15, 2014

Citation: Menaa F. Relative influence of submicroscopic genomic variants in the etiology of diseases and personalized medicine: time has come to move forward! J Investig Genomics. 2014;1(2):33-36. DOI: 10.15406/jig.2014.01.00007

Download PDF

Abstract

For five decades, the discovery, mapping and analysis of genomic markers in non-coding, coding-, and/or intergenic regions have provided precious information which can be used for translational and personalized medicine. Indeed, submicroscopic genomic variations (e.g. point mutations, microsatellites, single nucleotide polymorphisms (SNPs), copy number variations (CNVs), microsatellites) have been associated with changes in gene expression and clinical phenotypes (e.g. pathologies, population diversity, genetic adaptation and/or evolution). Emerging findings, including my works, highlighted the important role of SNPs and CNVs in sickle cell anemia (SCA) patients. In particular, I reported that many of them often occurred in genes involved inflammation, auto-immunity, lipid metabolism, and cell adhesion when adult SCA patients with stroke complication were compared to stratified controls (e.g. groups of SCA patients without stroke). The dynamism of the genome with possible combined role of sub-microscopic genomic alterations in complex diseases such SCA, strongly suggest a need for elaborated multi-disciplinary approaches to treat patients in a personalized fashion. In this manuscript, I critically provide a short cut for personalized medicine by first describing major genomic variants before focusing on the role of SNPs and CNVs in human pathology using SCA, the first reported genetic disease, as a key example.

Keywords: genomics, structural submicroscopic variants, theranostics, personalized medicine, translational medicine, sickle cell anemia

Abbreviations

NHEJ, non-homologous end joining; MAS, marker-assisted selection; SSRs, simple sequence repeats; STRs, short tandem repeats; SNPs, single nucleotide polymorphisms; CNVs, copy number variations; SCA, sickle cell anemia; MAF, minor allele frequency; LCRs, low-copy repeats; NGS, next-generation sequencing; GWAS, genomic-wide association studies; gDNA, genomic deoxyribonucleotide acid; HbF, fetal hemoglobin; MMBIR, microhomology- mediated break- induced replication

Sub-microscopic genomic variants

Mutations are defined as a change of the nucleotide sequence resulting from:^1‒4 (i) unrepaired damage(s) to DNA (e.g. errors in non-homologous end joining (NHEJ); (ii) errors in the process of replication (e.g. error-prone translesion synthesis); (iii) a point mutation (e.g. nonsensense-, missense-); (iv) a frameshift mutation such as insertion (e.g. duplications) or deletion of DNA segments by mobile genetic elements (e.g. transposons/"jumping genes") through genetic recombination. Mutations (e.g. loss-of-function-, gain-of-function-, dominant negative/antimorphic-, lethal, back/reversion, conditional, neutral or silent) in somatic and/or germinal cells, can be spontaneous (e.g. tautomerism, depurination, deamination, slipped strand mispairing) and naturally occurring (e.g. during inheritance) or induced (e.g. by mutagens such as radiation, alkylating agents or by experimental mutagenesis). Mutations are essential for specie evolution,⁵ and so for the genetic diversity (natural selection). They can be harmful/deleterious, beneficial/advantageous or neutral/nearly neutral.⁶Microsatellites represent simple, short or variable sequence repeats of DNA (e.g. (CA)_n), and so they are often named as SSRs (i.e. simple sequence repeats), STRs (short tandem repeats) or VNTRs (variable number tandem repeats).⁷ A large part of microsatellites are present in transposons.⁸ Interestingly, these microsatellites are contributing to studies that investigate frame shift mutations (i.e. insertion or deletion), for marker-assisted selection (MAS), finger-printing and/or to better understand regulation of gene expression. Importantly, microsatellites are considered to be good genomic markers when the number of sequence repetitions is at least greater than 10 due to the fact that the level of inter- or intra-specific polymorphisms becomes higher.⁹ Such length changes usually occur when potential for replication slippage during meiosis is relatively high.^10,11

SNPs are common structural genomic variations (e.g. C>T) within a population or between populations (i.e. about 63millions SNPs in humans, according to NCBI). Interestingly, almost all SNPs are bi-allelic facilitating research investigations (e.g. minor allele frequency (MAF) at a locus).¹² SNPs, which affect only one single nucleotide base, are more frequently present in non-coding regions when compared to coding regions where two types of SNPs are found (i.e. synonymous and non-synonymous).¹³ Unlike synonymous SNPs, non-synonymous SNPs (i.e. missense- or nonsense-) affect the protein sequence, and SNPs distributed in non-coding genomic regions are able to alter several cellular processes (e.g. pre-mRNA splicing, mRNA stability, gene expression). Genetic recombination, mutation rate, and/or AT microsatellites can determine SNP density (e.g. high (AT)_n is linked to low SNP density).^14,15 SNPs are often associated with the susceptibility of certain diseases (e.g. SCA, cancers, and neurodegenerative disorders). CNVs represent a group of structural rearrangements of the genome from 1Kb to several mega-bases (e.g. deletions, duplications, inversions, translocations) that may contribute to the phenotypic diversity in humans as well as to the etiology of complex pathologies such as cardio- and neurovascular diseases.^16‒19 CNVs originate from inheritance, de novo mutation, low-copy repeats (LCRs) (http://en.wikipedia.org/wiki/Low_copy_repeats), segmental duplicate(s), alterations in the replication process (e.g. microhomology- mediated break- induced replication (MMBIR)).^19‒22

Role of SNPs and CNVs in etiology of diseases and personalized medicine

The evoked sub-microscopic genomic alterations, in particular mutations, SNPs and CNVs, are known to deeply contribute in the etiology of most human pathologies, depending on their respective frequency (i.e. relative rate), function (e.g. gain or loss of), molecular interaction (possible additive or synergic effects), genomic location (e.g. exon, intron, promoter, junction), microenvironment (e.g. epigenetic considerations such methylation, acetylation) (see "Bayesian Network" model.^23,24 Therefore, one should keep in mind that CNVs, SNPs and/or mutations (e.g. silent) are not always associated directly with a disease/a clinical phenotype at a determined time point or period.

Tremendous advances in the development of state-of-the-art genomic technologies (e.g. microarrays, next-generation sequencing (NGS) platforms)²⁵ and in the constitution of large databases derived from many international research investigations (e.g. Human genome sequencing Project, international HapMap project involving genomic-wide association studies (GWAS)) that aimed to genotype and map millions of variants^25,26 for better understanding of the complexity of particular diseases have been performed. Nevertheless, many gaps still need to be filled in order to obtain a reliable big picture of the functional gene/genomic dynamic in a spatio-temporal and clinical context (e.g. chemotherapy and/or radiotherapy, which can induce new genomic alterations).^27‒29Further, for personalized theranostic and prognosis medicine, the genomic deoxyribonucleotide acid (gDNA) used for sequencing the human genome, the current meta-analyses at the different biosystem levels (i.e. RNA and proteins) and using different technological platforms induced a relative number of bias (i.e. intra- and inter-errors).^27,30,31 To minimize such effects, deep analyses, interpretations and validations are requested. In this context, system biology much matters when OMICS are involved. Indeed and interestingly, one study on genetic variations between different species of Drosophila suggests that, if a mutation changes a protein produced by a gene, the result is likely to be harmful, with an estimated 70% of amino acid polymorphisms that have damaging effects, and the remainder being either neutral or weakly beneficial.⁶ Nowadays, several databases describe the characteristics of variants in humans (e.g. frequency, location, their association between them and diseases) are available online (e.g. NCBI, OMIM, SNPedia, Human Gene Mutation, GWAS Central, Genebank).^{25,26,32‒36}

Until recently, GWAS have been mainly focused on associating SNPs with a particular clinical phenotype, which undoubtedly help a lot for personalized medicine.³⁷ Indeed, the identification of significant genetic variants of major effect or "modifiers" in complex diseases, can be used as markers for a specific disease such as age-related macular degeneration, diabetes, obesity, cancers, cardio- and neurovascular diseases (i.e. stroke).^28,38 For instance, a point mutation in the APOE (apolipoprotein E) gene was associated with a high susceptibility of developing Alzheimer disease.³⁹ In more complex diseases, SNPs rather work in coordination as seen in the case of osteoporosis (i.e. SNP-SNP interaction within APOE gene).⁴⁰ Besides, SNPs are relevant pharmacogenomic targets for drug therapy,⁴¹ and represent stable inherited markers which are useful for specie evolution or adaptation studies independently of an observable phenotypic impact.⁴²

In addition to SNPs, rare or common CNVs are distributed in the human genome and each CNV ranged from about 1Kb to several megabases in size.^34‒36,43 Indeed, the HapMap data analysis estimated SNPs frequency to 83.6% while CNVs represented as low as 17.7%, with a little overlap (1.3%) between SNPs and CNVs signals.³⁴ This roughly confirms a more recent study which reported that CNV variation accounted for about 12% of the human genome.⁴³ Interestingly, since about 0.4% of the genome of unrelated people typically differs with respect to CNVs⁴⁴ and de novo CNVs,⁴⁵ CNVs can be used as markers for population studies and for twins/individual differentiation. Remarkable studies showed that the patterns of both SNPs and CNVs together combined to environmental factors are required to produce the disease phenotype.^16,36,46 Further, likewise SNPs and possibly in combination to them, CNVs can affect the individual’s drug response individual and so, the subsequent susceptibility to health complications (e.g. disease resistance, adverse effects).¹⁶ Thereby, CNVs has been associated with several complex health conditions (e.g. cancers, infections, auto-immune diseases, autism, schizophrenia, idiopathic learning disability).^47‒54 Indeed, higher EGFR copy number than normal has been associated with non-small cell lung cancer.⁴⁷ However and importantly, a higher copy number of a particular gene (e.g. CCL3L1) is not always associated with a poor prognosis (i.e. HIV infection),⁴⁸ while a low copy number of certain other genes (e.g. CD16) can increase the risk of developing a complex disease (i.e. systemic lupus erythematosus).⁴⁹ Further, rare CNVs play a crucial role in causing disease, which is rarely observed with most common CNVs.⁵⁵ It is even worth to note that common CNVs of certain genes (e.g. AMY1) can even be beneficial, and this could be explained by their favored frequency during evolution for adaptation.^56,57 These observations are in line with my ongoing experiments in adult SCA patients which confirmed that rare CNVs can be reliable and targetable causative markers of diseases/disease complications.^58,59

Etiology of sickle cell anemia: important influence of SNPs and CNVs

The discovery of a pathological hemoglobin S (HbS) by Pauling and colleagues in 1949 was the first demonstration that the production of an abnormal protein could be the cause of a genetic disorder.⁶⁰ Thereby, the SCA is a quite interesting example because:^{28,38,58‒72} (i) it is the first diagnosed molecular disease, which is caused by an unique mutation (i.e. single nucleotide substitution (β6Glu (GAG)→Val (GTG)) in the normal β-globin gene (HBB) and inherited following an autosomal recessive Mendelian pattern; (ii) it is the most common hemoglobinopathy, which induces a structural transformation of the normal (i.e. "donut-like shape") red blood cells (RBCs) into intravascular sickle RBCs (i.e. "croissant-like shape"); (iii) the homozygous form of SCA (HbSS), which is the symptomatic form, is associated with numerous complications, including stroke, a major health public concern worldwide manifested by vaso-occlusive events and episodic hemolysis; (iv) its large panel of subsequent complications were found to be associated with SNPs and/or CNVs (e.g. SNPs in the CYBA, ANG1, TGFBR3, SELP, IL4R, ADRB2, VCAM1, LDL-R, AGT, ANXA2 or TEK genes; CNVs in the UGT2B28 gene). Conversely, it is important to keep in mind that certain genomic variants are not always associated with SCA seriousness. Thereby, SNPs in ADCY9 or BCL11A genes were associated with decreased stroke risk in SCA patients due to their participation in up-regulating fetal hemoglobin (HbF) production. Overall, these findings suggest a need for multi-disciplinary approaches to manage SCA complexity with more confidence (e.g. over-expression of HbF but also for a large panel of genomic variants which can be used as reliable biomarkers of SCA disease).²⁸ The great advancement of the OMICS era shall provide soon more insights regarding the etiology of SCA complications and contribute to the implementation of the HapMap data in order to develop efficient and safer theranostics.

Conclusions and perspectives

Although GWAS are fast emerging, it is clear that the set of markers analyzed up-to-date do not cover the variability of the entire genome yet. Therefore, continuous data implementation is still needed. Owing to this consideration, CNVs can implement SNPs for the diagnosis and prevention of a given disease (i.e. monogenic or polygenic). Eventually, variants can be considered as valuable biomarkers of disease, and should permit precocious diagnosis and more efficient treatment of a given patient.