Mini Review Volume 6 Issue 3
1Department of Biotechnology and Genetic Engineering, Philadelphia University, Jordan
2Center for Research on Health and Aging, University of Illinois, USA
Correspondence: Tawfiq Froukh, Department of Biotechnology and Genetic Engineering Philadelphia University - Jerash Road, Amman (11118) Jordan, Tel 962 6 4799000, Fax 962 6 4799040
Received: September 18, 2017 | Published: October 23, 2017
Citation: Froukh T, Froukh SQ. The importance of population specific sequence variants as control to investigate the causality of rare sequence variants in human diseases in Jordans. MOJ Proteomics Bioinform. 2017;6(3):283-285. DOI: 10.15406/mojpb.2017.06.00194
The majority of Mendelian diseases are caused by mutations affecting the coding segments of a gene. Therefore, Whole Exome Sequences (WES) based on Next Generation Sequencing (NGS) technology is obtained for patients affected by rare Mendelian diseases. The generated sequence file of NGS technology harbors huge numbers of variants compared to the reference genome. In order to identify the best candidate causative variant (mutation), filtration processes should take place. Mendelian diseases are rare hence making their causative variants rare. Variants are basically filtered based on their frequency within controls. The current publicly available controls are stored in gnomAD, EXAC, EVS and the 1000 genome. None of these repositories contain genetic data for Middle Eastern populations. This hinders the differentiation between real disease causing variants and the local polymorphism of rare variants especially in populations with high consanguinity such as Jordan. Population specific catalogs of genetic variation is mandatory to identify the rare variants causing rare diseases, and for drug pharmacokinetics treatment efficacy and adverse drug reactions. This review highlights the importance of having a Jordanian specific genetic variant control cohort to achieve the goal of variant identification implicitly.
Keywords: population genetics, consanguineous, MAF, OR, population stratification
Human diseases are caused by environmental factors (infections, malnutrition, poisons, or injuries), or genetic factors including single gene/Mendelian diseases, complex diseases (defects in multiple genes), and genomic diseases (chromosomal abnormalities). The focus is on Mendelian diseases which are characterized by: (1) rare alleles (<0.5% MAF, minor allele frequency) or very rare alleles (<0.1%MAF), and (2) high effect size (OR>3; Odds Ratio of genetic variant per disease expression).1 Much of what is known about the relationship between gene function and phenotype is based on the identification of rare variants causing Mendelian diseases. Such identifications have developed new diagnostic, therapeutic, and preventative strategies.2
Diagnosing many Mendelian diseases by phenotypic features and conventional diagnostic testing is challenging in most cases.3 According to the National Institute of Health diagnosed program, a general clinical geneticist diagnostic rate is ~34% for adults and ~11% for children.4 Moreover, the diagnoses time is prolonged. For example, in a survey for time needed to diagnose 8 rare diseases including fragile X syndrome and cystic fibrosis: 25% of the families waited between 5 and 30 years for the diagnosis, and 40% of the families were with wrong initial diagnosis.5
The NGS is used to obtain whole genome sequence (WGS) or whole exome sequence (WES). The human WGS and WES are useful to detect DNA variants in patients with rare disorders.6 WES has been useful to identify variants that cause Mendelian diseases. The majority of these diseases are caused by variants in the coding region of a gene which composes around 1% of the human genome (~60Megabase).7 The availability of clinical WES testing promises better diagnostic yield, and importantly, studies of the diagnostic efficacy of clinical WES show that the diagnostic successes depend on the discovery of disease causing genes.8 This highlights the value of continued research into the genetic basis of Mendelian diseases. In addition, the diagnostic rates will continue to increase as the work continued toward a more complete catalog of the disease-causing genes and the disease-causing variants.
One powerful approach to discover disease causing genes and variants is to study diseases in populations with high rates of parental consanguinity where recessive forms of diseases are enriched. In Jordan, 39.7% of marriages are consanguineous. The lowest percentage is in the capital Amman where 25.5% of marriages are consanguineous, and the highest is in Irbid-north east- with 52.1% consanguineous marriages (www.consang.net). In Amman before 1980, first cousin marriages comprised ~30% of all marriages.9 Such high percentages of consanguinity in Jordan result in an increased risk for recessive Mendelian diseases. Among couples with a child genetically diagnosed with a recessive genetic disease in Amman, 69% were offspring of couples who were first degree cousins, compared to 14% from non-consanguineous marriages.10
Recently NGS technique was used in Jordan to identify the variants causing intellectual disability (ID) and other forms of neurodevelopmental disorders in consanguineous families. These studies were within frame of joint projects as follows:
Interpretation of variants obtained by NGS requires large scale reference data sets of human genetic variation. EVS16 and the 1000 genome project17 are publicly available datasets that contain DNA data for 6,503 exomes and 2,504 individuals respectively. EXAC is a larger dataset that harbors data for 60,706 individuals.18 The represented populations in the EXAC dataset are African/African American (5,203), Latino (5789), East Asian (4,327), Finnish (3,307), Non-Finnish European (33,370), South Asian (8,256), and other (454).18,19 The largest publicly available dataset of DNA variants is gnomAD which spans 123,136 exome sequences and 15,496 whole-genome sequences from unrelated individuals sequenced as part of various disease-specific and population genetic studies 18. The represented populations in the gnomAD dataset are African/African American (7,652 exomes and 4,368 genomes), Latino (16,791 exomes and 419 genomes), Ashkenazi Jewish (4,925 exomes and 151 genomes), East Asian (8,624 exomes and 811 genomes), Finnish (11,150 exomes and 1,747 genomes), Non-Finnish European (55,860 exomes and 7,509 genomes), South Asian (15,391 exomes and 0 genomes), and other (2,743 exomes and 491 genomes). The limitation in these large datasets is the lack of Middle Eastern populations.18
The controls that were used in the exhibited projects in Jordan are the public datasets (EVS, 1000 genome project, EXAC and gnomAD) and the internal control cohort in Germany (Erlangen and Tuebingen) or in the U.S. (New York). No controls are available from Jordan and/or the Middle East. Because of the population stratification of rare variants which show stronger geographic clustering than common variants, control datasets should match closely with the patient’s ancestry.19 Therefore, population specific catalogs of genetic variation are mandatory not only to identify the rare variants causing the rare diseases, but also for drug pharmacokinetics treatment efficacy and adverse drug reactions. Lack of population-specific genetic variation hinders the differentiation between real disease-causing variants and the local polymorphism of rare variants.20 Therefore, establishing population specific DNA variant datasets in Jordan is very important in order to proceed identifying the rare genetic variants that cause rare Mendelian diseases in Jordan.
The authors would like to thank the Jordanian families willing to participate as control.
The author declares no conflict of interest.
©2017 Froukh, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.