Gene-Gene and Gene-Environment Interactions Underlying Complex Traits and their Detection

Among these methods emerged recently, data reduction approaches (a constructive induction strategy) such as the multifactor dimensionality reduction method (MDR) [36,37], the combinatorial partitioning method [38], and the restricted partition method [39], are promising to address the multidimensionality problems. Rather than modeling the interaction term per se as with regression methods, a data reduction strategy seeks for a pattern in a combination of factors/ attributes of interest that maximizes the phenotypic variation it explains. It treats the joint action as a whole, coinciding to the very original epitasis coined by Bateson [40], offering a solution that avoids decomposition as in regression methods where the number of interaction parameters grows exponentially as each new variable is added. It also has a straightforward correspondence to the concept of the phenotypic landscape that unifies biological, statistical genetics and evolutionary theories [41-45]. Notably the pioneering MDR method has sustained its popularity in detection of interactions since its launch [46].

Among these methods emerged recently, data reduction approaches (a constructive induction strategy) such as the multifactor dimensionality reduction method (MDR) [36,37], the combinatorial partitioning method [38], and the restricted partition method [39], are promising to address the multidimensionality problems. Rather than modeling the interaction term per se as with regression methods, a data reduction strategy seeks for a pattern in a combination of factors/ attributes of interest that maximizes the phenotypic variation it explains. It treats the joint action as a whole, coinciding to the very original epitasis coined by Bateson [40], offering a solution that avoids decomposition as in regression methods where the number of interaction parameters grows exponentially as each new variable is added. It also has a straightforward correspondence to the concept of the phenotypic landscape that unifies biological, statistical genetics and evolutionary theories [41][42][43][44][45]. Notably the pioneering MDR method has sustained its popularity in detection of interactions since its launch [46]. Several extensions of the MDR have been made for analyzing different traits, e.g., binary, count, continuous, polytomous, ordinal, time-to-onset, multivariate and others, as well as combinations of those, and also entertaining various study designs including homogeneous and admixed unrelated-subject and family as well as mixtures of them [47]. Such extensions include to inclusion of covariates [48,49], to continuous traits [49], to survival data [50,51], to multivariate phenotypes [52,53], to multi-categorical or ordinal phenotypes [47,54], to casecontrol study in structured populations [55,56], to family study [57,58], and to unified analysis of both unrelated and related samples [59]. With these extensions, the MDR-type methods offer a powerful tool for handling the breadth of data types and

Introduction
No genes or environmental factors are isolated from the interactive genomic and epigenomic networks in shaping a biological phenotype [1][2][3]. Non intuitivity and nonlinearity are a natural property of the network's architecture [4] (also see an illustrative example in Box 1). Consequently, the existence of interactions among genes, called gene-gene (also known as epistatic) interactions, and between genes and environmental factors (broadly defined as all non-genetic exposures), called gene-environment (GE) interactions, is the normal rather than an exception [5][6][7][8]. Several converging lines of evidence have pointed to the dominant role of interactions in the inherited traits [6][7][8][9]; in particular, epistatic and GE interactions are considered as one of the primary culprits for missing heritability [10,11], referred to the majority of the genetic variation that is not yet identified by the more than a decade's practice of genome-wide association studies [12][13][14]. Identification of background-specific factors among genes in combination with lifestyles and environmental exposures is an important scientific topic in genetics, breeding, and genetic epidemiology.
A high degree of context dependence of genetic architecture likely results in a relatively weak marginal genotype-phenotype correlations for complex traits, making traditional univariate approaches that test for association one factor at a time futile [5,11]. The multi factorial strategies are thus critical in hunting highly mutually dependent factors underlying a trait. However, such a search has to face a significant obstacle called "the curse of dimensionality", a problem caused by the exponential increase in volume of possible interactions with the number of factors to consider [15]. The conventional regression methods, established by the extension under the concept of single factorbased approaches, are hardly appropriate for tackling ubiquitous yet elusive interactions because of several problems: heavy computational burden (usually computationally intractable), increased Type I and II errors, and reduced robustness and potential bias as a result of highly sparse data in a multi factorial model [16]. addressing statistical issues associated with study design and sampling scheme.
Despite the methodological progresses in detection of multifactor interactions, there are still difficult computational challenges and multiple hypothesis testing problems in practice, especially high-order interactions for the large-scale such as whole genome data. Both the computational time and the number of hypotheses to test increase exponentially with the number of factors to consider. The implementation may quickly become prohibitively costly when considering more than 15 factors simultaneously. Further theoretical and computational work is required for effective identification of interacting factors underlying the complex traits. Specifically, it will be worth exploring the application of the sophisticated efficient algorithms such as the branch-and-bound algorithm [60][61][62] and the depthfirst search algorithm [63] to this field. The heuristic searches among the huge combinatorial search space such as TABU [64,65] are also encouraged for a much reduced computational burden while getting a solution approximating but good enough for practical purposes. On the other hand, the effective correction procedures for multiple testing, rather than the rectangle-like Bonferroni-type corrections, including those for controlling false discovery rate [66][67][68], will play a pivotal role in avoiding a flood of false-positive claims and true hits being missed.

Box 1: An illustrative example for the nonintuitivity in a network system
The following real-life experiment on a simple series circuit in electrics, as shown in Figure 1, is used to demonstrate the natural property of a network or pathway. The series circuit contains a light bulb, a pencil, and a battery. As shown in Figure  1A, a half part of the pencil is shaved off along the long way so that the graphite center is exposed for most of the length of the pencil. Two wire ends connect with the graphite part of the pencil that will function as a resistor. One of the ends may move along the graphite, changing the resistance, and, correspondingly, the brightness of the light bulb will change. Assume the battery has a voltage of 120 volt and a resistance of 0 ohm. Suppose there are two light bulbs, 20 watt/120 volt (i.e., having a resistance of 720 ohm) and 40 watt/120 volt (i.e., having a resistance of 360 ohm), respectively.

Consider two scenarios:
1. Moving the sliding end makes two ends coincide so that the resistance of the resistor is 0 ohm.

2.
Moving the sliding end to some place makes the resistance to be 14,400 ohm.
In the first scenario, the light of 40 watt will be brighter than the light of 20 watt and the output power of the former (40 watt) will be twice as much as that of the latter (20 watt). However, in the second case, the light of 40 watt will be darker than the light of 20 watt and the output power of the former (40/41 2 watt) is nearly a half of that of the latter (20/21 2 watt). This illustrative experiment supports that the existence of context-specific effects is widespread even in the simple electric pathway.