Modern molecular approaches for walnut:a review

Walnut (Juglans regia L.) is an important nut belonging to family Juglandaceae. In India it is commonly known as ‘Akhrot’, almost all parts of which are used in one way or the other. Juglans regia Persian walnut or English walnut is an indigenous species in Eurasia which is cultivated throughout the temperate regions of world for its high quality wood and edible nuts. Persian walnut is monoecious and heterodichogamous, with 2n chromosome number= 32. Juglans regia is economically important species in Europe, Asia and North America. The worldwide production of walnuts has been increasing rapidly in recent years, with the largest increase coming from Asia. Persian walnuts are the most common, their nutrient density and profile are significantly different from those of black walnuts. Unlike most nuts that are high in monounsaturated fatty acids, walnut oil is composed largely of polyunsaturated fatty acids, particularly alpha-linolenic acid and linoleic acid. They also contain triglycerides effective in reducing the risk of cardiovascular diseases. Compared to certain other nuts, such as almonds, peanuts and hazelnuts; walnuts contain the highest spectrum of antioxidants, including free antioxidants and the antioxidants bound to fiber. There are no standard walnut cultivars under cultivation in India. Most of the walnut produce comes from trees of seedling origin grown in semi-wild state. Like elsewhere in India, walnut plantations in Himachal Pradesh, too, comprise of genetically diverse seedling trees of unknown origin, there by constituting a vast gene pool. Accurate estimation, of distances between different genotypes of the germplasm, can assist the breeders in crop improvement programmes. For genetic characterization of a plant species, the molecular markers are considered superior to morphological and biochemical traits.


Introduction
Walnut (Juglans regia L.) is an important nut belonging to family Juglandaceae. In India it is commonly known as 'Akhrot', almost all parts of which are used in one way or the other. Juglans regia -Persian walnut or English walnut is an indigenous species in Eurasia which is cultivated throughout the temperate regions of world for its high quality wood and edible nuts. Persian walnut is monoecious and heterodichogamous, with 2n chromosome number= 32. Juglans regia is economically important species in Europe, Asia and North America. The worldwide production of walnuts has been increasing rapidly in recent years, with the largest increase coming from Asia. Persian walnuts are the most common, their nutrient density and profile are significantly different from those of black walnuts. Unlike most nuts that are high in monounsaturated fatty acids, walnut oil is composed largely of polyunsaturated fatty acids, particularly alpha-linolenic acid and linoleic acid. They also contain triglycerides effective in reducing the risk of cardiovascular diseases. Compared to certain other nuts, such as almonds, peanuts and hazelnuts; walnuts contain the highest spectrum of antioxidants, including free antioxidants and the antioxidants bound to fiber. There are no standard walnut cultivars under cultivation in India. Most of the walnut produce comes from trees of seedling origin grown in semi-wild state. Like elsewhere in India, walnut plantations in Himachal Pradesh, too, comprise of genetically diverse seedling trees of unknown origin, there by constituting a vast gene pool. Accurate estimation, of distances between different genotypes of the germplasm, can assist the breeders in crop improvement programmes. For genetic characterization of a plant species, the molecular markers are considered superior to morphological and biochemical traits.

Genetic diversity analysis
Various molecular markers are used to study the genetic diversity in walnut. Juglans populations, 1-4 to determine the impact of different timberharvest scenarios on residual levels of genetic diversity. 3 Pijut et al. 5 reviewed technological applications of molecular markers used on several temperate hardwood tree species. Modern genomicsbased tools are currently being used in several walnut species. Erturk et al. (2011) studied that RAPD analysis was used to study the characterization and grouping of walnut genotypes.A total of forty five primers were used out of which 37 primers were polymorphic. Vischi M, et al. 6 studied genetic diversity of walnut (Juglansregia L.) in the eastern italian alps. They studied two hundred and fifteen wild accessions native to the area were sampled, georeferenced, and genotyped with 20 microsatellite loci that were selected from the literature. Mahmoodi R et al. 7 assayed genetic diversity among 16 accessions and 5 cultivars of Juglans regia. Nine SSR markers and morphological traits were used to study the genetic diversity. They concluded that high diversity among the genotypes was used for breeding programmes. Najafi et al. 8 isolated and characterized 13 microsatellite markers of Juglans regia L. All 13 primer pairs could amplify 36 genotypes of Juglans regia. The number of polymorphic alleles ranged from 2 to 4 (with an average of 4.35). The polymorphic information content values ranged from 0.47 to 0.88 (with an average of 0.69). TC/AG and GAA/CTT class of repeats were the most abundant di-nucleotide and tri-nucleotide repeats, respectively. Out of these TC/AG di-nucleotide was more abundant.
Salieh et al. 9 assessed genetic relationship among 12 walnut genotypes using ten RAPD and nine SSR primers. RAPD primers produced 85 bands, out of which 36 (42.35%) were polymorphic with an average of 3.6 bands per primer. In SSR analysis, nine primers produced 26 bands, out of which 23 (88.16%) were polymorphic, with an average of 2.3 bands per primer. Genetic similarities were calculated and ranged from 0.4 to 0.93 for RAPD data and 0.27 to 1.00 for SSR data. Cluster analysis by RAPD and SSR markers revealed clear distinct diversity between genotypes. The dendrogram constructed from the RAPD and SSR markers grouped the 12 genotypes into four major clusters. This study demonstrated that RAPD and SSR analysis can be used for the characterization and grouping of walnut genotypes. Zhang et al. 10 used publicly available walnut (Juglans regia) EST database to develop SSR markers and used them to study the genetic analysis of the widespread Juglans nigra, Carya cathayensis and an endangered species Annamocarya sinensis. A total of 7262 unigenes, including 1911 contigs and 5351 singletons, were obtained from 13,559 ESTs retrieved from the NCBI database. They further reduced 7262 unigenes to 706 EST-SSR sequences containing 805 SSR loci. Then, they randomly designed 309 EST-SSR primers. The 77 primers were transferable among species namely, J. regia, J. nigra, C. cathayensis, Carya dabieshanensis, and A. sinensis. 13 highly polymorphic EST-SSRs were further used for genetic analysis in these five species. The genetic relatedness of 82 walnut genotypes by using 13 SSR and 20 RAPD primers was reported by Ahmed et al. 11 A high level of genetic diversity was observed within populations with the number of alleles per locus ranging from one to five and two to six in SSR and RAPD primers, respectively. The proportion of polymorphic loci was 100 % and similarity ranged from 12 % to 79 % with an average of 49 %. Dendrogram showed that all the accessions formed four main clusters with various degree of sub-clustering within the clusters.
Cosmulescu & Botu 12 evaluated Juglans regia L. germplasm from the Oltenia region, located in the South Western part of Romania and determined the variability in walnut germplasm. Variability found in nut weight was between 6.8-18.4 g, in kernel weight between 1.7-8.79 g, in weight kernel/weight nut ratio between 23.6-71.7%. Nut size, bud breaking time, nut maturity time and phenological characteristics were also evaluated. The data obtained have indicated that walnut trees studied in this region have high variations in fruit characteristics that indicate the higher potential in selecting new genotypes of material under study. A total of 5,025 walnut ESTs (covering 16.41 Mb) were retrieved from the National Center for Biotechnology Information database. 10 They analyzed the SSR motifs by SSR Hunter software. They obtained 398 SSRs, dinucleotide repeat motifs accounted for 69.85%, followed by trinucleotide with a frequency of 27.64%, while low frequency (2.51%) of tetranucleotide to hexanucleotide was observed. Subsequently, a total of 123 primer pairs were designed from the non-redundant SSR-containing unigenes with the selection threshold of SSR length set to 10 bp or more. The efficiency of markers was examined by seven DNA pools of walnut that were collected from geographically different accessions. This study demonstrated that 41 SSR primer sets produced high polymorphic amplification products. Pollegioni et al. 13 characterized Juglans sp. using SSR markers. Natural hybrids between the two species, known as Juglans X intermedia (Carr), are valued for timber production. They tested ten nuclear microsatellite markers to identify new J. X intermedia hybrids and then characterized their parental species-Juglans regia and Juglans nigra to detect Juglans nigra genotypes with a spontaneous crossing ability with Juglans regia in a mixed Italian population. They also studied the transferability of ten black walnut SSR loci to Persian walnut. Ten microsatellites amplified in both species, producing fragments of variable sizes. Indices of genetic diversity revealed high level of variability. Total sample set of 112 alleles divided into three main groups: Juglans nigra, Juglans regia and J. X intermedia hybrids. They performed the microsatellite fingerprinting, and identified a triploid hybrid plant with two genome parts of Juglans nigra and one part of Juglans regia. The sequence of amplified fragments confirmed the cross-species amplification of SSR. Similarity coefficient based on 112 alleles divided the studied genotype into distinct four groups.
The genetic diversity among 16 'Sorrento' plants grown in Caserta (ten originated from seeds and six fromgrafts), and 26 grafted 'Sorrento' clones grown in the Sorrento peninsula was evaluated by Foroni et al. 14 They compared their genotypes along with six other walnut cultivars using 12 SSR markers. A total of 66 putative alleles were detected, 16 of which were unique to one individual.
Two loci namely WGA9 and WGA71 were useful for distinguishing Caserta samples from Sorrento peninsula clones. The phylogenetic and structure analysis highlighted the genetic distance between the Sorrento peninsula and Caserta groups. The samples were grouped into two different clusters (or populations) corresponding closely, but not perfectly, to each sample's geographic origin. Ruiz et al. 15 characterized a collection of 57 common walnut cultivars using 32 microsatellite markers. The 19 selected microsatellite markers could discriminate the studied cultivars, with a total of 97 alleles and an average of five alleles per locus, confirming that these markers are more suitable tools for walnut identification than other molecular markers studied previously. The genetic similarity estimated from the molecular data clearly separated the Spanish walnuts from the Californian genotypes. UPGMA analysis grouped 57 walnut cultivars into two clusters based on similarity coefficient.
Foroni et al. 14 analyzed ten plants that originated from 'Sorrento' seeds, six grafted 'Sorrento' clones and compared them with six other walnut cultivars using SSR DNA markers. These primers amplified the alleles at six SSR loci in Persian walnut which were derived from Juglans nigra. A total 33 putative alleles were detected, nine of which were unique to one genotype. Two loci namely WGA5 and WGA27 were useful for distinguishing walnut varieties. Cluster analysis clarified the relatively large genetic distance between most of the 'Sorrento' plants and some genotypes labeled as 'Sorrento'. The genetic diversity found among seed and vegetatively propagated plants. Jia et al. 16 studied 77 samples grown in China, including 14 introduced cultivars, 12 domestic seedling breeding cultivars, and 49 fine pecan plants, together with Carya cathayensis and Juglans nigra. A total of 77 ISSR and 19 SSR primers were prescreened, out of which ten ISSR and eight SSR primers were selected for further studies. ISSR markers yielded a total of 94 amplified bands (100% polymorphic) in the range of 140-950 bp while SSR markers ampified 70 bands (100% polymorphic) in the range of 50-350 bp. They analysed genetic diversity which indicated that Chinese-grown pecan cultivars and fine plants had significant diversity at the DNA level. The dendrogram constructed with ISSR, SSR or combined data were very similar, but showed very weak grouping association with morphological characters. However, the progeny was grouped with the parents. This study demonstrated that ISSR and SSR techniques both were suitable for genetic diversity analyses and the identification of pecan cultivars.

Fingerprinting
DNA fingerprinting technique to plant genome analysis was first introduced in 1988. Ryskov et al. 17 reported first time DNA fingerprint in plants. They reported that DNA fragment pattern differences between two varieties of barley (Hordeum vulgare), following Southern blot hybridization of Hae III-digested DNA samples with the M13 probe. F He et al. identified various walnut cultivars with the help of AFLP (Amplified Fragment Length Polymorphism) fingerprinting. They used AFLP markers to identify 5 Juglans regia L. cultivars. They identified that with 5 pairs of E+3/M+3 primers 28 specific bands were produced. The five cultivars were discriminated successfully with the help of DNA fingerprints. Therefore, DNA fingerprint techniques provided an effective tool for the identification of the cultivars. Jan Dvorak et al. 18 fingerprinted the individuals using eighteen microsatellite loci to confirm the hybridity and parentage. One hundred thirty-five F1 individuals were fingerprinted from a cross between 'Chandler' x 'Idaho'. This experiment resulted in the identification of six out-crossed individuals possessing alleles other than the ones present in the parents. The results were analyzed to verify the pattern of inheritance of molecular markers. About half of this F1 population is already in bearing stage and phenological data, bearing habit, and fruit and nut traits are being collected. The remaining half will come into bearing in the coming years.

Next generation sequencing
The scientific revolution that started with the human genome sequencing project carried out with first generation sequencing technology, has initiated other sequencing projects, including those for plant species. High throughput genome sequencing strategies are used for sequencing whole genome of the organisms. Two basic methods used to sequence large genomes are explained below: 1. Whole genome shotgun method-It is one of the efficient methods mainly used in sequencing prokaryotic genomes. The genomes which are less repetitive in nature can be easily sequenced with WGS method and assembled by using computational tools ( Figure  1). 2. Hierarchial sequencing method -This method is also known as clone-by-clone approach of sequencing large genomes. In this method construction of genetic linkage map of an organism is a pre-requisite. So the availability of genomic resources in terms of DNA markers and BAC Libraries are very important to construct maps of the plants (Figure 2). Different technologies have been developed together with the second and third generation sequencing platformscalled next generation sequencing whereas, determining the number and order of nucleotides that make up a givenmolecule of DNA is called as sequencing Different techniques were used to sequence the genome of different crops. (Figure 3 representing genome sequencing) The Illumina Genome Analyzer, a groundbreaking platform for geneticanalysis and functional genomics. Gb.The Illumina system can be used for de novo genome sequencing, resequencing and transcript profiling (Figure 4). Walnut genome sequence was revealed by Martínez García Pedro J et al. 19 They used Illumina sequencing technique for sequencing the genome of walnut. Illumina sequencing of paired end reads from short-and large-fragment libraries, resulted in 500 million reads and 120X genome coverage. Developments in sequencingtechnology make it possible to obtain large-scale sequence data in a short time, the assembly and analysis of sequences remains a challenging task.     Figure 4) It is concluded that NGS diagnostics -shifted towards data analysis rather than the technical component.NGSinfrastructures must consist of appropriate expertise and computational hardware. Unprecedented amounts of medical data and various processing algorithms necessitate adequate tools for Data management. Many new denovo and resequenced plant genomes are expected in the near future for plants in general and crop species in particular, using second and mostly third generation sequencing platforms.  Figure 4).

Genome editing
In general Genome editing is a combination of two word i.e Genome + Editing means there is some editing in the genome. It is considered as the type of genetic engineering in which DNA is inserted, deleted or replaced in the genome of an organism using engineered nucleases. These nucleases create site-specific doublestrandbreaks at desired locations in the genome. There are currently four families of engineered nucleases being used: 1. Meganucleases.

CRISPR-Cassystem.
Currently no successful reports of using genome editing in tree fruits are available. But in future efforts have been undertaken.

Conclusion
Markers are powerful tools in the field of genomics. The use of these markers would reduce the cost and therefore facilitate cultivar identification, genetic distance assessments, gene mapping and possible marker-assisted selection. Fruit and nut crops have received considerable attention in genomics. Genomic selection in fruit tree crops is expected to enhance breeding efficiency. These tools complement, but do not replace, classical plant breeding. The variation in agronomic traits of walnut populations would be of great interest for future cultivar improvement. The combination of genomics and breeding results in the improvement of walnut fruit.