Research Article Volume 2 Issue 4
1College of Hydraulic & Environmental Engineering, China Three Gorges University Yichang, China
2International Water Management Institute (IWMI) Lahore, Pakistan
3Center of Excellence in Water Resources Engineering (CEWRE), UET, Lahore, Pakistan
4Centre for Integrated Mountain Research (CIMR), University of the Punjab, Pakistan
Correspondence: Muhammad Imran Azam, College of Hydraulic & Environmental Engineering, China Three Gorges University Yichang, China, Tel 861 5672489878
Received: July 30, 2018 | Published: August 30, 2018
Citation: Azam MI, Bhatti MT, Xiaotao S, et al. Flood occurrence exploration for ungauged river catchment at Jhelum river basin of Pakistan. Int J Hydro. 2018;2(4):520-526. DOI: 10.15406/ijh.2018.02.00120
The Jhelum River catchment includes gauged and un-gauged sub-catchments with hydrological similarities. In this study, Canonical Correlation Analysis (CCA) was used to explore the correlation between un-gauged and gauged sub-catchments. Flow data from fifteen gauging stations was obtained for detailed analysis. Linear regression was applied to transfer flood data of gauged catchments to the un-gauged sub-catchments showing high correlation based on CCA. The flood series of the un-gauged sub-catchments were analyzed using different flood frequency methods. Floods of different return periods were compared with the floods of identical return periods estimated through graphical method. The results of CCA showed that two main characteristics of the gauged and un-gauged sub-catchments i.e. catchment area and main channel slope possess different levels of correlation. High correlation (R2=0.95) was observed in case of catchment area while main channel slope has limited correlation (R2=0.58). Furthermore, the relationship between dependent and independent variables showed that high correlation between length of channel, catchment area (R2=0.94) Rainfall and main channel slope (R2=80). In general, the results of CCA based on selected multivariate tests of significance showed that four un-gauged sub-catchments are fairly correlated with the gauged catchments and flow data from these gauged sub-catchments can be transferred to the un-gauged sub-catchments through linear regression. The flood frequency analysis for the un-gauged sub-catchments showed that CCA method is more suited for the estimation of floods of different return periods as compared to Graphical method of estimation.
Keywords: Jhelum, catchment, flood, canonical correlation analysis, regression
In developing countries like Pakistan, there are a low number of stream flow recording stations with poor management and low standards. Many of the available flow time series data are either too short to allow for a reliable estimation of extreme events or there is no flow record available at the concern site.1 For the planning and designing of hydraulic structure (dams, barrages, head works etc.) flow record of long duration are required. However, in many cases where limited stream flow data is available for analysis, it is generated at the desired location by using the raw and simple technique such as area reduction method. Area reduction method considers the area of catchment as the only parameter for flow estimation at the un-gauged site or uses some coefficient representing the catchment characteristics. Therefore, this technique doesn’t establish a direct relationship between the catchment characteristics of gauged and un-gauged catchments. The Jhelum River is one of the western rivers authorized to Pakistan according to the Indus Waters Treaty of 1960. The Jhelum River is an eastern tributary of the IBRS (Indus Basin River System) with catchment area of 33,000 square kilometer and length of 500 kilometer up to the Mangla Dam lying in the disputed territory of Kashmir.2 In regional frequency analysis, a hydrologically homogeneous region from the statistical point of view is considered. Long-term data from the neighboring catchments are tested for homogeneity and a group of hydrological stations is formed to establish a region. Data from all hydrological stations of this region is polled and analyzed as a group to find frequency characteristics of the specific region.3 Dalrymple4 has discussed a test to determine flood frequency curves in a region considering it as a homogeneous. Ouarda et al.5 used Canonical Correlation Analysis to estimate the flood characteristics of un-gauged basins in Ontario (Canada). This method emphasizes graphical and quantitative analysis of the relationship between the flood variables before the data of the gauged basin is used for estimating the flood variables at un-gauged sites. Ribeiro-Correa et al.6 presented a theoretical framework for determining the hydrologically neighboring of a drainage basin based on CCA. Kjeldsen et al.7 figured out the use of the index flood method at un-gauged sites for estimation of index flood parameters at these sites. Crochet8 used regional flood frequency analysis to present estimating the T-year flood peak discharge with fixed duration for poorly gauged and un-gauged catchments. The research study incorporated scaling of a regional flood frequency distribution by the so-called index flood of the catchment (Index Flood Method). Zakaria et al.9 used support vector machines (SVM) model for river flow forecasting at un-gauged sites, and compared the performance SVM with other statistical method of multiple linear regression (MLR). Burn10 used the approach of region of influence (ROI) framework and derived the information from flood magnitude for examining the homogeneity of flood regions. Chavoshi & Soleiman11 applied conventional cluster analysis as well as Fuzzy Logic theory on regionalization of 70 catchments in north of Iran. Grandry et al.12 worked on low flow calculations in an un-gauged catchment. Mirghani et al.13 worked on the regionalization of the Nile water resources for low flow frequency estimated ultimately contribution. Ouarda et al.14 presented an adaptation of some regional assessment approaches and a comparison of their performance on the basis of their application to data from the Balsas, Lerma and Pa´nuco River Basins located in Mexico. Four approaches were used in this study for the delineation of homogeneous regions:
A data set of 29 stations from numerous Mexican River catchments of the Balsas region was used. Results demonstrated that CCA-based methods lead to best performances as compare to hierarchical clustering seems generally to lead to less biased quantile estimates; the lowest root mean square error values are almost consistently obtained for the CCA-based methods. The method of canonical kriging does not seem to be more sensitive to the database quality than the other two CCA-based methods. Badyalina & Shabri15 studied Model based on canonical correlation analysis (CCA) and group method of data handling (GMDH) were illuminated to obtain a better flood magnitudes estimation at ungauged catchments. CCA was used to form a canonical physiographical space by relating the site characteristics from gauged station. Chebana et al.16 The aim of the present paper was to take into account this nonlinearity by introducing the generalized additive model (GAM) in the estimation step of RFA.
A neigh boyhood approach using canonical correlation analysis (CCA)is used to delineate homogenous regions. GAMs possessed a number of advantages such as flexibility in shapes of the relationships as well as the distribution of the output variable. The regional model was applied on a dataset of 151 hydrometrical stations located in the province of Québec, Canada. A stepwise procedure is employed to select the appropriate physiometeorological variables. A comparison was performed based on different elements (regional model, variable selection, and delineation). Badyalina & Shabri15 GMDH model is used to distinguish the functional relationship between flood quantiles and the physiographic variables in the CCA space. The proposed model is applied to 70 catchments in Peninsular Malaysia. The jackknife procedure is used to evaluate the results of proposed model. Result of proposed model compared with Traditional CCA model, linear regression (LR) model and GMDH model. The results indicated that the proposed model CCA-GMDH deliver the best performance among all models in terms of extrapolation precision. Komi et al.17 flood frequency estimates are important for disaster risk management. This study aimed to improving knowledge of flood frequencies in the Volta River Basin through regional frequency analysis based on L-moments. Hence, three homogeneous groups had been identified based on cluster analysis and a homogeneity test. By using L-moment diagrams and goodness of fit tests, the generalized extreme value and the generalized Pareto distributions are found suitable to yield accurate flood quantiles in the Volta River Basin. Finally, regression models of the mean annual flood with the size of the drainage area, mean basin slope and mean annual rainfall were proposed to enable flood frequency estimation of ungauged sites within the study area. Hailegeorgis & Alfredsen18 performed regional flood frequency analysis (RFFA) using the L-moments method and annual maximum series (AMS). Hailegeorgis & Alfredsen18 used similarity in at-site and regional parameters of distributions, high flow regime and seasonality, and runoff response from rainfall runoff models to identify homogeneous catchments, bootstrap re sampling for estimation of uncertainty and regression methods for prediction in ungauged basins (PUB). New hydrological insights for the region: The rigorous similarity criteria were useful for identification of catchments. Resemblance in runoff response has the least identification power. For the PUB, a linear regression between index-flood and catchment area (R2=0.95) performed superior to a power-law (R2=0.80) and a linear regression between at-site quantiles and catchment area (e.g. R2=0.88 for a 200 year flood). There is considerable uncertainty in regional growth curves (e.g.−6.7% to−13.5% and +5.7% to +24.7% respectively for 95% lower and upper confidence limits (CL) for 2–1000 years return periods). The peaks of hourly AMS are 2–47% higher than that of the daily series. Quantile estimates from at-site flood frequency analysis (ASFFA) for some catchments are outside the 95% CL. Uncertainty estimation, sampling of flood events from instant observations and comparative evaluation of RFFA with ASFFA are important. Similarly, many other scientists worked on conical correlation and flood frequency analysis to produce better hydrological analysis of an area. The present study was conducted for the identification of hydrological similarities between un-gauged and gauged sites for flood frequency analysis using Canonical Correlation Analysis.
The Jhelum River catchment was selected for detailed analysis. The catchment area of the Jhelum River and its tributaries is 33000km2. Fourteen gauging stations (Sopore, Chinari, Domel, Dudhnial, Nosheri, Muzaffarabad, Naran, Garihabibullah, Dollai, Kohala, Azad-Pattan, Palote, Kotli and Mangla) were selected for detailed analysis out of which eight were located at the Jhelum River, three on the Neelum River, two on the Kunhar River and one each on Poonch and Kanshi Rivers, respectively. The location of these sub-catchments within the Jhelum River catchment is shown in (Figure 1).
Data collection
Flow records: For the present study, the mean daily flow records of the selected stations were collected from Surface Water Hydrology Project (SWHP) of WAPDA. Characteristics of these selected gauging sites are given in Table 1. Jhelum at Azad Pattan has the largest sub-catchment among the selected sub-catchments with an area of 26485 km2. On the other hand, Jhelum at Sopore is the smallest catchment with catchment area of only 4905km2 lying in Indian occupied Kashmir. The information about the length of flow records at different stream gauging stations of the Jhelum River catchment are shown in (Table 2). The longest flow records are available at Kotli (1961-2009). Domel, Dollai, Dudhnial and Sopore have the short data length
Sr. No. |
Stations |
Catchment area (km2) |
Records |
No. of years |
River |
Sr. no. |
Stations |
Catchment area (km2) |
Records |
No. of years |
River |
1 |
Chinari |
13598 |
1970-2012 |
42 |
Jhelum |
8 |
Palote |
1111 |
1971-2012 |
42 |
Kanshi |
2 |
Azad Pattan |
26485 |
1978-2012 |
35 |
Jhelum |
9 |
Kotli |
3238 |
1961-2012 |
52 |
Poonch |
3 |
Mangla |
33411 |
1967-2012 |
46 |
Jhelum |
10 |
Naran |
1036 |
1970-2012 |
42 |
Kunhar |
4 |
Kohala |
24890 |
Jhelum |
11 |
Domel |
14504 |
1976-1977,1984-2001 |
19 |
Jhelum |
||
5 |
Nosheri |
6809 |
1980-2009 |
29 |
Jhelum |
12 |
Dollai |
24406 |
1990-1994,1996 |
5 |
Jhelum |
6 |
Muzaffarabad |
7278 |
1963-2012 |
50 |
Neelum |
13 |
Sopore |
4905 |
1970-1988 |
18 |
Jhelum |
7 |
Gariihabibullah |
2382 |
1961-2012 |
52 |
Kunhar |
14 |
Dudhnial |
6500 |
1982-1992 |
11 |
Jhelum |
Table 1 List of the gauged & un-gauged stations at Jhelum river catchment
Catchment characteristics data
The catchment characteristics are important in regional studies. The selected catchment characteristics (Catchment area, Main channel length, Channel slope, mean elevation of the catchment, mean annual precipitation) were collected from the topographic maps prepared by Soil Survey of Pakistan and by processing digital elevation models using ArcGIS software. The standard 1:50000 scale topographic maps were used to select catchment characteristics.
Canonical correlation analysis (CCA)
CCA was used to establish hydrological similarities between gauged and un-gauged catchments of the Jhelum River. Canonical correlation is considered to be the general model on which many other multivariate techniques are based because it can use both metric and non-metric data for either the dependent or independent variables. The general form of canonical analysis is given in the equation 1 as below;
(1)
Dependent Variables=Independent Variables
The details of CCA for analysis of un-gauged catchments can be seen from a paper by Ouarda et al.5 For CCA Statistical Product and Service Solutions (SPSS) software was used to determine the correlation and significance level among gauged and un-gauged catchments. All the selected catchments (gauged and un-gauged) were analyzed for dependent and independent variables (Table 2) to establish hydrological similarities between the sub-catchments. Statistical Product and Service Solutions (SPSS) software was used for the canonical correlation analysis to find out the correlation between dependent and independent variables.
Sr. no |
Variable name |
Nature |
Catchment |
1 |
Latitude(dd) |
Independent |
Gauged |
2 |
Longitude |
Independent |
Gauged |
3 |
Elevation (m) |
Independent |
Gauged |
4 |
Length of the Channel (km) |
Independent |
Gauged |
5 |
Catchment Area (km2) |
Dependent |
Un-gauged |
6 |
Main Channel slope (m/km) |
Dependent |
Un-gauged |
7 |
Mean Annual Rainfall (mm) |
Independent |
Gauged |
Table 2 Dependent and independent variables used for canonical correlation analysis
Linear regression
Generally, the objective of such a model is to provide a means of predicting or estimating one variable (the dependent variable) from information of a second variable (the independent variable). The general form of linear regression is given in equation 2;
(2)
Where
(α,β) =Constants, X= Independent Variable, Y=Dependent Variable
Analytical frequency analysis
Two probability distributions were used to determine the flood magnitudes at the un-gauged sites as well as gauged sites:
Regional flood frequency analysis
Regional flood frequency analysis is a commonly used method to overcome the problems associated with un-gauged catchments. The application of this method consists of developing two curves using flood data of the gauged site in the region. The first curve shows the mean annual peak flood versus the catchment area. Once these two curves are developed for the region, a flood frequency curve for any other un-gauged catchment in the same region can be constructed. The procedure to develop the two curves can be found in any hydrology book.
Canonical correlation analysis
The relationship between independent variables and dependent variables {catchment area (CA) and main channel slope (MCS)} is given in Table 3. Positive values show direct relation between dependent and independent variables. Whereas negative values denote indirect relationship. Length of the channel (LC) is directly correlated with CA with 0.94 values and rainfall is directly related with MCS with value of 0.80. To test goodness of fit between dependent and independent variables, confidence level was checked in SPSS software. Values of variables have good relations showing direct impact on hydrological homogeneity of the catchment. The significance test of the canonical correlations is straightforward in principle. Different canonical correlations were tested, one by one, beginning with the largest one. Table 4 gives the significance level of canonical correlation with acceptable value for the interpretation is 0.05. To find out the significance level F test was applied in the SPSS, Table 4 presents that all the tests applied to find the significance level are showing high level of significance. Among all the selected tests, Wilks test is showing the best result with the minimum test value of 0.016. Once correlation between dependent variables of un-gauged and gauged catchments is established, the next step was to transfer data from the neighboring gauged catchments to the un-gauged catchment using linear regression technique.
Covariate |
CA |
MCS |
Latitude |
-0.40 |
0.30 |
Longitude |
-0.18 |
-0.20 |
Elevation |
-0.51 |
0.41 |
Length of Channel |
0.94 |
-0.26 |
Rainfall |
0.27 |
0.80 |
Table 3 Correlations between dependent and independent variables
Sr. no. |
Test name |
Values |
1 |
Pillais |
1.52 |
2 |
Hotellings |
26.76 |
3 |
Wilks |
0.016 |
Table 4 Multivariate tests of significance
Linear regression analysis
To transfer the data of gauged sub-catchment to un-gauged sub-catchment linear regression analysis was performed. For Dudhnial station (un-gauged) the data was transferred from Nosheri site. Figure 2(A) represents the regression between Nosheri and Dudhnial for similar length of the record, the regression equation was determined to find out the missing data at Dudhnial station. R2 is 0.91 which is showing strong correlation between Dudhnial and Nosheri data. Figure 2(B) represents the regression between Sopore and Chinari for similar length of the record, the regression equation was determined to find out the missing data at Sopore station. R2 is 0.91 which is showing strong relation between Sopore and Chinari data. Figure 2(C) represents the regression between Chinari and Domel for similar length of the record, the regression equation was determined to find out the missing data at Domel station. R2 is 0.98 which is showing strong relation between Chinari and Domel data. Figure 2(D) represents the regression between Kohala and Dollai for similar length of the record. The regression equation was determined to find out the missing data of Dollai station. R2 is 0.98 which is showing strong relation between Dollai and Kohala data.
Flood frequency analysis
Un-gauged sub-catchments
Flood frequency analysis was performed using past records of peak flow to fabricate the guidance about the probable behavior of future flooding. The analysis provided the information about possible flood magnitude on different return periods and frequency with which certain flood occurred. The Gumbel and Log Pearson Type-III Distributions were applied on historical record of flows of Sopore, Dudhnial, Dollai and Domel. Chi square test was performed to test the goodness of the fit of the selected distributions. The Chi square test should be less than or equal to 12. Table 5 showing the values of frequency analysis is less than 12.
Sr. no. |
Un-gauged stations |
Chi-square test |
Selected distribution |
|
Gumbel |
Log pearson III |
|||
1 |
Sopore |
5 |
9 |
Gumbel |
2 |
Domel |
9 |
12 |
Gumbel |
3 |
Dollai |
6 |
9 |
Gumbel |
4 |
Dudhnial |
4 |
4.1 |
Gumbel |
Table 5 Chi-square test values of gumbel and log pearson type-III of un-gauge stations
Gauged sub-catchments
Flood frequency analysis was performed using past records of peak flow to fabricate the guidance about the probable behavior of future flooding of gauged sub-catchments. Table 6 presents the summary of Chi-square test results for Gumbel and Log Pearson Type-III distributions applied at gauged stations.
Where
O=observed Values
E=Expected Values
Sr. no. |
Gauged stations |
Chi-square test |
Selected distribution |
|
|
|
Gumbel |
Log pearson III |
|
1 |
Chinari |
26 |
6 |
Log Pearson III |
2 |
Azad Pattan |
9.7 |
6.09 |
Log Pearson III |
3 |
Nosheri |
9 |
12 |
Gumbel |
4 |
Muzaffarabad |
10.2 |
14.2 |
Gumbel |
5 |
Ghari-Habibullah |
2 |
4 |
Gumbel |
6 |
Palote |
11.5 |
9.6 |
Log Pearson III |
7 |
Kotli |
9.6 |
7.1 |
Log Pearson III |
8 |
Kohala |
12 |
10 |
Log Pearson III |
9 |
Mangla |
18 |
10.5 |
Log Pearson III |
10 |
Naran |
4 |
5.4 |
Gumbel |
Table 6 Chi-square test values at gauged stations
The Chi-square test value helps in selection of appropriate frequency distribution, the chi-square test value should be less than 12.19 However, if the values for many distributions remain below 12 then the distribution with relatively lower value is selected. In our analysis, Chi-square values for selected distributions mostly below 12 except Chinari and Log Pearson Type-III at Muzaffarabad.
Graphical method for regional flood frequency analysis
Figure 3 shows the catchment area-mean annual flood curve. The mean annual flood is the flood with the return period of 2.33 years at the gauged sites of the Jhelum River basin. The relation of catchment area-mean annual flood assumes a straight line. As next step, peak flood ratios were calculated. The peak flood Q at different return periods (i.e. 1.25, 2, 5, 10, 20, 50, 100, 200, and 1000) was divided by the mean annual flood (Q2.33) to obtain the ratios. The regional flood frequency curve was then plotted between median of peak flood ratio and the selected return periods as shown in (Figure 4). The catchment areas of un-gauged sites were measured from topographic maps as well from DEM using GIS software. For the respective catchment area of each un-gauged site, mean annual flood (Q2.33) was read out from Figure 4. The ratio of median Q/Q2.33 is available from Figure 4 at different return periods. The ratio was multiplied with the Q2.33 to obtain flood magnitude at different return periods for the selected ungauged sub-catchments. The results of calculation are presented in (Table 7). CCA method was showed slightly increasing trend as compare to Graphical method. Figure 5(C) presents the comparison of flood magnitude on different return periods at Sopore by applying Graphical method and CCA method. CCA and Graphical method were showed same value on 4-year return period with flood magnitude of 700cumec. Therefore, flood magnitude overestimated after the 4-year return period. There was large difference between both methods on higher return periods. Figure 5(D) presents the comparison of flood magnitude on different return periods at Dudhnial by applying Graphical method and CCA method. Graphical Method over estimating the flood magnitudes
|
||||
Return periods |
Domel |
Dollai |
Sopore |
Dudhnial |
1.25 |
828 |
864 |
720 |
852 |
2 |
952.2 |
993.6 |
828 |
979.8 |
10 |
1110.9 |
1159.2 |
966 |
1143.1 |
20 |
1131.6 |
1180.8 |
984 |
1164.4 |
50 |
1621.5 |
1692 |
1410 |
1668.5 |
100 |
1518 |
1584 |
1320 |
1562 |
200 |
1104 |
1152 |
960 |
1136 |
1000 |
1035 |
1080 |
900 |
1065 |
Table 7 Flood peaks at ungagged sub-catchments
Graphical method estimates lower flood peaks than the Canonical Correlation method especially for Dollai where substantial difference (148-295%) exists between the floods estimated by the two methods at all return periods. For Domel, Sopore and Dudhnial the graphical method estimated floods closer to the estimated by canonical correlation method at smaller return periods. However, again the estimated floods at higher return periods are much less than estimated by CCA (Canonical Correlation Analysis) method. The comparisons of flood estimated by two methods infer that the graphical method may have underestimated the flood peaks. The performance of CCA method seems to be better approach in flood estimation because it takes into consideration hydrological similarities of the un-gauged and gauged sub-catchments. Moreover, the frequency analysis is performed on the transported data using most appropriate frequency distribution. Therefore, it can be concluded that canonical correlation method should be preferred over graphical method.
This research was supported by the National Natural Science Foundation of China. The authors would also like to acknowledge the Pakistan Water and Power Development Authority (WAPDA), & Pakistan Meteorological Department (PMD) for providing data for the study.
None.
©2018 Azam, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.