Research Article Volume 7 Issue 3
Department of Statistics, Eritrea Institute of Technology, Eritrea
Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Eritrea
Received: May 03, 2018 | Published: June 13, 2018
Citation: Eyob T, Shanker R. A two–parameter weighted Garima distribution with properties and application. Biom Biostat Int J. 2018;7(3):234-242. DOI: 10.15406/bbij.2018.07.00214
In this paper a two–parameter weighted Garima distribution which includes one–parameter Garima distribution has been proposed for modeling real lifetime data. Statistical properties of the distribution including shapes of probability density function, moments and moment related measures, hazard rate function, mean residual life function and stochastic orderings have been discussed. The estimation of its parameters has been discussed using the method of maximum likelihood. Application of the proposed distribution has been discussed.
Keywords: garima distribution, moments, hazard rate function, mean residual life function, stochastic ordering, maximum likelihood estimation, goodness of fit
Fisher1 firstly introduced the idea of weighted distributions to model ascertainment biases which were later reformulated by Rao2 in a unifying theory for problems where the observations fall in non–experimental, non–replicated and non–random. When a researcher collects observations in the nature according to certain stochastic model, the distribution of the collected observations will not have the original distribution unless every observation has been given an equal chance of being included. Suppose the original observation x0x0 comes from a population having a probability density function (pdf.), fo(x,θ1)fo(x,θ1), where θ1θ1 may be a parameter vector; and observation xx is collected according to a probability re–weighted with a weight function w(x,θ2)>0w(x,θ2)>0,θ2θ2 being a new parameter vector, then xx comes from a population having pdf
f(x;θ1,θ2)=k.w(x;θ2) fo(x;θ2)f(x;θ1,θ2)=k.w(x;θ2)fo(x;θ2) (1.1)
where kk is a normalizing constant. Recall that such types of distributions are known as weighted distributions. A weighted distribution having weight function w(x,θ2)=xw(x,θ2)=x is called length–biased distribution. Extensive discussions on some general probability models leading to weighted probability distributions and their applications and the occurrence of ⌢θ⌢θ in the problems in sampling have been discussed by Patil & Rao.3,4
Shanker5 proposed a lifetime distribution named Garima distribution for modeling behavioral Science data and defined by its pdf and cumulative distribution function (cdf)
f1(x;θ)=θθ+2(1+θ+θx)e−θx ; x>0, θ>0f1(x;θ)=θθ+2(1+θ+θx)e−θx;x>0,θ>0 (1.2)
F1(x,θ)=1−[1+θxθ+2]e−θx ;x>0 , θ>0F1(x,θ)=1−[1+θxθ+2]e−θx;x>0,θ>0 (1.3)
The first four raw moments (moments about origin) of Garima distribution obtained by Shanker5 are
μ1′=θ+3θ(θ+2) , μ2′=2(θ+4)θ2(θ+2) , μ3′=6(θ+5)θ3(θ+2) , μ4′=24(θ+6)θ4(θ+2)μ1′=θ+3θ(θ+2),μ2′=2(θ+4)θ2(θ+2),μ3′=6(θ+5)θ3(θ+2),μ4′=24(θ+6)θ4(θ+2)
The central moments (moments about the mean) of Garima distribution obtained by Shanker5 are given by
μ2=θ2+6θ+7θ2(θ+2)2 μ3=2(θ3+9θ2+21θ+15)θ3(θ+2)3 μ4=3(3θ4+36θ3+134θ2+204θ+111)θ4(θ+2)4
Statistical properties including shapes for different values of parameter, hazard rate function, mean residual life function, stochastic ordering, mean deviations, order statistic, Bonferroni and Lorenz curves, Renyi entropy measures and stress–strength reliability of Garima distribution have been discussed in Shanker.5 Estimation of parameter using both the method of moment and the method of maximum likelihood along with application of Garima distribution has been explained in Shanker.5 The suitability, superiority and application of Garima distribution for modeling behavioral science data over exponential and Lindley distribution, introduced by Lindley,6 have also been discussed by Shanker.5 Ghitany et al.,7 have detailed study on statistical properties, estimation of parameter and application of Lindley distribution to model waiting time in a bank and established that Lindley distribution gives much closer fit than exponential distribution. Further, Ghitany et al.,8 introduced a two–parameter weighted Lindley distribution (WLD) defined by its pdf and cdf
f2(x;θ,β)=θβ+1(θ+β)Γ(β)xβ−1(1+x)e−θx ; x>0 , θ>0, β>0 (1.4)
F2(x;θ,β)=1−(θ+β)Γ(β,θx)+(θx)βe−θx(θ+β)Γ(β) ; x>0, θ>0, β>0 (1.5)
where Γ(β,θx) is the upper incomplete gamma function defined as
Γ(β,z)=∞∫ze−yyβ−1dy;y≥0, β>0 (1.6)
It can be easily shown that Lindley distribution is a particular case of WLD at β=1. Shanker et al.,9 proposed an extension of WLD named a three–parameter weighted Lindley distribution which includes Lindley distribution and WLD as particular cases.
Shanker10 has obtained discrete Poisson–Garima distribution (PGD), discussed its statistical properties, estimation of parameter and applications for count data from biological sciences. Shanker & Shukla11,12 have also proposed size–biased Poisson–Garima distribution (SBPGD) and Zero–truncated Poisson–Garima distribution (ZTPGD), along with estimation of their parameter and applications for data which structurally excludes zero counts.
The organizations of the rest of the paper are as follows: Section 2 deals with two–parameter weighted Garima distribution (WGD) which includes one–parameter Garima distribution proposed by Shanker5 along with the shapes of the pdf and the cdf of WGD. Section 3 deals with statistical constants and associated measures of WGD including coefficient of variation, skewness, kurtosis, index of dispersion. Section 4 deals with the reliability properties of WGD including hazard rate function, mean residual life function and stochastic ordering. Section 5 deals with the maximum likelihood method for the estimation of parameters of the distribution. Section 6 deals with the goodness of fit of the proposed distribution with a real lifetime data over other one parameter and two–parameter lifetime distributions. Finally, the conclusions of the paper have been presented in section 7.
The pdf of the weighted Garima distribution (WGD) can be expressed as
f3(x;θ,β)=K xβ−1fo(x;θ) ; x>0, θ>0, β>0 (2.1)
where, K is the normalizing constant and fo(x;θ) is the pdf of Garima distribution given in (1.2). Thus the pdf of WGD can be obtained as
f3(x;θ,β)=θβ(θ+β+1) xβ−1Γ(β)(1+θ+θx)e−θx ; x>0, θ>0, β>0 (2.2)
where
Γ(β)=∞∫0e−yyβ−1dy;y>0,β>0 is the complete gamma function.
We say that X follows WGD with parameters θ and β if its pdf is given by (2.2) and we denote it by X~WGD (θ,β). It can be easily verified that Garima distribution and size–biased Garima distribution (SBGD) are particular cases of WGD at β=1 and β=2, respectively. The behavior of the pdf of WGD for different combinations of parameters θ and β are shown in Figure 1.
The cdf of the WGD can be expressed as
F3(x;θ,β)=1−(θx)βe−θx+(θ+β+1)Γ(β,θx)(θ+β+1)Γ(β) ;x>0,θ>0,β>0 (2.3)
where Γ(β,θx) is the upper incomplete gamma function defined in (1.6). Behavior of the cdf of the WGD for different combinations of the parameters θ and β are shown in Figure 2.
The rth raw moments (moment about origin), μr′,of WGD (2.2) can be derived as
μr′=E(Xr)=∞∫0xrf2(x;θ,β)dx=∞∫0xrxβ−1(θ+β+1)θβΓ(β)(1+θ+θx)e−θxdx
=(θ+β+r+1)Γ(β+r)θr(θ+β+1)Γ(β) ; r=1, 2, 3, . . . (3.1)
Thus the first four raw moments of WGD are obtained as
μ1′=β(θ+β+2)θ(θ+β+1)
μ2′=β(β+1)(θ+β+3)θ2(θ+β+1)
μ3′=β(β+1)(β+2)(θ+β+4)θ3(θ+β+1)
μ4′=β(β+1)(β+2)(β+3)(θ+β+5)θ4(θ+β+1)
Using relationship μr=E(X−μ1′)r=r∑k=0(rk) μk′ (−μ1′)r−k between central moments and raw moments, the central moments of WGD are
μ2=β{θ2+(2β+4)θ+(β2+3β+3)}θ2(θ+β+1)2
μ3=2β{θ3+(3β+6)θ2+(3β2+9β+9)θ+(β3+4β2+6β+4)}θ3(θ+β+1)3
μ4=3β{(β+2)θ4+(4β2+16β+16)θ3+(6β3+34β2+58β+36)θ2+(4β4+28β3+68β2+72β+32)θ+(β5+8β4+25β3+38β2+29β+10)}θ4(θ+β+1)4
It can be verified that at β=1, the moments about origin and the moments about mean of WGD reduces to the corresponding moments of Garima distribution.
The coefficient variation (C.V.) coefficient of skewness(√β1), coefficient of kurtosis (β2) and index of dispersion (γ) of WGD are thus given as
C.V=σμ′1=√θ2+(2β+4)θ+(β2+3β+3)√β(θ+β+2)
√β1=μ3μ3/22=2{θ3+(3β+6)θ2+(3β2+9β+9)θ+(β3+4β2+6β+4)}√β{θ2+(2β+4)θ+(β2+3β+3)}3/2
β2=μ4μ22=3{(β+2)θ4+(4β2+16β+16)θ3+(6β3+34β2+58β+36)θ2+(4β4+28β3+68β2+72β+32)θ+(β5+8β4+25β3+38β2+29β+10)}β{θ2+(2β+4)θ+(β2+3β+3)}2
γ=σ2μ′1=θ2+(2β+4)θ+(β2+3β+3)θ(θ+β+1)(θ+β+2)
Behaviors of coefficient of variation (C.V), coefficient of skewness (√β1), coefficient of kurtosis (β2) and index of dispersion (γ) of WGD have been prepared for different values of θ and β are presented in Tables 1, 2, 3, 4.
It is obvious from Table 1 that for a given β, the C.V. increases as the θ increases, whereas for a givenθ, the C.V. decreases as the value of β increases. It is obvious from Table 2 that for a given θ(β), (√β1) decreases (increases) as the β(θ) increases. It is obvious from Table 3 that for a given θ(β), the coefficient of Kurtosis (β2) decreases (increases) as the β(θ) increases.
It is obvious from Table 4 that for a given β, the index of dispersion decreases as θ increases. Similarly, for a givenβ, the index of dispersion decreases as β increases.
|
0.5 |
1 |
2 |
3 |
4 |
5 |
0.5 |
1.29099 |
1.32480 |
1.36083 |
1.37870 |
1.38888 |
1.39523 |
1 |
0.91473 |
0.93541 |
0.95917 |
0.97183 |
0.97938 |
0.98425 |
2 |
0.65263 |
0.66332 |
0.67700 |
0.68512 |
0.69034 |
0.69389 |
3 |
0.53783 |
0.54433 |
0.55328 |
0.55902 |
0.56291 |
0.56569 |
4 |
0.46948 |
0.47380 |
0.48007 |
0.48432 |
0.48734 |
0.48956 |
5 |
0.42269 |
0.42573 |
0.43033 |
0.43359 |
0.43598 |
0.43780 |
Table 1 Behavior of CV of WGD for varying values of parameters and
θ |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.5 |
2.375430 |
2.477646 |
2.599725 |
2.667337 |
2.708762 |
2.736002 |
1 |
1.698866 |
1.756288 |
1.831301 |
1.876396 |
1.905555 |
1.925486 |
2 |
1.236173 |
1.260866 |
1.298056 |
1.323613 |
1.341710 |
1.354931 |
3 |
1.033503 |
1.046136 |
1.067222 |
1.083251 |
1.095447 |
1.104854 |
4 |
0.911052 |
0.918239 |
0.931182 |
0.941812 |
0.950380 |
0.957291 |
5 |
0.825794 |
0.830208 |
0.838630 |
0.845981 |
0.852189 |
0.857388 |
Table 2 Behavior of (√β1) of WGD for varying values of parameters θ and β
θ β |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.5 |
11.08000 |
11.84586 |
12.8368 |
13.42662 |
13.80492 |
14.06174 |
1 |
7.172516 |
7.469388 |
7.888469 |
8.159170 |
8.342689 |
8.472425 |
2 |
5.243856 |
5.330579 |
5.471074 |
5.574669 |
5.651707 |
5.710059 |
3 |
4.582041 |
4.617188 |
4.680000 |
4.731111 |
4.771968 |
4.804688 |
4 |
4.235147 |
4.252066 |
4.284545 |
4.313019 |
4.337119 |
4.357313 |
5 |
4.017509 |
4.026635 |
4.045120 |
4.062291 |
4.077505 |
4.090737 |
Table 3 Behavior of Kurtosis (β2) of WGD for varying values of parameters θ and β
θ β |
0.5 |
1 |
2 |
3 |
4 |
5 |
0.5 |
2.500000 |
1.228571 |
0.595238 |
0.387205 |
0.284965 |
0.224615 |
1 |
2.342857 |
1.166667 |
0.575000 |
0.377778 |
0.279762 |
0.221429 |
2 |
2.190476 |
1.100000 |
0.550000 |
0.365079 |
0.272321 |
0.216667 |
3 |
2.121212 |
1.066667 |
0.535714 |
0.357143 |
0.267361 |
0.213333 |
4 |
2.083916 |
1.047619 |
0.526786 |
0.351852 |
0.263889 |
0.210909 |
5 |
2.061538 |
1.035714 |
0.520833 |
0.348148 |
0.261364 |
0.209091 |
Table 4 Behavior of γ of WGD for varying values of parameters θ and
In this section three important reliability properties namely hazard rate function, mean residual life function and stochastic ordering of WGD has been discussed
a. Hazard rate function
The survival (reliability) function of WGD can be expressed as
S(x;θ,β)=1−F3(x;θ,β)=(θx)βe−θx+(θ+β+1)Γ(β,θx)(θ+β+1)Γ(β) (4.1.1)
The hazard (or failure) rate function, h(x) of WGD is thus expressed as
h(x)=f3(x;θ,β)S(x;θ,β)=xβ−1 θβ(1+θ+θx)e−θx(θx)βe−θx+(θ+β+1) Γ(β,θx) ; x>0, θ>0, β>0 (4.2)
The behavior of h(x) of WGD for different combinations of θ and β are shown in Figure 3.
Mean residual life function
The mean residual life function μ(x)=E(X−x|X>x) of the WGD can be derived as
μ(x)=1S(x;θ,β)∞∫xt f3(t;θ,β)dt −x
=(θx)β(θ+β+2)e−θx+{β2+βθ+2β−θx(θ+β+1)}Γ(β,θx)θ{(θx)βe−θx+(θ+β+1)Γ(β,θx)}
Clearly μ(0)=β(θ+β+2)θ(θ+β+1)=μ′1. The behavior of μ(x)of the WGD for different combinations of θ and β are shown in Figure 4.
c. Stochastic ordering
A random variable X is said to be smaller than a random variable Y in the
The following important interpretations due to Shaked & Shanthikumar13 are well known for establishing stochastic ordering of distributions.
X≤lrY⇒X≤hrY⇒X≤mrlY ⇓ X≤stY
The WGD is ordered with respect to the strongest ‘likelihood ratio’ ordering as established in the following theorem.
Theorem
Let X~WGD(θ1,β1) and Y~WGD(θ2,β2). If θ1>θ2 and β1=β2(or β1<β2 and θ1=θ2), then X≤lrY and thus X≤hrY,X≤mrlY and X≤stY.
Proof
We have
ln fX(x;θ1,β1)fY(x;θ2,β2)=ln(θβ11(θ2+β2+1)Γ(β2)θβ22(θ1+β1+1)Γ(β1) ) xβ1−β2 (1+θ1+θ1x1+θ2+θ2x)e−(θ1−θ2)x
Now ln fX(x;θ1,β1)fY(x;θ2,β2)=ln(θβ11(θ2+β2+1)Γ(β2)θβ22(θ1+β1+1)Γ(β1) ) +(β1−β2)lnx+ln (1+θ1+θ1x1+θ2+θ2x)−(θ1−θ2)x
This gives
ddxln(fX(x;θ1,β1)fY(x;θ2,β2))=β1−β2x+θ1−θ2(1+θ1+θ1x)(1+θ2+θ2x)−(θ1−θ2).
Thus, for β1=β2 and θ1≥θ2, or (β1<β2 and θ1≥θ2),ddxln(fx(x;θ1,β1)fy(x;θ2,β2))<0. This shows that X≤lrYand thus X≤hrY,X≤mrlY and X≤stY. This shows flexibility of WGD over Garima distribution.
Consider (x1,x2, . . .,xn) be a random sample of size n from WGD (2.2). The natural log likelihood function can be expressed as
lnL=n[β lnθ−ln(θ+β+1)−ln(Γ(β))]+(β−1)n∑i=1ln(xi)+n∑i=1ln(1+θ+θxi)−n θ ˉx
The maximum likelihood estimates (ˆθ,⌢β) of (θ,β) is the solution of the following log likelihood equations.
∂lnL∂θ=nβθ−nθ+β+1+n∑i=11+xi1+θ+θxi−n ˉx=0
∂lnL∂β=nlnθ−nθ+β+1−n ψ(β)+n∑i=1ln(xi)=0
where ˉx being the sample mean and ψ(β) is the digamma function defined as ψ(β)=ddβlnΓ(β). Since these two log likelihood equations are not in closed forms, they cannot be solved analytically. However, the MLE’s (ˆθ,ˆβ) of (θ,β) can be computed directly by solving the natural log likelihood equation using Newton–Raphson method available in R–software till sufficiently close estimates of ⌢θ and ⌢β are obtained. In this paper starting values of parameters θ and β are θ=0.5 and β=1.5, respectively.
In this section a real lifetime data has been considered for the goodness of fit of WGD over one–parameter and two–parameter life time distributions. The data is regarding the tensile strength, measured in GPa, of 69 carbon fibers tested under tension at gauge lengths of 20mm, available in Bader & Priest.14
1.312 1.314 1.479 1.552 1.700 1.803 1.861 1.865 1.944 1.958 1.966 1.997 2.006 2.021 2.027 2.055 2.063 2.098 2.140 2.179 2.224 2.240 2.253 2.270 2.272 2.274 2.301 2.301 2.359 2.382 2.382 2.426 2.434 2.435 2.478 2.490 2.511 2.514 2.535 2.554 2.566 2.570 2.586 2.629 2.633 2.642 2.648 2.684 2.697 2.726 2.770 2.773 2.800 2.809 2.818 2.821 2.848 2.880 2.954 3.012 3.067 3.084 3.090 3.096 3.128 3.233 3.433 3.585 3.585
The goodness of fit of WGD has been compared with one parameter exponential, Lindley and Garima distributions and two–parameter Gompertz distribution, generalized exponential distribution (GED) introduced by Gupta & Kundu15, lognormal distribution and WLD. The pdf and cdf of Gompertz distributions, lognormal and GED are presented in Table 5. The ML estimates of parameters, −2lnL, Akaike Information criteria (AIC), K–S statistics and p–value of distributions for the considered dataset are presented in Table 6. The AIC and K–S Statistics are calculated using the formulae: AIC=−2lnL+2k and K-S=Supx|Fn(x)−F0(x)|, where k = the number of parameters, n = the sample size, Fn(x)is the empirical (sample) cumulative distribution function, and F0(x) is the theoretical cumulative distribution function. The best distribution is the distribution corresponding to lower values of −2lnL, AIC, and K–S statistics.
It is quite obvious from table 6 that WGD is competing well with two–parameter lifetime distributions and gives quite satisfactory fit.
Distributions |
cdf |
|
WLD |
f(x;θ,β)=θβ+1(θ+β) xβ−1Γ(β) (1+x) e−θ x |
F(x;θ,β)=1−(θ+β)Γ(β,θ x)+(θ x)βe−θ x(θ+β)Γ(β) |
Lognormal |
f(x;θ,β)=1√2πβx e−12(logx−θβ)2 |
F(x;θ,β)=ϕ(logx−θβ) |
GED |
f(x;θ,β)=θ β(1−e−θ x)β−1 e−θ x |
F(x;θ,β)=(1−e−θx)β |
Gompertz |
f(x;θ,β)=θ eβ x−θβ(eβ x−1) |
F(x;θ,β)=1−e− θβ(eβ x−1) |
Table 5 The pdf and the cdf of fitted distributions
Distributions |
ML Estimates |
−2lnL |
AIC |
K-S |
P-value |
|
---|---|---|---|---|---|---|
ˆθ |
β |
|||||
WGD |
9.3798 |
22.3473 |
101.94 |
105.94 |
0.057 |
0.979 |
WLD |
9.6265 |
22.8938 |
101.95 |
105.95 |
0.059 |
0.973 |
Lognormal |
0.8751 |
0.2124 |
102.72 |
106.73 |
0.103 |
0.713 |
GED |
2.0331 |
87.2847 |
109.24 |
113.24 |
0.087 |
0.613 |
Gompertz |
0.0080 |
2.0420 |
107.25 |
111.250 |
0.085 |
0.673 |
Garima |
0.5863 |
------- |
251.33 |
253.33 |
0.4381 |
0.000 |
Lindley |
0.0702 |
------- |
238.38 |
240.38 |
0.401 |
0.000 |
Exponential |
0.4079 |
------- |
261.73 |
263.73 |
0.448 |
0.000 |
Table 6 MLE’s, - 2ln L, AIC, K-S Statistics and p-values of the fitted distributions
A two–parameter weighted Garima distribution(WGD) which includes one parameter Garima distribution proposed by Shanker5 has been suggested for modeling lifetime data from engineering. Its statistical properties including shapes of probability density function for different combinations of parameters, coefficients of variation, skweness, kurtosis, and index of dispersion have been explained. Reliability measures including hazard rate function, mean residual life function and the stochastic ordering have been studied. Estimation of parameters has been discussed using maximum likelihood. The goodness of fit of WGD has been explained with a real lifetime data and the fit has been found to be quite satisfactory over exponential, Lindley and Garima, Gompertz, generalized exponential, lognormal and weighted Lindley distributions.
Authors are grateful to the editor–in–chief of the journal and the anonymous reviewer for fruitful comments on the paper.
Author declares there is no conflict of interest towards this manuscript.
©2018 Eyob, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7