Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 7 Issue 3

A two–parameter weighted Garima distribution with properties and application

Tesfalem Eyob, Rama Shanker

Department of Statistics, Eritrea Institute of Technology, Eritrea

Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Eritrea

Received: May 03, 2018 | Published: June 13, 2018

Citation: Eyob T, Shanker R. A two–parameter weighted Garima distribution with properties and application. Biom Biostat Int J. 2018;7(3):234-242. DOI: 10.15406/bbij.2018.07.00214

Download PDF

Abstract

In this paper a two–parameter weighted Garima distribution which includes one–parameter Garima distribution has been proposed for modeling real lifetime data. Statistical properties of the distribution including shapes of probability density function, moments and moment related measures, hazard rate function, mean residual life function and stochastic orderings have been discussed. The estimation of its parameters has been discussed using the method of maximum likelihood. Application of the proposed distribution has been discussed.

Keywords: garima distribution, moments, hazard rate function, mean residual life function, stochastic ordering, maximum likelihood estimation, goodness of fit

Introduction

Fisher1 firstly introduced the idea of weighted distributions to model ascertainment biases which were later reformulated by Rao2 in a unifying theory for problems where the observations fall in non–experimental, non–replicated and non–random. When a researcher collects observations in the nature according to certain stochastic model, the distribution of the collected observations will not have the original distribution unless every observation has been given an equal chance of being included. Suppose the original observation x0x0 comes from a population having a probability density function (pdf.), fo(x,θ1)fo(x,θ1), where θ1θ1 may be a parameter vector; and observation xx is collected according to a probability re–weighted with a weight function w(x,θ2)>0w(x,θ2)>0,θ2θ2 being a new parameter vector, then xx comes from a population having pdf

f(x;θ1,θ2)=k.w(x;θ2)fo(x;θ2)f(x;θ1,θ2)=k.w(x;θ2)fo(x;θ2)          (1.1)

where kk is a normalizing constant. Recall that such types of distributions are known as weighted distributions. A weighted distribution having weight function w(x,θ2)=xw(x,θ2)=x is called length–biased distribution. Extensive discussions on some general probability models leading to weighted probability distributions and their applications and the occurrence of θθ in the problems in sampling have been discussed by Patil & Rao.3,4

Shanker5 proposed a lifetime distribution named Garima distribution for modeling behavioral Science data and defined by its pdf and cumulative distribution function (cdf)

f1(x;θ)=θθ+2(1+θ+θx)eθx;x>0,θ>0f1(x;θ)=θθ+2(1+θ+θx)eθx;x>0,θ>0             (1.2)

F1(x,θ)=1[1+θxθ+2]eθx;x>0,θ>0F1(x,θ)=1[1+θxθ+2]eθx;x>0,θ>0            (1.3)

The first four raw moments (moments about origin) of Garima distribution obtained by Shanker5 are

μ1=θ+3θ(θ+2),μ2=2(θ+4)θ2(θ+2),μ3=6(θ+5)θ3(θ+2),μ4=24(θ+6)θ4(θ+2)μ1=θ+3θ(θ+2),μ2=2(θ+4)θ2(θ+2),μ3=6(θ+5)θ3(θ+2),μ4=24(θ+6)θ4(θ+2)

The central moments (moments about the mean) of Garima distribution obtained by Shanker5 are given by

μ2=θ2+6θ+7θ2(θ+2)2μ3=2(θ3+9θ2+21θ+15)θ3(θ+2)3μ4=3(3θ4+36θ3+134θ2+204θ+111)θ4(θ+2)4

Statistical properties including shapes for different values of parameter, hazard rate function, mean residual life function, stochastic ordering, mean deviations, order statistic, Bonferroni and Lorenz curves, Renyi entropy measures and stress–strength reliability of Garima distribution have been discussed in Shanker.5 Estimation of parameter using both the method of moment and the method of maximum likelihood along with application of Garima distribution has been explained in Shanker.5 The suitability, superiority and application of Garima distribution for modeling behavioral science data over exponential and Lindley distribution, introduced by Lindley,6 have also been discussed by Shanker.5 Ghitany et al.,7 have detailed study on statistical properties, estimation of parameter and application of Lindley distribution to model waiting time in a bank and established that Lindley distribution gives much closer fit than exponential distribution. Further, Ghitany et al.,8 introduced a two–parameter weighted Lindley distribution (WLD) defined by its pdf and cdf

f2(x;θ,β)=θβ+1(θ+β)Γ(β)xβ1(1+x)eθx;x>0,θ>0,β>0       (1.4)

F2(x;θ,β)=1(θ+β)Γ(β,θx)+(θx)βeθx(θ+β)Γ(β);x>0,θ>0,β>0        (1.5)

where Γ(β,θx) is the upper incomplete gamma function defined as

Γ(β,z)=zeyyβ1dy;y0,β>0       (1.6)

It can be easily shown that Lindley distribution is a particular case of WLD at β=1. Shanker et al.,9 proposed an extension of WLD named a three–parameter weighted Lindley distribution which includes Lindley distribution and WLD as particular cases.

Shanker10 has obtained discrete Poisson–Garima distribution (PGD), discussed its statistical properties, estimation of parameter and applications for count data from biological sciences. Shanker & Shukla11,12 have also proposed size–biased Poisson–Garima distribution (SBPGD) and Zero–truncated Poisson–Garima distribution (ZTPGD), along with estimation of their parameter and applications for data which structurally excludes zero counts.

The organizations of the rest of the paper are as follows: Section 2 deals with two–parameter weighted Garima distribution (WGD) which includes one–parameter Garima distribution proposed by Shanker5 along with the shapes of the pdf and the cdf of WGD. Section 3 deals with statistical constants and associated measures of WGD including coefficient of variation, skewness, kurtosis, index of dispersion. Section 4 deals with the reliability properties of WGD including hazard rate function, mean residual life function and stochastic ordering. Section 5 deals with the maximum likelihood method for the estimation of parameters of the distribution. Section 6 deals with the goodness of fit of the proposed distribution with a real lifetime data over other one parameter and two–parameter lifetime distributions. Finally, the conclusions of the paper have been presented in section 7.

Weighted Garima distribution

The pdf of the weighted Garima distribution (WGD) can be expressed as

f3(x;θ,β)=Kxβ1fo(x;θ);x>0,θ>0,β>0 (2.1)

where, K is the normalizing constant and fo(x;θ)is the pdf of Garima distribution given in (1.2). Thus the pdf of WGD can be obtained as

f3(x;θ,β)=θβ(θ+β+1)xβ1Γ(β)(1+θ+θx)eθx;x>0,θ>0,β>0  (2.2)

where

Γ(β)=0eyyβ1dy;y>0,β>0 is the complete gamma function.

We say that X follows WGD with parameters θ and β if its pdf is given by (2.2) and we denote it by X~WGD(θ,β). It can be easily verified that Garima distribution and size–biased Garima distribution (SBGD) are particular cases of WGD at β=1 and β=2, respectively. The behavior of the pdf of WGD for different combinations of parameters θ and β are shown in Figure 1.

Figure 1 Behavior of the pdf of WGD for different combinations of the parameters θ and β.

The cdf of the WGD can be expressed as

F3(x;θ,β)=1(θx)βeθx+(θ+β+1)Γ(β,θx)(θ+β+1)Γ(β);x>0,θ>0,β>0    (2.3)

where Γ(β,θx) is the upper incomplete gamma function defined in (1.6). Behavior of the cdf of the WGD for different combinations of the parameters θ and β are shown in Figure 2.

Figure 2 Behavior of the cdf of WGD for different combinations of the parameters θ and β.

Statistical constants and related measures

The rth raw moments (moment about origin), μr,of WGD (2.2) can be derived as

μr=E(Xr)=0xrf2(x;θ,β)dx=0xrxβ1(θ+β+1)θβΓ(β)(1+θ+θx)eθxdx

=(θ+β+r+1)Γ(β+r)θr(θ+β+1)Γ(β);r=1,2,3,...              (3.1)

Thus the first four raw moments of WGD are obtained as

μ1=β(θ+β+2)θ(θ+β+1)

μ2=β(β+1)(θ+β+3)θ2(θ+β+1)

μ3=β(β+1)(β+2)(θ+β+4)θ3(θ+β+1)

μ4=β(β+1)(β+2)(β+3)(θ+β+5)θ4(θ+β+1)

Using relationship μr=E(Xμ1)r=rk=0(rk)μk(μ1)rk between central moments and raw moments, the central moments of WGD are

μ2=β{θ2+(2β+4)θ+(β2+3β+3)}θ2(θ+β+1)2

μ3=2β{θ3+(3β+6)θ2+(3β2+9β+9)θ+(β3+4β2+6β+4)}θ3(θ+β+1)3

μ4=3β{(β+2)θ4+(4β2+16β+16)θ3+(6β3+34β2+58β+36)θ2+(4β4+28β3+68β2+72β+32)θ+(β5+8β4+25β3+38β2+29β+10)}θ4(θ+β+1)4

It can be verified that at β=1, the moments about origin and the moments about mean of WGD reduces to the corresponding moments of Garima distribution.

The coefficient variation (C.V.) coefficient of skewness(β1), coefficient of kurtosis (β2) and index of dispersion (γ) of WGD are thus given as

C.V=σμ1=θ2+(2β+4)θ+(β2+3β+3)β(θ+β+2)

β1=μ3μ3/22=2{θ3+(3β+6)θ2+(3β2+9β+9)θ+(β3+4β2+6β+4)}β{θ2+(2β+4)θ+(β2+3β+3)}3/2

β2=μ4μ22=3{(β+2)θ4+(4β2+16β+16)θ3+(6β3+34β2+58β+36)θ2+(4β4+28β3+68β2+72β+32)θ+(β5+8β4+25β3+38β2+29β+10)}β{θ2+(2β+4)θ+(β2+3β+3)}2

γ=σ2μ1=θ2+(2β+4)θ+(β2+3β+3)θ(θ+β+1)(θ+β+2)

Behaviors of coefficient of variation (C.V), coefficient of skewness (β1), coefficient of kurtosis (β2) and index of dispersion (γ) of WGD have been prepared for different values of θ and β are presented in Tables 1, 2, 3, 4.

It is obvious from Table 1 that for a given β, the C.V. increases as the θ increases, whereas for a givenθ, the C.V. decreases as the value of β increases. It is obvious from Table 2 that for a given θ(β), (β1) decreases (increases) as the β(θ) increases. It is obvious from Table 3 that for a given θ(β), the coefficient of Kurtosis (β2) decreases (increases) as the β(θ) increases.

It is obvious from Table 4 that for a given β, the index of dispersion decreases as θ increases. Similarly, for a givenβ, the index of dispersion decreases as β increases.

      
      θ
β

0.5

1

2

3

4

5

0.5

1.29099

1.32480

1.36083

1.37870

1.38888

1.39523

1

0.91473

0.93541

0.95917

0.97183

0.97938

0.98425

2

0.65263

0.66332

0.67700

0.68512

0.69034

0.69389

3

0.53783

0.54433

0.55328

0.55902

0.56291

0.56569

4

0.46948

0.47380

0.48007

0.48432

0.48734

0.48956

5

0.42269

0.42573

0.43033

0.43359

0.43598

0.43780

Table 1 Behavior of CV of WGD for varying values of parameters and

      θ
β

0.5

1

2

3

4

5

0.5

2.375430

2.477646

2.599725

2.667337

2.708762

2.736002

1

1.698866

1.756288

1.831301

1.876396

1.905555

1.925486

2

1.236173

1.260866

1.298056

1.323613

1.341710

1.354931

3

1.033503

1.046136

1.067222

1.083251

1.095447

1.104854

4

0.911052

0.918239

0.931182

0.941812

0.950380

0.957291

5

0.825794

0.830208

0.838630

0.845981

0.852189

0.857388

Table 2 Behavior of (β1) of WGD for varying values of parameters θ and β

  θ 

β

0.5

1

2

3

4

5

0.5

11.08000

11.84586

12.8368

13.42662

13.80492

14.06174

1

7.172516

7.469388

7.888469

8.159170

8.342689

8.472425

2

5.243856

5.330579

5.471074

5.574669

5.651707

5.710059

3

4.582041

4.617188

4.680000

4.731111

4.771968

4.804688

4

4.235147

4.252066

4.284545

4.313019

4.337119

4.357313

5

4.017509

4.026635

4.045120

4.062291

4.077505

4.090737

Table 3 Behavior of Kurtosis (β2) of WGD for varying values of parameters θ and β

         θ

β

0.5

1

2

3

4

5

0.5

2.500000

1.228571

0.595238

0.387205

0.284965

0.224615

1

2.342857

1.166667

0.575000

0.377778

0.279762

0.221429

2

2.190476

1.100000

0.550000

0.365079

0.272321

0.216667

3

2.121212

1.066667

0.535714

0.357143

0.267361

0.213333

4

2.083916

1.047619

0.526786

0.351852

0.263889

0.210909

5

2.061538

1.035714

0.520833

0.348148

0.261364

0.209091

Table 4 Behavior of γ of WGD for varying values of parameters θ and

Reliability properties

In this section three important reliability properties namely hazard rate function, mean residual life function and stochastic ordering of WGD has been discussed

a. Hazard rate function

The survival (reliability) function of WGD can be expressed as

S(x;θ,β)=1F3(x;θ,β)=(θx)βeθx+(θ+β+1)Γ(β,θx)(θ+β+1)Γ(β) (4.1.1)

The hazard (or failure) rate function, h(x) of WGD is thus expressed as

h(x)=f3(x;θ,β)S(x;θ,β)=xβ1θβ(1+θ+θx)eθx(θx)βeθx+(θ+β+1)Γ(β,θx);x>0,θ>0,β>0    (4.2)

The behavior of h(x) of WGD for different combinations of θ and β are shown in Figure 3.

Figure 3 Behavior of the hazard function of WGD for different combinations θ and β.

Mean residual life function

The mean residual life function μ(x)=E(Xx|X>x) of the WGD can be derived as

μ(x)=1S(x;θ,β)xtf3(t;θ,β)dtx

=(θx)β(θ+β+2)eθx+{β2+βθ+2βθx(θ+β+1)}Γ(β,θx)θ{(θx)βeθx+(θ+β+1)Γ(β,θx)}

Clearly μ(0)=β(θ+β+2)θ(θ+β+1)=μ1. The behavior of μ(x)of the WGD for different combinations of θ and β are shown in Figure 4.

Figure 4 Behavior of μ(x) of the WGD for different cpmbinations of θ and β.
Figure 5 Fitted pdf plots of considered distribution for the given dataset.

c. Stochastic ordering

A random variable X is said to be smaller than a random variable Y in the

  1. Stochastic order (XstY) if FX(x)FY(x) for all x
  2. Hazard rate order (XhrY)if hX(x)hY(x) for all x
  3. Mean residual life order (XmrlY) if mX(x)mY(x) for all x
  4. Likelihood ratio order (XlrY)if fX(x)fY(x) decreases in x.

The following important interpretations due to Shaked & Shanthikumar13 are well known for establishing stochastic ordering of distributions.

XlrYXhrYXmrlYXstY

The WGD is ordered with respect to the strongest ‘likelihood ratio’ ordering as established in the following theorem.

Theorem

Let X~WGD(θ1,β1) and Y~WGD(θ2,β2). If θ1>θ2 and β1=β2(orβ1<β2andθ1=θ2), then XlrY and thus XhrY,XmrlY and XstY.

Proof

We have

lnfX(x;θ1,β1)fY(x;θ2,β2)=ln(θβ11(θ2+β2+1)Γ(β2)θβ22(θ1+β1+1)Γ(β1))xβ1β2(1+θ1+θ1x1+θ2+θ2x)e(θ1θ2)x

Now lnfX(x;θ1,β1)fY(x;θ2,β2)=ln(θβ11(θ2+β2+1)Γ(β2)θβ22(θ1+β1+1)Γ(β1))+(β1β2)lnx+ln(1+θ1+θ1x1+θ2+θ2x)(θ1θ2)x

This gives

ddxln(fX(x;θ1,β1)fY(x;θ2,β2))=β1β2x+θ1θ2(1+θ1+θ1x)(1+θ2+θ2x)(θ1θ2).

Thus, for β1=β2 and θ1θ2, or (β1<β2andθ1θ2),ddxln(fx(x;θ1,β1)fy(x;θ2,β2))<0. This shows that XlrYand thus XhrY,XmrlY and XstY. This shows flexibility of WGD over Garima distribution.

Estimation of parameters

Consider (x1,x2,...,xn) be a random sample of size n from WGD (2.2). The natural log likelihood function can be expressed as

lnL=n[βlnθln(θ+β+1)ln(Γ(β))]+(β1)ni=1ln(xi)+ni=1ln(1+θ+θxi)nθˉx

The maximum likelihood estimates (ˆθ,β) of (θ,β) is the solution of the following log likelihood equations.

lnLθ=nβθnθ+β+1+ni=11+xi1+θ+θxinˉx=0

lnLβ=nlnθnθ+β+1nψ(β)+ni=1ln(xi)=0

where ˉx being the sample mean and ψ(β) is the digamma function defined as ψ(β)=ddβlnΓ(β). Since these two log likelihood equations are not in closed forms, they cannot be solved analytically. However, the MLE’s (ˆθ,ˆβ) of (θ,β) can be computed directly by solving the natural log likelihood equation using Newton–Raphson method available in R–software till sufficiently close estimates of θ and β are obtained. In this paper starting values of parameters θ and β are θ=0.5 and β=1.5, respectively.

Data analysis

In this section a real lifetime data has been considered for the goodness of fit of WGD over one–parameter and two–parameter life time distributions. The data is regarding the tensile strength, measured in GPa, of 69 carbon fibers tested under tension at gauge lengths of 20mm, available in Bader & Priest.14

1.312     1.314      1.479      1.552      1.700      1.803      1.861      1.865      1.944      1.958      1.966      1.997      2.006      2.021      2.027      2.055      2.063      2.098      2.140      2.179      2.224      2.240      2.253      2.270      2.272      2.274      2.301      2.301      2.359      2.382      2.382      2.426      2.434      2.435      2.478      2.490      2.511      2.514      2.535      2.554      2.566      2.570      2.586      2.629      2.633      2.642      2.648      2.684      2.697      2.726      2.770      2.773      2.800      2.809      2.818      2.821      2.848      2.880      2.954      3.012      3.067      3.084      3.090      3.096      3.128      3.233      3.433      3.585      3.585

The goodness of fit of WGD has been compared with one parameter exponential, Lindley and Garima distributions and two–parameter Gompertz distribution, generalized exponential distribution (GED) introduced by Gupta & Kundu15, lognormal distribution and WLD. The pdf and cdf of Gompertz distributions, lognormal and GED are presented in Table 5. The ML estimates of parameters, 2lnL, Akaike Information criteria (AIC), K–S statistics and p–value of distributions for the considered dataset are presented in Table 6. The AIC and K–S Statistics are calculated using the formulae: AIC=2lnL+2k and K-S=Supx|Fn(x)F0(x)|, where k = the number of parameters, n = the sample size, Fn(x)is the empirical (sample) cumulative distribution function, and F0(x) is the theoretical cumulative distribution function. The best distribution is the distribution corresponding to lower values of 2lnL, AIC, and K–S statistics.

It is quite obvious from table 6 that WGD is competing well with two–parameter lifetime distributions and gives quite satisfactory fit.

Distributions

pdf

cdf

WLD

f(x;θ,β)=θβ+1(θ+β)xβ1Γ(β)(1+x)eθx

F(x;θ,β)=1(θ+β)Γ(β,θx)+(θx)βeθx(θ+β)Γ(β)

Lognormal

f(x;θ,β)=12πβxe12(logxθβ)2

F(x;θ,β)=ϕ(logxθβ)

GED

f(x;θ,β)=θβ(1eθx)β1eθx

F(x;θ,β)=(1eθx)β

Gompertz

f(x;θ,β)=θeβxθβ(eβx1)

F(x;θ,β)=1eθβ(eβx1)

Table 5 The pdf and the cdf of fitted distributions

Distributions

ML Estimates

2lnL

AIC

K-S

 P-value

ˆθ

β

WGD

9.3798

22.3473

101.94

105.94

0.057

0.979

WLD

9.6265

22.8938

101.95

105.95

0.059

0.973

Lognormal

0.8751

0.2124

102.72

106.73

0.103

0.713

GED

2.0331

87.2847

109.24

113.24

0.087

0.613

Gompertz

0.0080

2.0420

107.25

111.250

0.085

0.673

Garima

0.5863

-------

251.33

253.33

0.4381

0.000

Lindley

0.0702

-------

238.38

240.38

0.401

0.000

Exponential

0.4079

-------

261.73

263.73

0.448

0.000

Table 6 MLE’s, - 2ln L, AIC, K-S Statistics and p-values of the fitted distributions

Conclusion

A two–parameter weighted Garima distribution(WGD) which includes one parameter Garima distribution proposed by Shanker5 has been suggested for modeling lifetime data from engineering. Its statistical properties including shapes of probability density function for different combinations of parameters, coefficients of variation, skweness, kurtosis, and index of dispersion have been explained. Reliability measures including hazard rate function, mean residual life function and the stochastic ordering have been studied. Estimation of parameters has been discussed using maximum likelihood. The goodness of fit of WGD has been explained with a real lifetime data and the fit has been found to be quite satisfactory over exponential, Lindley and Garima, Gompertz, generalized exponential, lognormal and weighted Lindley distributions.

Acknowledgements

Authors are grateful to the editor–in–chief of the journal and the anonymous reviewer for fruitful comments on the paper.

Conflict of interest

Author declares there is no conflict of interest towards this manuscript.

References

  1. Fisher RA. The effects of methods of ascertainment upon the estimation of frequencies. Ann Eugenics. 1934;6(1):13–25.
  2. Rao CR. On discrete distributions arising out of methods of ascertainment. In: Patil GP, editor. Classical and Contagious Discrete Distributions. Calcutta: Statistical Publishing Society; 1965. p. 320–332.
  3. Patil GP, Rao CR. The Weighted distributions: A survey and their applications. In applications of Statistics. PR Krishnaiah, editor. Amsterdam: North Holland Publications Co; 1977. p. 383–405,
  4. Patil GP, Rao CR. Weighted distributions and size-biased sampling with applications to wild-life populations and human families. Biometrics. 1978;34:179–189.
  5. Shanker R. Garima distribution and Its Application to model behavioral science data. Biometrics & Biostatistics International Journal. 2016;4(7):1–9.
  6. Lindley DV. Fiducial distributions and Bayes’theorem. Journal of the Royal Statistical Society. Series B. 1958;20(1):102–107.
  7. Ghitany ME, Atieh B, Nadarajah S. Lindley distribution and its Application. Mathematics Computing and Simulation. 2008;78(4):493–506.
  8. Ghitany ME, Alqallaf F, Al-Mutairi DK, et al. A two-parameter weighted Lindley distribution and its applications to survival data. Mathematics and Computers in simulation. 2011;81(6):1190–1201.
  9. Shanker R, Shukla KK, Mishra A. A three-parameter weighted Lindley distribution. Statistics in Transition new series. 2017;18(2):291–300.
  10. Shanker R. The discrete Poisson-Garima distribution. Biometrics and Biostatistics International Journal. 2017;5(2):1–7.
  11. Shanker R, Shukla KK. Size-biased Poisson-Garima distribution with applications. Biometrics and Biostatistics International Journal. 2017;6(3):1-6.
  12. Shanker R, Shukla KK. Zero-truncated Poisson-Garima distribution with applications. Biometrics and Biostatistics Open Access Journal. 2017 b;3(1):1-6.
  13. Shaked M, Shanthikumar JG. Stochastic Orders and Their Applications. New York: Academic Press; 1994.
  14. Bader MG, Priest AM. Statistical aspects of fiber and bundle strength in hybrid composites. In; hayashi T, Kawata K, Umekawa S. editors, ICCM-IV, Tokyo: Progress in Science in Engineering Composites; 1982. p. 1129–1136.
  15. Gupta RD, Kundu D. Generalized exponential distribution. Australian and Newzealand Journal of statistics. 1999;41(2):173–188.
Creative Commons Attribution License

©2018 Eyob, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.