In this paper some of the important mathematical properties including moment generating function, mean deviations, order statistics, Bonferroni and Lorenz curves, Renyi entropy and stress strength reliability of two-parameter Lindley distribution (TPLD) of Shanker & Mishra1 have been discussed. Its goodness of fit over exponential and Lindley distributions have been illustrated with some real lifetime data-sets and found that TPLD is preferable over exponential and Lindley distributions for modeling lifetime data-sets.
Keywords: mean deviations; order statistics, bonferroni and lorenz curves, entropy, stress-strength reliability, goodness of fit
The probability density function (p.d.f.) and the cumulative distribution function (c.d.f.) of distribution, introduced in the context of Bayesian analysis as a counter example of fiducial statistics, are given by
The detailed study about its mathematical properties, estimation of parameter and application showing the superiority of Lindley distribution over exponential distribution for the waiting times before service of the bank customers has been done by Ghitany et al.2 The Lindley distribution has been generalized extended and modified by different researchers including1,3-19 are some among others.
The probability density function (p.d.f.) and cumulative distribution function (c.d.f) of two-parameter Lindley distribution (TPLD) of Shanker & Mishra1 are given by
At α=1α=1 , both (1.3) and (1.4) reduce to the corresponding expressions (1.1) and (1.2) of Lindley distribution. The first two moments about origin and the variance of TPLD of Shanker & Mishra1 are given by
μ1′=α θ+2θ(α θ+1)μ1′=αθ+2θ(αθ+1) (1.5)
At α=1α=1 , these moments reduce to the corresponding moments of Lindley distribution. Shanker & Mishra1 have derived and discussed some of its mathematical properties such as shape, moments, coefficient of variation, coefficient of skewness and kurtosis, hazard rate function, mean residual life function and stochastic orderings. They have also discussed the estimation of its parameters using maximum likelihood estimation and method of moments and its goodness of fit over Lindley distribution. It has been observed that many important mathematical properties of this distribution have not been studied.
In the present paper some of the important mathematical properties including moment generating function, mean deviations, order statistics, Bonferroni and Lorenz curves, Renyi entropy and stress strength reliability of TPLD of Shanker & Mishra1 have been derived and discussed. Its goodness of fit over exponential and Lindley distributions have been illustrated with some real lifetime data-sets and found that TPLD gives better fit than exponential and Lindley distributions.
The moment generating function, (MX(t))(MX(t)) of TPLD (1.3) can be obtained as
It can be easily seen that the expression for μr′μr′ obtained as the coefficient of trr!trr! is given as
For α=1α=1 , μr′μr′ reduces to the corresponding μr′μr′ of Lindley distribution.
The amount of scatter in a population is measured to some extent by the totality of deviations usually from mean and median. These are known as the mean deviation about the mean and the mean deviation about the median defined by δ1(X)=∞∫0|x−μ| f(x)dxδ1(X)=∞∫0|x−μ|f(x)dx and δ2(X)=∞∫0|x−M| f(x)dxδ2(X)=∞∫0|x−M|f(x)dx , respectively, where μ=E(X)μ=E(X) and M=Median (X)M=Median (X) . The measures δ1(X)δ1(X) and δ2(X)δ2(X) can be calculated using the relationships
and
Using p.d.f. (1.3), and expression for mean of two-parameter Lindley distribution, we have
μ∫0x f(x) dx=μ−{θ2(μ2+α μ)+2θ μ+(α θ+2)}e−θ μθ(α θ+1)μ∫0xf(x)dx=μ−{θ2(μ2+αμ)+2θμ+(αθ+2)}e−θμθ(αθ+1) (3.3)
Using expressions from (3.1), (3.2) and (3.3), and little algebraic simplification, the mean deviation about mean, δ1(X)δ1(X) and the mean deviation about median, δ2(X)δ2(X) of TPLD (1.3) are obtained as
δ1(X)=2(θ μ+α θ+2)e−θ μθ(α θ+1)δ1(X)=2(θμ+αθ+2)e−θμθ(αθ+1) (3.4)
and δ2(X)=2{θ2(M2+α M)+2θ M+(α θ+2)}e−θ Mθ(α θ+1)−μδ2(X)=2{θ2(M2+αM)+2θM+(αθ+2)}e−θMθ(αθ+1)−μ (3.5)
It can be easily seen that expressions (3.4) and (3.5) of TPLD (1.3) reduce to the corresponding expressions of Lindley distribution at α=1α=1 .
Let X1, X2, ..., XnX1,X2,...,Xn be a random sample of size from two-parameter Lindley distribution (1.3). Let X(1)<X(2)< ... <X(n)X(1)<X(2)<...<X(n) denote the corresponding order statistics. The p.d.f. and the c.d.f. of the kk th order statistic, say Y=X(k)Y=X(k) are given by
and
FY(y)=n∑j=k(nj) Fj(y){1−F(y)}n−jFY(y)=n∑j=k(nj)Fj(y){1−F(y)}n−j
=n∑j=kn−j∑l=0(nj)(n−jl) (−1)lFj+l(y)=n∑j=kn−j∑l=0(nj)(n−jl)(−1)lFj+l(y)
respectively, for k=1,2,3,...,nk=1,2,3,...,n
Thus, the p.d.f. and the c.d.f of the th order statistics of TPLD (1.3) are obtained as
fY(y)=n!θ2(α+x)e−θx(α θ+1)(k−1)! (n−k)! n−k∑l=0(n−kl)(−1)l×[1−1+α θ+θxα θ+1e−θx]k+l−1fY(y)=n!θ2(α+x)e−θx(αθ+1)(k−1)!(n−k)!n−k∑l=0(n−kl)(−1)l×[1−1+αθ+θxαθ+1e−θx]k+l−1
and
FY(y)=n∑j=kn−j∑l=0(nj)(n−jl) (−1)l[1−1+α θ+θxα θ+1e−θx]j+lFY(y)=n∑j=kn−j∑l=0(nj)(n−jl)(−1)l[1−1+αθ+θxαθ+1e−θx]j+l
It can be easily verified that the expressions for the p.d.f. and c.d.f. of the th order statistics of TPLD (1.3) reduce to the expressions for the p.d.f. and c.d.f. of the th order statistics of Lindley distribution at α=1α=1
The Bonferroni and Lorenz curves20 and Bonferroni and Gini indices have applications not only in economics to study income and poverty, but also in other fields like reliability, demography, insurance and medicine. The Bonferroni and Lorenz curves are defined as
B(p)=1pμq∫0x f(x) dx=1pμ[∞∫0x f(x)dx−∞∫qx f(x) dx]=1pμ[μ−∞∫qx f(x) dx]B(p)=1pμq∫0xf(x)dx=1pμ⎡⎢⎣∞∫0xf(x)dx−∞∫qxf(x)dx⎤⎥⎦=1pμ⎡⎢⎣μ−∞∫qxf(x)dx⎤⎥⎦ (5.1)
L(p)=1μq∫0x f(x) dx=1μ[∞∫0x f(x)dx−∞∫qx f(x) dx]=1μ[μ−∞∫qx f(x) dx]L(p)=1μq∫0xf(x)dx=1μ⎡⎢⎣∞∫0xf(x)dx−∞∫qxf(x)dx⎤⎥⎦=1μ⎡⎢⎣μ−∞∫qxf(x)dx⎤⎥⎦ (5.2)
respectively or equivalently
B(p)=1pμp∫0F−1(x) dxB(p)=1pμp∫0F−1(x)dx (5.3)
and L(p)=1μp∫0F−1(x) dxL(p)=1μp∫0F−1(x)dx (5.4)
respectively, where μ=E(X)μ=E(X) and q=F−1(p)q=F−1(p) .
The Bonferroni and Gini indices are thus defined as
B=1−1∫0B(p) dpB=1−1∫0B(p)dp (5.5)
and G=1−21∫0L(p) dpG=1−21∫0L(p)dp (5.6)
respectively.
Using p.d.f. (1.3), we get
∞∫qx f(x) dx={θ2(q2+α q)+2θq+(α θ+2)}e−θqθ(α θ+1)∞∫qxf(x)dx={θ2(q2+αq)+2θq+(αθ+2)}e−θqθ(αθ+1) (5.7)
Now using equation (5.7) in (5.1) and (5.2), we get
B(p)=1p[1−{θ2(q2+α q)+2θq+(α θ+2)}e−θqα θ+2]B(p)=1p[1−{θ2(q2+αq)+2θq+(αθ+2)}e−θqαθ+2] (5.8)
and L(p)=1−{θ2(q2+α q)+2θq+(α θ+2)}e−θqα θ+2L(p)=1−{θ2(q2+αq)+2θq+(αθ+2)}e−θqαθ+2 (5.9)
Now using equations (5.8) and (5.9) in (5.5) and (5.6), the Bonferroni and Gini indices of TPLD (1.3) are obtained as
B=1−{θ2(q2+α q)+2θq+(α θ+2)}e−θqα θ+2B=1−{θ2(q2+αq)+2θq+(αθ+2)}e−θqαθ+2 (5.10)
G=−1+2{θ2(q2+α q)+2θq+(α θ+2)}e−θqα θ+2G=−1+2{θ2(q2+αq)+2θq+(αθ+2)}e−θqαθ+2 (5.11)
The Bonferroni and Gini indices of Lindley distribution are particular cases of the Bonferroni and Gini indices (5.10) and (5.11) of TPLD (1.3) for α=1α=1 .
An entropy of a random variable is a measure of variation of uncertainty. A popular entropy measure is Renyi entropy.21 If is a continuous random variable having probability density function f(.)f(.) , then Renyi entropy is defined as
TR(γ)=11−γlog{∫fγ(x)dx}TR(γ)=11−γlog{∫fγ(x)dx}
where γ>0 and γ≠1γ>0andγ≠1 .
Thus, the Renyi entropy for TPLD (1.3) can be obtained as
The Renyi entropy of Lindley distribution is a particular case of the Renyi entropy TPLD at α=1α=1 .
The stress- strength reliability describes the life of a component which has random strength that is subjected to a random stress . When the stress applied to it exceeds the strength, the component fails instantly and the component will function satisfactorily till X>YX>Y . Therefore, R=P(Y<X)R=P(Y<X) is a measure of component reliability and in statistical literature it is known as stress-strength parameter. It has wide applications in almost all areas of knowledge especially in engineering such as structures, deterioration of rocket motors, static fatigue of ceramic components, aging of concrete pressure vessels etc.
Let XX and YY be independent strength and stress random variables having TPLD (1.3) with parameter (α1,θ1)(α1,θ1) and (α2,θ2)(α2,θ2) respectively. Then the stress-strength reliability RR is obtained as
R=P(Y<X)=∞∫0P(Y<X|X=x)fX(x)dxR=P(Y<X)=∞∫0P(Y<X|X=x)fX(x)dx
=∞∫0f(x;α1,θ1) F(x;α2,θ2)dx=∞∫0f(x;α1,θ1)F(x;α2,θ2)dx
=1−θ12[2θ2+(α1θ2+α2θ2+1)(θ1+θ2)+α1(α2θ2+1)(θ1+θ2)2](α1θ1+1)(α2θ2+1)(θ1+θ2)2=1−θ12[2θ2+(α1θ2+α2θ2+1)(θ1+θ2)+α1(α2θ2+1)(θ1+θ2)2](α1θ1+1)(α2θ2+1)(θ1+θ2)2
The expression of stress-strength reliability of Lindley distribution is a particular case of the expression of stress-strength reliability of TPLD (1.3) at α1=α2=1α1=α2=1 .
The TPLD (1.3) has two parameters to be estimated and so the first two moments about origin are required to estimate parameters. Using the first two moments about origin, we have
μ2′(μ1′)2=k(Say)=2(α θ+3)(α θ+1)(α θ+2)2μ2′(μ1′)2=k(Say)=2(αθ+3)(αθ+1)(αθ+2)2 (8.1.1)
Taking b=α θb=αθ , we get
μ2′(μ1′)2=2(b+3)(b+1)(b+2)2=2b2+8b+6b2+4b+4=kμ2′(μ1′)2=2(b+3)(b+1)(b+2)2=2b2+8b+6b2+4b+4=k
This gives a quadratic equation in bb as
(2−k)b2+4(2−k)b+2(3−2k)=0(2−k)b2+4(2−k)b+2(3−2k)=0 (8.1.2)
Replacing the first and second moments about origin μ1′μ1′ and μ2′μ2′ by their respective sample moments, an estimate of kk can be obtained and substituting the value of kk in equation (8.1.2), an estimate of can be obtained. Substituting this estimate of in the expression for the mean of TPLD (1.3), moment estimate ˜θ˜θ of θθ can be obtained as
˜θ=(b+2b+1)1ˉx˜θ=(b+2b+1)1¯x (8.1.3)
Finally, moment estimate ˜α˜α of αα can be obtained as
˜θ=(b+2b+1)1ˉx˜θ=(b+2b+1)1¯x (8.1.3)
Finally, moment estimate ˜α˜α of αα can be obtained as
˜α=b˜θ˜α=b˜θ
b. Maximum likelihood estimate of parameters
Let (x1, x2, x3, ... ,xn)(x1,x2,x3,...,xn) be a random sample from TPLD (1.3). Let fxfx be the observed frequency in the sample corresponding to X=x (x=1,2,3,...k)X=x(x=1,2,3,...k) such that k∑x=1fx=nk∑x=1fx=n , where kk is the largest observed value having non-zero frequency. The likelihood function, LL of TPLD (1.3) is given by
L=(θ2α θ+1)nn∏i=1(α+x)fx e−n θ ˉxL=(θ2αθ+1)nn∏i=1(α+x)fxe−nθ¯x (8.2.1)
The log likelihood function is thus obtained as
logL=nlogθ2−nlog(α θ+1)+k∑x=1fxlog(α+x)−n θ ˉxlogL=nlogθ2−nlog(αθ+1)+k∑x=1fxlog(α+x)−nθ¯x (8.2.2)
where ˉx¯x is the sample mean.
The two log likelihood equations are obtained as
∂logL∂θ=2nθ−nαα θ+1−nˉx=0∂logL∂θ=2nθ−nααθ+1−n¯x=0
∂logL∂α=−nθα θ+1+k∑x =1fxα+x=0∂logL∂α=−nθαθ+1+k∑x=1fxα+x=0
It can be easily seen that equation (8.2.3) gives ˉx=α θ+2θ(α θ+1)=μ1′¯x=αθ+2θ(αθ+1)=μ1′ , mean of TPLD. The equations (8.2.3) and (8.2.4) do not seem to be solved directly. However, Fisher’s scoring method can be applied to solve these equations iteratively. We have
∂2logL∂θ2=−2nθ2+nα2(α θ+1)2∂2logL∂θ2=−2nθ2+nα2(αθ+1)2
∂2logL∂θ ∂α=−n(α θ+1)2∂2logL∂θ∂α=−n(αθ+1)2 (8.2.6)
∂2logL∂α2=nθ2(α θ+1)2−k∑x=1fx(α+x)2∂2logL∂α2=nθ2(αθ+1)2−k∑x=1fx(α+x)2 (8.2.7)
The maximum likelihood estimates of parameters are the solution of the following equations
[∂2logL∂θ2∂2logL∂θ ∂α∂2logL∂θ ∂α∂2logL∂α2]ˆθ=θ0ˆα=α0[ˆθ=θ0ˆα=α0]=[∂logL∂θ∂logL∂α]ˆθ=θ0ˆα=α0⎡⎢⎣∂2logL∂θ2∂2logL∂θ∂α∂2logL∂θ∂α∂2logL∂α2⎤⎥⎦ˆθ=θ0ˆα=α0[ˆθ=θ0ˆα=α0]=⎡⎣∂logL∂θ∂logL∂α⎤⎦ˆθ=θ0ˆα=α0
where θ0 and α0θ0andα0 are initial values of θ and αθandα as given by the method of moments. These equations are solved iteratively till sufficiently close estimates of ˆθ and ˆαˆθandˆα are obtained.
The two-parameter Lindley distribution (TPLD) has been fitted to a number of lifetime data- sets. In this section, we present the fitting of two-parameter Lindley distribution to five real lifetime data-sets and compare its goodness of fit with the one parameter exponential and Lindley distributions data sets (1-5).
1.1 |
1.4 |
1.3 |
1.7 |
1.9 |
1.8 |
1.6 |
2.2 |
1.7 |
2.7 |
4.1 |
1.8 |
1.5 |
1.2 |
1.4 |
3 |
1.7 |
2.3 |
1.6 |
2 |
|
|
|
|
Data set 1: This data set represents the lifetime’s data relating to relief times (in minutes) of 20 patients receiving an analgesic and reported by Gross et al.22
18.83 |
20.8 |
21.657 |
23.03 |
23.23 |
24.05 |
24.321 |
25.5 |
25.52 |
25.8 |
26.69 |
26.77 |
26.78 |
27.05 |
27.67 |
29.9 |
31.11 |
33.2 |
33.73 |
33.76 |
33.89 |
34.76 |
35.75 |
35.91 |
36.98 |
37.08 |
37.09 |
39.58 |
44.045 |
45.29 |
45.381 |
Data set 2: This data set is the strength data of glass of the aircraft window reported by Fuller et al.23
0.8 |
0.8 |
1.3 |
1.5 |
1.8 |
1.9 |
1.9 |
2.1 |
2.6 |
2.7 |
2.9 |
3.1 |
3.2 |
3.3 |
3.5 |
3.6 |
4 |
4.1 |
4.2 |
4.2 |
4.3 |
4.3 |
4.4 |
4.4 |
4.6 |
4.7 |
4.7 |
4.8 |
4.9 |
4.9 |
5 |
5.3 |
5.5 |
5.7 |
5.7 |
6.1 |
6.2 |
6.2 |
6.2 |
6.3 |
6.7 |
6.9 |
7.1 |
7.1 |
7.1 |
7.1 |
7.4 |
7.6 |
7.7 |
8 |
8.2 |
8.6 |
8.6 |
8.6 |
8.8 |
8.8 |
8.9 |
8.9 |
9.5 |
9.6 |
9.7 |
9.8 |
10.7 |
10.9 |
11 |
11 |
11.1 |
11.2 |
11.2 |
11.5 |
11.9 |
12.4 |
12.5 |
12.9 |
13 |
13.1 |
13.3 |
13.6 |
13.7 |
13.9 |
14.1 |
15.4 |
15.4 |
17.3 |
17.3 |
18.1 |
18.2 |
18.4 |
18.9 |
19 |
19.9 |
20.6 |
21.3 |
21.4 |
21.9 |
23 |
27 |
31.6 |
33.1 |
38.5 |
0.55 |
0.93 |
1.25 |
1.36 |
1.49 |
1.52 |
1.58 |
1.61 |
1.64 |
1.68 |
1.73 |
1.81 |
2 |
0.74 |
1.04 |
1.27 |
1.39 |
1.49 |
1.53 |
1.59 |
1.61 |
1.66 |
1.68 |
1.76 |
1.82 |
2.01 |
0.77 |
1.11 |
1.28 |
1.42 |
1.5 |
1.54 |
1.6 |
1.62 |
1.66 |
1.69 |
1.76 |
1.84 |
2.24 |
0.81 |
1.13 |
1.29 |
1.48 |
1.5 |
1.55 |
1.61 |
1.62 |
1.66 |
1.7 |
1.77 |
1.84 |
0.84 |
1.24 |
1.3 |
1.48 |
1.51 |
1.55 |
1.61 |
1.63 |
1.67 |
1.7 |
1.78 |
1.89 |
Data set 4: The data set represents the strength of 1.5cm glass fibers measured at the National Physical Laboratory, England. Unfortunately, the units of measurements are not given in the paper, and they are taken from Smith & Naylor25
17.88 |
28.92 |
33 |
41.52 |
42.12 |
45.6 |
48.8 |
51.84 |
51.96 |
54.12 |
55.56 |
67.8 |
68.44 |
68.64 |
68.88 |
84.12 |
93.12 |
98.64 |
105.12 |
105.84 |
127.92 |
128.04 |
173.4 |
Data set 5: The data set is from Lawless.26 The data given arose in tests on endurance of deep groove ball bearings. The data are the number of million revolutions before failure for each of the 23 ball bearings in the life tests and they are:
In order to compare distributions, −2lnL−2lnL , AIC (Akaike Information Criterion), AICC (Akaike Information Criterion Corrected), BIC (Bayesian Information Criterion), K-S Statistics (Kolmogorov-Smirnov Statistics) for five real data - sets have been computed (Table 1). The formulae for computing AIC, AICC, BIC, and K-S Statistics are as follows:
|
Model |
Estimate of Parameters |
— 2ln L |
AIC |
AICC |
BIC |
K-S |
|
ˆθˆθ |
ˆαˆα |
|||||||
Data 1 |
Lindley |
0.816118 |
|
60.50 |
62.50 |
62.72 |
63.49 |
0.341 |
|
Exponential |
0.526316 |
— 0.31285 |
65.67 |
67.67 |
67.90 |
68.67 |
0.389 |
Data 2 |
Lindley |
0.062988 |
|
253.99 |
255.99 |
256.13 |
257.42 |
0.333 |
|
Exponential |
0.032455 |
— 5.25330 |
274.53 |
276.53 |
276.67 |
277.96 |
0.426 |
Data 3 |
Lindley |
0.186571 |
|
638.07 |
640.07 |
640.12 |
642.68 |
0.058 |
|
Exponential |
0.101245 |
0.337078 |
658.04 |
660.04 |
660.08 |
662.65 |
0.163 |
Data 4 |
Lindley |
0.996116 |
|
162.56 |
164.56 |
164.62 |
166.70 |
0.371 |
|
Exponential |
0.663647 |
0.257373 |
177.66 |
179.66 |
179.73 |
181.80 |
0.402 |
Data 5 |
Lindley |
0.027321 |
|
231.47 |
233.47 |
233.66 |
234.61 |
0.149 |
|
Exponential |
0.013845 |
10.12355 |
242.87 |
244.87 |
245.06 |
246.01 |
0.263 |
Table 1 MLE’s, — 2ln L, AIC, AICC, BIC, K-S Statistics of the fitted distributions of data sets 1-5
AIC=−2lnL+2kAIC=−2lnL+2k ,
AICC=AIC+2k(k+1)(n−k−1)AICC=AIC+2k(k+1)(n−k−1) ,
BIC=−2lnL+kln nBIC=−2lnL+klnn and
D=Supx|Fn(x)−F0(x)|D=Supx|Fn(x)−F0(x)| , where kk = the number of parameters, nn = the sample size and Fn(x)Fn(x) is the empirical distribution function.
The best distribution corresponds to lower −2lnL−2lnL , AIC, AICC, BIC, and K-S statistics.
None.
None.
© . This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7