Research Article Volume 10 Issue 4
The seven-parameter lindley distribution
Abdul Hadi N Ahmed,1
Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.
Zohdy M Nofal,2 Rania MA Osman2
1Department of Mathematical Statistics, Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt
2Department of Statistics, Mathematics and Insurance Benha University, Egypt
Correspondence: Rania M. A. Osman, Department of Statistics, Mathematics and Insurance, Benha University, Benha 13518, Egypt
Received: September 10, 2021 | Published: November 25, 2021
Citation: Ahmed AHN, Nofal ZM, Osman RMA. The seven-parameter lindley distribution. Biom Biostat Int J. 2021;10(4):166-174. DOI: 10.15406/bbij.2021.10.00344
Download PDF
Abstract
In this paper, a seven parameter Lindley Distribution (SPL) distribution is proposed as a new generalization of the basic Lindley distribution. The structural properties of the new distribution are investigated. These include the compounding representation of the distribution, reliability analysis and statistical measures. Expressions for Lorenz and Bonferroni curves and Renyi entropy as a measure for uncertainty reduction are derived. Maximum likelihood estimation is used to evaluate the parameters. The new model contains twelve lifetime distributions as special cases such as the Lindley, Quasi Lindley, gamma, and exponential distributions, among others. This model has the advantage of being capable of modeling various shapes of aging and failure criteria. Finally, the usefulness of the new model for modeling reliability data is illustrated using a real data set.
Keywords: lindley distribution, mixture, reliability analysis, moment generating function, order statistics, maximum likelihood estimation
Introduction
In many applied sciences such as medicine, engineering, and finance, amongst others, modeling and analyzing lifetime data are crucial. Several lifetime distributions have been used to model such kinds of data. The quality of the procedures used in a statistical analysis depends heavily on the assumed probability model or distribution along with relevant statistical methodologies. However, there remain many important problems where the real data does not follow any of the classical or standard probability models.
The one parameter family of distributions with density function
is used by Lindley1 to illustrate the difference between fiducial distribution and posterior distribution. Sankaran2 introduced the discrete Poisson-Lindley distribution by combining the Poisson and Lindley distributions. Ghitany et al.3 have discussed various properties of (4.1). Another discrete version of this distribution has been suggested by Deniz and Ojeda4 with applications in count data related to insurance. Ghitany et al.5 obtained size-biased and zero-truncated version of Poisson-Lindley distribution with various properties and applications.
The aim of this chapter is to introduce a new generalization of Lindley1 distribution. This generalization is flexible enough to model different types of lifetime data having different forms of failure rate. The new distribution can accommodate both decreasing and increasing failure rates as its antecessors, as well as unimodal and bathtub shaped failure rates. The Lindley distribution is generalized by mixing. Several authors have considered versions from usual density functions by following this idea. For instance, Zakerzadeh and Dolati6 considered a generalized Lindley distribution as a generalization of the usual Lindley distribution, Rama and Mishra7 studied quasi-Lindley distribution, Rama et al.8 introduced a new distribution for generalizing the Lindley model which called Janardan distribution, and Elbatal et al.9 have suggested a new generalized Lindley distribution.
The introduced model will be named Seven Parameter Lindley Distribution (SPL) distribution. The basic idea behind this generalization is to use a unified approach that accommodates all the proceeding generalizations of Lindley distribution abovementioned, and to add new model that could offer a better fit to lifetime data. The proposed distribution includes nine models as special cases plus three new models (all special cases are shown in section 2). The procedure used here is based on certain mixtures of two gamma distributions with various weights. The research examines various properties of the new distribution.
The rest of paper is organized as follows: Section 2 introduces the definition of the probability density function (pdf) of the SPL distribution including its cumulative distribution function (cdf) and the sub-models of the new suggested model. The reliability analysis including the survival function, the hazard (or failure) rate function, the reversed hazard rate function, the cumulative hazard rate function, and the mean residual lifetime is explored in Section 3. The statistical properties of the new distribution such as the moments, the moment generating function, and the distribution of order statistics are investigated in Section 4, with a proposed algorithm for generating random data from the new distribution in this section. Section 5 introduces Lorenz and Bonferroni curves and Renyi entropy as measures of inequality and uncertainty, respectively. Section 6 discusses the estimation of parameters by using maximum likelihood estimation. Finally, Section 7 provides an application for modeling real data sets to illustrate the performance of the new distribution.
Generalization and related sub-models
In this section, we introduce the pdf and the cdf of the five parameter Lindley distribution and then the special cases of the SPL distribution are mentioned.
Generalization
Let
(2)
and
(3)
be gamma
and gamma
densities, respectively.
We define a new seven parameter Lindley distribution as a mixture of (2) and (3) with Probabilities
and (1-p), respectively, as follows:
Therefore, the pdf of the seven Parameter Lindley Distribution (SPL) distribution, is defined as
We note that
incorporates seven parameters namely
and subject to
and
are not allowed to be simultaneously zeros. The corresponding cumulative distribution function (cdf) of the SPLD is
where
is known as the lower incomplete gamma function ratio. Also, the upper incomplete gamma function ratio is given by
Figures 1&2 illustrate some of the possible shapes of the pdf and the cdf, respectively, of the SPLD for different values of the parameters
and
chosen from the ranges specified in Equation (4).
Figure 1 Different shapes of the pdf for the SPLD.
Figure 2 The distribution function (cdf) of the SPLD.
Sub-models of the SPLD
It is clear that the seven parameter Lindley distribution is very flexible. Assigning particular numerical values of some subsets of the parameters yields several special generalizations of Lindley distribution. The special cases include nine distributions namely; the new generalized Lindley distribution (NGLD) introduced by Elbatal et al.,9 generalized Lindley distribution (GLD) introduced by Zakerzadeh and Dolati,6 quasi Lindley distribution (QLD) introduced by Rama and Mishra,7 Lindley distribution (LD) by Lindley,1 Erlang distribution, Janardan distribution introduced by Rama et al.,8 gamma distribution, the exponential distribution (ED), and Chi-square distribution. In addition to yield all the previous distributions, our generalization model allowed us to create new three distributions namely, the 4-parameter Lindley type I (4-p L type I) distribution, the 4-parameter Lindley type II (4-p L type II) distribution and the 2-parameter Lindley (2-p L) distribution.
Reliability analysis
In this section, we present the survival function, the hazard rate function, the reversed hazard rate function, the cumulative hazard rate function and the mean residual lifetime for the seven parameter Lindley distribution.
The survival function
The survival function
which is the probability of an item not failing prior to some time is defined by
. Therefore, the survival function of the SPL distribution is given
(6)
The hazard rate function
The other characteristic of interest of a random variable is the hazard rate function,
the hazard rate function of the SPLD is given by
(7)
We note that
might be constant, increasing, decreasing, or bathtub shaped depending on the values of the parameters involved. For example, if
and
then
, a constant, while for
and
it will be increasing,
and it is going to be decreasing if , and the bathtub-type curve appears for
and
.
The next result describes some particular cases for the hazard rate function arising from the five parameter Lindley distribution by assigning relevant values of the parameters.
Theorem 1:
The hazard rate function of the particular cases from the five parameter Lindley distribution are given by
- If
and
the failure rate is same as the
.
- If
,
and the failure rate is same as the
- If
, and
the failure rate is same as the
.
- if
, and
the failure rate is same as the
.
- If
the failure rate is same as the
.
- If
the failure rate is same as the Gamma
.
Proof:
(i)If
, and
the failure rate is same as the
(ii) If
, and
the failure rate is same as the
(iii) If
, and
the failure rate is same as the
/
.
(iv) If
, and
the failure rate is same as the
.
(v) if
the failure rate is same as the
.
The reversed hazard rate function
The reversed hazard rate function
, of a random variable distributed according to the Spl after some simplifications is given by
(8)
The cumulative hazard rate function
Many generalized models have been proposed in reliability literature through the relationship between the reliability function and its cumulative hazard rate function given by .The cumulative hazard rate function of the SPL distribution is given by
(9)
where is the total number of failure or deaths over an interval of time, and is a non-decreasing function of satisfying.
The mean residual lifetime
The additional lifetime given that the component has survived up to time is called the residual life function of the component, the nth expectation of the random variable that represent the remaining lifetime is called the mean residual lifetime (MRL) and is given by
or equivalently
While the hazard rate function provides information about a small interval after time (just after ), the MRL considers information about the whole interval after (all after ). The MRL as well as the hazard rate function or the reliability function is very important as each of them can be used to characterize a unique corresponding lifetime distribution.
The MRL function for SPL random variable is given by
(10)
The MRL function given in Equation (4.10) satisfies the following properties.
where is the first non-central moment of the SPL Distribution (the mean of the distribution).
Statistical Properties
This section investigates the statistical properties of the SPL Distribution as the moments (non- central and central), the moment generating function and an algorithm for random number generating.
The moment generating function
The following theorem gives the moment generating function (mgf) of SPL Distribution
( ).
Theorem 2:
If has the SPL Distribution ( ), then the mgf of say is given as follows
(11)
Proof:
using the expansion , one has
This completes the proof.
Depending on the previous theorem, we can conclude the basic statistical properties as follows:
(i) The non-central moments are the coefficients of . In Equation (11), for . Therefore, the mean and the variance of the SPL random variable are, respectively, given by
(12)
and
(13)
Where is the second non-central moment which is given by
(14)
The central moments can be obtained easily from the moments through the relation
Where
Then the central moments of the SPL distribution are given by
(15)
(iii) Finally, the coefficient of variation , the coefficient of skewness , and the coefficient of kurtosis of SPLdistribution are, respectively, obtained according the following relations
Distribution of order statistics
Let denote independent random variables from a distribution function with pdf , and then the pdf of (the order sample arrangement) is given by
(19)
Using Equations (4) and (5) into Equation (19), then the pdf of according to the SPL distribution is given by
(20)
Hence, the pdf of the largest order statistic and the smallest order statistic are, respectively, given by
(21)
and
(22)
Random variates generation
The probability density function of the SPL distribution can be expressed in terms of the gamma density function as follows
To generate random variates , for from SPL , we can use the following algorithm:
- Generate from Uniform distribution
- Generate from Gamma
- Generate from Gamma
- If then the set of random variates otherwise
Set
Measures of inequality and uncertainty
In this section Lorenz and Bonferroni curves are introduced as measures of inequality. Also, Renyi entropy will be mentioned as an important measure of uncertainty.
Lorenz and bonferroni curves
Lorenz and Bonferroni curves are the most widely used inequality measures in income and wealth distribution.10
In fact, Lorenz and Bonferroni curves are depending on the length-biased distribution with pdf defined by
(23)
Where is the pdf of the base distribution with mean
Accordingly, Lorenz and Bonferroni curves denoted by and respectively, defined by
(24)
where is the cdf of the length-biased distribution. Now, we shall derive the expressions of and based on and for SPLD.
It is easily shown that the pdf of the length-biased distribution can be obtained as Follows
(25)
With cdf defined by
(26)
It follows from (12), (24), and (26) that and are
(27)
and
(28)
Renyi entropy
If is a random variable having an absolutely continuous cdf and pdf , then the basic uncertainty measure for distribution (called the entropy of ) is defined as . Statistical entropy is a probabilistic measure of uncertainty or ignorance about the outcome of a random experiment and is a measure of a reduction in that uncertainty. Abundant entropy and information indices, among them the Renyi entropy, have been developed and used in various disciplines and contexts. Information theoretic principles and methods have become integral parts of probability and statistics and have been applied in various branches of statistics and related fields.
Renyi entropy is an extension of Shannon entropy. Renyi entropy of the SPLD is defined to be
(29)
Where and Renyi entropy tends to Shannon entropy as . Now,
(30)
Using then one has
(31)
Using the expansion: one can have
(32)
(33)
Using the gamma function to evaluate the integral in (33) and collecting the entire above evaluations then substitute into (29), the Renyi entropy of the SPLD can be written as
Where is a constant as
Estimation of the parameters
In this section, we use the method of likelihood to estimate the parameters involved and use them to create confidence intervals for the unknown parameters.
Let be a sample size from SPL distribution. Then the likelihood function is given by
Then,
(35)
Hence, the log-likelihood function becomes
(36)
Therefore, the maximum likelihood estimators (MLEs) of and are derived from the derivatives of .
They should satisfy the following equations
(37)
(38)
(39)
(40)
(41)
(42)
(43)
Where is the diagamma function, and it is defined as
To solve the equations (37) through (43), it is usually more convenient to use nonlinear optimization algorithms such as quasi-Newton algorithm to numerically maximize the log-likelihood function. In order to compute the standard errors and asymptotic confidence intervals we use the usual large sample approximation, in which the MLEs can be treated as being approximately trivariate normal.
Hence as , the asymptotic distribution of the MLE is given by, see Zaindin et al.6
Where ( ), and
is the approximate variance-covariance matrix with its elements obtained from
By solving this inverse dispersion matrix, these solutions will yield the asymptotic variances and co- variances of these MLEs for and .
Approximate confidence intervals for and can be determinedas
, , ,
Where is the upperpercentile of the standard normal distribution.
Application
In this section, we use a real data set to compare the fits of the SPL distribution with three sub-models. In each case, the parameters are estimated by maximum likelihood as described in Section 4.6, using the R software.
The data set consist of uncensored data set from Nichols and Padgett on the breaking stress of carbon fibers (in Gba). The data are given below:
3.70, 2.74, 2.73, 2.50, 3.60, 3.11, 3.27, 2.87, 1.47, 3.11, 3.56,
4.42, 2.41, 3.19, 3.22, 1.69, 3.28, 3.09, 1.87, 3.15, 4.90, 1.57,
2.67, 2.93, 3.22, 3.39, 2.81, 4.20, 3.33, 2.55, 3.31, 3.31, 2.85,
1.25, 4.38, 1.84, 0.39, 3.68, 2.48, 0.85, 1.61, 2.79, 4.70, 2.03,
1.89, 2.88, 2.82, 2.05, 3.65, 3.75, 2.43, 2.95, 2.97, 3.39, 2.96,
2.35, 2.55, 2.59, 2.03, 1.61, 2.12, 3.15, 1.08, 2.56, 1.80, 2.53.
The summary of the above data is given by
Units Minimum Ist Qu. Median Mean 3rd Qu. Maximum
66 0,390 2,178 2,835 2,760 3.278 4.900
In order to compare the two distribution models, we consider criteria like KS (Kolmogorov Smirnov), , AIC (Akaike information criterion), AICC (corrected Akaike information criterion), and BIC (Bayesian information criterion) for the data set. The better distribution corresponds to smaller KS, , AIC and AICC values:
and
Where denotes the log-likelihood function evaluated at the maximum likelihood estimates, is the number of parameters, and is the sample size.
Also, for calculating the values of KS we use the sample estimates of and b. Table 1&2 shows the parameter estimation based on the maximum likelihood and least square estimation, and gives the values of the criteria AIC, AICC, BIC, and KS test.
Distribution |
Parameters |
Author |
|
|
|
|
|
|
|
Gamma |
|
|
1 |
1 |
0 |
1 |
1 |
Brown & Flood11 |
ED |
|
1 |
1 |
1 |
0 |
1 |
1 |
Steffensen12 |
LD |
|
1 |
2 |
1 |
1 |
1 |
1 |
Lindley1 |
Erlang |
|
|
1 |
1 |
0 |
1 |
1 |
A. K. Erlang13 |
QLD |
|
1 |
2 |
|
|
1 |
1 |
Rama & Mishra7 |
GLD |
|
|
|
1 |
|
1 |
1 |
Zakerzadeh&Dolati6 |
Janardan |
|
1 |
2 |
1 |
|
1 |
1 |
Rama et al.8 |
NGLD |
|
|
|
1 |
1 |
1 |
1 |
Elbatal et al.9 |
Chi-square |
1/2 |
|
1 |
1 |
0 |
1 |
1 |
Fisher14 |
4-p L type I |
|
|
|
1 |
|
1 |
1 |
New |
4-p L type II |
|
|
|
|
1 |
1 |
1 |
New |
2-p L |
|
1 |
2 |
|
1 |
1 |
1 |
New |
5-p L |
|
|
|
|
|
1 |
1 |
New |
Table 1 The special cases of the SPL distribution
|
SPL |
FPLD |
Lindley |
Gamma |
exponential |
|
coef |
std.e |
coef |
std.e |
coef |
std.e |
coef |
std.e |
coef |
std.e |
alpha |
4.918145682 |
3.1418 |
4.918019 |
2.273544 |
4.9181 |
2.2736 |
7.48803 |
1.27552 |
- |
- |
beta |
13.32661771 |
2.273564 |
13.32648 |
3.141746 |
13.3266 |
3.1418 |
2.7135 |
0.47806 |
- |
- |
theta |
4.608695814 |
1.033154 |
4.608644 |
1.033136 |
4.6087 |
1.0332 |
- |
- |
0.362379 |
0.044606 |
phai |
0.033886236 |
0.452362 |
- |
- |
- |
- |
- |
- |
- |
- |
k |
1.471738523 |
0.056224 |
0.103372 |
0.080925 |
- |
- |
- |
- |
- |
- |
eta |
3.406750687 |
0.187052 |
6.104362 |
0.001398 |
59.0538 |
46.2436 |
- |
- |
- |
- |
sigma |
2.438262765 |
0.236172 |
- |
- |
- |
- |
- |
- |
- |
- |
AIC |
177.3931 |
181.3931 |
179.3931 |
186.3351 |
267.9887 |
BIC |
182.0655 |
192.3413 |
188.1517 |
190.7144 |
270.1784 |
AICC |
175.462 |
182.3931 |
180.0488 |
186.5256 |
268.0512 |
HQIC |
171.3364 |
185.7192 |
182.854 |
188.0656 |
268.8539 |
K-S |
0.07 |
0.070003 |
0.0713806 |
0.13285 |
0.35811 |
P-value |
0.9031 |
0.9028 |
0.8569 |
0.1945 |
8.89E-08 |
The values in Table 2 indicate that the SPL distribution leads to a better fit over all the other models.
A density plot compares the fitted densities of the models with the empirical histogram of the observed data (Figures 3-5). The fitted density for the SPL model is closer to the empirical histogram than the fits of the other models.15-22
Figure 3 Increasing, decreasing, constant, bathtub and upside-down shapes for the hazad rate function of the SPLD.
Figure 4 (a) Estimated densities of the SPL distributions for the data.
(b) Estimated cdf function from the fitted the fitted the SPL distributions and the empirical cdf for the data.
.
Figure 5 PP plots for the fitted SPLD distribution and for the data set.
Concluding remarks
There has been a great interest among statisticians and applied researchers in constructing flexible lifetime models to facilitate better modeling of survival data. Consequently, a significant progress has been made towards the generalization of some well-known lifetime models and their successful application to problems in several areas. In this paper, we introduce a new five-parameter distribution obtained using the idea of mixture of distributions. We refer to the new model as the Five Parameter Lindley Distribution (FPLD) and study some of its mathematical and statistical properties. We provide the pdf, the cdf and the hazard rate function of the new model and explicit expressions for the moments. The model parameters are estimated by the method of maximum likelihood. The new model is compared with three lifetime models and provides consistently better fit than them. We hope that the proposed distribution will serve as an alternative model to other models available in the literature for modeling positive real data in many areas such as engineering, survival analysis, hydrology and economics.
References
- Lindley, DV. Fiducial distributions and Bayes' theorem. Journal of the Royal Statistical Society. Series B (Methodological). 1958;20(1):102‒107.
- Sankaran M. The Discrete Poisson-Lindley Distribution. Biometrics. 1970;26(1):145‒149.
- Ghitany M E, Al-Mutairi D K, Nadarajah S. Zero-truncated Poisson– Lindley distribution and its application. Mathematics and Computers in Simulation. 2008;79(3):279‒287.
- Gómez-Deniz E,Calderon-Ojeda E. The discrete Lindley distribution: properties and applications. Journal of Statistical Computation and Simulation. 2011;81(11):1405‒1416.
- Ghitany M E, Atieh B, Nadarajah S. Lindley distribution and its application. Mathematics and computers in simulation. 2008;78(4):493‒506.
- Zakerzadeh H, Dolati A. Generalized Lindley Distribution. Journal of Mathematical Extension. 2009;3(2):13‒22.
- Shanker R, Mishra A. A quasi Lindley distribution. African Journal of Mathematics and Computer Science Research. 2013;6(4):64‒71.
- Shanker R, Sharma S, Shanker, et al. Janardan distribution and its application to waiting times data. Indian Journal of Applied Research. 2013;3(8):500‒502.
- Elbatal I, Merovci F, Elgarhy M. A new generalized Lindley distribution. Mathematical Theory and Modeling. 2013;3(13):30‒47.
- Dagum C. Specification and analysis of wealth distribution models with applications. American Statistical Association, Business and Economic Statistics Section, Toronto. 2014;1116‒1123.
- Brown G W, Flood M M. Tumbler mortality. Journal of the American Statistical Association. 1947;42(240):562‒574.
- https://www.cambridge.org/core/journals/transactions-of-the-faculty-of-actuaries/article/abs/some-recent-researches-in-the-theory-of-statistics-and-actuarial-science-by-j-f-steffensen-pp-48-cambridge-institute-of-actuaries-1930/ACD1158BE3A85F98F666E2B09B5589AD
- Erlang A K. Solution of some problems in the theory of probabilities of significance in automatic telephone exchanges. The Post Office Electrical Engineer’s Journal. 1997;10:189‒197.
- Fisher RA. On a distribution yielding the error functions of several well known statistics. Proceedings International Mathematical Congress, Toronto. 1924;2:805‒813.
- Balakrishnan K. Exponential distribution: theory, methods and applications. CRCpress. 1996.
- Bozdogan H. Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions. Psychometrika. 1987;52(3):345‒370.
- Guess F, Proschan F. Mean residual life: theory and applications (No. FSU-STATISTICS-M702). Dept. of Statistics, Florida State UniversityTallahassee. 1985.
- Hogg RV, McKean J, Craig AT. Introduction to mathematical statistics. Pearson Education. 2005.
- Jeffrey A, Zwillinger D. Table of integrals, series and products. Academic Press. 2007.
- Klein JP, Zhang MJ. Survival analysis, software. John Wiley & Sons, Ltd. 2005.
- Nichols MD, Padgett WJ. A bootstrap control chart for Weibull percentiles. Quality and Reliability Engineering International. 2006;22(2):141‒151.
- Renyi A. On measures of entropy and information. In Fourth Berkeley Symposium on Mathematical Statistics and Probability. 1961. p. 547‒561.
©2021 Ahmed, et al. This is an open access article distributed under the terms of the,
which
permits unrestricted use, distribution, and build upon your work non-commercially.