Research Article Volume 12 Issue 2
Department of Statistics, Assam University, Silchar, Assam, India
Correspondence: Department of Statistics,Assam University, Silchar, Assam, India
Received: April 05, 2023  Published: April 25, 2023
Citation: Shanker R. Komal distribution with properties and application in survival analysis. Biom Biostat Int J. 2023;12(2):4044. DOI: 10.15406/bbij.2023.12.00381
The modeling and analysis of lifetime data are becoming a challenge for the statistician and policy makers because the lifetime data are in general stochastic in nature. During recent decades several one parameter lifetime distributions have been proposed by researchers but they are not suitable due to the nature of the distribution and the stochastic nature of the data. In this paper an attempt has been made to propose a new one parameter lifetime distribution named Komal distribution. The statistical properties, estimation of parameter and application of the distribution to a lifetime dataset have been presented.
Keywords: lifetime distributions, statistical properties, estimation of parameter, applications
In the present era, modeling of lifetime data is a serious challenge because the lifetime data are stochastic in nature. It has been observed that policy makers are struggling to find a suitable distribution for lifetime data. During recent decades several one parameter lifetime distributions have been proposed in Statistics literature but due to distributional nature or the nature of the lifetime data, these proposed distributions do not give proper fit. Several researchers in the field of distribution theory are trying to propose a new lifetime distribution as per the stochastic nature of lifetime data. Upto 1958, there was only one lifetime distribution named exponential distribution which was in use for the analysis and modeling of lifetime data. Lindley^{1} proposed another lifetime distribution known as Lindley distribution and Ghitanty et al.^{2} after detailed study on its statistical properties and application observed that Lindley distribution gives much closure fit than exponential distribution. While working on the comparative study of exponential and Lindley distribution, Shanker et al.^{3 }observed that exponential and Lindley distributions are competing well and there were some datasets where these two distributions do not provide good fit. Shanker^{4,5} proposed two new one parameter lifetime distributions namely Shanker distribution and Akash distribution which gave much better fit than both exponential and Lindley distribution. Shanker et al.^{6 }provides a comparative study on applications of exponential, Lindley and Akash distribution. Further, Shanker & Hagos^{7} presented a detailed study on applications of exponential, Lindley, Shanker and Akash distribution and showed that still there are some datasets where these distributions did not provide better fit. Further, Shanker^{8} introduced Sujatha distribution which provides much better fit than exponential, Lindley, Shanker and Akash distribution. Again, Shanker^{9} proposed another one parameter lifetime distribution named Garima distribution to model data arising from behavioral sciences, but this also does not give good fit on several real lifetime datasets. Now the question is to search a distribution which is both flexible and tractable in nature to capture the variation in the datasets. When a distribution does not give good fit, then some researchers prefer to transform the dataset to satisfy the assumptions of the distribution but this is not a preferable method because the original nature of the dataset gets lost. Some researchers also prefer to modify the distribution by adding extra shape parameter or scale parameter distribution to suit the nature of the dataset. But, instead of transforming the original dataset or modifying the distribution suiting to dataset, it is better to search a new distribution which provides better fit for the given datasets when the existing distributions fails to provide good fit.
In the present paper an attempt has been made to propose a new one parameter lifetime distribution, named Komal distribution, which would provide a better fit over exponential, Lindley, Shanker, Akash and Sujatha distributions. Some of its statistical properties, estimation techniques of parameter and an application to a real lifetime dataset has been discussed and presented.
Taking the convex combination of exponential $\left(\theta \right)$ and gamma $\left(2,\theta \right)$ distributions with respective mixing proportions $\frac{\theta \left(\theta +1\right)}{{\theta}^{2}+\theta +1}$ and $\frac{1}{{\theta}^{2}+\theta +1}$ , a new probability density function (pdf) can be expressed as
$f\left(x;\theta \right)=\frac{{\theta}^{2}}{{\theta}^{2}+\theta +1}\left(1+\theta +x\right)\text{\hspace{0.17em}}{e}^{\theta \text{\hspace{0.17em}}x};x>0,\theta >0$
We would call this new distribution as ‘Komal distribution’. Like other one parameter lifetime distributions, Komal distribution has been derived as a convex combination of exponential distribution and gamma distribution, it is expected to give better fit over exponential and other one parameter distributions derived using convex combinations of exponential distribution and gamma distribution. The cumulative distribution function (cdf) of Komal distribution can thus be obtained as
$F\left(x;\theta \right)=1\left[1+\frac{\theta x}{{\theta}^{2}+\theta +1}\right]\text{\hspace{0.17em}}{e}^{\theta \text{\hspace{0.17em}}x};x>0,\theta >0$
The behaviour of the pdf and the cdf of Komal distribution for varying values of parameter $\theta $ have been presented in Figures 1 & 2 respectively.
As we know that moments are essential to know the descriptive nature such as coefficient of variation, skewness, kurtosis and index of dispersion of any distribution. Following the approach of obtaining the $r$ th moment of Shanker distribution and Akash distribution by Shanker^{4,5}, the $r$ th moment about origin ${\mu}_{r}{}^{\prime}$ of Komal distribution can be obtained as
${\mu}_{r}{}^{\prime}=E\left({X}^{r}\right)=\frac{{\theta}^{2}}{{\theta}^{2}+\theta +1}{\displaystyle \underset{0}{\overset{\infty}{\int}}{x}^{r}\left(1+\theta +x\right)}\text{\hspace{0.17em}}{e}^{\theta \text{\hspace{0.17em}}x}dx$
$=\frac{r!\left({\theta}^{2}+\theta +r+1\right)}{{\theta}^{r}\left({\theta}^{2}+\theta +1\right)};r=1,2,3,\cdot \cdot \cdot $ (3.1)
Substituting $r=1,2,3,4$ in (3.1), the first four moments about origin of Komal distribution can be obtained as
${\mu}_{1}{}^{\prime}=\frac{{\theta}^{2}+\theta +2}{\theta \left({\theta}^{2}+\theta +1\right)}$ ,${\mu}_{2}{}^{\prime}=\frac{2\left({\theta}^{2}+\theta +3\right)}{{\theta}^{2}\left({\theta}^{2}+\theta +1\right)}$
${\mu}_{3}{}^{\prime}=\frac{6\left({\theta}^{2}+\theta +4\right)}{{\theta}^{3}\left({\theta}^{2}+\theta +1\right)}$ ,${\mu}_{4}{}^{\prime}=\frac{24\left({\theta}^{2}+\theta +5\right)}{{\theta}^{4}\left({\theta}^{2}+\theta +1\right)}$ .
The moments about the mean of Komal distribution, using relationship between moments about the mean and the moments about the origin, can thus be obtained as
${\mu}_{2}=\frac{{\theta}^{4}+2{\theta}^{3}+5{\theta}^{2}+4\theta +2}{{\theta}^{2}{\left({\theta}^{2}+\theta +1\right)}^{2}}$
${\mu}_{3}=\frac{2\left({\theta}^{6}+3{\theta}^{5}+9{\theta}^{4}+13{\theta}^{3}+12{\theta}^{2}+6\theta +2\right)}{{\theta}^{3}{\left({\theta}^{2}+\theta +1\right)}^{3}}$
${\mu}_{4}=\frac{3\left(3{\theta}^{8}+12{\theta}^{7}+42{\theta}^{6}+84{\theta}^{5}+119{\theta}^{4}+112{\theta}^{3}+76{\theta}^{2}+32\theta +8\right)}{{\theta}^{4}{\left({\theta}^{2}+\theta +1\right)}^{4}}$ .
The descriptive constants including coefficient of variation (CV), coefficient of skewness (CS), coefficient of kurtosis (CK) and the index of dispersion (ID) of Komal distribution are thus obtained as
$CV=\frac{\sqrt{{\mu}_{2}}}{{\mu}_{1}{}^{\prime}}=\frac{\sqrt{{\theta}^{4}+2{\theta}^{3}+5{\theta}^{2}+4\theta +2}}{{\theta}^{2}+\theta +2}$
$CS=\frac{{\mu}_{3}{}^{2}}{{\mu}_{2}{}^{3}}=\frac{4{\left({\theta}^{6}+3{\theta}^{5}+9{\theta}^{4}+13{\theta}^{3}+12{\theta}^{2}+6\theta +2\right)}^{2}}{{\left({\theta}^{4}+2{\theta}^{3}+5{\theta}^{2}+4\theta +2\right)}^{3}}$
$CK=\frac{{\mu}_{4}}{{\mu}_{2}{}^{2}}=\frac{3\left(3{\theta}^{8}+12{\theta}^{7}+42{\theta}^{6}+84{\theta}^{5}+119{\theta}^{4}+112{\theta}^{3}+76{\theta}^{2}+32\theta +8\right)}{{\left({\theta}^{4}+2{\theta}^{3}+5{\theta}^{2}+4\theta +2\right)}^{2}}$
$ID=\frac{{\mu}_{2}}{{\mu}_{1}{}^{\prime}}=\frac{{\theta}^{4}+2{\theta}^{3}+5{\theta}^{2}+4\theta +2}{\theta \left({\theta}^{2}+\theta +1\right)\left({\theta}^{2}+\theta +2\right)}$ .
Behaviour of coefficient of variation (CV), coefficient of skewness (CS), coefficient of kurtosis (CK) and index of dispersion (ID) of Komal distribution for changing values of parameter are shown in the Figure 3. The coefficient of variation and the coefficient of skewness are nondecreasing whereas the coefficient of kurtosis and the index of dispersion are nonincreasing.
Hazard rate function
The hazard rate function of a random variable $X$ having pdf $f\left(x;\theta \right)$ and cdf $F\left(x;\theta \right)$ is defined as
$h\left(x,\theta \right)=\underset{\Delta x\to 0}{\mathrm{lim}}\frac{P\left(X<x+\Delta x\text{\hspace{0.17em}}X>x\right)}{\Delta x}=\frac{f\left(x;\theta \right)}{1F\left(x;\theta \right)}$
Thus, the hazard rate function of Komal distribution can be obtained as
$h\left(x,\theta \right)=\frac{{\theta}^{2}\left(1+\theta +x\right)}{\left({\theta}^{2}+\theta +1+\theta x\right)}$ .
This gives $h\left(0,\theta \right)=\frac{{\theta}^{2}\left(\theta +1\right)}{{\theta}^{2}+\theta +1}=f\left(0,\theta \right)$ . The behaviour of the hazard rate function of Komal distribution for various values of parameter $\theta $ is shown in the following Figure 4. The hazard rate of Komal distribution is monotonically nondecreasing. Further, as the values of parameter increases, the hazard rate of Komal distribution scaled up.
Figure 4 Graphs of the hazarad rate function of Komal distribution for selected values of the parameter.
Mean residual life function
Let $X$ be a random variable over the support $\left(0,\infty \right)$ representing the lifetime of a system. Mean Residual life (MRL) function measures the expected value of the remaining lifetime of the system, provided it has survived up to time $x$ . Let us consider the conditional random variable ${X}_{x}=\left(XxX>x\right)\text{\hspace{0.17em}};x>0$ . Then, the MRL function, denoted by $m\left(x\right)$ , is defined as
$m\left(x\right)=E\left({X}_{x}\right)=\frac{1}{S\left(x\right)}{\displaystyle \underset{x}{\overset{\infty}{\int}}\left[1F\left(t\right)\right]}\text{\hspace{0.17em}}dt\text{\hspace{0.17em}}\text{\hspace{0.17em}}$
The MRL function of Komal distribution can thus be obtained as
$m\left(x\right)=\frac{1}{\left\{{\theta}^{2}+\theta +1+\theta \text{\hspace{0.17em}}x\right\}{e}^{\theta x}}{\displaystyle \underset{x}{\overset{\infty}{\int}}t\text{\hspace{0.17em}}\left({\theta}^{2}\left(1+\theta +t\right)\right)}\text{\hspace{0.17em}}{e}^{\theta t}\text{\hspace{0.17em}}dtx$ $=\frac{{\theta}^{2}+\theta +2+\theta \text{\hspace{0.17em}}x}{\theta \left({\theta}^{2}+\theta +1+\theta \text{\hspace{0.17em}}x\right)}$ .
This gives $m\left(0\right)=\frac{{\theta}^{2}+\theta +2}{\theta \left({\theta}^{2}+\theta +1\right)}={\mu}_{1}{}^{\prime}$ . The behaviour of the mean residual life function of Komal distribution for various values of parameter is shown in the following Figure 5. It is clear that the mean residual life function of Komal distribution is monotonically nonincreasing.
Stochastic ordering
In Probability theory and statistics, a stochastic order quantifies the concept of one random variable being bigger than another. A random variable $X$ is said to be smaller than a random variable $Y$ in the
The following results due to Shaked & Shantikumar^{10} are well known for establishing stochastic ordering of distributions
Theorem: Let $X~$ Komal distribution $\left({\theta}_{1}\right)$ and $Y~$ Komal $\left({\theta}_{2}\right)$ . If ${\theta}_{1}>{\theta}_{2}$, then $X{<}_{lr}Y$ hence $X{<}_{hr}Y$ , $X{<}_{mrl}Y$ and $X{<}_{st}Y$ .
Proof: We have
$\frac{{f}_{X}\left(x;{\theta}_{1}\right)}{{f}_{Y}\left(x;{\theta}_{2}\right)}=\left[\frac{{\theta}_{1}{}^{2}\left({\theta}_{2}{}^{2}+{\theta}_{2}+1\right)}{\text{\hspace{0.17em}}{\theta}_{2}{}^{2}\left({\theta}_{1}{}^{2}+{\theta}_{1}+1\right)}\right]\left(\frac{1+{\theta}_{1}+x}{1+{\theta}_{2}+x}\right){e}^{\left({\theta}_{1}\text{\hspace{0.17em}}{\theta}_{2}\right)x}$ .
This gives
$\mathrm{log}\left[\frac{{f}_{X}\left(x;{\theta}_{1}\right)}{{f}_{Y}\left(x;{\theta}_{2}\right)}\right]=\mathrm{log}\left[\frac{{\theta}_{1}{}^{2}\left({\theta}_{2}{}^{2}+{\theta}_{2}+1\right)}{{\theta}_{2}{}^{2}\left({\theta}_{1}{}^{2}+{\theta}_{1}+1\right)}\right]+\mathrm{log}\left(\frac{1+{\theta}_{1}+x}{1+{\theta}_{2}+x}\right)({\theta}_{1}\text{\hspace{0.17em}}{\theta}_{2})x$
Therefore, $\frac{d}{dx}\mathrm{log}\left[\frac{{f}_{X}\left(x;{\theta}_{1}\right)}{{f}_{Y}\left(x;{\theta}_{2}\right)}\right]=\frac{{\theta}_{2}{\theta}_{1}}{\left(1+{\theta}_{1}+x\right)\left(1+{\theta}_{2}+x\right)}({\theta}_{1}\text{\hspace{0.17em}}{\theta}_{2})$
Thus, for ${\theta}_{1}>{\theta}_{2}$ , $\frac{d}{dx}\mathrm{log}\left[\frac{{f}_{X}\left(x;{\theta}_{1}\right)}{{f}_{Y}\left(x;{\theta}_{2}\right)}\right]<0$ . this means that $X{<}_{lr}Y$ hence $X{<}_{hr}Y$ , $X{<}_{mrl}Y$ and $X{<}_{st}Y$ .
Method of moment estimate
Since Komal distribution has one parameter, equating the population mean to the corresponding sample mean $\left(\overline{x}\right)$ , we get the thirddegree polynomial equation of parameter $\theta $ in the form
$\overline{x}\text{\hspace{0.17em}}{\theta}^{3}+\left(\overline{x}1\right){\theta}^{2}+\left(\overline{x}1\right)\theta 2=0$ .
Solving this thirddegree polynomial equation using NewtonRaphson method, we can easily get the moment estimate of the parameter.
Maximum likelihood estimate
Suppose $\left({x}_{1},\text{\hspace{0.17em}}{x}_{2},\text{\hspace{0.17em}}{x}_{3},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{...}\text{\hspace{0.17em}}\text{\hspace{0.17em}},{x}_{n}\right)$ be a random sample of size $n$ from Komal distribution. The log likelihood function, $\mathrm{log}L$ of Komal distribution is given by
$\mathrm{log}L={\displaystyle \sum _{i=1}^{n}\mathrm{log}f\left({x}_{i};\theta \right)}=n\left\{2\mathrm{log}\theta \mathrm{log}\left({\theta}^{2}+\theta +1\right)\right\}+{\displaystyle \sum _{i=1}^{n}\mathrm{log}\left(1+\theta +{x}_{i}\right)n\text{\hspace{0.17em}}\theta \text{\hspace{0.17em}}\overline{x}}$
The maximum likelihood estimate (MLE) $\left(\widehat{\theta}\right)$ of the parameters $\left(\theta \text{\hspace{0.17em}}\right)$ of Komal distribution is the solution of the following log likelihood equation
$\frac{d\mathrm{log}L}{d\theta}=\frac{2n}{\theta}\frac{n\left(2\theta +1\right)}{{\theta}^{2}+\theta +1}+{\displaystyle \sum _{i=1}^{n}\frac{1}{1+\theta +{x}_{i}}}n\text{\hspace{0.17em}}\text{\hspace{0.17em}}\overline{x}=0$
This gives
$\sum _{i=1}^{n}\frac{1}{1+\theta +{x}_{i}}}+\frac{n\left(\theta +2\right)}{\theta \left({\theta}^{2}+\theta +1\right)}n\text{\hspace{0.17em}}\overline{x}=0$ .
This is a nonlinear equation in $\theta $ . This can be solved using NewtonRaphson method available in R software to get the maximum likelihood estimate (MLE) of the parameter $\theta $ by taking the moment estimate of the parameter as the initial value. It should be noted that the method of moment estimate of the parameter will not be the same as that of the MLE.
The application and the goodness of fit of Komal distribution has been discussed with a failure time dataset. Following failure time dataset has been considered.
Data set: The following skewed to right dataset relating to the failure times of 20 electric bulbs discussed by Murthy et al.^{11} is considered and the observations are:
1.32, 12.37, 6.56, 5.05, 11.58, 10.56, 21.82, 3.60, 1.33, 12.62, 5.36, 7.71, 3.53, 19.61, 36.63,
0.39, 21.35, 7.22, 12.42, 8.92.
The values ML estimates of parameter and its standard error in parenthesis, $2\mathrm{log}L$ , AIC (Akaike Information Criterion), AICC (Akaike Information Criterion corrected), BIC (Bayesian Information criterion), KS (KolmogorovSmirnov) for the considered distributions for the given dataset have been computed and presented in table 2. The formulae for computing AIC, AICC, BIC and KS Statistics are as follows:
$AIC=2logL+2k$ , $AICC=AIC+\frac{2k(k+1)}{nk1}$ , $BIC=2logL+k\text{\hspace{0.17em}}\mathrm{log}n$ , $D=\underset{x}{Sup}{F}_{n}\left(x\right){F}_{0}\left(x\right)$ .
$\text{where}k=\text{numberofparameter},\text{}n=\text{samplesize}$
The pdf and the cdf of the fitted distributions are given in the Table 1.
Distributions 

Cdf 
Garima 
$f\left(x;\theta \right)=\frac{\theta}{\theta +2}\left(1+\theta +\theta \text{\hspace{0.17em}}x\right){e}^{\theta x}\text{\hspace{0.17em}}$  $F\left(x;\theta \right)=1\left(1+\frac{\theta x}{\theta +2}\right){e}^{\theta x}$ 
Sujatha 
$f\left(x;\theta \right)=\frac{{\theta}^{3}}{{\theta}^{2}+\theta +2}\left(1+x+{x}^{2}\right){e}^{\theta \text{\hspace{0.17em}}x}$  $F\left(x,\theta \right)=1\left[1+\frac{\theta x\left(\theta x+\theta +2\right)}{{\theta}^{2}+\theta +2}\right]{e}^{\theta x}$ 
Akash 
$f\left(x;\theta \right)=\frac{{\theta}^{3}}{{\theta}^{2}+2}\left(1+{x}^{2}\right){e}^{\theta x}$  $F\left(x;\theta \right)=1\left[1+\frac{\theta x\left(\theta \text{\hspace{0.17em}}x+2\right)}{{\theta}^{2}+2}\right]{e}^{\theta x}$ 
Shanker 
$f\left(x;\theta \right)=\frac{{\theta}^{2}}{{\theta}^{2}+1}\left(\theta +x\right){e}^{\theta x}$  $F\left(x;\theta \right)=1\left(1+\frac{\theta x}{{\theta}^{2}+1}\right){e}^{\theta x}$ 
Lindley 
$f\left(x;\theta \right)=\frac{{\theta}^{2}}{\theta +1}\left(1+x\right){e}^{\theta x}\text{\hspace{0.17em}}\text{\hspace{0.17em}}$  $F\left(x;\theta \right)=1\left[1+\frac{\theta x}{\theta +1}\right]{e}^{\theta x}\text{\hspace{0.17em}}\text{\hspace{0.17em}}$ 
Exponential 
$f\left(x;\theta \right)=\theta {e}^{\theta x}$  $$F\left(x;\theta \right)=1{e}^{\theta x}\text{\hspace{0.17em}}$$ 
Table 1 The pdf and the Cdf of fitted distributions
The fitted plots of considered distributions for the given datasets have been presented in Figure 6. The goodness of fit in Table 2 and the fitted plots of distributions for the dataset in figure 6 show that Komal distribution provides best fit over exponential, Lindley, Shanker, Akash and Sujatha distributions and therefore Komal distribution can be considered as the most suitable lifetime distribution for modeling lifetime data from biomedical science and engineering.
Distributions 
$\widehat{\theta}$  $2\mathrm{log}L$ 
AIC 
AICC 
BIC 
KS 
Pvalue 
Komal 
0.1745 (0.0275) 
133.33 
135.33 
135.90 
135.52 
0.0992 
0.9914 
Garima 
0.1408 (0.0273) 
133.18 
135.18 
135.75 
135.37 
0.1255 
0.9218 
Sujatha 
0.2689 (0.0345) 
137.54 
139.54 
140.11 
139.73 
0.1294 
0.9037 
Akash 
0.2786 (0.0355) 
138.47 
140.47 
141.04 
140.66 
0.1607 
0.6434 
Shanker 
0.1885 (0.0292) 
134.65 
136.65 
137.22 
136.84 
0.1172 
0.9539 
Lindley 
0.1762 (0.0280) 
133.44 
135.44 
136.01 
135.63 
0.1122 
0.9684 
Exponential 
0.0952 (0.0212) 
134.04 
136.04 
136.61 
136.23 
0.1255 
0.9218 
Table 2 ML estimates, $2\mathrm{log}L$ , AIC, AICC, BIC, KS of the distribution for the dataset
In this paper a new lifetime distribution named Komal distribution for analysing and modeling lifetime data from biomedical science and engineering has been proposed. Some of its important statistical properties, estimation of parameter and application to a real lifetime dataset from survival analysis has been presented. Since the distribution is new one, it would be of great hope and expectation that this will capture the attention of researchers working in biomedical science, engineering and insurance to model lifetime data in their respective fields. As the distribution has flexibility, tractability and practicability, future of the distribution would be quite bright among researchers in biomedical sciences and engineering.
The author is thankful to the editor in chief of the journal and the anonymous reviewer of the paper for fruitful comments.
The author declares that there are no conflicts of interest.
None.
©2023 Shanker. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work noncommercially.
2 7