The odd log-logistic generalized gamma model: properties, applications, classical and bayesian approach

doi:10.15406/bbij.2017.06.00174

eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 6 Issue 4

The odd log-logistic generalized gamma model: properties, applications, classical and bayesian approach

Fábio Prataviera,¹ Gauss M Cordeiro,² Adriano K Suzuki,³ Edwin MM Ortega⁴

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

¹Departamento de Ciências Exatas, Universidade de São Paulo, Brazil
²Departamento de Estat´ıstica, Universidade Federal de Pernambuco, Brazil
³Departamento de Matemática Aplicada e Estat´ıstica, Universidade de São Paulo, Brazil
⁴Departamento de Ciências Exatas, Universidade de São Paulo, Brazil

Correspondence: Edwin M. M. Ortega, Departamento de Ciências Exatas, Universidade de São Paulo, Piracicaba, SP, Brazil

Received: September 23, 2017 | Published: October 27, 2017

Citation: Prataviera, F, Cordeiro GM, Suzuki AK, et al. The odd log-logistic generalized gamma model: properties, applications, classical and bayesian approach. Biom Biostat Int J. 2017;6(4):388-405. DOI: 10.15406/bbij.2017.06.00174

Download PDF

Abstract

We propose a new lifetime model called the odd log-logistic generalized gamma distribution that can be easily interpreted. Some of its special models are discussed. We obtain general mathematical properties of this distribution including the ordinary moments, and quantile functions. We discuss parameter estimation by the maximum likelihood method and a Bayesian approach, where Gibbs algorithms along with metropolis steps are used to obtain the posterior summaries of interest for survival data with right censoring. Further, for different parameter settings, sample sizes and censoring percentages, we perform various simulations and evaluate the behavior of the estimators. The potentiality of the new distribution is proved by means of two real data sets. In fact, the new distribution can produce better fits than some well-known distributions.

Keywords:censored data, exponentiated distribution, generalized gamma distribution, moments, survival analysis

Introducton

The statistics literature is filled with hundreds of continuous univariate distributions. Recent developments focus on new techniques for building meaningful models. More recently, several methods of introducing one or more parameters to generate new distributions have been proposed. Among these methods, the compounding of some discrete and important lifetime distributions has been in the vanguard of lifetime modeling. So, several families of distributions were investigated by compounding some useful lifetime and truncated discrete distributions. The log-logistic (LL) distribution with a shape parameter $λ > 0$ is a useful model for survival analysis and it is an alternative to the log-normal distribution. Unlike the more commonly used Weibull distribution, the LL distribution has a non-monotonic hazard rate function (hrf), which makes it suitable for modeling cancer survival data. For $λ > 1$ , the hrf is unimodal and when $λ = 1$ , the hazard decreases monotonically. The fact that its cumulative distribution function (cdf) has a closed-form is particularly useful for analysis of survival data with censoring.

The odd log-logistic (OLL) family of distributions was pioneered by Gleaton and Lynch;¹ they called this family the generalized log-logistic (GLL) family. Recently, Braga et al.² studied the odd log-logistic normal distribution, da Cruz et al.³ proposed the odd log-logistic Weibull distribution and Cordeiro et al.² proposed the beta odd log-logistic generalized family. We develop a similar methodology to propose a new model based on the generalized gamma (GG) distribution. The GG distribution plays a very important role in statistical inferential problems. When modeling monotone hazard rates, the Weibull distribution may be an initial choice because of its negatively and positively skewed density shapes. However, the Weibull distribution does not provide a reasonable parametric fit for modeling phenomenon with bathtub shaped and unimodal failure rates, which are common in biological and reliability studies. Alternatively, other extensions of the GG distribution were developed for modeling lifetime data. For example, Cordeiro et al.⁴ defined the exponentiated generalized gamma with applications, Pascoa et al.⁵ introduced the Kumaraswamy generalized gamma distribution, Ortega et al.⁶ proposed the generalized gamma geometric distribution, Cordeiro et al.⁷ studied the beta generalized gamma distribution and, more recently, Lucena et al.⁸ defines the transmuted generalized gamma distribution and Silva et al.⁹ proposed the generalized gamma power series class.

Given a continuous baseline cdf $G (t; ξ)$ with a parameter vector $ξ$ , the cdf of the odd log-logistic-G (“OLL-G” for short) distribution with an extra shape parameter $λ > 0$ is defined by

$F (t) = \int_{0}^{\frac{G (t; ξ)}{\bar{G} (t; ξ)}} \frac{λ x^{λ - 1}}{{(1 + x^{λ})}^{2}} d x = \frac{G {(t; ξ)}^{λ}}{G {(t; ξ)}^{λ} + \bar{G} {(t; ξ)}^{λ}} .$ (1)

We can write

$λ = \frac{\log [\frac{F (t)}{\bar{F} (t)}]}{\log [\frac{G (t)}{\bar{G} (t)}]}$ and $\bar{G} (t; ξ) = 1 - G (t; ξ)$

So, the parameter $λ$ represents the quotient of the log odds ratio for the generated and baseline distributions. We note that there is no complicated function in equation (1) in contrast with the beta generalized family (Eugene et al.,¹⁰), which includes two extra parameters and also involves the beta incomplete function. The baseline cdf $G (t; ξ)$ is clearly a special case of (1) when $G (t; ξ)$ . If $G (t; ξ) = t /(1+ t)$ , it becomes the LL distribution. Several distributions can be generated from equation (1). For example, the odd log-logistic Fréchet and odd log-logistic gamma distributions are obtained by taking $G (t; ξ)$ to be the Fréchet and gamma cumulative distributions, respectively. The probability density function (pdf) of the new family is given by

$f (t) = \frac{λ g (t; ξ) {G (t; ξ) [1 - G (t; ξ)]}^{λ - 1}}{{G {(t; ξ)}^{λ} + [1 + G {(t; ξ)}^{λ}]}^{2}}$ (2)

The OLL-G family of densities (2) allows for greater flexibility of its tails and can be widely applied in many areas of engineering and biology. We can study some of its mathematical properties because it extends several well-known distributions.

The inferential part of this model is carried out using the asymptotic distribution of the maximum likelihood estimators (MLEs), which in situations when the sample size is small or moderate, might lead to poor inference on the model parameters. Hence, in this paper, we also explore the Markov Chain Monte Carlo (MCMC) techniques to develop a Bayesian inference as an alternative analysis for the model. So, we discuss the inference aspects of the OLL-G model following both a classical and a Bayesian approach.

The rest of the paper is organized as follows. In Section 2, we define the odd log-logistic generalized gamma (OLLGG) distribution and present some special cases. Section 3 provides a useful linear representation for the OLLGG density function. We derive in Section 4 some structural properties of the new distribution. Considering censored data, we adopt a classic analysis for the parameters of the model in Section 5. In Section 6, the Bayesian approach is considered using MCMC with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. In Section 7, we present results from various simulation studies displayed graphically and commented. Two applications to real data are performed in Section 8. Some concluding remarks are given in Section 9.

The OLLGG distribution

The gamma distribution is the most popular model for analyzing skewed data. The generalized gamma distribution (GG) was introduced by Stacy¹¹ and includes as special models: the exponential, Weibull, gamma and Rayleigh distributions, among others. It is suitable for modeling data with different forms of the hazard rate function (hrf): increasing, decreasing, bathtub and unimodal. This characteristic is useful for estimating individual hrfs and both relative hazards and relative times. The GG distribution has been used in several research areas such as engineering, hydrology and survival analysis.

The cdf and pdf of the $GG(α, τ, k)$ distribution (Stacy,¹⁰) are given by

$G(t; α, τ, k) = γ_{1} (k, {(\frac{t}{α})}^{τ}) = \frac{γ (k, {(t / α)}^{τ})}{Γ (k)}, t > 0$ (3)

$g (t; α, τ, k) = \frac{τ}{α Γ (k)} {(\frac{t}{α})}^{τ k - 1} \exp [- {(\frac{t}{α})}^{τ}]$ (4)

where $α > 0, T > 0, k > 0, γ (k, x) = \int_{0}^{x} w^{k - 1} e^{- w} d w$ is the incomplete gamma function and $Γ (.)$ is the gamma function. Basic properties of the GG distribution are given by Stacy and Mihram¹² and Lawless.¹³ The OLLGG distribution (for t > 0) is defined by substituting $G (t; α, τ, k) a n d g (t; α, τ, k)$ in equations (1) and (2), respectively. Hence, its density function with four positive parameters $ξ = {(α, τ, k)}^{T}$ and $λ > 0$ has the form

$f (t) = \frac{λ τ {(t / α)}^{τ κ - 1} \exp [- {(t / α)}^{τ}] {γ_{1} (k, {(t / α)}^{τ}) [1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ - 1}}{α Γ (κ) {γ_{1}^{λ} (k, {(t / α)}^{τ}) + {[1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ}}^{2}}, t > 0$ (5)

where α is a scale parameter and the other positive parameters τ, $k$ and $λ$ are shape parameters. One major benefit of (5) is its ability of fitting skewed data that can not be properly fitted by existing distributions. The OLLGG density allows for greater flexibility of its tails and can be widely applied in many areas of engineering and biology.

The Weibull and GG distributions are the most important sub-models of (5) for $λ = k = 1$ and $λ = 1$ , respectively. The OLLGG distribution approaches the log-normal (LN) distribution when $λ = 1$ and $k \to \infty$ . Other sub-models are listed in Table 2: OLL-Gamma, OLL-Chi-Square, OLL-Exponential, OLL-Weibull, OLL-Rayleigh, OLL-Maxwell, OLL-Folded normal, among others.

Distribution	$α$	$τ$	$k$	$λ$
OLL-Gamma	$α$	1	$k$	$λ$
OLL-Weibull	$α$	$τ$	1	$λ$
OLL-Exponential	$α$	1	1	$λ$
OLL-Chi-square	2	1	$\frac{n}{2}$	$λ$
OLL-Chi	$\sqrt{2}$	2	$\frac{n}{2}$	$λ$
OLL-Rayleigh	$α$	2	1	$λ$
OLL-Maxwell	$α$	2	$\frac{3}{2}$	$λ$
OLL-Folded normal	$\sqrt{2}$	2	$\frac{1}{2}$	$λ$
OLL-Circular normal	$\sqrt{2}$	2	1	$λ$
OLL-Spherical Normal	$\sqrt{2}$	2	$\frac{3}{2}$	$λ$

Table 1 Some new OLL-G sub-models

If $T$ is a random variable with density function (5), we write $T \sim O L L G G (α, τ, k, λ)$ . The survival and hazard rate functions corresponding to (5) are

$S (t) = 1 - F (t) = \frac{{[1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ}}{γ_{1}^{λ} (k, {(t / α)}^{τ}) + {[1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ}}$ (6)

$h (t) = \frac{λ τ {(t / α)}^{τ k - 1} \exp [- {(t / α)}^{τ}] γ_{1}^{λ - 1} (k, {(t / α)}^{τ}) {γ_{1}^{λ} (k, {(t / α)}^{τ}) + {[1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ}}}{α Γ (k) {γ_{1}^{λ} (k, {(t / α)}^{τ}) + {[1 - γ_{1} (k, {(t / α)}^{τ})]}^{λ}}^{2} [1 - γ_{1} (k, {(t / α)}^{τ})]}$ (7)

respectively. Plots of the OLLGG density function for selected parameter values are given in Figure 1. We note that the OLLGG density function can be symmetrical, left-skewed, right-skewed, unimodal and bimodal shaped.

The hrf (7) is quite flexible for modeling survival data as indicated by the plots for selected parameter values in Figure 2. The hrf can be increasing, decreasing, unimodal, bathtub and have other forms.

Figure 1 Plots of the OLLGG density function for some parameter values. (a) Fixed $λ = 1$ . (b) Fixed $α = 2$ , $τ = 3$ and $k = 10$ . (c) Fixed $α = 2$ , $τ = 5$ and $λ = 0, 15$ .

Figure 2 The OLLGG hrf. (a) Bathtub. (b) Unimodal. (c) Increasing, decreasing and other forms.

The OLLGG model is easily simulated by inverting (1) as follows:

$t = Q_{G G} (\frac{υ^{1 / λ}}{{(1 - υ)}^{1 / λ} + υ^{^{1 / λ}}}, α, τ, k)$ , (8)

where $u$ has a uniform $U (0, 1)$ distribution and $Q_{G G} (.) = G^{- 1} (.)$ is the baseline quantile function (qf).

Some properties of the OLLGG distribution are:

If $T \sim O L L G G (α, τ, k, λ) \Rightarrow b T O L L G G (b α, τ, k, λ), \forall b > 0.$

If $T \sim O L L G G (α, τ, k, λ) \Rightarrow T^{m} O L L G G (α^{m}, τ / m, k, λ), \forall m \neq 0.$

So, the new distribution is closed under power transformation.

Linear representation for the OLLGG distribution

First, we define the exponentiated-generalized gamma (“Exp-GG”) distribution, say $W ~ E x p^{c} (G G)$ with power parameter $c > 0$ , if $W$ has cdf and pdf given by

$H_{c} (t) = G {(t; α, τ, k)}^{c}$ and $h_{c} (t) = \frac{c τ}{α Γ (k)} {(\frac{t}{α})}^{τ k - 1} G {(x; α, τ, k)}^{c - 1},$

respectively. In a general context, the properties of the exponentiated-G (Exp-G) distributions have been studied by several authors for some baseline G models, see Mudholkar and Srivastava¹⁴ and Mudholkar et al.¹⁵ for exponentiated Weibull, Nadarajah¹⁶ for exponentiated Gumbel, Shirke and Kakade.¹⁷ for exponentiated log-normal and Nadarajah and Gupta¹⁸ for exponentiated gamma distributions. See, also, Nadarajah and Kotz,¹⁹ among others.

First, we obtain an expansion for $F (t; α, τ, k, λ)$ using a power series for $G {(t; α, τ, k)}^{λ}$ ( $λ > 0$ real)

$G {(t; α, τ, k)}^{λ} = {\sum_{j = 0}^{\infty} a_{j} G (t; α, τ, k)}^{j},$ (9)

where

$a_{j} = a_{j} (λ) {\sum_{υ = j}^{\infty} (- 1)}^{j + υ} (_{υ}^{λ}) (_{j}^{υ}) .$

For any real $λ > 0$ , we consider the generalized binomial expansion

${[1 - G (t; α, τ, k)]}^{λ} = {\sum_{j = 0}^{\infty} (- 1)}^{j} (\begin{matrix} λ \\ j \end{matrix}) G {(t; α, τ, k)}^{j} .$ (10)

Inserting (9) and (10) in equation (1), we obtain

$F (t; α, τ, k, λ) = \frac{\sum_{j = 0}^{\infty} a_{j} G {(t; α, τ, k)}^{j}}{\sum_{j = 0}^{\infty} b_{j} G {(t; α, τ, k)}^{j}},$

where $b_{j} = a_{j} + {(- 1)}^{j} (\begin{matrix} λ \\ j \end{matrix})$ for $j \geq 0.$

The ratio of the two power series can be expressed as

$F (t; α, τ, k, λ) = {\sum_{j = 0}^{\infty} c_{j} G (t; α, τ, k)}^{j},$ (11)

where $c_{0} = a_{0} / b_{0}$ and the coefficients $c_{j}$ ’s (for $j \geq 1$ ) are determined from the recurrence equation

$c_{j} = b_{0}^{- 1} (a_{j} - \sum_{r = 1}^{j} b_{r} c_{j - r}) .$

The pdf of $Τ$ is obtaining by differentiating (11) as

$f (t; α, τ, k, λ) = \sum_{j = 0}^{\infty} c_{j + 1} h_{j + 1} (t),$ (12)

where

$h_{j + 1} (t) = \frac{(j + 1) τ}{α Γ (k)} {(\frac{t}{α})}^{τ k - 1} \exp [{(\frac{t}{α})}^{τ}] G {(t; α, τ, k)}^{j}$

is the Exp-GG density function with power parameter $j + 1$ .

For $j \geq 0$ , we can write

$h_{j + 1} (t) = \frac{(j + 1) τ}{α Γ (k)} {(\frac{t}{α})}^{τ k - 1} \exp [- {(\frac{t}{α})}^{τ}] γ_{1} {(k, {(t / α)}^{τ})}^{j},$ (13)

where $γ_{1} (k, {(t / α)}^{τ}) = γ (k, {(t / α)}^{τ}) / Γ (k) .$

By application of an equation in Section 0.314 of Gradshteyn and Ryzhik²⁰ for a power series raised to a power, we obtain for any $j$ positive integer

${(\sum_{i - 0}^{\infty} a_{i} x^{i})}^{j} = \sum_{i = 0}^{\infty} d_{j, i} x^{i},$ (14)

where the coefficients $d_{j, i} (f o r i = 1, 2, .....)$ satisfy the recurrence relation

$d_{j, i} = {(i a_{0})}^{- 1} \sum_{p = 1}^{i} [j (p + 1) - i] a_{p} d_{j, i - p}$ (15)

and $d_{j, 0} = a_{0}^{j}$ . The coefficient $d_{j, i}$ can be expressed explicitly from $d_{j, 0, ...,} d_{j, i - 1}$ and then from $a_{0, ..,} a_{i,}$ , although it is not necessary for programming numerically our expansions using any software with numerical facilities.

Further, using equation (14), we can write (for $j \geq 1$ )

$γ_{1} {(k, {(t / α)}^{τ})}^{j} = \frac{{(t / α)}^{j k τ}}{Γ {(k)}^{j}} \sum_{i = 0}^{\infty} d_{j, i} {(t / α)}^{i τ},$ (16)

where the coefficients $d_{j, i} (f o r i \geq 1)$ are determined from (15) with $a_{p} = {(- 1)}^{p} / [(k + p) p!] .$ Based upon equation (16), we can write the Exp-GG density (for $j \geq 1$ ) from (13) as

$h_{j + 1} (t) = \frac{(j + 1) τ}{α Γ {(k)}^{j + 1}} \exp [- {(\frac{t}{α})}^{τ}] {\sum_{i = 0}^{\infty} (\frac{t}{α})}^{[i + (j + 1) k] τ - 1} .$

The last density can be expressed in terms of the GG density functions. By noting the form of (4), we can write (for $j \geq 1$ )

$h_{j + 1} (t) = \sum_{i = 0}^{\infty} e_{j, i} g (t; α, τ, [i + (j + 1) k]),$ (17)

where $g (t; α, τ, [i + (j + 1) k])$ is the GG density function with parameters $α, τ$ and $i + (j + 1) k$ and

$e_{j, i} = \frac{(j + 1) Γ ([i + (j + 1) k])}{Γ {(k)}^{j + 1}} d_{j, i} .$ (18)

For $j = 0$ , we have from (13) $h_{1} (t) = \frac{τ}{α Γ (k)} {(t / α)}^{τ k - 1} \exp [{(\frac{t}{α})}^{τ}] = g (t; α, τ, k) .$ Combining the result (17) (for $j \geq 1$ ) and that one for $j = 0$ , we can write $f (t) = f (t; α, τ, k, λ)$ in (12) as

$f (t) = c_{1} g (t; α, τ, k) + \sum_{j = 1}^{\infty} \sum_{i = 0}^{\infty} e_{j, i} c_{j + 1} g (t; α, τ, [i + (j + 1) k]) .$ (19)

Equation (19) reveals that the OLLGG density function is a linear combination of Exp-GG densities. Hence, some mathematical properties of the OLLGG distribution can follow directly from those properties of the GG distribution. For example, the ordinary, central, fac¬torial moments and the moment generating function (mgf) of the proposed distribution can be obtained from the same weighted infinite linear combination of the corresponding quantities for the GG distribution. This equation is the main result of this section.

Mathematical properties

Some of the most important features and characteristics of a distribution can be studied through moments (e.g., tendency, dispersion, skewness and kurtosis). In this section, we give two different expansions for calculating the moments of the EGG distribution.

First, we obtain an infinite sum representation for the $r$ th ordinary moment $μ_{r}^{'}$ of the EGG distribution based on the equation (19). The $r$ th moment of the $G G (α, τ, k)$ distribution is well known to be

$μ_{r, G G}^{'} = \frac{α^{r} Γ (k + r / τ)}{Γ (k)}$

Equation (19) then immediately gives

$μ_{r}^{'} = \frac{c_{1} α^{r} Γ (k + r / τ)}{Γ (k)} + \frac{α^{r}}{Γ (k)} \sum_{j = 1}^{\infty} \sum_{i = 0}^{\infty} e_{j, i} c_{j + 1} Γ ([i + (j + 1) k] + r / τ) .$ (20)

Equation (20) reveals that the moment $μ_{r}^{'}$ does have the inconvenient of depending on the quantities $e_{j, i}$ given by (18).

We now derive another infinite sum representation for $μ_{r}^{'}$ by computing the $r$ th moment directly without requiring the quantities $e_{j, i}$ . We readily obtain

$μ_{r}^{'} = \frac{β α^{r - 1}}{Γ (k)} \int_{0}^{\infty} {(\frac{t}{α})}^{β k + r - 1} \exp {- {(\frac{t}{α})}^{β}} {γ_{_{1}} [k, {(\frac{t}{α})}^{β}]}^{λ - 1} d t$

and then $x = {(t / α)}^{β}$ gives

$μ_{r}^{'} = \frac{λ α^{r}}{Γ {(k)}^{λ}} \int_{0}^{\infty} x^{k + r / β - 1} e^{- x} γ {(k, x)}^{λ - 1} d x .$

Using expansion (16) for $γ (k, x)$ leads to

$γ {(k, x)}^{λ - 1} = {\sum_{j = 0}^{\infty} \sum_{m = 0}^{j} (- 1)}^{j + m} (\overset{λ - 1}{j}) (\overset{j}{m}) γ {(k, x)}^{m} .$

Inserting the last equation in the expression for $μ_{r}^{'}$ and interchanging terms, we obtain

$μ_{r}^{'} = \frac{λ α^{r}}{Γ {(k)}^{r}} {\sum_{j = 0}^{\infty} \sum_{m = 0}^{j} (- 1)}^{j + m} (\overset{λ - 1}{j}) (\overset{j}{m}) I (k, r / β, m),$ (21)

where

$I (k, r / β, m) = \int_{0}^{\infty} x^{k + r / β - 1_{e^{- x_{γ (k, x)}}}^{m}} d x$ .

For calculating the last integral, the series expansion (16) for the incomplete gamma function gives

$I (k, r / β, m) = \int_{0}^{\infty} {^{x^{k + r / β - 1}}}^{{_{{^{e}}^{^{- x}}^{{x^{k} \sum_{p = 0}^{\infty} \frac{{(- x)}^{p}}{(k + p) p!}}}}}^{m}} d x .$

Now this integral can be obtained from equations (24) and (25) of Nadarajah²¹ in terms of the Lauricella function of type A (Exton,²² Aarts,²³) defined by

$F_{A}^{(n)} (a; b_{1}, . . ., b_{n}; c_{1}, . . ., c_{n}; x_{1}, . . ., x_{n}) =$

$\sum_{m_{1} = 0}^{\infty} ... \sum_{m_{n} = 0}^{\infty} {\frac{{(a)}_{m}_{_{1} + ... + m_{n}} {(b_{1})}_{m}_{_{1}} ... {(b_{n})}_{m}_{_{n}}}{{(c_{1})}_{m_{1}} ... {(c_{n})}_{m_{n}}}}_{} \frac{x_{_{1}}^{m_{1}} ... x_{_{n}}^{m_{n}}}{m_{1}! ... m_{n}!},$

where ${(a)}_{i}$ is the ascending factorial defined by (with the convention that ${(a)}_{0} = 1$ )

${(a)}_{i} = a (a + 1) ... (a + i - 1) .$

Numerical routines for the direct computation of the Lauricella function of type A are available, see Exton²² and Mathematica (Trott,²⁴). We obtain

$I (k, r / β, m) = k^{- m} Γ (r / β + k (m + 1)) F_{A}^{(m)} (r / β + k (m + 1); k, ... k; k + 1, ..., k + 1; - 1, ... - 1) .$

(22) Hence, as an alternative way to equation (20), the rth moment of the EGG distribution follows from both formulae (21) and (22) as an infinite weighted sum of the Lauricella functions of type A. In Figures 3 and 3, we display plots of the skewness and kurtosis the OLGG distribution for some parameter values.

Maximum likelihood estimation

Let Ti be a random variable following (5) with the vector of parameters $θ = {(T, k, λ)}^{T}$ . The data encountered in survival analysis and reliability studies are often censored. A very simple random censoring mechanism that is often realistic is one in which each individual $i$ is assumed to have a lifetime $T_{i}$ and a censoring time $C_{i}$ , where $T_{i}$ and $C_{i}$ are independent random variables. Suppose that the data consist of n independent observations $t_{i} = \min (T_{i, C_{i}})$ for $i = 1, ..., n .$

Figure 3 Skewness and kurtosis of the OLLGG distribution as a function of $λ$ for some values of $k$ with $α = 2$ and $τ = 1$ .

Figure 4 Skewness and kurtosis of the OLLGG distribution as a function of $τ$ for some values of $λ$ with $α = 2$ and $k = 1$ .

The distribution of $C_{i}$ does not depend on any of the unknown parameters of $T_{i}$ . Parametric inference for such data are usually based on likelihood methods and their asymptotic theory. The censored log-likelihood $l (θ)$ for the model parameters is given by

$\begin{array}{l} l (θ) = r \log [\frac{λ_{T}}{α Γ (k)}] - {\sum_{i \in F} (\frac{t_{i}}{α})}^{t} + (T k - 1) \sum_{i \in F} \log (\frac{t_{i}}{α}) + (λ - 1) \sum_{i \in F} \log (u_{i}) + \\ (λ - 1) \sum_{i \in F} \log (1 - u_{i}) - 1 \sum_{i \in F} \log [u_{i}^{λ} + {(1 - u_{i})}^{λ}] + λ \sum_{i \in c} \log (1 - u_{i}) - \sum_{i \in c} \log [u_{i}^{λ} + {(1 - u_{i})}^{λ}], \end{array}$ (23)

Where $μ_{i} = γ_{1} (k, {(\frac{t_{i}}{α})}^{t})$ , $r$ is the number of failures and $F$ and $C$ denote the uncensored and censored sets of observations, respectively.

The score components corresponding to the parameters in $θ$ are:

$\begin{array}{l} U_{α} (θ) = - \frac{r τ k}{α} + τ α^{τ - 1} \sum_{i \in F} t_{i}^{τ} + (λ - 1) \sum_{i \in F} \frac{{[{\dot{u}}_{i}]}_{α}}{μ_{i}} - (λ - 1) \sum_{i \in F} \frac{{[{\dot{u}}_{i}]}_{α}}{1 - μ_{i}} \\ - 2 λ \sum_{i \in F} \frac{{[{\dot{u}}_{i}]}_{α} [u_{i}^{λ - 1}] - {(1 - u_{i})}^{λ - 1}}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]} - λ \sum_{i \in C} \frac{{[{\dot{u}}_{i}]}_{α}}{(1 - u_{i})} - λ \sum_{i \in C} \frac{{[{\dot{u}}_{i}]}_{α} [u_{i}^{λ - 1}] - {(1 - u_{i})}^{λ - 1}}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]}, \end{array}$

$\begin{array}{l} U_{τ} (θ) = - \frac{r}{τ} - \sum_{i \in F} {(\frac{t_{i}}{α})}^{τ} \log (\frac{t_{i}}{α}) + k \sum_{i \in F} (\frac{t_{i}}{α}) \log + (λ - 1) \sum_{i \in F} \frac{[{\dot{u}}_{i}] τ}{μ_{i}} - (λ - 1) \sum_{i \in F} \frac{[{\dot{u}}_{i}] τ}{1 - μ_{i}} \\ - 2 \sum_{i \in F} \frac{[{\dot{u}}_{i}] τ [u_{i}^{λ - 1} - {(1 - u_{i})}^{λ - 1}]}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]} - λ \sum_{i \in C} \frac{[{\dot{u}}_{i}] τ}{(1 - μ_{i})} - λ \sum_{i \in C} \frac{[{\dot{u}}_{i}] T [u_{i}^{λ - 1} - {(1 - u_{i})}^{λ - 1}]}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]}, \end{array}$

$\begin{array}{l} U_{k} (θ) = - r ψ (k) + τ \sum_{i \in F} \log (\frac{t_{i}}{α}) + (λ - 1) \sum_{i \in F} \frac{[{\dot{u}}_{i}] k}{u_{i}} - (λ - 1) \sum_{i \in F} \frac{[{\dot{u}}_{i}] k}{1 - u_{i}} \\ - 2 λ \sum_{i \in F} \frac{[{\dot{u}}_{i}] k [u_{i}^{^{λ - 1}} - 1 {(1 - u_{i})}^{λ - 1}]}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]} - λ \sum_{i \in C} \frac{[{\dot{u}}_{i}] k}{(1 - u_{i})} - λ \sum_{i \in C} \frac{[{\dot{u}}_{i}] k [u_{i}^{λ - 1} - {(1 - u_{i})}^{λ - 1}]}{[u_{i}^{λ} + {(1 - u_{i})}^{λ}]}, \end{array}$

and

$\begin{array}{l} U_{λ} (θ) = \frac{r}{λ} \sum_{i \in F} \log (u_{i}) + \sum_{i \in F} \log (1 - u_{i}) - 2 \sum_{i \in F} \frac{u_{_{i}}^{λ} \log (u_{i}) + {(1 - u_{i})}^{λ} \log (1 - u_{i})}{[u_{_{i}}^{λ} + {(1 - u_{i})}^{λ}]} \\ + \sum_{i \in F} \log (1 - u_{i}) - \sum_{i \in F} \frac{μ_{_{i}^{λ}} \log (u_{i}) + {(1 - u_{i})}^{λ} \log (1 - u_{i})}{[μ_{i}^{λ} + {(1 - u_{i})}^{λ}]}, \end{array}$

Where

$[{\dot{u}}_{i}] α + \frac{_{\partial_{γ 1} (k, {(\frac{t_{i}}{α})}^{τ})}}{\partial α} [{\dot{u}}_{i}] τ \frac{_{\partial_{γ 1} (k, {(\frac{t_{i}}{α})}^{τ})}}{\partial α} [{\dot{u}}_{i}] k \frac{\partial_{γ 1}_{(k, {(\frac{t_{i}}{α})}^{τ})}}{\partial k}$

$ψ (.)$ is the digamma function and $i = 1, ..., n$ .

The $M L E \hat{θ}$ of $θ$ can be obtained numerically from the nonlinear equations $U_{τ} (θ) = U_{k} (θ) = U_{λ} (θ) = 0.$ For interval estimation and hypothesis tests on the model parameters, we require the $J = J (θ)$ unit observed information matrix $(\overset{⌢}{θ} - θ)$ , whose elements are evaluated numerically. Under general regularity conditions, the asymptotic distribution of $(\overset{⌢}{θ} - θ)$ is $N_{4} (0, {(θ)}^{- 1})$ , where $I (θ)$ is the expected information matrix. This matrix can be replaced by $J (\overset{⌢}{θ})$ , i.e., the observed information matrix evaluated at $\overset{⌢}{θ}$ . The multivariate normal $N_{_{4}} (0, {(\hat{θ})}^{- 1})$ distribution can be used to construct approximate confidence intervals for the individual parameters. Further, the likelihood ratio (LR) statistic can be adopted for comparing this distribution with some of its special models. We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct LR statistics for testing some sub-models of the OLLGG distribution. For example, the test of $H_{0} : λ = 1$ versus $H : H_{0}$ is not true is equivalent to compare the OLLGG and GG distributions and the LR statistic reduces to

$w = 2 {l (\overset{⌢}{α}, \hat{τ}, \overset{⌢}{k}, \overset{⌢}{λ}) - l (\overset{⌢}{α}, \hat{τ}, \overset{⌢}{k}, 1)},$

where $\overset{⌢}{α}, \hat{τ}, \overset{⌢}{k}$ , and $\overset{⌢}{λ}$ are the MLEs under H and $\overset{⌢}{α}, \hat{τ}$ and $\overset{⌢}{k}$ are the estimates under $H_{0}$ .

Bayesian inference

In this section we briefly discuss the inference from a Bayesian viewpoint. We making a change in the parameters to $ξ = (\log (λ), \log (k), l o g (τ), l o g (α))$ , so that the parameter space is transformed into $R^{4}$ (necessary for the work with the proposed Gaussian densities). We assume that $λ, k, τ a n d α$ are prior independent, that is,

$π (θ) = π (λ) π (k) π (τ) π (α),$

where

$l o g (λ) \sim N (0, σ^{2}), \log (k) \sim N (0, σ^{2}), \log (τ) \sim N (0, σ^{2}) a n d \log (α) \sim N (0, σ^{2})$ and $N (μ, σ^{2})$ denotes the normal distribution with mean $μ$ and variance $σ^{2}$ . All the hyper-parameters $σ^{2}$ have been specified to express non-informative priors.

Regarding the Jacobian of this transformation, our joint posterior density (or target density) reduces to

$π (ξ | D) \propto L (ξ; D) \exp {- \frac{1}{2} [\frac{\log (λ)}{σ^{2}} + \frac{\log (k)}{σ^{2}} + \frac{\log (τ)}{σ^{2}} + \frac{\log (α)}{σ^{2}}]}$ (25)

where $L (ξ; D)$ is the likelihood function.

This joint posterior density is analytically intractable. Therefore, we based our inference on the MCMC simulation methods. No closed-form is available for any of the full conditional distributions necessary for the implementation of the Gibbs sampler. Then, we have resorted to the Metropolis–Hastings algorithm. To implement this algorithm, we proceed as follows:

(1) Start with any point $ξ_{(10)}$ and stage indicator $j = 0$ ;

(2) Generate a point $ξ'$ according to the transitional kernel $Q (ξ', ξ_{j}) = N_{4} (ξ_{j}, \tilde{Σ})$ , where $\tilde{Σ}$ is the covariance matrix of $ξ$ , which is the same in any stage;

(3) Update $ξ_{(j)} t o ξ_{(j + 1)} = ξ'$ ′ with probability $p_{j} = \min {1, π (ξ' | D) / π (ξ_{(j)} | D)}$ , or keep $θ_{(j)}$ ;

(4) Repeat steps (2) and (3) by increasing the stage indicator until the process has reached a stationary distribution.

In this scheme, we consider 30,000 sample burn-in, and we use every tenth sample from the 200,000 MCMC posterior samples to reduce the autocorrelations and yield better convergence results, thus obtaining an effective sample of size 20,000 from which the posterior is based on. We monitor the convergence of the Metropolis-Hasting algorithm using the method proposed by Geweke (1992), as well as trace plots. All computations are performed in the $R$ software ( $R$ Development Core Team, 2011).

Bayesian model comparison

In the literature, a variety of Bayesian methodologies can be applied for comparing of several competing models for a given data set and selection of the best one to fit the data. In this paper, we use the deviance information criterion (DIC) proposed by Spiegelhalter et al.,²⁵ the expected Akaike information criterion (EAIC)given by Brooks,²⁶ and the expected Bayesian (or Schwarz) information criterion (EBIC) discussed by Carlin and Louis.²⁷

They are based on the posterior mean of the deviance, which can be approximated by $\bar{d} = Σ_{q = 1}^{Q} d (θ_{q}) / Q, w h e r e d (θ) = - 2 Σ_{i = 1}^{n} \log [f (t_{i} | θ)]$ . The DIC criterion can be estimated using the MCMC output by $\hat{D I C} = \bar{d} + \hat{ρ d} = 2 \bar{d} - \hat{d}$ , where ρD is the effective number of parameters given by $E {d (θ)} - d {E (θ)}, a n d d {E (θ)}$ is the deviance evaluated at the posterior mean. Similarly, the EAIC and EBIC criteria can be estimated by means of $\hat{E A I C} = \bar{d} + 2 # (θ)$ and $\hat{E A I C} = \bar{d} + # (θ) \log (n), w h e r e # (θ)$ is the number of the model parameters.

Simulation study

We evaluate some properties of the MLEs using the classical and Bayesian analysis by means of a simulation study. We simulate the OLLGG distribution considering modality form from equation (8) by using a random variable U having a uniform distribution in (0, 1).

We take n=50, 150 and 350 and, for each replication, we calculate the MLEs $\hat{α}, \hat{τ}, \hat{k} a n d \hat{λ}$ . We repeat this process 1, 000 times and determine the average estimates (AEs), biases and means squared errors (MSEs). In this study, we consider two scenarios. In the first scenario, we take $α = 2, τ = 5, k = 10, λ = 0.5.$ In the second scenario, we use the values fitted in the adjustment to the temperature data set in Section 8 $(α = 21.2911, τ = 13.0661, k = 2.8755, λ = 0.2882)$ . The estimates of $α, τ, k, a n d λ$ are determined by solving the nonlinear equations $U_{α} (θ) = 0, U_{T} (θ) = 0, U_{k} (θ) = 0, U_{λ} (θ) = 0$ . The results of the Monte Carlo study under maximum likelihood and Bayesian estimation are given in Tables 2 and 3, respectively. They indicate that the MSEs of the MLEs of $α, τ, k, a n d λ$ decay toward zero as the sample size increases, as expected under first-order asymptotic theory. The same results are obtained using the Bayesian approach. In Figures 5 and 6, we present the estimated densities based on 1,000 samples of the AEs of the parameters $α, τ, k, a n d λ$ , respectively and n = 50, 150 and 350 for both scenarios. These plots are in agreement with the first-order asymptotic theory for the MLEs and reveal a fast convergence even for small sample sizes.

Simulation study of random censored values

Similarly, we also consider a simulation study in the presence of censored data. The censoring times $C_{i}$ are sampled from the uniform distribution in the interval $(0, v), w h e r e v$ denotes the proportion of censored observations. In this study, the proportions of censored observations are approximately equal to 10% and 30%. In this scenario, we take the values of the parameters as $α = 2, τ = 5, k = 10, λ = 0.15$ . Table 4 lists the averages of the MLEs (Mean) and the MSEs. The figures in this table indicate that the MSEs increase when the censoring percentage increases. Further, the MSEs of the MLEs of $α, τ, k, a n d λ$ decay toward zero as the sample size increases, as expected under first-order asymptotic theory.

Table 5 lists the posterior means (Mean) and the MSEs. We can note that increasing the sample size and decreasing the percentage of censure, the estimates are closer to the true values with lower MSEs.

Scenario 1
$n$	Parameters	AEs	Biases	MSEs
50	$α$	2.0404	-0.0404	0.1984
	$τ$	5.3257	-0.3257	1.8523
	$k$	10.7653	-0.7653	2.9000
	$λ$	0.1708	-0.0208	0.0115
150	$α$	2.0393	-0.0393	0.0242
	$τ$	5.1585	-0.1585	0.2070
	$k$	9.8491	0.1509	1.9955
	$λ$	0.1528	-0.0028	0.0011
350	$α$	2.0065	-0.0065	0.0024
	$τ$	5.0417	-0.0417	0.0276
	$k$	10.012	-0.0012	0.2220
	$λ$	0.1511	-0.0011	0.0001
Scenario 2
$α$	Parameters	AEs	Biases	MSEs
50	$α$	21.1422	0.1489	7.7557
	$τ$	15.5491	-2.483	64.7128
	$k$	4.5288	-1.6533	22.2571
	$λ$	0.3400	-0.0518	0.0685
150	$α$	21.3407	-0.0496	2.1903
	τ	13.8973	-0.8312	9.9415
	$k$	3.2666	-0.3911	3.3779
	$λ$	0.3060	-0.0178	0.0167
350	$α$	21.2908	0.0003	0.8393
	τ	13.3138	-0.2477	3.0814
	$k$	3.0593	-0.1838	1.2018
	$λ$	0.2956	-0.0074	0.0058

Table 2 AEs, biases and MSEs for the estimates of the OLLGG parameters

In Figures 7 and 8, we present the estimated densities based on 1,000 samples of the AEs of the parameters $α, τ, k, a n d λ$ respectively, and n = 50, 150 and 350 for both scenarios with 10% and 30% of censored. These plots are in agreement with the first-order asymptotic theory for the MLEs and indicate a fast convergence even for small sample sizes and considering censored data.

Applications

In this section, we provide two applications to real data to prove empirically the flexibility of the OLLG model. The computations are performed using the R software and NLMixed procedure in SAS. In the first application, we give an application for bimodal data comparing the OLLGG, GG and Weibull models. In the second application, we prove the usefulness of the new distribution for censored data.

Figure 5 Some OLLGG density functions at the true parameter values and at the AEs for scenario 1.

Figure 6 Some OLLGG density functions at the true parameter values and at the AEs for scenario 2.

Temperature data

The first data set refers to daily temperatures $({}^{0}C)$ $(n = 365)$ in the period from January 1 to December 31, 2011 in the city of Piracicaba obtained from the Department of Biosystems Engi-neering of the Luiz de Queiroz Superior School of Agriculture (ESALQ), part of the University of São Paulo (USP).

We show the superiority of the OLLGG distribution as compared to some of its sub-mo¬dels and also to the following non-nested models: the exponentiated generalized gamma (EGG) proposed by Cordeiro et al.²⁸ and beta Weibull (BW) distributions. The BW cdf (Famoye et al.,²⁹) is given by

Scenario 1
$n$	Parameters	Means	Biases	MSEs
50	$α$	1.8130	0.1870	0.0703
	$τ$	4.1719	0.8281	1.1152
	$k$	9.9011	0.0989	0.0601
	$λ$	0.2795	-0.1295	0.0319
150	$α$	1.8891	0.1109	0.0240
	$τ$	4.4648	0.5352	0.4132
	$k$	9.9893	0.0107	0.0824
	$λ$	0.2005	-0.0505	0.0031
350	$α$	1.9283	0.0717	0.0128
	$τ$	4.6425	0.3575	0.2232
	$k$	9.9929	0.0071	0.0913
	$λ$	0.1812	-0.0312	0.0014
Scenario 2
$n$	Parameters	Means	Biases	MSEs
50	$α$	19.4002	1.8909	6.0127
	$τ$	10.6098	2.4563	15.3457
	$k$	5.2667	-2.3912	6.8778
	$λ$	0.4200	-0.1318	0.0536
150	$α$	20.4151	0.8760	1.5490
	τ	11.5327	1.5334	5.6849
	$k$	4.1478	-1.2723	2.3679
	$λ$	0.3344	-0.0462	0.0070
350	$α$	21.3516	-0.0605	0.1011
	τ	13.2395	-0.1734	0.2929
	$k$	3.0900	-0.2145	0.2465
	$λ$	0.3040	-0.0158	0.0020

Table 3 Posterior means, biases and MSEs for the estimates of the OLLGG parameters

$F (t) = \frac{1}{B (a, b)} \int_{0}^{{1 - \exp [- (\frac{t}{α}) γ]}} w^{a - 1} {(1 - w)}^{b - 1} d w .$

The Kumaraswamy generalized gamma (KumGG) distribution (for t > 0) is defined by Pascoa et al.⁵ Its density function with five positive parameters $α, τ, k, λ a n d φ$ is given by

$f (t) = \frac{λ φ τ}{α Γ (k)} {(\frac{t}{α})}^{τ k - 1} \exp [- {(\frac{t}{α})}^{τ}] {γ_{1} [k {(\frac{t}{α})}^{τ}]}^{λ - 1} {(1 - {γ_{1} [k, {(\frac{t}{α})}^{τ}]}^{λ})}^{φ - 1}$ , (26)

$n$	Parameters	Actual values	0%	10%	30%
50	$α$	2.00	2.0404 (0.1984)	2.0366(0.2257)	2.0441 (0.2836)
	$τ$	5.00	5.3257 (1.8523)	5.395 (3.3121)	5.5626 (4.3955)
	$k$	10.00	10.7653 (2.9900)	10.9566 (3.20461)	11.2739 (3.63055) (3.63055)
	$λ$	0.15	0.1708 (0.0115)	0.1708 (0.0149)	0.1736 (0.0201)
150	$α$	2.00	2.0393 (0.0242)	2.0382 (0.03220)	2.0427 (0.0621)
	$τ$	5.00	5.1585 (0.2070)	5.1763 (0.2784)	5.2257 (0.5882)
	$k$	10.00	9.8491 (1.9955)	9.9663 (3.2201)	10.0686 (7.2771)
	$λ$	0.15	0.1528 (0.0011)	0.1521 (0.0015)	0.1539 (0.0022)
350	$α$	2.00	2.0065 (0.0024)	2.0089 (0.0033)	2.0181 (0.0115)
	$τ$	5.00	5.0417 (0.0276)	5.0483 (0.0315)	5.0823 (0.0969)
	$k$	10.00	10.0120 (0.2220)	9.9941 (0.3281)	9.9645 (1.3263)
	$λ$	0.15	0.1511 (0.0001)	0.1506 (0.0002)	0.1510 (0.0005)

Table 4 MLEs and (MSEs) for the estimates of the OLLGG parameters

$n$	Parameteres	Actual values	0%	10%	30%
50	$α$	2.00	1.8130 (0.0703)	1.7642 (0.1691)	1.6585 (0.3824)
	$τ$	5.00	4.1719 (1.1152)	3.9498 (1.8535)	3.6105 (2.7121)
	$k$	10.00	9.9011 (0.0601)	9.6298 (3.3379)	10.2626 (6.3004)
	$λ$	0.15	0.2795 (0.0319)	0.3293 (0.0623)	0.4377 (0.3072)
150	$α$	2.00	1.8891 (0.0240)	1.9183 (0.0474)	1.9070 (0.0548)
	$τ$	5.00	4.4648 (0.4132)	4.4970 (0.5506)	4.4131 (0.6805)
	$k$	10.00	9.9893 (0.0824)	9.6082 (3.9058)	9.5601 (4.2357)
	$λ$	0.15	0.2005 (0.0031)	0.2169 (0.0061)	0.2397 (0.0119)
350	$α$	2.00	1.9283 (0.0128)	1.9348 (0.0169)	1.9336 (0.0211)
	$τ$	5.00	4.6425 (0.2232)	4.6729 (0.2098)	4.6333 (0.2833)
	$k$	10.00	9.9929 (0.0913)	10.0471 (1.1531)	9.9108 (1.1627)
	$λ$	0.15	0.1812 (0.0014)	0.1795 (0.0012)	0.1876 (0.0020)

Table 5 Posterior means and (MSEs) for the estimates of the OLLGG parameters

where $γ_{1} (k, x) = \frac{γ (k, x)}{Γ (k)}$ is the incomplete gamma function ratio, $α$ is a scale parameter and the other positive parameters $τ, k, φ a n d λ$ are shape parameters.

Next, we report the MLEs and their corresponding standard errors (SEs) in parentheses of the parameters and the values of the Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (CAIC) and Bayesian Information Criterion (BIC). The lower the values of these criteria, the better the fit. In each case, the parameters are estimated by maximum likelihood using the NLMixed procedure in SAS.

We compute the MLEs of the model parameters and the AIC, CAIC and BIC statistics for each fitted model to these data. The OLLGG model was fitted and compared with the fits from two sub-models cited before. The results are reported in Table 6. The three information

Figure 7 Some OLLGG density functions at the true parameter values and at the AEs for scenario 1 and censored data.

Figure 8 Some OLLGG density functions at the true parameter values and at the AEs for scenario 2 and censoringed data.

criteria agree on the model’s ranking. The lowest values of these criteria correspond to the OLLGG distribution, which could be preferred in this case.

We perform the LR tests to verify if the extra shape parameter $λ$ is really necessary. We provide the histogram of the data and the fitted density functions. Formal tests for the skewness parameter in the generated distribution can be based on LR statistics. The LR statistics for comparing the fitted models are listed in Table 7. We reject the null hypotheses in the two tests in favor of the wider distribution. The rejection is extremely highly significant and it gives clear evidence of the potential need for the shape parameter $λ$ when modeling real data. More information is provided by a visual comparison of the histogram of the data and the fitted density functions. The plots of the fitted OLLGG, GG and Weibull densities are displayed in Figure 9a. The estimated OLLGG density provides the closest fit to the histogram of the data.

Model	$α$	$τ$	$k$	$λ$	$φ$	AIC	CAIC	BIC
OLLGG	21.2911	13.0661	2.8755	0.2882		1752.1	1752.2	1767.7
	(0.0012)	(0.0234)	(0.1095)	(0.0127)
KumGG	25.3965	25.2759	12.8897	0.0243	2.3730	1780.6	1780.7	1800.1
	(1.6147)	(3.0850)	(0.6885)	(0.0079)	(2.4887)
EGG	23.8850	22.9475	12.8766	0.0215	1	1777.6	1777.7	1793.2
	(2.8175)	(7.7331)	(9.1805)	(0.0019)
GG	26.1868	33.1789	0.1888	1		1777.6	1777.7	1788.3
	(0.1877)	(7.5737)	(0.0514)	(-)
Weibull	23.5808	9.4296	1	1		1796.4	1796.5	1804.2
	(0.1376)	(0.4038)	(-)	(-)
	$α$	$γ$	$a$	$b$
BW	25.0516	25.7636	0.2460	0.6159		1778.1	1778.3	1793.7
	(1.3335)	(8.2195)	(0.0858)	(0.4512)

Table 6 MLEs of the model parameters for the temperature data and information criteria

Models	Hypotheses	Statistic w	p-value
OLLGG vs GG OLLGG vs Weibull	$H_{0} : λ = 1 v s H_{1} : H_{0} i s f a l s e$ $H_{0} : λ = k = 1 v s H_{1} : H_{0} i s f a l s e$	26.5 48.3	<0.0001 <0.0001

Table 7 LR tests

In order to assess if the model is appropriate, plots of the fitted OLLGG, GG and Weibull cumulative distributions and the empirical cdf are displayed in Figure 9b. They indicate that the OLLGG distribution gives a good fit to these data.

Under a Bayesian approach, we also fit the OLLGG model and some models described above. For each fitted model to these data, the Bayesian estimates of the model parameters and the DIC, EAIC and EBIC statistics are shown in the Tables 8 and 9, respectively. According to the three Bayesian information criteria, the OLLGG model stands out as the best one.

Survival data

Aids is a pathology that mobilizes its sufferers because of the implications for their interpersonal relationships and reproduction. Therapeutic advances have enabled seropositive women to bear children safely. In this respect, the pediatric immunology outpatient service and social service of Hospital das Cl´ınicas have a special program for care of newborns of seropositive mothers to provide orientation and support for antiretroviral therapy to allow these women and their babies to live as normally as possible. Here, we analyze a data set on the time to serum reversal of 148 children exposed to HIV by vertical transmission, born at Hospital das Cl´ınicas (associated with the Ribeirão Preto School of Medicine) from 1995 to 2001, where the mothers were not treated (Silva,³⁰; Perdoná,³¹). Vertical HIV transmission can occur during gestation in around 35% of cases, during labor and birth itself in some 65% of cases, or during breast feeding, varying from 7% to 22% of cases. Serum reversal or serological reversal can occur in children of HIV-contaminated mothers. It is the process by which HIV antibodies disappear from the blood in an individual who tested positive for HIV infection. As the months pass, the maternal antibodies are eliminated and the child ceases to be HIV positive. The exposed newborns were monitored until definition of their serological condition, after administration of Zidovudin (AZT) in the first 24 hours and for the following 6 weeks. We assume that the lifetimes are independently distributed, and also independent from the censoring mechanism.

Figure 9 (a) Estimated densities of the OLLGG, GG and Weibull models for fibre data. (b) Estimated cumulative functions of the OLLGG, GG and Weibull models and the empirical cdf for temperature data.

Model	$α$	$τ$	$k$	$λ$	$φ$
OLLGG	20.5189 (0.7529)	12.6685 (1.2244)	4.3121 (1.2345)	0.2245 (0.0456)
	(18.8963, 21.8643)	(10.2667, 14.8841)	(2.0842, 6.8480)	(0.1573, 0.3168)
KumGG	25.4831 (0.1970)	25.7606 (0.2195)	13.3406 (0.1878)	0.0226 (0.00109)	2.3779 (0.1234)
	(25.1727, 25.8274)	(25.3986, 26.1771)	(13.0556, 13.7286)	(0.0205, 0.0247)	(2.1607, 2.6765)
EGG	24.2333 (0.1250)	23.9107 (0.3303)	9.5727 (0.6326)	0.0278 (0.0022)
	(23.9937, 24.4557)	(23.3771, 24.4652)	(8.6935, 10.9631)	(0.0237, 0.0322)
GG	26.1305 (0.2334)	32.4133 (7.9028)	0.2104 (0.0669)
	(25.6783, 26.5446)	(18.2675, 48.2415)	(0.0981, 0.3381)
Weibull	23.5782 (0.1381)	9.3741 (0.4078)
	(23.3033, 23.8465)	(8.5262, 10.1351)

Table 8 Posterior mean (standard deviation) and 95% Highest Posterior Density (HPD) interval of the model parameters

Model	DIC	EAIC	EBIC
OLLGG	1746.344	1752.546	1768.146
KumGG	1775.319	1783.009	1802.508
EGG	1773.724	1779.722	1795.322
GG	1774.657	1779.718	1791.418
Weibull	1796.501	1798.483	1806.283

Table 9 Bayesian information criteria

Tables 10-12 list, respectively, the MLEs and their corresponding SEs in paren¬theses and posterior mean (standard deviation) and 95% highest posterior density (HPD) interval for the parameters and the values of the model selection statistics. These results indicate that the OLLGG model has the lowest AIC, BIC, CAIC, DIC, EAIC e EBIC values among those of all fitted models, and hence it could be chosen as the best model.

Note that the KumGG model is competitive with the model OLLGG. However, the model KumGG has two disadvantages:

It does not model bimodal data.

It has five parameters, i.e. is less parsimonious.

Model	$α$	$τ$	$k$	$λ$	$φ$	AIC	BIC	CAIC
OLLGG	352.0	46.9706	0.1043	0.4468		771.1	783.6	771.9
	(1.0590)	(1.4847)	(0.0324)	(0.0881)
KumGG	350.05	49.8303	0.2176	0.1282	0.3424	770.7	785.7	771.1
	(1.5707)	(5.8895)	(0.0073)	(0.0236)	(0.0522)
EGG	350.45	22.2991	1.0741	0.1072	1	798.1	810.1	798.3
	(2.4187)	(0.0375)	(0.0004)	(0.0113)
GG	379.40	24.5312	0.0974	1	1	783.7	792.7	783.9
	(8.8211)	(10.3258)	(0.0402)
Weibull	307.62	3.1132	1	1	1	808.0	814.0	808.1
	(12.3523)	(0.3250)
	$α$	$γ$	a	b
BW	349.99	6.3895	0.3944	0.9273		797.9	809.9	798.2
	(23.0923)	(0.7657)	(0.0468)	(0.3361)

Table 10 MLEs of the model parameters for the serum reversal data, the corresponding SEs (given in parentheses) and the AIC, BIC and CAIC statistics

A comparison of the proposed distribution with some of its sub-models using LR statis¬tics is performed in Table 13. The figures in this table, specially the p-values, suggest that the OLLGG model yields a better fit to these data than the other three distributions. In order to assess if the model is appropriate, plots of the estimated survival functions of the KumGG, EGG, GG, Weibull and BW distributions and the empirical survival function are given in Figure 10. We conclude that the OLLGG distribution provides a good fit for these data.

Model	$α$	$τ$	$k$	$λ$	$φ$
OLLGG	348.9 (11.5813)	47.7542 (22.7428)	0.1741 (0.1443)	0.4342 (0.1619)
	(324.1, 366.5)	(15.3289, 98.0374)	(0.0230, 0.4910)	(0.1331, 0.7222)
KumGG	351 (1.0623)	42.8395 (1.4827)	0.0114 (0.00383)	3.0697 (0.5911)	0.3601 (0.0550)
	(349.0, 353.1)	(39.7113, 45.1984)	(0.0058, 0.0191)	(1.7862, 4.0205)	(0.2678, 0.4790)
EGG	348.6 (0.8519)	19.7657 (1.1290)	4.3776 (1.0638)	0.0309 (0.0097)
	(347.3, 350.4)	(18.3590, 22.5768)	(2.5525, 6.0764)	(0.0177, 0.0505)
GG	376.3 (6.7347)	44.2185 (16.2531)	0.0652 (0.0341)
	(364.5, 389.9)	(15.9226, 71.1764)	(0.0279, 0.1302)
Weibull	307.5 (12.6278)	3.0864 (0.3237)
	(283.7, 333.4)	(2.4619, 3.7203)

Table 11 Posterior means (Stantard Deviations) and 95% HPD intervals for the model parameters in the serum reversal data

Model	DIC	EAIC	EBIC
OLLGG	752.017	775.385	787.3738
KumGG	764.746	772.79	787.776
EGG	781.475	788.53	800.519
GG	776.599	783.425	792.417
Weibull	807.984	809.989	815.983

Table 12 Bayesian information criterion

Model	Hypotheses	Statistic w	p-value
OLLGG vs GG OLLGG vs Weibull	$H_{0} : λ = 1 v s H_{1} : H_{0} i s f a l s e$ $H_{0} : φ = λ = k = 1 v s H_{1} : H_{0} i s f a l s e$	13.0 40.3	0.00031 <0.0001

Table 13 LR statistics for the serum reversal data

Concluding remarks

The odd log-logistic generalized gamma (OLLGG) distribution provides a rather general and flexible framework for statistical analysis of positive data. It unifies some previously known distributions and yields a general overview of these distributions for theoretical studies. It also represents a rather flexible mechanism for fitting a wide spectrum of real world data sets. The OLLGG distribution is motivated by the wide use of the generalized gamma (GG) distribution in practice, and also for the fact that the generalization provides more flexibility to analyze skewed data. This extension provides a continuous cross over to other cases with different shapes (e.g. a particular combination of skewness and kurtosis). We derive an expansion for the density function as a linear combination of GG density functions. We obtain explicit expressions for the moments and moment generating function. The estimation of parameters is approached by the maximum likelihood method and a Bayesian approach, where the Gibbs algorithms along with metropolis steps are used to obtain the posterior summaries of interest for survival data with right censoring. Two applications of the OLLGG distribution to real data show that it could provide a better fit than other statistical models frequently used in lifetime data analysis.

Figure 10 Estimated survival function by fitting the OLLGG distribution and some other models and the empirical survival for the serum reversal data. (a) OLLGG vs KGG and GG. (b) OLLGG vs BW and Weibull.