Research Article Volume 8 Issue 6
Department of Statistics, Mathematics and Insurance, Faculty of Commerce, Tanta University, Egypt
Correspondence: Dina H. Abdel Hady, Department of Statistics, Mathematics and Insurance, Faculty of Commerce, Tanta University, Egypt
Received: September 24, 2019 | Published: December 30, 2019
Citation: Abdelhady DH. Parameter estimation for the bivariate inverse lomax distribution by the EM algorithm based on censored samples. Biom Biostat Int J. 2019;8(6):223-229. DOI: 10.15406/bbij.2019.08.00293
In this article, we estimate the parameters of the bivariate Inverse Lomax distribution of Marshall-Olkin based on right censored sample. Utilizing EM algorithm is a priority because the vector of the observed data is not complete but viewed as an observable function of complete data. After that, the EM algorithm makes use of the simplicity of maximum likelihood estimation for complete data. In addition, normal deviations of the estimates of bivariate Lomax distribution are derived.
A comparison is conducted via a simulation study between estimates obtained by using the EM algorithm and without the EM algorithm.
Keywords: bivariate lomax distribution, censored sample, EM algorithm, missing data, simulation study
Inverse Lomax distribution that is one of the inverted distributions which is very flexible to analyze the situation with non-monotonic failure rate, Singh, et al.1 If a random variable Y has Lomax distribution, then x=1/y has an Inverse Lomax distribution (ILD) Singh, et al.2 The Inverse Lomax Distribution (ILD) is used in random modeling of life components that have a decreasing failure rate. Like other distributions included in the family of generalized Beta distribution, Inverse Lomax Distribution also has application in actuarial sciences and economy and Kleiber and Kotz.3
Inverse Lomax was implemented on geophysical databases McKenzie, et al. (2011)4 These databases were about sizes of land fires in the state of California Rahman et al.,5 carried out research about statistical inference and Prediction on inverse Lomax distribution via Bayesian inferences. Kleiber6 tackled Inverse Lomax distribution to possess the Lorenz ordering relationship between ordered statistics.
In this article, maximum likelihood estimates (MLEs) of the parameters of the bivariate Inverse Lomax distribution (MOBIL) of Marshall-Olkin7 are obtained based on censored samples. The censoring time (T) is supposed to be independent of the life times X,Y of the two components. For example, this situation happens in medical studies of organs with pairs like kidneys, eyes, lungs, or any other paired organs of an individual like a two-component system that is interdependent. Failure of an individ ual might censor failure of either one of the paired organ or both.
This scheme of censoring is right censoring, There are similar situation in engineering science whenever sub-systems are considered having two components with life times (X,Y) being independent of the life time (T) of the entire system. However, failure of the main system may censor failure of either one component or both. Maximum likelihood estimators of the parameters for the case of univariate right censoring were derived by Hanagal (a,b).8,9 In addition, censoring might happen in different ways. Patients may not follow up during the study. Some patients might decide to move somewhere else. Thereupon, the experimenter may not follow them again, or the patients may not continue to cooperate because of bad side effects of the treatment. These cases are called withdrawal from the study. Useful information is presented by patients with censored data. Therefore, they should not be omitted from the analysis. Due to lack of data about real processes, the data in this study are derived from the BVL of Marshall-Olkin7 with Matlab software. Subsequently, the parameters are assessed using EM algorithm.
Let X be a random variable with the following CDF as follows;
FIL(x)=(1+θx)-α ,x>0 , α,θ>0 (1)FIL(x)=(1+θx)−α ,x>0 , α,θ>0 (1)
The distribution of this form is said to be a Inverse Lomax distribution with parameters α and θ, will be denoted by IL(α,θ). The PDF and the hazard function of Inverse Lomax distribution with parameter (α,θ) will be
fIL(x;α,k)=θαx2(1+θx)−α−1 ,x>0 (2)fIL(x;α,k)=θαx2(1+θx)−α−1 ,x>0 (2)
Suppose U1,U2 and U3 are three independent random variables such that Ui~ IL(αi,θ)Ui~ IL(αi,θ) for i = 1, 2 and 3 it is assumed that α1, α2, α3, θ>0α1, α2, α3, θ>0. Define X=max(u1,u3)X=max(u1,u3) and Y=max(u2,u3)Y=max(u2,u3).
Then we say that the bivariate vector (X,Y)(X,Y) has a bivariate Inverse Lomax distribution Marshall-Olkin.7
The joint cumulative function of X and Y is defined as
FX,Y(x,y)=(1+θx)−α1(1+θy)−α2(1+θmin(x, y))−α3 FX,Y(x,y)=(1+θx)−α1(1+θy)−α2(1+θmin(x, y))−α3 =FIL(x;α1,θ)FIL(y;α2,θ)FIL(min(x, y).;α3,θ) (3)=FIL(x;α1,θ)FIL(y;α2,θ)FIL(min(x, y).;α3,θ) (3)
FX,Y(x,y)={F1(x,y) 0<x<y<∞F2(x,y) 0<y<x<∞F0(x) 0<x=y<∞ (4)FX,Y(x,y)=⎧⎪⎨⎪⎩F1(x,y) 0<x<y<∞F2(x,y) 0<y<x<∞F0(x) 0<x=y<∞ (4)
where
F1(x,y)=F(y;α2+α3,θ)F(x;α1,θ)= (1+θx)−(α1+α3)(1+θy)−α2F1(x,y)=F(y;α2+α3,θ)F(x;α1,θ)= (1+θx)−(α1+α3)(1+θy)−α2
F2(x,y)=F(x;α1+α3,θ)F(y;α2,θ)=(1+θy)−(α2+α3)(1+θx)−α1F2(x,y)=F(x;α1+α3,θ)F(y;α2,θ)=(1+θy)−(α2+α3)(1+θx)−α1
F0(x)=F(x;α,θ)=(1+θx)−αF0(x)=F(x;α,θ)=(1+θx)−α
and α=α1+α2+α3α=α1+α2+α3
The joint probability density function fX,Y(x,y)fX,Y(x,y) of X and Y takes the form
fX,Y(x,y)={f1(x,y) if x<yf2(x,y) if y<xf0(x) if x=y (5)fX,Y(x,y)=⎧⎪⎨⎪⎩f1(x,y) if x<yf2(x,y) if y<xf0(x) if x=y (5)
where
f1(x,y)=(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1f1(x,y)=(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1
f2(x,y)=(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1f2(x,y)=(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1
and
f0(x)=θ2α3x2(1+θx)−α−1f0(x)=θ2α3x2(1+θx)−α−1
The density functions of X|{(x,y)|x>y}, Y|{(x,y)|y>x}X|{(x,y)|x>y}, Y|{(x,y)|y>x} and Z=min (X,Y) are given as follows:
fX|{(x,y)|x>y}(x)=θ(α1+α3)x2(1+θx)−(α1+α3)−1(1+θy)α1+α3 , x>yfX|{(x,y)|x>y}(x)=θ(α1+α3)x2(1+θx)−(α1+α3)−1(1+θy)α1+α3 , x>y
fY|{(x,y)|y>x}(y)=θ(α2+α3)y2(1+θy)−(α2+α3)−1(1+θx)α2+α3 , y<xfY|{(x,y)|y>x}(y)=θ(α2+α3)y2(1+θy)−(α2+α3)−1(1+θx)α2+α3 , y<x
fZ(z)=θ αz2(1+θz)−(α)−1 , Z=min(x,y) fZ(z)=θ αz2(1+θz)−(α)−1 , Z=min(x,y)
This article aims at deriving an estimation method for the parameters of a bivariate Inverse Lomax distribution of Marshall-Olkin by the EM algorithm. In Section 2, the EM algorithm is presented, while in Section 3, the parameters are estimated by applying the EM algorithm. Finally in Section 4, we present the results of a simulation study.
The expectation-maximization (EM) algorithm was introduced by Dempster et al.10 The algorithm is considered a repeated procedure used to find the maximum likelihood estimates of data that are missing, incomplete, unobserved or censored data. Because it is easy to implement, the impact of the EM algorithm has had great effect, not only as a tool for computation but also as a method of solving complicated statistical problems.
The basic idea behind the method is to transform a set of incomplete data into a complete data problem for which the required maximization is computationally more tractable and stable numerically. Each repetition raises the likelihood, which is finally converged to a local maximum. The complete data set x can be viewed as consisting of vectors (t,t*)(t,t∗), where t is the observed incomplete data, and t*t* is the missing data.
The EM Algorithm method has been applied by several authors, for example, see Qin, et al.,11 Rudolf,12 Ning, et al.,13 Zanini, et al.,14 Açıkgöz,15 Kalabatsos,16 Attia, et al.17 and Hanagal and Ahmadi.18
The iterations
The objective is to draw inferences about the parameter vectors We will useto denote the likelihood function where t is the vector of observed data. Let t*t* represent the vector of missing data. Starting with a guessed value for the parameter α_α––, carry out the following iterations:
Execution of the algorithm
The two steps are repeated iteratively until the difference between two successive iterations is less than 0.00001. This iterative procedure leads to a monotonic increase of log L(α_ ,E [T*|t])L(α–– ,E [T∗|t]):
logL(α_(k+1) ,E [T*|t])≥logL(α_ ,E [T*|t]) for k = 1,2,... (8)logL(α––(k+1) ,E [T∗∣∣t])≥logL(α–– ,E [T∗|t]) for k = 1,2,... (8)
Since the likelihood increases in each step, the EM algorithm converges generally to a local maximum. When there is no closed form solution of the M-step, a numerical algorithm as, for example, the Newton-Raphson procedure, may be used for iteratively computing α_(k)α––(k). In fact, in this paper the Newton-Raphson procedure is used to obtain maximum likelihood estimates of α_α–– at the (k +1)th iteration as follows:
α_(k+1)=α_(k)−(∂2Q(α_|α_(k))∂α_∂ˊα_)−1α_=α_(k)(∂Q(α_|α_(k))∂α_)α_=α_(k) (9)α––(k+1)=α––(k)−(∂2Q(α––|α––(k))∂α––∂´α––)−1α––=α––(k)(∂Q(α––|α––(k))∂α––)α––=α––(k) (9)
The iterative procedure is carried out until the difference α_(k+1)−α_(k)<0.0001α––(k+1)−α––(k)<0.0001, For more detail see Hanagal and Ahmadi.18
The density function of (X,Y) is given by
fX,Y(x,y)={(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1 for 0<x<y<∞(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1 for 0<y<x<∞θ2α3x2(1+θx)−α−1 for x=y(10)fX,Y(x,y)=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1 for 0<x<y<∞(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1 for 0<y<x<∞θ2α3x2(1+θx)−α−1 for x=y(10)
For the bivariate life time distribution, we use the univariate censoring plan presented by Hanagal(a,b)8,9 as the persons do not join the study at the same time and withdrawal or death of a person or ending the study will censor life times of both components. The time of censoring is independent of life times of both components, This is the standard univariate right.
Suppose that there are n independent pairs of components, for example, paired kidneys, lungs, eyes, ears in an individual under study and ith pair of the components have life times (Xi,Yi)(Xi,Yi) and censoring time (Ti ). The life times associated with i-th pair of the components are given by
(Xi,Yi)={(Xi,Yi) ifmax (Xi,Yi)<Ti(Xi,Tiy) if Xi<Tiy<Yi(Tix,Yi) if Yi<Tix<Xi(Tixy,Tixy) ifmin (Xi,Yi)>Tixy (11)(Xi,Yi)=⎧⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪⎩(Xi,Yi) ifmax (Xi,Yi)<Ti(Xi,Tiy) if Xi<Tiy<Yi(Tix,Yi) if Yi<Tix<Xi(Tixy,Tixy) ifmin (Xi,Yi)>Tixy (11)
for i = 1,2,...,n
where Tix,Tiy and Tixy represent the unobserved random variables x<y, x>y, x=y respectively.
The likelihood of the sample of size n after discarding factors which do not contain any of the parameters of interest is given as follows
L(α_,tix,tiy,tixy)=∏6j=i∏nji=1fjX,Y(xi,yi) (12)L(α––,tix,tiy,tixy)=∏6j=i∏nji=1fjX,Y(xi,yi) (12)
where
f1X,Y(x,y)=(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1 , 0<x<y<tf2X,Y(x,y)=(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1 , 0<x<y<tf3X,Y(x,y)=θ2α3x2(1+θx)−α−1 , 0< x=y<tf4X,Y(x,y)=(α1+α3)α2 θ2x2 ty2(1+θx)−(α1+α3)−1(1+θty)−α2−1 ,0<x<t<ty=y f5X,Y(x,y)=(α2+α3)α1θ2tx2 y2(1+θy)−(α2+α3)−1(1+θtx)−α1−1 , 0<y<t<tx=x f6X,Y(x,y)=θ2α3txy2(1+θtxy)−α−1 , 0<t<txy=min(x,y)}(13)f1X,Y(x,y)=(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1 , 0<x<y<tf2X,Y(x,y)=(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1 , 0<x<y<tf3X,Y(x,y)=θ2α3x2(1+θx)−α−1 , 0< x=y<tf4X,Y(x,y)=(α1+α3)α2 θ2x2 ty2(1+θx)−(α1+α3)−1(1+θty)−α2−1 ,0<x<t<ty=y f5X,Y(x,y)=(α2+α3)α1θ2tx2 y2(1+θy)−(α2+α3)−1(1+θtx)−α1−1 , 0<y<t<tx=x f6X,Y(x,y)=θ2α3txy2(1+θtxy)−α−1 , 0<t<txy=min(x,y)⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭(13)
where ∑6j=1nj=n∑6j=1nj=n,n1, n2, n3, n4, n5 and n6n1, n2, n3, n4, n5 and n6 be the numbers of realizations falling in the range corresponding to fjX,Y(x,y)fjX,Y(x,y) and j=1,2,...,6, respectively.f1X,Y(x,y)f1X,Y(x,y), f2X,Y(x,y)f2X,Y(x,y), f4X,Y(x,y)f4X,Y(x,y) and f5X,Y(x,y)f5X,Y(x,y) are density functions with respect to the Lebesque measure on R2, while , f3X,Y(x,y)f3X,Y(x,y),and f6X,Y(x,y)f6X,Y(x,y) are density functions with respect to the Lebesque measure on R.
Let the range of variability corresponding to fjX,Y(x,y)fjX,Y(x,y) be denoted by AjAj, j=1, 2, …., 6.and A1,…,A6A1,…,A6 are disjoint sets and letting E1=A1∪A4E1=A1∪A4,E2=A2∪A5,E3=A3∪A6,E4=E1∪E2E2=A2∪A5,E3=A3∪A6,E4=E1∪E2,E5=E1∪A2∪A3E5=E1∪A2∪A3 and E6=E2∪A1E6=E2∪A1, the log-likelihood function lnLc(α_,tix,tiy,tixy)lnLc(α––,tix,tiy,tixy) can be written as follows:
lnL=(n1+n4)(ln(α1+α3)+lnα2)+(n2+n5)(ln(α2+α3)+lnα1)+(n3+n6)lnα3lnL=(n1+n4)(ln(α1+α3)+lnα2)+(n2+n5)(ln(α2+α3)+lnα1)+(n3+n6)lnα3
−2∑i∈E5ln(xi)−2∑i∈E6ln(yi)−(α1+α3+1)∑i∈E1ln(1+θxi)−(α+1)∑i∈A3ln(1+θxi)−2∑i∈E5ln(xi)−2∑i∈E6ln(yi)−(α1+α3+1)∑i∈E1ln(1+θxi)−(α+1)∑i∈A3ln(1+θxi)
−(α1+1)∑i∈A2ln(1+θxi)−(α2+α3+1)∑i∈E2ln(1+θyi)−(α2+1)∑i∈A6ln(1+θyi) −(α1+1)∑i∈A2ln(1+θxi)−(α2+α3+1)∑i∈E2ln(1+θyi)−(α2+1)∑i∈A6ln(1+θyi)
−2∑i∈A4ln(tyi)−2∑i∈A5ln(txi) −(α1+1)∑i∈A5ln(1+θtxi)−(α2+1)∑i∈A4ln(1+θtyi) −2∑i∈A4ln(tyi)−2∑i∈A5ln(txi) −(α1+1)∑i∈A5ln(1+θtxi)−(α2+1)∑i∈A4ln(1+θtyi)
−2∑i∈A6ln(txyi)−(α+1)∑i∈A6ln(1+θtxyi)+nln(θ)+2(n1+n4+n2+n5)ln(θ) (14) −2∑i∈A6ln(txyi)−(α+1)∑i∈A6ln(1+θtxyi)+nln(θ)+2(n1+n4+n2+n5)ln(θ) (14)
Estimation parameter with EM algorithm
The E-step and M- step are obtained as follow:
E- step:
The unobserved random variables, Tix,Tiy and Tixy follow the distributions as stated in (7). The conditional distributions of Tx|{(tx,t,y)|(tx>t>y)}Tx|{(tx,t,y)|(tx>t>y)}, Ty|{(ty,t,x)|(ty>t>x)}Ty∣∣{(ty,t,x)∣∣(ty>t>x)} and Txy|{(txy,t)|(txy>t)}Txy∣∣{(txy,t)∣∣(txy>t)} are given as follows:
fTx|(tx,t,y)(tx)=θ(α1+α3)tx2(1+θtx)−(α1+α3)−1(1+θt)α1+α3 , y<t<tx fTy|(ty,t,x)(ty)=θ(α2+α3)ty2(1+θty)−(α2+α3)−1(1+θt)α2+α3, x<t<tyfTxy|(txy,t)(txy)=θ αtxy2(1+θtxy)−(α)−1(1+θt)(α) ,t>txy}(15)fTx|(tx,t,y)(tx)=θ(α1+α3)tx2(1+θtx)−(α1+α3)−1(1+θt)α1+α3 , y<t<tx fTy∣∣(ty,t,x)(ty)=θ(α2+α3)ty2(1+θty)−(α2+α3)−1(1+θt)α2+α3, x<t<tyfTxy∣∣(txy,t)(txy)=θ αtxy2(1+θtxy)−(α)−1(1+θt)(α) ,t>txy⎫⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎬⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭(15)
The values of the first moments of the conditional unobserved random variables (1+Txk)|{(tx,t,y)|(tx>t>y)} ,(1+Tyk)|{(ty,t,x)|(ty>t>x)} and
E(1+Tx/k)|{(tx,t,y)|(tx>t>y)}=−(α1+α3)α1+α3−1 (1+θt)=γ1(1+θt) (say)E(1+Ty/k)|{(ty,t,x)|(ty>t>x)}=−(α2+α3)α2+α3−1(1+θt)=γ2(1+θt) (say)E(1+Txy/k)|{(txy,t)|(txy>t)}=−αα−1(1+tk) =γ3(1+θt) (say)}(16)
The conditional expectation Q(α_|α_(k)) is obtained as follows:
Q(α_|α_(k))=(n1+n4)(ln(α1+α3)+lnα2)+(n2+n5)(ln(α2+α3)+lnα1)+(n3+n6)lnα3 −2∑i∈E5ln(xi)−2∑i∈E6ln(yi)−(α1+α3+1)∑i∈E1˙xi−(α+1)∑i∈A3˙xi −(α1+1)∑i∈A2˙xi−(α2+α3+1)∑i∈E2˙yi−(α2+1)∑i∈A6˙yi +2∑i∈A4ln(γ(k)2−1θ+γ(k)2ti)+2∑i∈A5ln(γ(k)1−1θ+γ(k)1ti) −(α1+1)∑i∈A5(γ(k)1+˙ti) −(α2+1)∑i∈A4(γ(k)2+˙ti) +2∑i∈A6ln(γ(k)3−1θ+γ(k)3ti)−(α+1)∑i∈A6(γ(k)3+˙ti) + nln(θ)+2(n1+n4+n2+n5)ln(θ) (17)
suppose is θ known and γ1(k), γ2(k), γ3(k) are expressed in terms of α1(k), α2(k), α3(k) at Kth iteration, and
˙xi=ln(1+θxi), ˙yi=ln(1+θyi) and ˙ti=ln(1+θti).
M-step:
The following equations are obtained by equating the partial derivatives of Q(α_|α_(k)) with respect to α1,α2and α3 to zero:
n2+n5α1+n1+n4α1+α3−∑i∈D1˙xi−∑i∈A5(γ(k)1+˙ti)−∑i∈A6(γ(k)3+˙ti)=0 (18)
n1+n4α2+n2+n5α2+α3−∑i∈A3˙xi−∑i∈D2˙yi−∑i∈A4(γ(k)2+˙ti)−∑i∈A6(γ(k)3+˙ti)=0 (19)
n1+n4α1+α3+n2+n5α2+α3+n3+n6α3−∑i∈A3˙xi−∑i∈E1˙xi−∑i∈E2˙yi−∑i∈A6(γ(k)3+˙ti)=0 (20)
Where D1=E1∪A2∪A3 and D2=E2∪A6.
The above likelihood equations are solved for the maximum likelihood estimates (ˆα1(k), ˆα2(k), ˆα3(k)) using the Newton-Raphson procedure. Below the observed elements of the symmetric information matrix at the current guess α_=α_(k) are given which is required in the Newton-Raphson iterative procedure.
−∂2Q(α_|α_(k))∂α21=(n2+n5)α12+(n1+n4)(α1+α3)2−∂2Q(α_|α_(k))∂α1∂α2=0−∂2Q(α_|α_(k))∂α1∂α3=(n1+n4)(α1+α3)2−∂2Q(α_|α_(k))∂α22=(n1+n4)α22+(n2+n5)(α2+α3)2−∂2Q(α_|α_(k))∂α2∂α3=(n2+n5)(α2+α3)2−∂2Q(α_|α_(k))∂α23=(n1+n4)(α1+α3)2+(n2+n5)(α2+α3)2+(n3+n6)α23}(21)
As mentioned above the iterative process is carried out until the following condition is met by two subsequent solutions:
L(α_(k+1),E[tix,tiy,tixy|xi,yi,ti])−L(α_(k),E[tix,tiy,tixy|xi,yi,ti])<0.0001 (22)
Estimation parameter without EM algorithm
The likelihood of the sample of size n after discarding factors which do not contain any of the parameters of interest is given as follows
L(α_,t)=∏6j=i∏nji=1fjX,Y(xi,yi) (23)
where
f1X,Y(x,y)=(α1+α3)α2 θ2x2 y2(1+θx)−(α1+α3)−1(1+θy)−α2−1 , 0<x<y<tf2X,Y(x,y)=(α2+α3)α1θ2x2 y2(1+θy)−(α2+α3)−1(1+θx)−α1−1 , 0<x<y<tf3X,Y(x,y)=θ2α3x2(1+θx)−α−1 , 0< x=y<tf4X,Y(x,y)=(α1+α3)α2 θ2x2 t2(1+θx)−(α1+α3)−1(1+θt)−α2−1 ,0<x<t<yf5X,Y(x,y)=(α2+α3)α1θ2t2 y2(1+θy)−(α2+α3)−1(1+θt)−α1−1 , 0<y<t<xf6X,Y(x,y)=θ2α3t2(1+θt)−α−1 , 0<t<min(x,y)}(24)
The log-likelihood function lnLc(α_,t) can be written as follows:
lnL=(n1+n4)(ln(α1+α3)+lnα2)+(n2+n5)(ln(α2+α3)+lnα1)+(n3+n6)lnα3 −2∑i∈E5ln(xi)−2∑i∈E6ln(yi)−(α1+α3+1)∑i∈E1˙xi−(α+1)∑i∈A3˙xi −(α1+1)∑i∈A2˙xi−(α2+α3+1)∑i∈E2˙yi−(α2+1)∑i∈A6˙yi −2∑i∈A4ln(ti)−2∑i∈A5ln(ti) −(α1+1)∑i∈A5˙ti−(α2+1)∑i∈A4˙ti −2∑i∈A6ln(ti)−(α+1)∑i∈A6˙ti+nln(θ)+2(n1+n4+n2+n5)ln(θ) (25)
The following likelihood equations are obtained by equating the partial derivatives of ln(L) with respect to α1,α2and α3 to zero:
n2+n5α1+n1+n4α1+α3−∑i∈D1˙xi−∑i∈A5∪A6˙ti=0 (26)
n1+n4α2+n2+n5α2+α3−∑i∈A3˙xi−∑i∈D2˙yi−∑i∈A4∪A6˙ti=0 (27)
n1+n4α1+α3+n2+n5α2+α3+n3+n6α3−∑i∈A3˙xi−∑i∈E1˙xi−∑i∈E2˙yi−∑i∈A6˙ti=0 (28)
The sample data are generated based on following algorithms:
Step 1: Generate ui using the Lomax distribution white parameters α1,α2and α3.
Step 2: Take X=min(u1,u3) and Y=min(u2,u3) and, therefore (X,Y) follows a bivariate Inverse Lomax distribution of Marshall-Olkin type.
Step 3: Generate ti using the Inverse Lomax distribution with β,θ, where tis are the censoring times.
For two cases with respect to the αis, we generated 1000 sets of samples. Each set consisted of three samples with sizes n = 20, 35, 50 and 100. The corresponding maximum likelihood estimates are displayed in Table 1 together with the empirical standard deviation. The estimates denoted by MLEem and SEem are obtained by using the EM algorithm, while the estimates denoted by MLE and SE are obtained without the EM algorithm.
Parameters |
α1 | α2 | α3 | α4 | α5 | α6 |
---|---|---|---|---|---|---|
1.6 |
1.4 |
1.8 |
0.6 |
0.5 |
0.4 |
|
n=20 | ||||||
MLEem |
1.6113 |
1.3882 |
1.8154 |
0.6131 |
0.4812 |
0.3874 |
SEem |
0.0771 |
0.0682 |
0.0854 |
0.0865 |
0.0967 |
0.0793 |
MLE |
1.6173 |
1.3845 |
1.8193 |
0.6216 |
0.4735 |
0.3818 |
SE |
0.0943 |
0.0817 |
0.0975 |
0.0972 |
0.1076 |
0.0926 |
n=35 | ||||||
MLEem |
1.6092 |
1.3914 |
1.8122 |
0.6092 |
0.4885 |
0.3917 |
SEem |
0.0723 |
0.0637 |
0.0761 |
0.0782 |
0.0913 |
0.0725 |
MLE |
1.6117 |
1.3902 |
1.8164 |
0.6173 |
0.4834 |
0.3861 |
SE |
0.0788 |
0.0636 |
0.0826 |
0.0841 |
0.1015 |
0.0844 |
n=50 | ||||||
MLEem |
1.6064 |
1.3931 |
1.8093 |
0.6062 |
0.4921 |
0.3943 |
SEem |
0.0681 |
0.0586 |
0.0717 |
0.0726 |
0.0813 |
0.0637 |
MLE |
1.6113 |
1.3915 |
1.8124 |
0.6125 |
0.4876 |
0.3903 |
SE |
0.0714 |
0.0622 |
0.0773 |
0.0796 |
0..0972 |
0.0779 |
n=100 | ||||||
MLEem |
1.6033 |
1.3947 |
1.8068 |
0.6046 |
0.4969 |
0.3973 |
SEem |
0.0655 |
0.0561 |
0.0693 |
0.0597 |
0.0772 |
0.0589 |
MLE |
1.6055 |
1.3968 |
1.8059 |
0.6071 |
0.4934 |
0.3942 |
SE |
0.0658 |
0.0572 |
0.0714 |
0.0615 |
0.0786 |
0.0603 |
Table 1 Comparison of MLEs obtained using the EM algorithm and without EM algorithm.
Table 1 and Table 2 shows that the estimates obtained by using the EM algorithm gave a smaller empirical standard error than those obtained without the EM algorithm.
Parameters |
α1 | α2 | α3 | α4 | α5 | α6 |
---|---|---|---|---|---|---|
2.5 |
2 |
2.2 |
1.1 |
0.9 |
1.4 |
|
n=20 | ||||||
MLEem |
2.613 |
2.221 |
2.414 |
1.289 |
1.089 |
1.603 |
SEem |
0.141 |
0.135 |
0.157 |
0.167 |
0.149 |
0.163 |
MLE |
2.777 |
2.335 |
2.575 |
1.432 |
1.176 |
1.699 |
SE |
0.173 |
0.169 |
0.189 |
0.192 |
0.176 |
0.194 |
n=35 | ||||||
MLEem |
2.587 |
2.198 |
2.385 |
1.176 |
1.054 |
1.547 |
SEem |
0.109 |
0.119 |
0.132 |
0.123 |
0.135 |
0.152 |
MLE |
2.711 |
2.282 |
2.513 |
1.356 |
1.103 |
1.621 |
SE |
0.148 |
0.135 |
0.159 |
0.184 |
0.161 |
0.183 |
n=50 | ||||||
MLEem |
2.534 |
2.145 |
2.313 |
1.142 |
0.987 |
1.498 |
SEem |
0.089 |
0.097 |
0.114 |
0.114 |
0.119 |
0.124 |
MLE |
2.665 |
2.214 |
2.478 |
1.258 |
1.067 |
1.557 |
SE |
0.123 |
0.123 |
0.137 |
0.162 |
0..149 |
0.168 |
n=100 | ||||||
MLEem |
2.516 |
2.061 |
2.257 |
1.122 |
0.931 |
1.433 |
SEem |
0.078 |
0.068 |
0.096 |
0.079 |
0.088 |
0.104 |
MLE |
2.608 |
2.111 |
2.319 |
1.207 |
0.987 |
1.519 |
SE |
0.117 |
0.109 |
0.121 |
0.123 |
0.117 |
0.147 |
Table 2 Comparison of MLEs obtained using the EM algorithm and without EM algorithm.
Moreover, the estimates MLE_em are close to the true parameter values and the standard errors (SE_em) decrease as the sample size increases. The estimates for both methods are obtained by taking the mean of the 1000 maximum likelihood estimates and the mean of the 1000 standard errors from the 1000 samples of size n = 20, 35,50 and 100.
Also note that with a large sample size estimates of the values of both methods (with and without EM algorithm) from each other as well as the standard error, which means that when the sizes of large samples may give both methods convergent results and then fit whichever is applicable.
None.
None.
©2019 Abdelhady. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7