Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 4 Issue 7

Inference for zero inflated truncated power series family of distributions

MK Patil

Padmabhushan Vasantraodada Patil Mahavidyalaya, India

Correspondence: MK Patil, Padmabhushan Vasantraodada Patil Mahavidyalaya, Kavathe Mahankal, Dist. Sangli, India

Received: August 14, 2016 | Published: December 6, 2016

Citation: Patil MK. Inference for zero inflated truncated power series family of distributions. Biom Biostat Int J. 2016;4(7):119-122. DOI: 10.15406/bbij.2016.04.00115

Download PDF

Abstract

Zero-inflated data indicates that the data set contains an excessive number of zeros. The word zero-inflation is used to emphasize that the probability mass at the point zero exceeds than the one allowed under a standard parametric family of discrete distributions. Gupta et al.,1 Murat & Szynal,2 Patil & Shirke3 have contributed to estimation and testing of the parameters involved in Zero Inflated Power Series Distributions. If the data set under study does not contain observations after some known point in the support, we have to modify Zero Inflated Power Series Distribution (ZIPSD) accordingly in order to get better inferential properties. Zero Inflated Truncated Power Series Distribution (ZITPSD) is one of the better options. In the present work we address problem of estimation for ZITPSD with more emphasis on statistical tests. We provide three asymptotic tests for testing the parameter of ZITPSD, using an unconditional (standard) likelihood approach, a conditional likelihood approach and the sample mean, respectively. The performance of first two tests has been studied for Zero Inflated Truncated Poisson Distribution (ZITPD). Asymptotic Confidence Intervals for the parameter are also provided. The model has been applied to a real life data.

Keywords: zero inflation, zero inflated power series distribution, zero inflated truncated power series distribution, zero inflated truncated poisson distribution

Introduction

In certain applications involving discrete data, we come across data having frequency of an observation ‘zero’ significantly higher than the one predicted by the assumed model. The problem of high proportion of zeros has been an interest in data analysis and modeling. There are many situations in the medical field, engineering applications, manufacturing, economics, public health, road safety epidemiology and in other areas leading to similar situations. In highly automated production process, occurrence of defects is assumed to be Poisson. However, we get no defectives in many samples. This leads to excess number of zeros. Models having more number of zeros significantly are known as zero-inflated models.

In the literature, numbers of researchers have worked on family of zero-inflated power series distributions. Gupta et al.1 have studied the structural properties and point estimation of parameters of Zero-Inflated Modified Power Series distributions and in particular for zero-inflated Poisson distribution. Murat & Szynal2 have studied the class of inflated modified power series distributions where inflation occurs at any of the support points. Moments, factorial moments, central moments, the maximum likelihood estimators and variance-covariance matrix of the estimators are obtained. Murat & Szynal2 extended the results of Gupta et al.1 to the discrete distributions inflated at any point ss .

Zero Inflated Truncated Power Series Distribution contains two parameters. The first parameter indicates inflation ( ππ ) of zero and the other parameter ( θθ ) is that of power series distribution. Literature survey reveals that many researchers devoted to the inflation parameter of the model. In the present study, we focus on the referential aspect of the basic parameter of the model. In this article, we provide maximum likelihood parameters, Fisher information and asymptotic tests for testing the parameter of the Zero Inflated Truncated Power Series Distribution. Additionally, asymptotic confidence interval for the parameter is provided.

In section 2.1 we report estimation of both the parameters of ZITPSD and corresponding asymptotic variances using full likelihood approach, conditional likelihood approach and method of moments. In section 2.2, we provide three asymptotic tests for testing the parameter of ZITPSD. Section 2.3 is devoted to asymptotic confidence intervals for the parameters of ZITPSD. In section 3.1 we report estimation of parameters involved in Zero Inflated Truncated Poisson Distribution (ZITPD) and inference related to the model. Section 3.2 is devoted to three asymptotic tests for testing the parameter of ZITPD and in section 3.3 we provide asymptotic confidence intervals for the parameters of ZITPD. Simulation study is carried out in section 4, to study the performance of the tests. Illustrative example is provided in section 5.

Zero-inflated truncated power series distribution(ZITPSD)

Before we define truncated ZIPSD, we first consider the Truncated Power Series Distribution (TPSD) truncated at the support point 't''t'  onwards, where 't''t'  is known. Then the probability mass function of TPSD is given by

P(X=x)=bxθxf(θ)(1P(X>t),forx=0,1,2,...,tP(X=x)=bxθxf(θ)(1P(X>t),forx=0,1,2,...,t

  =bxθx(ty=0byθy),=bxθx(ty=0byθy), =bxθxG(θ),=bxθxG(θ),   where G(θ)=ty=0byθyG(θ)=ty=0byθy          

It is clear that the truncated distribution is also Power series distribution. Based on the same, we define ZITPSD as follows:

Let the probability mass function of a random variable X is given by

P(X=x)={1π+πb0G(θ)      for  x=0πbxθxG(θ)      for   x=1,2,3,...,t …(2.1)

where G(θ)=ty=0byθy

Estimation of π and θ

Estimation of the parameters using full likelihood function: Suppose a random sample X1,X2,...,Xn  of size n from ZITPSD is available. Then the likelihood function is given b

L(θ,π;x_)=ni=1(1π+πb0G(θ))1ai(πbxiθxiG(θ))aiθ,π>0   

where ai=0  if xi=0 and ai=1  if xi= 1,2,3,….t. …(2.2)

then, logL(θ,π;x_)=   =n0log(1π+πb0G(θ))+ni=1ailogπ+ni=1ailogbxi+ni=1aixilog(θ)ni=1ailogG(θ) …(2.3)

Maximum likelihood estimators of θ  and π are obtained by solving the following two equations

ˆπ=(nn0)G(ˆθ)n(G(ˆθ)b0)                                                       …(2.4)

ni=1aixiˆθ=n0ˆπb0G(ˆθ)G(ˆθ)2(1ˆπ+ˆπb0G(ˆθ))+ni=1aiG(ˆθ)G(ˆθ) , …(2.5)

Substituting ˆπ=(nn0)G(ˆθ)n(G(ˆθ)b0)  in eq. (2.5) we get

ˉx=ˆθG(ˆθ)(G(ˆθ)b0) ,                     …(2.6)

which is non-linear equation in θ , Using Newton-Raphson method first we find ˆθ , substituting this value of ˆθ  in Eq. (2.4) we find ˆπ . The Fisher information matrix of δ_=(π,θ)  is given by

I(δ_)=(E(2logLπ2)E(2logLπθ)E(2logLπθ)E(2logLθ2))=(I11I12I21I22)

Where

  I11=(n(G(θ)b0)π(G(θ)πG(θ)+πb0)) ,                                           …(2.7)

I12=(nb0G(θ)G(θ)(G(θ)πG(θ)+πb0))                                          …(2.8)

and

I22=(nπb0((G(θ)2G(θ)2G(θ)2G(θ)G(θ)4)+πb0G(θ)2G(θ)4(1π+πb0G(θ)))                                  

+nπ(G(θ)θG(θ))+nπ(1b0G(θ))(G(θ)G(θ)G(θ)2)G(θ)2  …(2.9)

 Assuming that conditions required for asymptotic normality for maximum likelihood estimators are satisfied, we have following theorem:

Theorem 2.1: Let X1,X2,...Xn  be a random sample from ZITPSD with parameters π  and θ . Then the maximum likelihood estimator obtained by solving eq. (2.4) and eq. (2.6), have asymptotic bivariate normal distribution with mean vector (π,θ)'  and dispersion matrix I1(δ_)  for n  sufficiently large.

That is as n , (n(ˆππ),n(ˆθθ))N2(0,I1(δ_)) .

In the following we present conditional likelihood approach and obtain MLEs for θ .

Conditional likelihood function approach: We observe that the conditional density of Xi  given Ai=ai  is independent of inflation parameter π , since

P(Xi=xi|Ai=ai) =(bxi(θ)xiG(θ)b0)ai  …(2.10)

Now the conditional log likelihood function is given by

logL(θ;x_)= nn0i=1logbxi+nn0i=1xilog(θ)nn0i=1log(G(θ)b0)  …(2.11)

The mle ˜θ  of θ  is the solution to an equation

  ˉx=˜θG(˜θ)(G(˜θ)b0)  ,                                     …(2.12)

where ˉx=nnoi=1xinn0  is the mean of the positive observations only. We note that mle of θ  based on full likelihood (eq. 2.6) and based on conditional likelihood (Eq. 2.12) are the same and

AV˜θ(θ)= ((nn0){(G(θ)θ(G(θ)b0)+G(θ)(G(θ)b0)G(θ)2(G(θ)b0)2)})1  …(2.13)

Assuming that Cramer-Huzurbazar conditions required for asymptotic normality for MLEs are satisfied, we have following theorem:

Theorem 2.2: Let X1,X2,...Xn  be a random sample from ZITPSD with parameters π  and θ . Then the mle of θ  is solution to the eq. (2.12) and has asymptotic normal distribution with mean θ  and variance AV˜θ(θ)  for n  sufficiently large. That is as n , (n(˜θθ))N1(0,AV˜θ(θ)) .

In the following we present moment estimator of ZITPSD.

Moment estimator of ZITPSD: We have,

E(X)= πθG(θ)G(θ)=πμ(θ)  

E(X2)= θπG(θ)(θG(θ)+G(θ)  and

Var(X)=θπG(θ)(θG(θ)+G(θ)πθ(G(θ)2G(θ)) ,

=σ2(π,θ)  say.

Let,

  ˉX=πμ(θ)  …(2.14)

ni=1xi2n=πθG(θ)(θG(θ)+G(θ) , …(2.15)

Solving eq. (2.14) and eq. (2.15) we get moment estimators of π  and π .

Theorem 2.3: Let X1,X2,...Xn  be a random sample from ZITPSD with parameters π  and θ . Then the moment estimator of π  and θ  are obtained by solving in the eq. (2.14) and eq. (2.15). The moment estimator of θ  has asymptotic normal distribution with mean πμ(θ)  and variance σ2(π,θ)n , for n  sufficiently large. That is as n , (n(ˉθθ))N1(0,σ2(π,θ)n) .

Tests for the parameter θ  of ZITPS distribution

Test based on ˆθ : Suppose we wish to test H0:θ=θ0  vs H1:θθ0 . Let us assume that π  is known. Therefore, under H0 , from Theorem (2.1) we have

(ˆθθ0) ~ AN(0,AVˆθ(π,θ0)) .                                 …(2.16)

Define a test statistic to be Z1=ˆθθ0AVˆθ(π,θ0) . Based on Z1 we define the test ψ1  which rejects H0  at α level of significance, if |Z1|>z1α/2 , where z1α/2  is the upper 100(α/2) th percentile of SNV.

Let Φ(.) be the cumulative distribution function of SNV. Then the power of the test ψ1  is given by

βψ1(π,θ)= 1Φ(B)+Φ(A) ,                                 

where    

  A=θ0θz1α/2AVˆθ(π,θ0)AVˆθ(π,θ)  and

B=θ0θ+z1α/2AVˆθ(π,θ0)AVˆθ(π,θ) .

However, in practice π  is unknown. Hence we modify the test statistic by replacing π  by its maximum likelihood estimator ( ˆπ0 ), when H0  is true. By doing so, we define test Z1=ˆθθ0AVˆθ(ˆπ0,θ0) , where ˆπ0=(nn0)f(θ0)n(f(θ0)b0)

Based on Z1 , we propose a test ψ1  rejects H0  at α  level of significance, if |Z1|>Z1α/2 .

The power of this test is given by

βψ1(π,θ)=nk=0(1Φ(ˆBk)+Φ(ˆAk))P(n0=k) , ...(2.17)

where

ˆBk=θ0θ+z1α/2AVˆθ(ˆπ0,θ0)AVˆθ(π,θ)  , ˆAk=θ0θz1α/2AVˆθ(ˆπ0,θ0)AVˆθ(π,θ) ,

P(n0=k)=(nk)P0k(1P0)nk ,with P0=(1π+πb0f(θ)) . …(2.18)

Below we develop test based on ˜θ , estimator based on conditional likelihood approach.

Test based on ˜θ : Theorem (2.5) gives

  (˜θθ0) ~ AN(0,AV˜θ(θ0)) . …(2.19)

Hence, we define test statistic Z2=˜θθ0AV˜θ(θ0) . A test based on Z2  which rejects H0  α level of significance, if |Z2|>z1α/2 .

The power of the test ψ2  is given by

  βψ2(π,θ)=1Φ(B)+Φ(A)  , …(2.20)

where, A=θ0θz1α/2AV˜θ(θ0)AVˆθ(θ) , B=θ0θ+z1α/2AV˜θ(θ0)AVˆθ(θ) .      

Test based on the moment estimator ˉθ of θ : It is clear that the problem of testing H0:θ=θ0  vs H1:θθ0  is equivalent to testing H0:μ(θ)=μ(θ0)  vs H1:μ(θ)μ(θ0) , where μ(θ)=θG(θ)G(θ) . We have from Theorem (2.3), sample mean is consistent and asymptotically normal for the population mean.

That is ˉX ~ AN(πμ(θ),σ2(π,θ)n) .

Therefore, under H0 , we have

n(ˉXπμ(θ0)) ~ AN(0,σ2(π,θ0)π2) .

Define test statistic

Z3=n(ˉXπμ(θ0))σ2(π,θ0)π2 ~ N(0,1) , when π  is known.

The test ψ3  rejects H0  at α level of significance if |Z3|>z1α/2 .

That is, reject H0 if (n|ˉXπμ(θ0)|σ2(π,θ0)π2)>z1α/2 .

The power of the test ψ3 is given by

βψ3(π,θ)= 1Φ(B)+Φ(A) , …(2.21)

where A=π(μ(θ0)z1α/2σ2(π,θ0)nπ2)πμ(θ)σ2(π,θ)n  and

B=π(μ(θ0)+z1α/2σ2(π,θ0)nπ2)πμ(θ)σ2(π,θ)n,  

If π  is unknown, we modify the test statistic by replacing π  by its estimate (ˆπ0)  under H0 . By doing so, we define test statistic

Z3=n(ˉXˆπ0μ(θ0))σ2(ˆπ0,θ0)ˆπ02 , …(2.22)

where ˆπ0  is given by ˆπ0=ˉXμ(θ0) .

Based on Z3  we propose a test ψ3  which rejects H0  at α level of significance if |Z3|>z1α/2 .

The power of the test is given by

βψ3(π,θ)= nk=0(1Φ(Bk)+Φ(Ak))P(n0=k) , …(2.23)

where

Ak=ˆπ0(μ(θ0)z1α/2σ2(ˆπ0,θ0)nπ2)-πμ(θ)σ2(π,θ)n  

Bk=ˆπ0(μ(θ0)+z1α/2σ2(ˆπ0,θ0)nπ2)-πμ(θ)σ2(π,θ)n

and P(n0=k)=(nk)P0k(1P0)nk , with P0=(1π+πa0f(θ)) .

Using the tests developed above, we can define two sided asymptotic confidence intervals for θ , by inverting acceptance regions of the tests appropriately. Below we report the same.

Asymptotic confidence interval for the parameter θ

Asymptotic confidence interval for θ  based on the test ψ1 is given by

(ˆθz1α/2AVˆθ(ˆπ,ˆθ),ˆθ+z1α/2AVˆθ(ˆπ,ˆθ))  …(2.24)

where, AVˆθ(ˆπ,ˆθ)  is an estimate of asymptotic variance of ˆθ  and asymptotic confidence interval for θ based on the test ψ2  is given by

  (˜θz1α/2AV˜θ(˜θ),˜θ+z1α/2AV˜θ(˜θ))  …(2.25)

where AV˜θ(˜θ)  is an estimate of the asymptotic variance of ˜θ  as given in the eq. (2.13) .

Asymptotic confidence interval for θ  based on the test ψ3  is given by

(ˉXˆπz1α/2AVˉθ(ˆπ,ˉθ),ˉXˆπ+z1α/2AVˉθ(ˆπ,ˉθ)) , …(2.26)

where AVˉθ(ˆπ,ˉθ)  = n(ˉXπμ(θ))σ2(π,θ)π2 .

In the following we study inference for zero-inflated truncated poisson distribution using results reported in the earlier.

Zero-inflated truncated poisson distribution

Truncated samples from discrete distributions arise in numerous situations where counts of zero are not observed. As an example, consider the distribution of the number of children per family in developing nations, where records are maintained only if there is at least a child in the family. The number of childless families remains unknown. The resulting sample is thus truncated with zero class missing. In continuous distribution, a sample of this type would be described as singly left truncated. In other situations, sample from discrete distributions might be censored on the right.

In this section, we consider zero-inflated truncated Poisson distribution truncated at right at the support point 't'  onwards, where 't'  is known. Moments, maximum likelihood estimators, Fisher information matrix for full and conditional likelihood are provided. We provide three tests for testing the parameter of the ZITPD.

Consider the probability mass function of truncated Poisson distribution (TPD) truncated at the support point 't'  onwards. The probability mass function of TPD is given by

P(X=x)=eθθxx!(1P(X>t),forx=0,1,2,...,t

=eθθxx!(ty=0eθθyy!),                               

=θxx!A(θ),  where A(θ)=(ty=0θyy!)  

Using this truncated distribution, we define the zero-inflated truncated Poisson distribution truncated at 't'  onwards.

The probability mass function of ZITP distribution is given by

P(X=x)={(1π)+πA(θ)      for  x=0πθxx!A(θ)    for   x=1,2,3,...,t and θ>0,0<π<1  …(3.1)

Estimation of the parameters π  and θ  

Estimation of the parameters using full likelihood function

Let X1,X2,...,Xn  be a random sample observed from zero-inflated truncated Poisson distribution truncated at 't  onwards, where 't  is the point in the support defined in the above probability mass function. Then the likelihood function is given by

L(θ,π;x_)=ni=1(1π+πA(θ))1ai(πθxx!A(θ))aiθ,π>0  

The corresponding log likelihood function is given by

log

+  …(3.2)

To find MLEs of θ  and π , we differentiate the eq. (3.2) with respective π  and θ , and then equating to zero we get

ˆπ=(nn0)A(ˆθ)n(A(ˆθ)1)                                                      …(3.3)

and ni=1aixiˆθ=πn0(A(ˆθ))(1π+πA(ˆθ))(A(ˆθ))2+ni=1aiA(ˆθ)A(ˆθ)

=A(ˆθ)A(ˆθ)((nn0)+ˆπn0(1π+ˆπA(ˆθ))A(ˆθ))

ni=1aixiθ=A(θ)A(θ)((nn0)(1π)A(θ)+nπ(1π)A(θ)+π)  …(3.4)

Substituting ˆπ=(nn0)A(ˆθ)n(A(ˆθ)1)  in the above equation we have

ni=1aixiˆθ=(nn0)A(ˆθ)(A(ˆθ)1) ,

ni=1aixi(A(ˆθ)1)(nn0)ˆθA(ˆθ)=0 , …(3.5)

which is non-linear equation in ˆθ . Therefore, we use a numerical technique to solve it. Let

h(ˆθ)=ni=1aixi(A(ˆθ)1)(nn0)ˆθA(ˆθ)  and

h(ˆθ)=ni=1aixiA(ˆθ)(nn0)(ˆθA(ˆθ)+A(ˆθ) .

Using Newton-Raphson iterative formula ˆθi+1=ˆθih(ˆθ)h(ˆθ),i=0,1,2,... with suitable initial value of θ0  we get ˆθ . Substituting this value of ˆθ  in eq. (3.3), we get the value of ˆπ .

In the following we find the elements of Fisher information matrix

Here we have

logLπ=n0(1+1A(θ))(1π+πA(θ))+ni=1aiπ ,

2logLπ2=n0(1+1A(θ))2(1π+πA(θ))2ni=1aiπ2 ,

E(2logLπ2)=E(n0)(1+1A(θ))2(1π+πA(θ))2+E(ni=1ai)π2 ,

=n(1+1A(θ))2(1π+πA(θ))+n(11A(θ))π ,

E(2logLπ2)=n(1+1A(θ))2(1π+πA(θ))n(1+1A(θ))π ,                                   

I11=n(A(θ)1)π(A(θ)πA(θ)+π) . …(3.6)

Now

logLπ=n0(1+1A(θ))(1π+πA(θ))+ni=1aiπ ,

2logLπθ=n0A(θ)(A(θ))2((1π+πA(θ))π(1+1A(θ)))(1π+πA(θ))2 ,

E(2logLπθ)=E(n0)(A(θ)A(θ)2)((1π+πA(θ))π(1+1A(θ)))(1π+πA(θ))2 ,

E(2logLπθ)=I12=nA(θ)A(θ)(A(θ)πA(θ)+π)  …(3.7)

Further differentiating eq. (3.2) twice with respect to θ , we get

2logLθ2=n0π{(1π+πA(θ))(A(θ)2A(θ)2A(θ)2A(θ)f(θ)4)A(θ)2A(θ)2(πA(θ)2)}(1π+πA(θ))2

ni=1aixiθ2ni=1ai(A(θ)A(θ)A(θ)2)A(θ)2 .

Therefore,

E(2logLθ2)= (nπ((A(θ)2A(θ)2A(θ)2A(θ)A(θ)4)+πA(θ)2A(θ)4(1π+πA(θ)))

+nπA(θ)θA(θ)+nπ(11A(θ))(A(θ)A(θ)A(θ)2)A(θ)2) .

Hence,

                I22= (nπ((A(θ)2A(θ)2A(θ)2A(θ)A(θ)4)+πA(θ)2A(θ)4(1π+πA(θ)))

+nπA(θ)θA(θ)+nπ(11A(θ))(A(θ)A(θ)A(θ)2)A(θ)2) .

The asymptotic variance of ˆπ  and ˆθ  are

AVˆπ(π,θ)=I11=(I11I12I22)1 .

AVˆθ(π,θ)=I22=(I22I12I11)1  .                                                … (3.8)

  1. Conditional likelihood function approach

The conditional likelihood function is given by

L(θ;x_)=ni=1(θxixi!(A(θ)1))ai,θ>0                   …(3.9)

The corresponding log likelihood function is given by

log  …(3.10)

The corresponding mle ˜θ  is the solution to an equation

ˉx=˜θA(˜θ)A(˜θ)1  …(3.11)

Now consider,

2logL*θ2=nn0i=1xiθ2nn0i=1(A(θ)1)A(θ)A(θ)2(A(θ)1)2

E(2logL*θ2)=E(nn0i=1xiθ2)+E(nn0i=1(A(θ)1)A(θ)A(θ)2(A(θ)1)2)

=(nn0)θA(θ)θ2(A(θ)1)+(nn0){(A(θ)1)A(θ)A(θ)2}(A(θ)1)2

=(nn0)(A(θ)1)(A(θ)θ+A(θ)(A(θ)1)A(θ)2(A(θ)1)) . … (3.12)

Therefore, asymptotic variance of ˜θ  is different than the asymptotic variance of estimate of θ  based on the standard likelihood approach. The same is given by

AV˜θ(θ)=((nn0)(A(θ)1)(A(θ)θ+(A(θ)1))A(θ)(A(θ))2(A(θ)1)))1  … (3.13)

  1. Moment estimator of ZITP distribution

 Mean = E(X)= πθA(θ)A(θ)                                  …(3.14)

E(X2)= θπA(θ)(θA(θ)+A(θ))

Var(X)=θπA(θ)(θA(θ)+A(θ)πθA(θ)2A(θ))=σ2(π,θ)  say       …(3.15)

ˉx=πθA(θ)A(θ)  …(3.16)

ni=1xi2n=θπA(θ)(θA(θ)+A(θ))  …(3.17)

Solving eq. (3.16) and eq. (3.17), we get moment estimators of π  and θ .

Tests for the parameter θ  of ZITP distribution

Suppose we want to test H0:θ=θ0  vs H1:θθ0 , (assuming π  is unknown) [4]

  1. Test based on ˆθ

Z4=ˆθθ0AVˆθ(ˆπ0,θ0)                                   …(3.18)

where AVˆθ(ˆπ0,θ0) is defined in eq. (3.8).The test ψ4  rejects H0 , if |Z4|>z1α/2 .

  1. Test based on ˆθ

The test statistic here is Z5=˜θθ0AV˜θ(θ0) , …(3.19)

Where, AV˜θ(θ0) is as defined in eq. (3.13). The test ψ5 rejects H0 if |Z5|>z1α/2 .

  1. Test based on sample mean

The test statistic

Z6=n(ˉXˆπ0θ0)ˆπ02AVˉX(ˆπ0,θ0)  , …(3.20)

where ˆπ0=ˉX(A(θ0)θ0A(θ0))

Power of the test is given by

βψ6(ˆπ,θ)   =nk=0(1Φ(ˆBk)+Φ(ˆAk))P(n0=k)

where , ˆBk=ˆπ0(θ0+z1α/2ˆπ02AVˉX(ˆπ0,θ0))πθAVˉX(π,θ0) ,

ˆAk=ˆπ0(θ0z1α/2ˆπ02AVˉX(ˆπ,θ0))πθAVˉX(π,θ0)  and

P(n0=k)=(nk)P0k(1P0)nk , with P0=1π+πA(θ)

Asymptotic confidence interval for the parameter θ

Asymptotic confidence interval for θ  based on the test ψ4  is given by

(ˆθz1α/2AVˆθ(ˆπ,ˆθ),ˆθ+z1α/2AVˆθ(ˆπ,ˆθ))  …(3.21)

where, AVˆθ(ˆπ,ˆθ)  is an estimate of asymptotic variance of ˆθ  and asymptotic confidence interval for q based on the test ψ5  is given by

(˜θz1α/2AV˜θ(˜θ),˜θ+z1α/2AV˜θ(˜θ))  …(3.22)

where AV˜θ(˜θ)  is an estimate of the asymptotic variance of ˜θ  as given in the eq. (3.13) .

Asymptotic confidence interval for θ  based on the test ψ6  is given by

(ˉXˆπz1α/2AVˉθ(ˆπ,ˉθ),ˉXˆπ+z1α/2AVˉθ(ˆπ,ˉθ)) , …(3.23)

where AVˉθ(ˆπ,ˉθ)  = n(ˉXπμ(θ))σ2(π,θ)π2 .

Simulation study

A simulation study is carried out to investigate the power of the two tests proposed in section 3.2. We generate 10000 samples of sizes 100 and 200 for different values of p , θ and truncation point t. Based on generated sample, the test statistics were calculated. Percentage of times the test statistics exceeds Z1-a/2 is computed, which is an estimate of power of the respective test. R programme is developed to find power of the test. The results for the case of θ0=2 and 4 , p=0.3, 0.4, 0.5, 0.6, 0.7, a=0.05 and truncation point t= 7 and 9 are presented in the Table 1 & Table 2.

π

θ

n=100

n=200

ψ4

ψ5

ψ4

ψ5

0.3

2.0

6.57

4.28

6.57

4.63

2.2

11.49

8.9

16.08

12.88

2.4

26.85

22.24

45.08

39.93

2.6

49.43

44.5

76.81

72.82

2.8

71.27

66.99

93.49

91.72

3

86.28

83.23

98.91

98.46

3.2

94.64

93.18

99.77

99.73

3.4

98.08

97.58

99.99

99.99

3.6

99.44

99.08

100

100

3.8

99.8

99.76

100

100

4

99.95

99.94

100

100

4.2

99.98

99.97

100

100

4.4

100

100

100

100

0.4

2

6.44

4.24

6.29

4.46

2.2

12.83

10.33

20.01

15.59

2.4

33.16

28.94

56.94

50.24

2.6

60.11

55.4

87.34

83.64

2.8

81.87

78.72

97.82

97

3

93.8

92.31

99.83

99.79

3.2

98.35

97.87

100

100

3.4

99.62

99.54

100

100

3.6

99.98

99.96

100

100

3.8

99.99

99.97

100

100

3.8

100

100

100

100

0.5

2

6.17

4.46

6.25

4.2

2.2

14.83

12.01

24.63

19.24

2.4

40.31

34.76

66.99

60.39

2.6

70.38

65.06

92.88

90.37

2.8

90.13

87.37

99.52

99.14

3

97.23

96.45

99.99

99.97

3.2

99.55

99.36

100

100

3.4

99.96

99.94

100

100

3.6

100

99.99

100

100

3.8

100

100

100

100

0.6

2

6.8

4.43

7.12

4.89

2.2

18.04

13.52

28.35

21.91

2.4

47.01

40.71

73.41

65.85

2.6

77.65

72.17

96.33

94.68

2.8

94.05

91.78

99.86

99.74

3

99.01

98.48

99.99

99.99

3.2

99.85

99.74

100

99.99

3.4

99.99

99.97

100

100

3.6

100

100

100

100

0.7

2

7.11

4.17

7.34

4.95

2.2

19.69

14.15

32.46

24.21

2.4

54.29

45.76

80.95

73.64

2.6

84.17

78.59

98.35

97.26

2.8

96.91

95.1

99.95

99.9

3

99.65

99.28

100

100

3.2

99.99

99.97

100

100

3.4

100

100

100

100

Table 1 Power (in %) of the test ψ4 and ψ5  for θ0 =2. t=7, n=100 and 200, α=0.05

π

n=100

n=100

n=200

ψ4

ψ5

ψ4

ψ5

0.3

4

5.56

3.58

4.65

3.37

4.2

9.33

4.71

12.38

5.58

4.4

19.4

9.95

31.31

17.04

4.6

33.8

20.81

56.28

38.36

4.8

50.58

35.48

78.37

62.91

5

68.14

53.07

92

82.84

5.2

80.88

67.97

97.5

93.47

5.4

89.97

81.06

99.45

98.4

5.6

95.31

89.59

99.83

99.53

5.8

97.77

94.74

99.97

99.94

6

99.05

97.48

100

99.99

6.2

99.6

98.72

100

100

6.4

99.85

99.5

100

100

0.4

4

5.29

3.57

5.26

3.95

4.2

10.24

4.86

13.8

5.8

4.4

22.52

12.32

38.38

21.74

4.6

41.49

26.14

68.09

49.91

4.8

62.45

46.12

88.35

76.97

5

78.69

65.52

97.45

92.41

5.2

90.17

81.34

99.52

98.26

5.4

95.75

90.75

99.96

99.67

5.6

98.55

96.16

99.99

99.97

5.8

99.53

98.56

100

100

6

99.88

99.52

100

100

6.2

99.94

99.78

100

100

6.4

99.96

99.94

100

100

0.5

4

5.39

3.75

4.88

3.91

4.2

11.78

5.51

15.75

6.94

4.4

26.72

14.86

45.12

26.2

4.6

49.72

33.44

76.69

59.34

4.8

70.81

55

94.06

85.45

5

86.58

75.44

98.94

96.71

5.2

95.84

89.47

99.95

99.49

5.4

98.59

95.86

99.98

99.95

5.6

99.62

98.82

100

100

5.8

99.93

99.79

100

100

6

99.98

99.86

100

100

6.2

99.99

99.97

100

100

6.4

100

100

100

100

0.6

4

4.71

3.41

5.27

4.12

4.2

13.45

5.88

20.35

8.06

4.4

34.38

19.15

57.57

35.63

4.6

62.82

45.15

89.27

74.97

4.8

84.74

70.72

98.5

94.9

5

95.58

88.95

99.96

99.56

5.2

98.9

96.8

100

99.97

5.4

99.77

99.19

100

100

5.6

99.98

99.88

100

100

5.8

100

100

100

100

0.7

4

4.71

3.41

5.27

4.12

4.2

13.45

5.88

20.35

8.06

4.4

34.38

19.15

57.57

35.63

4.6

62.82

45.15

89.27

74.97

4.8

84.74

70.72

98.5

94.9

5

95.58

88.95

99.96

99.56

5.2

98.9

96.8

100

99.97

5.4

99.77

99.19

100

100

5.6

99.98

99.88

100

100

5.8

100

100

100

100

Table 2 Power (in %) of the test ψ4 and ψ5 for θ0 =4, t=9, n=100 and 200, a=0.05

From the simulation study reported in Table 1 & Table 2, we observe that

  1. The test based on full likelihood approach is better than the one based on conditional likelihood approach when θ is small. For large θ, both the tests are equally good.
  2. Probability of Type I error of the former test is more than that of later.
  3. Since for large values of θ both the tests are equally good. We recommend the use of conditional likelihood approach, when θ is large, from the computational point of view.
  4. If θ is large, proportion of zeros corresponding the Poisson distribution are relatively low. Hence these zeros can be ignored while making inference about θ. However, for smaller values of θ, such ignorance will have effect on inference of θ.

Illustrative example

Let us consider the data of Traffic Accident Research given by Kuan et al.5

The data from the department of motor vehicles master driver license file

Traffic accidents 0

1

2

>3

Number of drivers

4499

766

136

21

From the data we see that there is excess number of zero counts and the frequency of X is greater than or equal to 3 is 21. Generally such data is modeled by Poisson distribution. But Poisson distribution does not fit well for this data. We fit the above data for ZIPD. In ZIPD there are two parameters π  and θ . In this problem n0=4499,n=5422  and estimated values of ˆπ=0.5583  and ˆθ=0.3637 . Using these values we fit the ZIPD for the above data. The calculated Chi-square value is 0.4043 and table value of X2(1, 0.05) is 3.841459 and the P-value is 0.5249

Same data is fitted to ZITPD truncated at 4 and above. The parameters are ˆπ=0.5582  and ˆθ=0.3646  The calculated Chi-square value is 0.4018 and table value of X2(1, 0.05) is 3.8415 and the P-value is 0.5262. If the same data is fitted to ZITPD truncated at 5 and above. The parameters are ˆπ=0.5583  and ˆθ=0.3638  . The calculated Chi-square value is 0.4012 and table value of X2(1, 0.05) is 3.8415 and the P-value is 0.5265. Here we prefer ZITPD to model the data.

Acknowledgments

None.

Conflicts of interest

Author declares that there are no conflicts of interest.

References

Creative Commons Attribution License

©2016 Patil. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.