Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 11 Issue 3

The Poisson-Adya distribution 

Rama Shanker,1 Kamlesh Kumar Shukla2

1Department of Statistics, Assam University, Silchar, Assam, India
2Department of Mathematics, Noida International University, Gautam Buddh Nagar, India

Correspondence: Rama Shanker, Department of Statistics, Assam University, Silchar, Assam, India

Received: July 20, 2022 | Published: August 17, 2022

Citation: Shanker R, Shukla KK. The Poisson-Adya distribution. Biom Biostat Int J. 2022;11(3):100-103. DOI: 10.15406/bbij.2022.11.00361

Download PDF

Abstract

In this paper a Poisson mixture of Adya distribution called Poisson-Adya distribution has been suggested. The expressions of statistical constants including coefficients of variation, skewness, kurtosis and index of dispersion have been obtained and their behavior for varying values of parameter has been studied. It is observed that the obtained distribution is unimodal, has increasing hazard rate and over-dispersed. Maximum likelihood estimation and method of moment have been discussed for estimating parameter. Finally, the goodness of fit of the proposed distribution and its comparison with Poisson and Poisson-Lindley distributions has been given.

Keywords: Adya distribution, compounding, unimodality, over-dispersion, estimation, goodness of fit

Introduction

The Poisson distribution is a suitable distribution for data having equi-dispersion (mean equal to variance). But in real life situation, it has been observed that most of the datasets being stochastic in nature are either over-dispersed (variance greater than mean) or under-dispersed (variance less than mean). During recent decades an attempt has been made by different researchers to derive over-dispersed one parameter discrete distribution by compounding Poisson distribution with one parameter continuous lifetime distributions. A popular one parameter discrete distribution for over-dispersed (variance greater than the mean) is the Poisson-Lindley distribution (PLD) proposed by Sankaran1. PLD is a Poisson mixture of Lindley distribution introduced by Lindley2. Further, it has been observed that these one parameter discrete distributions are not suitable for some over-dispersed datasets from biological sciences due to their levels of over-dispersion. Shanker & Hagos3 have detailed discussion on applications of PLD for data arising from biological sciences, as the data from biological sciences are, in general, over-dispersed. It has been observed by Shanker & Hagos3 that in some biological science data PLD does not give better fit and hence there is a need for another over-dispersed discrete distribution is required.

Shanker, et al4 proposed a one parameter continuous lifetime distribution named Adya distribution, defined by its probability density function (pdf) and cumulative density function (cdf) given by

f(x;θ)=θ3θ4+2θ2+2(θ+x)2eθx;x>0,θ>0f(x;θ)=θ3θ4+2θ2+2(θ+x)2eθx;x>0,θ>0   (1.1)

F(x,θ)=1[1+θx(θx+2θ2+2)θ4+2θ2+2]eθx;x>0,θ>0F(x,θ)=1[1+θx(θx+2θ2+2)θ4+2θ2+2]eθx;x>0,θ>0   (1.2)

Shanker, et al4 derived Adya distribution as a convex combination of exponential (θ)(θ) , gamma (2,θ)(2,θ) and gamma (3,θ)(3,θ) distributions with respective proportions θ4θ4+2θ2+2θ4θ4+2θ2+2 , 2θ4θ4+2θ2+22θ4θ4+2θ2+2 and 2θ4θ4+2θ2+22θ4θ4+2θ2+2 respectively. Its various statistical properties including moments and moments-based measures, hazard rate function, mean residual life function, stochastic ordering, deviations from the mean and the median, Bonferroni and Lorenz curves, and stress-strength reliability, estimation of parameter and applications are available in Shanker et al4.

In the present paper a Poisson mixture of Adya distribution has been derived and its statistical constants including coefficients of variation, skewess, kurtosis and index of dispersion have been studied. The Unimodality, increasing hazard rate and over-dispersion of the distribution have been explained. Estimation of parameter using method of moment and maximum likelihood has been discussed. Applications, goodness of fit and its comparison with other one parameter discrete distributions are presented.

Poisson-Adya distribution

Let XX follows Poisson distribution with parameter λ>0λ>0 having pmf

P(X|λ)=eλλxx!;x=0,1,2,...P(X|λ)=eλλxx!;x=0,1,2,...

Now suppose the parameter λλ follows Adya distribution with parameter θθ having pdf

f(λ|θ)=θ3θ4+2θ2+2(θ+λ)2eθλ;λ>0,θ>0f(λ|θ)=θ3θ4+2θ2+2(θ+λ)2eθλ;λ>0,θ>0

Thus, the marginal pmf of XX can be obtained as

 P(X=x)=0P(X|λ)f(λ|θ)dλ=0eλλxx!θ3θ4+2θ2+2(θ+λ)2eθλdλP(X=x)=0P(X|λ)f(λ|θ)dλ=0eλλxx!θ3θ4+2θ2+2(θ+λ)2eθλdλ (2.1)

=θ3(θ4+2θ2+2)x!0e(θ+1)λλx(θ2+2θλ+λ2)dλ=θ3(θ4+2θ2+2)x!0e(θ+1)λλx(θ2+2θλ+λ2)dλ

=θ3(θ4+2θ2+2)x2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)(θ+1)x+3;x=0,1,2,...,θ>0=θ3(θ4+2θ2+2)x2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)(θ+1)x+3;x=0,1,2,...,θ>0 (2.2)

We name this distribution as Poisson-Adya distribution. In the subsequent sections it has been shown that the pmf of Poisson-Adya distribution (PAD) is unimodal, has increasing hazard rate and over-dispersed. The nature of the pmf of PAD for varying values of parameter has been shown in the following figure1. As the value of parameter increases, the distribution becomes positively skewed and also it is becoming more over-dispersed (Figure 1).

Figure 1 pmf of PAD for varying values of parameter.

Statistical constants

Using (2.1), r the th factorial moment about origin,μ(r) , of PAD can be obtained as

μ(r)=E[E(X(r)|λ)]=θ3θ4+2θ2+20[x=0x(r)eλλxx!](θ+λ)2eθλdλ .

=θ3θ4+2θ2+20λr[x=reλλxr(xr)!](θ2+2θλ+λ2)eθλdλ =θ3θ4+2θ2+20λr(θ2+2θλ+λ2)eθλdλ =r!{θ4+2(r+1)θ+(r+1)(r+2)}θr(θ4+2θ2+2);r=1,2,3,...

Substituting r=1,2,3,&4 the first four factorial moment about origin, of PAD can be obtained as

μ(1)=θ4+4θ2+6θ(θ4+2θ2+2) ,μ(2)=2(θ4+6θ2+12)θ2(θ4+2θ2+2)

μ(3)=6(θ4+8θ2+20)θ3(θ4+2θ2+2) ,μ(4)=24(θ4+10θ2+30)θ4(θ4+2θ2+2) .

The relationship between moments about origin and factorial moments about origin gives the following four moments about originμ1=θ4+4θ2+6θ(θ4+2θ2+2)

μ2=θ5+2θ4+4θ3+12θ2+6θ+24θ2(θ4+2θ2+2) μ3=θ6+6θ5+10θ4+36θ3+54θ2+72θ+120θ3(θ4+2θ2+2) μ4=θ7+14θ6+40θ5+108θ4+294θ3+408θ2+720θ+720θ4(θ4+2θ2+2)

Using the relationship between moments about mean and the moments about origin, moments about the mean are obtained asμ2=θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12θ2(θ4+2θ2+2)2 .

μ3=(θ14+3θ13+10θ12+30θ11+54θ10+126θ9+172θ8+264θ7+284θ6+324θ5+280θ4+216θ3+168θ2+72θ+48)θ3(θ4+2θ2+2)3 μ4=(θ19+10θ18+28θ17+129θ16+300θ15+796θ14+1628θ13+2952θ12+4952θ11+6968θ10+9624θ9+11048θ8+12368θ7+11952θ6+10544θ5+8544θ4+5520θ3+3648θ2+1440θ+720)θ4(θ4+2θ2+2)4

Now, the descriptive measures of PAD including coefficient of variation (C.V), skewness, kurtosis and index of dispersion are obtained asC.V=σμ1=θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12θ4+4θ2+6

β1=μ3(μ2)3/2=(θ14+3θ13+10θ12+30θ11+54θ10+126θ9+172θ8+264θ7+284θ6+324θ5+280θ4+216θ3+168θ2+72θ+48)(θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12)3/2 β2=μ4μ22=(θ19+10θ18+28θ17+129θ16+300θ15+796θ14+1628θ13+2952θ12+4952θ11+6968θ10+9624θ9+11048θ8+12368θ7+11952θ6+10544θ5+8544θ4+5520θ3+3648θ2+1440θ+720)(θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12)2 γ=σ2μ1=θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12θ(θ4+2θ2+2)(θ4+4θ2+6)

 The nature of coefficients of variation, skewness, kurtosis and index of dispersion of PAD for varying values of parameter are shown in the following Figure 2. It is obvious that the coefficient of variation, skewness, kurtosis and index of dispersion are all increasing for increasing values of parameter (Figure 2).

Figure 2 Coefficients of variation, skewness, kurtosis and index of dispersion for varying values of parameter.

Statistical properties

Over-dispersion

We have

μ2=θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12θ2(θ4+2θ2+2)2 =θ4+4θ2+6θ(θ4+2θ2+2)[θ9+θ8+6θ7+8θ6+16θ5+24θ4+20θ3+24θ2+12θ+12θ(θ4+2θ2+2)(θ4+4θ2+6)] =θ4+4θ2+6θ(θ4+2θ2+2)[1+θ8+8θ6+24θ4+24θ2+12θ(θ4+2θ2+2)(θ4+4θ2+6)] =μ1[1+θ8+8θ6+24θ4+24θ2+12θ(θ4+2θ2+2)(θ4+4θ2+6)]

This shows that μ2>μ1 and thus PAD is always over-dispersed distribution. Therefore, PAD can be used for discrete data sets which are over-dispersed in nature.

Increasing Hazard Rate and Unimodality

It can be easily shown that PAD has increasing hazard rate (IHR) and is unimodal. Since

P(x+1,θ)P(x,θ)=1θ+1[1+2{x+(θ2+θ+2)}x2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)]

  is a decreasing function of x for a given θ ,P(x,θ) is log-concave. This implies that PAD has an increasing hazard rate and is unimodal. Grandell5 has detailed discussion about relationship between log-concavity, IHR and Unimodality of discrete distributions.

Parameter estimation

Method of moment estimate

Let x1,x2,...,xn be a random sample of size n from PAD. Equating the first moment about origin to the corresponding sample moment, the MOME ˜θ of θ is the solution of the following fifth degree polynomial equation

ˉxθ5θ4+2ˉxθ34θ2+2ˉxθ6=0 , where ˉx is the sample mean.

 This equation can be solved using Newton-Raphson method to get the estimate of the parameter.

Maximum Likelihood Estimate

Let be a random sample of size n from PAD and let fx be the observed frequency in the sample corresponding to X=x(x=1,2,3,...,k) such that kx=1fx=n , where k is the largest observed value having non-zero frequency. The likelihood function L of PAD is given by

L=(θ4θ4+2θ2+2)n1(θ+1)kx=1fx(x+3)kx=1[x2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)]fx

 The log likelihood function is obtained as

logL=nlog(θ4θ4+2θ2+2)kx=1fx(x+3)log(θ+1)+kx=1fxlog[x2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)]

The first derivative of the log likelihood function is given by

dlogLdθ=12nθ4n(θ3+θ)θ4+2θ2+2n(ˉx+3)θ+1+kx=1[(4θ+2)x+(4θ3+6θ2+6θ+2)]fxx2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)

,

where ˉx is the sample mean.

The maximum likelihood estimate (MLE), ˜θ of θ is the solution of the equation dlogLdθ=0 and is given by the solution of the non-linear equation

12nθ4n(θ3+θ)θ4+2θ2+2n(ˉx+3)θ+1+kx=1[(4θ+2)x+(4θ3+6θ2+6θ+2)]fxx2+(2θ2+2θ+3)x+(θ4+2θ3+3θ2+2θ+2)=0 Since this log-likelihood equation cannot be expressed in closed form, it may be difficult to solve it by direct method. Therefore, the MLE of the parameter θ can be computed iteratively by solving log-likelihood equation using Newton-Raphson iteration available in R-software, until sufficiently close values of the parameter θ is obtained. The initial value of the parameter θ can be taken as the value given by method of moment estimate.

Applications

In this section, the applications of PAD have been discussed for three count datasets which are over-dispersed. The goodness of fit of PAD has been compared with Poisson and PLD. The pmf of PLD is given byP(x,θ)=θ2(x+θ+2)(θ+1)x+3;x=0,1,2,...,θ>0

The expected values given by Poisson, PLD and PAD are given in the table for ready comparison. It is very clear from the goodness of fit presented in tables 1, 2, and 3 that PAD provides a better fit over Poisson and PLD (Tables 1-3).

No. of errors per group

Observed frequency

         Expected frequency

      PD

     PLD

        PAD

0

35

27.4

33

33.1

1

11

21.5

15.3

15.2

2

8

      8.4

      6.8

6.7

3

4

      2.2

      2.9

            2.8

4

2

      0.5

      2.0

            2.9

Total

60

60

60

60

ML estimate

         

ˆθ=0.7833

ˆθ=1.7434

ˆθ=1.9141

χ2

7.98

2.20

1.72

d.f.

1

1

2

p-value

0.0047

0.1380

0.4232

Table 1 Distribution of mistakes in copying groups of random digits, available in Kemp and Kemp6

No. of chromatid aberrations

Observed frequency

          Expected frequency

 
PD

 

PLD

 

PAD        

0

268

231.3

257

258.1

1

87

126.7

93.4

92.5

2

26

34.7

32.8

32.4

3

9

6.3

11.2

11.2

4

4

0.8

3.8

3.8

5

2

0.1

1.2

1.3

6

1

0.1

0.4

0.4

7+

3

0.1

0.2

0.4

Total

400

400

400

400

ML estimate

ˆθ=0.5475

ˆθ=2.380442     

ˆθ=2.4406

χ2

38.21

6.21

5.21

d.f.

2

3

3

p-value

0.0000

0.1018

0.1577

Table 2 Distribution of number of chromatid aberrations (0.2 g chinon 1, 24 hours), available in Loeschke & Kohler7 and Janardan & Schaeffer8

No. of accidents

Observed frequency

           Expected frequency

           PD

         PLD

         PAD

0

447

406

439.5

440.5

1

132

189

142.8

141.5

2

42

45

45

44.7

3

21

7

13.9

14

4

3

1

4.2

4.3

5

2

0.1

1.2

2.0

Total

647

647

647

647

ML estimate

ˆθ=0.465

ˆθ=2.729

ˆθ=2.7182

χ2

61.08

4.82

4.66

d.f.

1

3

2

p-value

0.0273

0.1855

0.1985

Table 3 Accidents to 647 women working on high explosive shells in 5 weeks, available in Sankaran1

Concluding remarks

In this paper a Poisson mixture of Adya distribution called Poisson-Adya distribution (PAD) has been suggested. The expressions of statistical constants including coefficients of variation, skewness, kurtosis and index of dispersion have been obtained and their behavior for varying values of parameter has been studied. It is observed that the obtained distribution is unimodal, has increasing hazard rate and over-dispersed. Maximum likelihood estimation and method of moment have been discussed for estimating parameter. Finally, the goodness of fit of the proposed distribution and its comparison with other one parameter discrete distributions including Poisson and PLD on three datasets from biological science has been presented.

Acknowledgments

None.

Conflicts of interest

The authors declare no conflicts of interest.

References

Creative Commons Attribution License

©2022 Shanker, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.