Research Article Volume 5 Issue 2
Department of Statistics, Eritrea Institute of Technology, Eritrea
Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Eritrea
Received: December 13, 2016 | Published: February 13, 2017
Citation: Shanker R. The discrete poisson-garima distribution. Biom Biostat Int J. 2017;5(2):48-53. DOI: 10.15406/bbij.2017.05.00127
In this paper, a discrete Poisson-Garima distribution has been obtained by compounding Poisson distribution with Garima distribution introduced by Shanker.1 The general expression for the r th factorial moment has been derived and hence moments about origin and central moments have been obtained. The expression for coefficient of Variation, skewness, kurtosis and index of dispersion has been given. Maximum likelihood estimation and the method of moments have been discussed for estimating the parameter of the distribution. Two examples of real data set have been given to test the goodness of fit of the discrete Poisson-Garima distribution and the fit has been compared with Poisson and Poisson-Lindley distributions.
Keywords: garima distribution, poisson-lindley distribution, compounding, moments, skewness, kurtosis, index of dispersion, estimation of parameter, goodness of fit.
Shanker1 introduced a lifetime distribution named Garima distribution having probability density function
f(x;θ)=θθ+2(1+θ+θ x)e−θx ;x>0, θ>0f(x;θ)=θθ+2(1+θ+θx)e−θx;x>0,θ>0 (1.1)
to model behavioral science data. It has been shown by Shanker1 that Garima distribution gives much better fit than exponential and Lindley2 distributions and the new lifetime distributions introduced by Shanker3–6 namely Shanker, Akash, Aradhana and Sujatha distributions. The first four moments about origin of Garima distribution obtained by Shanker1 are given by
μ1′=θ+3θ(θ+2)
,
μ2′=2(θ+4)θ2(θ+2)
μ3′=6(θ+5)θ3(θ+2)
,
μ4′=24(θ+6)θ4(θ+2)
.
The central moments of Garima distribution obtained by Shanker1 are as follows
μ2=θ2+6θ+7θ2(θ+2)2
μ3=2(θ3+9θ2+21θ+15)θ3(θ+2)3
μ4=3(3θ4+36θ3+134θ2+204θ+111)θ4(θ+2)4
Shanker1 has studied various properties of Garima distribution including its shape, moments, skewness, kurtosis, hazard rate function, mean residual life function, stochastic ordering, mean deviations, order statistics, Bonferroni and Lorenz curves, entropy measure and stress-strength reliability, amongst others. The estimation of the parameter of Garima distribution has been discussed by Shanker1 using both maximum likelihood estimation and the method of moments.
In this paper, a Poisson mixture of Garima distribution named, Poisson-Garima distribution (PGD) has been proposed and its various statistical and mathematical properties have been investigated. The estimation of its parameter has been studied using maximum likelihood estimation and method of moments. Since Poisson-Lindley distribution (PLD), a Poisson mixture of Lindley2 distribution and introduced by Sankaran,7 gives better fit than Poisson distribution, the Poisson-Garima distribution is expected to gives better fit than both Poisson and Poisson – Lindley distribution due to the fact that Garima distribution gives better fit than Lindley distribution. The goodness of fit of the Poisson-Garima distribution has been discussed and it has been compared with that of Poisson and Poisson-Lindley distributions.
Assuming that the parameter λ of Poisson distribution follows Garima distribution (1.1), the Poisson mixture of Garima distribution can be obtained as
P(X=x)=∞∫0e−λλxx!⋅θθ+2(1+θ+θ λ)e−θ λdλ
(2.1)
=θ(θ+2)x!∞∫0e−(θ+1)λ[(1+θ) λx+θ λx+1]dλ
=θθ+2⋅θx+(θ2+3θ+1)(θ+1)x+2 ;x=0,1,2,3,...,θ>0
(2.2)
This probability mass function (p.m.f.) has been named as “Poisson-Garima distribution (PGD)”.
It should be noted that Sankaran7 obtained Poisson-Lindley distribution (PLD) having probability mass function (p.m.f)
P(X=x)=θ2(x+θ+2)(θ+1)x+3 ;x=0,1,2,...,θ>0
(2.3)
by compounding Poisson distribution with Lindley distribution, introduced by Lindley2 having probability density function (p.d.f)
f(x,θ)=θ2θ+1(1+x) eθx ;x>0, θ>0 (2.4)
The graphs of the pmf of Poisson-Garima distribution (PGD) and Poisson-Lindley distribution (PLD) for varying values of the parameter are shown in the figure 1
The r th factorial moment about origin of Poisson-Garima distribution (2.2) can be obtained as
μ(r)′=E[E(X(r)|λ)]
, where
X(r)=X(X−1)(X−2)...(X−r+1)
=θθ+2∞∫0[∞∑x=0x(r)e−λλxx!](1+θ+θ λ)e−θ λdλ
=θθ+2∞∫0[λr∞∑x= re−λλx−r(x−r)!](1+θ+θ λ)e−θ λdλ
Taking
x+r
in place of
x
within bracket, we get
μ(r)′=
θθ+2∞∫0λr[∞∑x=0e−λλxx!](1+θ+θ λ)e−θ λdλ
The expression within the bracket is clearly unity and hence we have
μ(r)′=
θθ+2∞∫0λr(1+θ+θ λ)e−θ λdλ
Using gamma integral and some algebraic simplification, we get finally a general expression for the r th factorial moment of PGD (2.2) as
μ(r)′=r!(θ+r+2)θr(θ+2) ;r=1,2,3,....
(3.1)
Substituting
r=1,2,3, and 4
in (3.1), the first four factorial moments can be obtained and using the relationship between factorial moments and moments about origin, the first four moments about origin of the PGD (2.2) are obtained as
μ′1=θ+3θ(θ+2)
μ′2=θ2+5θ+8θ2(θ+2)
μ′3=θ3+9θ2+30θ+30θ3(θ+2)
μ′4=θ4+17θ3+92θ2+204θ+144θ4(θ+2)
Using the relationship between moments about mean and the moments about origin, the moments about mean of the PGD (2.2) are obtained as
μ2=σ2=θ3+6θ2+12θ+7θ2(θ+2)2
μ3=θ5+10θ4+42θ3+87θ2+84θ+30θ3(θ+2)3
μ4=(θ7+19θ6+148θ5+607θ4+1402θ3+1816θ2+1224θ+333)θ4(θ+2)4
The coefficient of variation
(C.V)
, coefficient of Skewness
(√β1)
, coefficient of Kurtosis
(β2)
and index of dispersion
(γ)
of the PGD (2.2) are thus obtained as
C.V=σμ′1=√θ3+6θ2+12θ+7θ+3
√β1=μ3μ23/2=θ5+10θ4+42θ3+87θ2+84θ+30(θ3+6θ2+12θ+7)3/2
β2=μ4μ22=(θ7+19θ6+148θ5+607θ4+1402θ3+1816θ2+1224θ+333)(θ3+6θ2+12θ+7)2
γ=σ2μ1′=θ3+6θ2+12θ+7θ(θ+2)(θ+3)
To study the nature and behavior of
μ1′, μ2, C.V, √β1, β2 and γ
of PGD and PLD, values of these characteristics for varying values of parameter
θ
have been computed and presented in table 1
|
Values of θ for Poisson-Garima Distribution |
|||||
1 |
2 |
3 |
4 |
5 |
6 |
|
μ1' |
1.333333 |
0.625 |
0.4 |
0.291667 |
0.228571 |
0.1875 |
μ2 |
2.888889 |
0.984375 |
0.551111 |
0.373264 |
0.279184 |
0.221788 |
CV |
1.274755 |
1.587451 |
1.855921 |
2.094697 |
2.311655 |
2.511701 |
√β1 |
1.915904 |
2.147798 |
2.355147 |
2.54717 |
2.727407 |
2.897852 |
β2 |
8.210059 |
9.335601 |
10.36498 |
11.36106 |
12.34549 |
13.32641 |
γ |
2.166667 |
1.575 |
1.377778 |
1.279762 |
1.221429 |
1.18287 |
|
Values of θ for Poisson-Lindley Distribution |
|||||
|
1 |
2 |
3 |
4 |
5 |
6 |
μ1' |
1.5 |
0.666667 |
0.416667 |
0.3 |
0.233333 |
0.190476 |
μ2 |
3.25 |
1.055556 |
0.576389 |
0.385 |
0.285556 |
0.225624 |
CV |
1.20185 |
1.541104 |
1.822087 |
2.068279 |
2.290174 |
2.493742 |
√β1 |
1.792108 |
2.083265 |
2.314307 |
2.517935 |
2.704839 |
2.87957 |
β2 |
7.532544 |
8.941828 |
10.10611 |
11.17187 |
12.19654 |
13.203 |
γ |
2.166667 |
1.583333 |
1.383333 |
1.283333 |
1.22381 |
1.184524 |
Table 1 Values of μ1′, μ2, C.V, √β1, β2 and γ of PGD and PLD for varying values of the parameter θ
No. of insects |
Observed Frequency |
Expected Frequency |
||
PD |
PLD |
PGD |
||
0 |
35 |
27.4 |
33.0 |
33.3 |
Total |
60 |
60.0 |
60.0 |
60.0 |
ML estimate |
|
ˆθ=0.7833 |
ˆθ=1.7434 |
ˆθ=1.628413 |
χ2 |
|
7.98 |
2.20 |
1.71 |
d.f. |
|
1 |
1 |
2 |
p-value |
|
0.0047 |
0.1380 |
0.4253 |
Table 2 Distribution of mistakes in copying groups of random digits
No. of errors per Group |
Observed Frequency |
Expected Frequency |
||
PD |
PLD |
PGD |
||
0 |
33 |
26.4 |
31.5 |
31.7 |
Total |
56 |
56.0 |
56.0 |
56.0 |
ML estimate |
|
ˆθ=0.7500 |
ˆθ=1.8081 |
ˆθ=1.695033 |
χ2 |
|
4.87 |
0.53 |
0.38 |
d.f. |
|
1 |
1 |
1 |
p-value |
|
0.0273 |
0.4660 |
0.5376 |
Table 3 Distribution of Pyrausta nublilalis
The graph of the coefficient of variation (C.V), coefficient of skewness (√β1) , coefficient of kurtosis (β2) , and index of dispersion (γ) of PGD and PLD are presented in figure 2.
The PGD (1.3) is always over dispersed
(σ2>μ)
.
We have
σ2=θ3+6θ2+12θ+7θ2(θ+2)2
=θ+3θ(θ+2)[θ3+6θ2+12θ+7θ(θ+2)(θ+3)]
=θ+3θ(θ+2)[1+θ2+6θ+7θ(θ+2)(θ+3)]
=μ[1+θ2+6θ+7θ(θ+2)(θ+3)]>μ
This shows that PGD (2.2) is always over dispersed.
Unimodality and increasing hazard rate
Since
P(x+1;θ)P(x;θ)=θ(x+1)+(θ2+3θ+1)(θ+1)[θx+(θ2+3θ+1)]=1θ+1[1+θθx+(θ2+3θ+1)]
is decreasing function in x,
P(x;θ)
is log-concave. Therefore, the PGD has an increasing hazard rate and thus unimodal. Detailed discussion about relationship between log-concavity, unimodality and increasing hazard rate of discrete distribution can be seen in Grandell.8
Generating functions
Probability generating function: The probability generating function of the PGD (2.2) can be obtained as
PX(t)=E(tX)=θ(θ+2)(θ+1)2[θ∞∑x=0x(tθ+1)x+(θ2+3θ+1)∞∑x=0(tθ+1)x]
=θ(θ+2)(θ+1)2[θ(θ+1)t(θ+1−t)2+(θ2+3θ+1)(θ+1)θ+1−t]
=θ(θ+2)(θ+1)[θt(θ+1−t)2+θ2+3θ+1θ+1−t]
=θ3+(4−t)θ2+2(2−t)θ+(1−t)(θ+1)(θ+2)(θ+1−t)2
Moment generating function: The moment generating function of the PGD (2.2) is thus given by
MX(t)=θ3+(4−et)θ2+2(2−et)θ+(1−et)(θ+1)(θ+2)(θ+1−et)2
.
Maximum likelihood estimate (MLE): Let
x1,x2,...,xn
be a random sample of size n from the PGD (2.2) and let
fx
be the observed frequency in the sample corresponding to
X=x (x=1,2,3,...,k)
such that
k∑x=1fx=n
, where k is the largest observed value having non-zero frequency. The likelihood function L of the PGD (2.2) is given by
L=(θθ+2)n1(θ+1)k∑x=1(x+2)fxk∏x=1[θx+(θ2+3θ+1)]fx
The log likelihood function is obtained as
logL=nlog(θθ+2)−k∑x=1fx(x+2)log(θ+1)+k∑x=1fxlog[θx+(θ2+3θ+1)]
The first derivative of the log likelihood function is given by
dlogLdθ=2nθ(θ+2)−n(ˉx+2)θ+1+k∑x=1(x+2θ+3)fxθx+(θ2+3θ+1)
where
ˉx
is the sample mean.
The maximum likelihood estimate (MLE),
ˆθ
of
θ
is the solution of the equation
dlogLdθ=0
and is given by the solution of the non-linear equation
2nθ(θ+2)−n(ˉx+2)θ+1+k∑x=1(x+2θ+3)fxθx+(θ2+3θ+1)=0
This non-linear equation can be solved by any numerical iteration methods such as Newton- Raphson, Bisection method, Regula –Falsi method etc
Method of moment estimate (MOME): Let
x1,x2,...,xn
be a random sample of size
n
from the PGD (2.2). Equating the first population moment about origin to the corresponding sample moment, the MOME
˜θ
of
θ
is given by
˜θ=(1−2ˉx)+√4ˉx2+8ˉx+12ˉx ;ˉx>0
where
ˉx
is the sample mean.
The PGD has been fitted to a number of data - sets to test its goodness of fit over Poisson distribution (PD) and Poisson-Lindley distribution (PLD. The parameter has been estimated using maximum likelihood estimation. Two examples of observed data-sets, for which the PD, PLD and PGD has been fitted, are presented. The first data-set is due to Kemp and kemp9 regarding the distribution of mistakes in copying groups of random digits and the second data-set is due to Beall10 regarding the distribution of Pyrausta nublilalis.
A discrete Poisson-Garima distribution has been proposed by compounding Poisson distribution with Garima distribution introduced by Shanker.1 Expression for r th factorial moment about origin has been derived and hence moments about origin and central moments have been given. The nature and behavior of coefficient of Variation, skewness, kurtosis and index of dispersion of the proposed distribution have been studied for varying values of the parameter. The estimation of parameter has been discussed using both maximum likelihood estimation and method of moments. The goodness of fit of the proposed distribution has been discussed with two examples of real data set and fit has been compared with Poisson and Poisson-Lindley distributions. The goodness of fit of the Poisson – Garima distribution shows that it gives better fit than both Poisson and Poisson-Lindley distribution and hence it can be considered as an important distribution to model discrete data over these two discrete distributions.
None.
None.
©2017 Shanker. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7