Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js
Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 6 Issue 3

Size-biased poisson-garima distribution with applications

Rama Shanker, Kamlesh Kumar Shukla

Department of Statistics, Eritrea Institute of Technology, Eritrea

Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Asmara, Eritrea

Received: August 23, 2017 | Published: August 28, 2017

Citation: Shanker R, Shukla KK. Size-biased poisson-garima distribution with applications. Biom Biostat Int J. 2017;6(3):335-340. DOI: 10.15406/bbij.2017.06.00167

Download PDF

Abstract

In this paper, a size-biased Poisson-Garima distribution (SBPGD) has been obtained by size-biasing the Poisson-Garima distribution (PGD) introduced by Shanker (2017). The moments about origin and moments about mean have been obtained and hence expressions for coefficient of variation (C.V.), skewness and Kurtosis have been obtained. The estimation of its parameter using the method of moment and the method of maximum likelihood estimation has been discussed. The goodness of fit of SBPGD has been discussed for two real data sets using maximum likelihood estimate and the fit shows quite satisfactory over size-biased Poisson distribution (SBPD) and size-biased Poisson-Lindley distribution (SBPLD).

Keywords: garima distribution, poisson-garima distribution, size-biasing, moments, estimation of parameter, goodness of fit

Introduction

Shanker1 has obtained Poisson-Garima distribution (PGD) for modeling count data having probability mass function (p.m.f.)

P0(x;θ)=θθ+2θx+(θ2+3θ+1)(θ+1)x+2 ; x=0,1,2,...,θ>0  (1.1)

The first four moments about origin and the variance of PGD obtained by Shanker1 are as follows:

μ1=θ+3θ(θ+2)  , μ2=θ2+5θ+8θ2(θ+2)  , μ3=θ3+9θ2+30θ+30θ3(θ+2)

μ4=θ4+17θ3+92θ2+204θ+144θ4(θ+2)  , and μ2=θ3+6θ2+12θ+7θ2(θ+2)2

The detailed discussion about its properties, estimation of parameter, and applications has been discussed by Shanker1 and it has been shown that it is better than Poisson and Poisson-Lindley distributions for modeling count data in various fields of knowledge. The PGD arises from the Poisson distribution when its parameter λ follows Garima distribution introduced by Shanker2 having probability density function (p.d.f.)

f0(λ;θ)=θθ+2(1+θ+θλ)eθλ;λ>0,θ>0  (1.2)

Size-biased distributions arise in practice when observations from a sample having probability proportional to some measure of unit size. Fisher3 firstly introduced these distributions to model ascertainment biases which were later formalized by Rao4 in a unifying theory. Size-biased observations occur in many research areas and its fields of applications includes medical science, sociology, psychology, ecology, geological sciences etc. The applications of size-biased distribution theory in fitting distributions of diameter at breast height (DBH) data arising from horizontal point sampling (HPS) has been discussed by Van Deusen.5 Further, Lappi and Bailey6 have applied size-biased distributions to analyze HPS diameter increment data. The statistical applications of size-biased distributions to the analysis of data relating to human population and ecology can be found in Patil and Rao.7,8 Some of the recent results on size-biased distributions pertaining to parameter estimation in forestry with special emphasis on Weibull family have been discussed by Gove.9 Ducey and Gove10 discussed size-biased distributions in the generalized beta distribution family, with applications to forestry.

Let a random variable X has the original probability distribution P0(x;θ) , then a simple size-biased distribution is given by its probability function

P1(x;θ)=xP0(x;θ)μ1  (1.3)

Where μ1=E(X) is the mean of the original probability distribution.

In the present paper, a size-biased Poisson-Garima distribution (SBPGD) has been proposed. It s raw and central moments and central moments based properties including coefficient of variation, skewness, kurtosis and index of dispersion have been obtained and discussed. Some of its statistical properties have been discussed. The method of moment and the method of maximum likelihood estimation have been discussed for estimating the parameter of SBPGD. The goodness of fit of SBPGD has also been presented.

Size-biased poisson-garima distribution

Using (1.1) and (1.3), the p.m.f. of the size-biased Poisson-Garima distribution (SBPGD) with parameter θ can be obtained as

P1(x;θ)=xP0(x;θ)μ1=θ2θ+3x2θ+x(θ2+3θ+1)(θ+1)x+2;x=1,2,3,..,θ>0  (2.1)

where μ1=θ+3θ(θ+2)  is the mean of the PGD (1.1).

The SBPGD can also be obtained from the size-biased Poisson distribution (SPBD) with p.m.f.

g(x|λ)=eλλx1(x1)!;x=1,2,3,...,λ>0  (2.2)

when its parameter λ follows size-biased Garima distribution (SBGD) with p.d.f.

h(λ;θ)=θ2θ+3λ(1+θ+θλ)eθλ;x>0,θ>0  (2.3)

Thus the p.m.f of SBPGD can be obtained as

P(X=x)=0g(x|λ)h(λ;θ)dλ

=0eλλx1(x1)!θ2θ+3λ(1+θ+θλ)eθλdλ  (2.4)

=θ2(θ+3)(x1)!0e(θ+1)λ[(1+θ)λx+θλx+1]dλ

=θ2θ+3x2θ+x(θ2+3θ+1)(θ+1)x+2;x=1,2,3,..,θ>0

which is the p.m.f of SBPGD with parameter θ .

Graphs of SBPGD for varying values of parameter θ  are shown in figure 1. It is obvious from the graphs of SBPGD that as the value of parameter θ  increases, the initially the graphs shift upward and decreases fast for increasing values of x . Also the graphs become convex for values of θ2 .

Figure 1 Graphs of SBPGD for varying values of parameter θ .

It would be recalled that the p.m.f of size-biased Poisson-Lindley distribution (SBPLD) given by

P2(X=x)=θ3θ+2x(x+θ+2)(θ+1)x+2;x=1,2,3,...,;θ>0  (1.7)

has been introduced by Ghitany and Mutairi,11 which is a size-biased version of Poisson-Lindley distribution (PLD) introduced by Sankaran.12 Ghitany and Mutairi11 have discussed its various mathematical and statistical properties, estimation of the parameter using maximum likelihood estimation and the method of moments, and goodness of fit. Shanker et al.,13 has detailed study on the applications of size-biased Poisson-Lindley distribution (SBPLD) for modeling data on thunderstorms and observed that in most data sets, SBPLD gives better fit than size-biased Poisson distribution (SBPD).

Moments and moments based measures

Using (2.4), the r th factorial moment about origin of the SBPGD (2.1) can be obtained as

μ(r)=E[E(X(r)|λ)]=0[x=1x(r)eλλx1(x1)!]θ2θ+3λ(1+θ+θλ)eθλdλ

=θ2θ+30[λr1x=rxeλλxr(xr)!]λ(1+θ+θλ)eθλdλ

Taking y=xr , we get

μ(r)=θ2θ+30[λr1y=0(y+r)eλλyy!]λ(1+θ+θλ)eθλdλ

=θ2θ+30λr1(λ+r)λ(1+θ+θλ)eθλdλ

=r!{(θ+1)(rθ+r+1)+(r+1)(rθ+r+2)}θr(θ+3);r=1,2,3,...  (3.1)

Substituting r=1,2,3,and4 , the first four factorial moments about origin can be obtained and using the relationship between factorial moments about origin and moments about origin, the first four moments about origin of SBPGD can be obtained as

μ1=θ2+5θ+8θ(θ+3)

μ2=θ3+9θ2+30θ+30θ2(θ+3)  

μ3=θ4+17θ3+92θ2+204θ+144θ3(θ+3)  

μ4=θ5+33θ4+270θ3+990θ2+1560θ+840θ4(θ+3)  

Using the relationship between moments about mean and the moments about origin, the moments about mean of the SBPGD are thus obtained as

μ2=2(θ3+8θ2+20θ+13)θ2(θ+3)2                        μ3=2(θ5+13θ4+68θ3+171θ2+195θ+80)θ3(θ+3)3                                            

μ4=2(θ7+26θ6+269θ5+1435θ4+4230θ3+6819θ2+5520θ+1740)θ4(θ+3)4  

The coefficient of variation (C.V) , coefficient of Skewness (β1) , coefficient of Kurtosis (β2)  and the index of dispersion (γ) of the SBPGD are thus obtained as

C.V=σμ1=2(θ3+8θ2+20θ+13)θ2+5θ+8  

β1=μ3μ23/2=θ5+13θ4+68θ3+171θ2+195θ+802(θ3+8θ2+20θ+13)3/2  

β2=μ4μ22=(θ7+26θ6+269θ5+1435θ4+4230θ3+6819θ2+5520θ+1740)2(θ3+8θ2+20θ+13)2  

γ=σ2μ1=2(θ3+8θ2+20θ+13)θ(θ+3)(θ2+5θ+8)  

Graphs of coefficient of variation, coefficient of skewness, coefficient of kurtosis and index of dispersion of SBPGD for varying values of parameter θ  are shown in figure 2. It is obvious from the graphs that C.V and the index of dispersion are monotonically decreasing while the coefficient of skewness and coefficient of kurtosis are decreasing for increasing value of the parameter θ .

Figure 2 Graphs of coefficient of variation, coefficient of skewness, coefficient of kurtosis and index of dispersion of SBPGD for varying values of parameter θ .

The condition under which SBPGD and SBPLD are over-dispersed, equi-dispersed or under-dispersed are presented in table 1.

Distributions

Over-dispersion
(μ<σ2)

Equi-dispersion
(μ=σ2)

Under-dispersion
(μ>σ2)

SBPGD

θ<1.671162

θ=1.671162

θ>1.671162

SBPLD

θ<1.636061

θ<1.636061

θ<1.636061

Table 1 Over-dispersion, equi-dispersion and under-dispersion of SBPGD and SBPLD

Statistical properties of SBPGD

Unimodality and increasing failure rate

Since

P1(x+1;θ)P1(x;θ)=(1θ+1)[1+2xθ+(θ2+4θ+1)x2θ+x(θ2+3θ+1)]  

is a deceasing function of x , P1(x;θ) is log-concave. Therefore, SBPGD is unimodal, has an increasing failure rate (IFR), and hence increasing failure rate average (IFRA). It is new better than used in expectation (NBUE) and has decreasing mean residual life (DMRL). Detailed discussion about definitions and interrelationships between these aging concepts are available in Barlow and Proschan.14

Generating functions

Probability Generating Function: The probability generating function of the SBPGD (2.1) can be obtained as

PX(t)=E(tX)=θ2(θ+3)(θ+1)2[θx=1x2(tθ+1)x+(θ2+3θ+1)x=1x(tθ+1)x]

=θ2(θ+3)(θ+1)2[θt(θ+1+t)(θ+1)(θ+1t)3+t(θ2+3θ+1)(θ+1)(θ+1t)2]

=θ2t(θ+3)(θ+1)[θ(θ+1+t)(θ+1t)3+θ2+3θ+1(θ+1t)2]

Moment generating function: The moment generating function of the SBPGD (2.1) can be given by

MX(t)=E(etX)=θ2et(θ+3)(θ+1)[θ(θ+1+et)(θ+1et)3+θ2+3θ+1(θ+1et)2]

Estimation of parameter

Method of moment estimate (MOME): Let x1,x2,...,xn be a random sample of size n from the SBPGD (2.1). Equating the population to the corresponding sample mean, the MOME ˜θ of θ  of SBPGD (2.1) can be obtained as

˜θ=(3ˉx5)+9ˉx2+2ˉx72(ˉx1)

where ˉx is the sample mean.

Maximum likelihood estimate (MLE): Let x1,x2,...,xn be a random sample of size n from the SBPGD (2.1) and let fx be the observed frequency in the sample corresponding to X=x(x=1,2,3,...,k) such that kx=1fx=n , where k is the largest observed value having non-zero frequency. The likelihood function L of the SBPGD (2.1) is given by

L=(θ2θ+3)n1(θ+1)kx=1fx(x+2)kx=1[x2θ+x(θ2+3θ+1)]fx

The log likelihood function is obtained as

logL=nlog(θ2θ+3)kx=1fx(x+2)log(θ+1)+kx=1fxlog[x2θ+x(θ2+3θ+1)]

The first derivative of the log likelihood function is given by

dlogLdθ=n(θ+6)θ(θ+3)n(ˉx+2)θ+1+kx=1(x+2θ+3)fxxθ+(θ2+3θ+1)

where ˉx  is the sample mean.

The maximum likelihood estimate (MLE), ˆθ  of θ  is the solution of the equation dlogLdθ=0  and is given by the solution of the non-linear equation

kx=1(x+2θ+3)fxxθ+(θ2+3θ+1)n(ˉx+2)θ+1+n(θ+6)θ(θ+3)=0

This non-linear equation can be solved by any numerical iteration methods such as Newton- Raphson method, Bisection method, Regula –Falsi method etc. note that in this paper, we have solved above equation using Newton-Raphson method where the initial value of θ  is the value given by the method of moment estimate.

Data analysis

In this section, we fit SBPGD using maximum likelihood estimate to test its goodness of fit over SBPD and SBPLD. The first data-set is the immunogold assay data of Cullen et al.,15 regarding the distribution of number of counts of sites with particles from immunogold assay data, the second data-set is the number of European red mites on apple leaves, reported by Garman16 (Tables 2&3).

It is obvious from above tables that SBPGD gives better fit than both SBPD and SBPLD

No. of sites with particles

Observed frequency

Expected frequency

SBPD

SBPLD

SBPGD

1
2
3
4
5

122
50
18
4
4

111.3
64.1
18.53.50.6}

119.0
53.8
18.0
5.31.9}

119.1
53.7
18.0
5.31.9}

Total

198

198.0

198.0

198.0

ML estimate

 

ˆθ=0.576

ˆθ=4.051

ˆθ=2.0992

χ2

 

4.642

0.51

0.40

d.f.

 

1

2

2

p-value

 

0.031

0.7749

0.8187

Table 2 Distribution of number of counts of sites with particles from immunogold data

Number of European red mites

Observed frequency

Expected frequency

SBPD

SBPLD

SBPGD

1

38

28.7

31.7

31.9

2

17

25.7

23.9

23.8

3

10

15.3

13.2

13.1

4
5
6
7
8

9
3
2
1
0

6.92.50.70.20.1}

6.32.81.20.50.4}

6.32.81.20.50.4}

Total

80

80.0

80.0

80.0

ML estimate

 

ˆθ=1.791615

ˆθ=2.163462

ˆθ=2.08381

χ2

 

9.827

5.30

5.11

d.f.

 

2

2

2

P-value

 

0.0073

0.0706

0.0777

Table 3 Number of European red mites on apple leaves, reported by Garman (1923)

Acknowledgements

None.

Conflicts of interest

None.

References

  1. Shanker R. The Discrete Poisson‒Garima Distribution. Biom Biostat Int J. 2017;5(2):00127.
  2. Shanker R. Garima Distribution and its Application to Model Behavioral Science Data. Biom Biostat Int J. 2016;4(7):00116.
  3. Fisher RA. The effects of methods of ascertainment upon the estimation of frequencies. Ann Eugenics. 1934;6(1):13‒25.
  4. Rao CR. On discrete distributions arising out of methods of ascertainment In: Patil GP (Ed.), Classical and Contagious Discrete Distributions. Statistical Publishing Society, Calcutta. 1965;pp. 320‒332.
  5. Van Deusen PC. Fitting assumed distributions to horizontal point sample diameters. For Sci. 1986;32(1):146‒148.
  6. Lappi J, Bailey RL. Estimation of diameter increment function or other tree relations using angle‒count samples. Forest science. 1987;33:725‒739.
  7. Patil GP, Rao CR. The Weighted distributions: A survey and their applications. In applications of Statistics (Ed P.R. Krishnaiah, North Holland Publications Co., Amsterdam. 1977;pp. 383‒405.
  8. Patil GP, Rao CR. Weighted distributions and size‒biased sampling with applications to wild‒life populations and human families. Biometrics. 1978;34(2):179‒189.
  9. Gove JH. Estimation and applications of size‒biased distributions in forestry. In Modeling Forest Systems. In: Amaro A, et al. (Eds), CABI Publishing, USA. 2003;pp. 201‒212.
  10. Ducey MJ, Gove JH. Size‒biased distributions in the generalized beta distribution family, with applications to forestry. Forestry‒ An International Journal of Forest Research. 2015;88:143‒151.
  11. Ghitany ME, Al‒Mutairi DK. Size‒biased Poisson‒Lindley distribution and Its Applications. Metron ‒ International Journal of Statistics LXVI. 2008;(3):299‒311.
  12. Sankaran M. The discrete Poisson‒Lindley distribution. Biometrics. 1970;26(1):145‒149.
  13. Shanker R, Hagos F, Abrehe Y. On Size –Biased Poisson‒Lindley Distribution and Its Applications to Model Thunderstorms. American Journal of Mathematics and Statistics. 2015;5(6):354‒360.
  14. Barlow RE, Proschan F. Statistical Theory of Reliability and Life Testing, Silver Spring, MD. 1981.
  15. Cullen MJ, Walsh J, Nicholson LV, et al. Ultrastructural localization of dystrophin in human muscle by using gold immunolabelling. Proc R Soc Lond B Biol Sci. 240(1297):197‒210.
  16. Garman P. The European red mites in Connecticut apple orchards. Connecticut Agri Exper Station Bull. 1923;252:103‒125.
  17. Lindley DV. Fiducial distributions and Bayes theorem. Journal of the Royal Statistical Society. 1958;20(1):102‒107.
Creative Commons Attribution License

©2017 Shanker, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.