Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 2 Issue 4

On poisson-lindley distribution and its applications to biological sciences

Rama Shanker,1 Hagos Fesshaye2

1Department of Statistics, Eritrea Institute of Technology, Eritrea
2Department of Economics, College of Business and Economics, Eritrea

Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Eritrea

Received: April 13, 2015 | Published: April 27, 2015

Citation: Shanker R, Fesshaye H. On poisson-lindley distribution and its applications to biological sciences. Biom Biostat Int J. 2015;2(4):103-107. DOI: 10.15406/bbij.2015.02.00036

Download PDF

Abstract

A general expression for the rr th factorial moment of Poisson-Lindley distribution has been obtained and hence its first four moments about origin has been obtained. The distribution has been fitted to some data-sets relating to ecology and genetics to test its goodness of fit and the fit shows that it can be an important tool for modeling biological science data.

Keywords: Lindley distribution, Poisson-Lindley distribution, moments, compounding, estimation of parameters, goodness of fit

Introduction

The Poisson-Lindley distribution (PLD) given by its probability mass function

P(X=x)=θ2(x+θ+2)(θ+1)x+3;P(X=x)=θ2(x+θ+2)(θ+1)x+3;x = 0, 1, 2,…, > 0. (1.1)

has been introduced by Sankaran (1970) to model count data. The distribution arises from the Poisson distribution when its parameter λλ follows Lindley (1958) distribution with its probability density function

f(λ,θ)=θ2θ+1(1+λ)eθλf(λ,θ)=θ2θ+1(1+λ)eθλx>0,θ>0x>0,θ>0 (1.2)

 We have

P(X=x)=P(X=x)= 0P(x|λ)f(λ;θ)dλ=0eλλxx!θ2θ+1(1+λ)eθλdλ0P(x|λ)f(λ;θ)dλ=0eλλxx!θ2θ+1(1+λ)eθλdλ(1.3)

=θ2(θ+1)x!0e(θ+1)λ(λx+λx+1)dλ=θ2(θ+1)x!0e(θ+1)λ(λx+λx+1)dλ

=θ2θ+1[1(θ+1)x+1+x+1(θ+1)x+2]=θ2θ+1[1(θ+1)x+1+x+1(θ+1)x+2]

=θ2(x+θ+2)(θ+1)x+3;θ>0,x=0,1,2,3,...=θ2(x+θ+2)(θ+1)x+3;θ>0,x=0,1,2,3,...

which is the Poisson-Lindley distribution (PLD).

The PLD has been extensively studied by Sankaran1 and Ghitany and Mutairi2 and they have discussed its various properties. The PLD has been generalized by many researchers. Shanker & Mishra3 obtained a two parameter Poisson-Lindley distribution by compounding Poisson distribution with a two parameter Lindley distribution introduced by Shanker & Mishra.4 A quasi Poisson-Lindley distribution has been introduced by Shanker & Mishra5 by compounding Poisson distribution with a quasi Lindley distribution introduced by Shanker & Mishra.6 Shanker et al.7 obtained a discrete two parameter Poisson-Lindley distribution by mixing Poisson distribution with a two parameter Lindley distribution for modeling waiting and survival time’s data introduced by Shanker et al.8 Further, Shanker & Tekie9 obtained a new quasi Poisson-Lindley distribution by compounding Poisson distribution with a new quasi Lindley distribution introduced by Shanker & Amanuel.10

In this paper, a general expression for the th factorial moment of PLD has been obtained and hence its first four moments about origin has also been obtained. It seems that not much work has been done on the applications of PLD. The PLD has been fitted to some data sets in ecology and genetics along with Poisson distribution and it has been found that PLD is more flexible for analyzing different types of count data than Poisson distribution.

Moments of poisson-Lindley distribution

The rr th factorial moment about origin of the PLD (1.1) can be obtained as

μ(r)=E[E(X(r)|λ)](2.1)

 where

X(r)=X(X1)(X2)...(Xr+1)

From (1.3), we thus have

μ(r)=0[x=0x(r)eλλxx!]θ2θ+1(1+λ)eθλdλ

=0[λrx=reλλxr(xr)!]θ2θ+1(1+λ)eθλdλ

Takingx+r in place of x , we get

μ(r)=0λr[x=0eλλxx!]θ2θ+1(1+λ)eθλdλ

The expression within the bracket is clearly unity and hence we have

μ(r)=θ2θ+10λr(1+λ)eθλdλ

Using gamma integral and a little algebraic simplification, we get finally a general expression for the th factorial moment of PLD as

μ(r)=r!(θ+r+1)θr(θ+1);r=1,2,3,....(2.2)

Substituting r=1,2,3,and4  in (2.2), first four factorial moments can be obtained and using the relationship between factorial moments and moments about origin, the first four moment about origin of the PLD (1.1) are given by

μ1=θ+2θ(θ+1)(2.3)

μ2= θ+2θ(θ+1)+2(θ+3)θ2(θ+1) (2.5) μ3=θ+2θ(θ+1)+6(θ+3)θ2(θ+1)+6(θ+4)θ3(θ+1) (2.4)

μ4=θ+2θ(θ+1)+14(θ+3)θ2(θ+1)+36(θ+4)θ3(θ+1)+24(θ+5)θ4(θ+1)(2.6)

Ghitany et al.2 discussed the estimation methods for the PLD (1.1) and its applications.

Estimation of parameters

Maximum likelihood (ML) estimates: Let x1,x2,,xn be a random sample of size n from the PLD (1.1). Let fx be the observed frequency in the sample corresponding to X=x ( x=1,2,3,...,k ) such that kx=1fx=n , where k is the largest observed value having non-zero frequency. The likelihood function, L , of the PLD (1.1) is given by

L=θ2n1(θ+1)kx=1fx(x+3)kx=1(x+θ+2)fx

The log likelihood function is given by

LogL=2nlogθkx=1fx(x+3)log(θ+1)+kx=1fxlog(x+θ+2)

The maximum likelihood estimate,ˆθ  of θ  is the solution of the equation dlogLdθ=0  and is given by solution of the following non-linear equation

2nθn(ˉx+3)θ+1+kx=1fxx+θ+2=0

Where ˉx  is the sample mean. It has been shown by Ghitany & Mutairi2 that the ML estimator ˆθ  of θ  is consistent and asymptotically normal.

Estimates from moments: Let x1,x2,,xn  be a random sample of size n from the PLD (1.1). Equating the first moment about origin to the sample mean, the method of moment (MOM) estimate, ˜θ , of θ  is given by

˜θ=(ˉx1)+(ˉx1)2+8ˉx2ˉx;ˉx>0

Whereˉx is the sample mean? It has been shown by Ghitany & Mutairi [2] that the MOM estimator ˆθ  of θ  is positively biased, consistent and asymptotically normal.

Applications of poisson-Lindley distribution

The Poisson distribution is a suitable model for the situations where events seem to occur at random such as the number of customers arriving at a service point, the number of telephone calls arriving at an exchange , the number of fatal traffic accidents per week in a given state, the number of radioactive particle emissions per unit of time, the number of meteorites that collide with a test satellite during a single orbit, the number of organisms per unit volume of some fluid, the number of defects per unit of some materials, the number of flaws per unit length of some wire, etc. However, the Poisson distribution requires events to be independent- a condition which is rarely satisfied completely. In biological science and medical science, the occurrence of successive events is dependent. The negative binomial distribution is a possible alternative to the Poisson distribution when successive events are possibly dependent11 Further, for fitting Poisson distribution to the count data equality of mean and variance should be satisfied. Similarly, for fitting negative binomial distribution (NBD) to the count data, mean should be less than the variance. In biological and medical sciences, these conditions are not fully satisfied.

The theoretical and empirical justification for the selection of the PLD to describe biological science and medical science data is that PLD is over dispersed ( μ<σ2 )

Application in ecology

The organisms and their environment in the nature are not only complex and dynamic but also interdependent, mutually reactive and interrelated. Ecology deals with the various principles which govern such relationship between organisms and their environment. Fisher et al.12 has discussed the applications of Logarithmic series distribution (LSD) to model count data in the science of ecology. It was Kempton13 who fitted the generalized form of Fisher’s Logarithmic series distribution (LSD) to model insect data and concluded that it gives a superior fit as compared to ordinary Logarithmic series distribution (LSD). He also concluded that it gives better explanation for the data having exceptionally long tail. Tripathi & Gupta14 proposed another generalization of the Logarithmic series distribution (LSD) and fitted it to insect data and found that it gives better fit as compared to ordinary Logarithmic series distribution. They concluded that the distribution is flexible to describe short-tailed as well as long-tailed data. Mishra & Shanker15 have discussed applications of generalized logarithmic series distributions (GLSD) to models data in ecology.

In this section we have tried to fit Poisson distribution and Poisson -Lindley distribution to many biological data using maximum likelihood estimates. The data were on haemocytometer yeast cell counts per square, on European red mites on apple leaves and European corn borers per plant.

It is obvious from above (Table 1-3) that PLD gives much closer fit than Poisson distribution and thus it can be considered as an important tool for modeling data in ecology.

Number of cells per square

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

128

118.1

127.4

1

37

54.3

41.1

2
3
4
5+

18
3
1
0

12.51.90.20.0}

12.93.91.20.5}

Total

187

187

187

Estimate of parameter

ˆθ=0.459893

ˆθ=2.751579

χ2

9.903

1.431

d.f.

1

1

p-value

 

0.0016

0.2316

Table 1 Observed and expected number of Haemocytometer yeast cell counts per square observed by ‘Student’ 1907

Number mites per leaf

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

38

25.3

35.8

1

17

29.1

20.7

2

10

16.7

11.4

3
4
5
6
7+

9
3
2
1
0

6.41.80.40.20.1}

6
3.11.60.80.6}

Total

80

80

80

Estimate of parameter

ˆθ=1.15

ˆθ=1.255891

χ2

18.275

2.469

d.f.

2

3

p-value

 

0.0001

0.4809

Table 2 Observed and expected number of red mites on apple leaves

Number of bores per plant

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

83

78.9

87.2

1

36

42.9

31.8

2
3
4+

14
2
1

11.72.10.4}

11.23.82.0}

Total

136

136

136

Estimate of parameter

ˆθ=0.544118

ˆθ=2.372252

χ2

1.885

0.757

d.f.

1

1

p-value

 

0.1698

0.3843

Table 3 Observed and expected number of European corn- borer of Mc. Guire et al18

Application in genetics

Genetics is the branch of biological science which deals with heredity and variation. Heredity includes those traits or characteristics which are transmitted from generation to generation, and is therefore fixed for a particular individual. Variation, on the other hand, is mainly of two types, namely hereditary and environmental. Hereditary variation refers to differences in inherited traits whereas environmental variations are those which are mainly due to environment. In the field of genetics much quantitative studies seem to have been done. The segregation of chromosomes has been studied using statistical tool, mainly chi-square ( χ2 ). In the analysis of data observed on chemically induced chromosome aberrations in cultures of human leukocytes, Loeschke & Kohler16 suggested the negative binomial distribution while Janardan & Schaeffer17 suggested modified Poisson distribution. Mishra & Shanker15 have discussed applications of generalized Logarithmic series distributions (GLSD) to model data in mortality, ecology and genetics. In this section an attempt has been made to fit to data relating to genetics using PLD and Poisson distribution using maximum likelihood estimate. Also an attempt has been made to fit Poisson distribution and PLD to the data of Catcheside et al.19,20 in (Tables 3-7).21

Number of aberrations

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

268

231.3

257

1

87

126.7

93.4

2

26

34.7

32.8

3
4
5
6
7+

9
4
2
1
3

6.30.80.10.10.1}

11.2
6.30.80.10.10.1}

Total

400

400

400

Estimate of parameter

ˆθ=0.5475

ˆθ=2.380442

χ2

38.208

6.208

d.f.

2

3

p-value

 

0

0.1019

Table 4 Distribution of number of Chromatid aberrations (0.2 g chinon 1, 24 hours)

Number of aberrations

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

268

231.3

257

1

87

126.7

93.4

2

26

34.7

32.8

3
4
5
6
7+

9
4
2
1
3

6.30.80.10.10.1}

3.81.20.40.2}

Total

400

400

400

Estimate of parameter

ˆθ=0.5475

ˆθ=2.380442

χ2

38.208

6.208

d.f.

2

3

p-value

 

0

0.1019

Table 5 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure -60μg|kg

Class/Exposure ( )

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

413

374

405.7

1

124

177.4

133.6

2

42

42.1

42.6

3
4
5
6

15
5
0
2

6.60.80.10.0}

13.3
4.11.20.5}

Total

601

601

601

Estimate of parameter

ˆθ=0.47421

ˆθ=2.685373

χ2

48.169

1.336

d.f.

2

3

p-value

 

0

0.7206

Table 6 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure -70μg|kg

Class/Exposure ( )

Observed frequency

Expected frequency

Poisson distribution

Poisson-Lindley distribution

0

155

127.8

158.3

1

83

109

77.2

2

33

46.5

35.9

3
4
5
6

14
11
3
1

13.22.80.50.2}

16.1
7.13.12.3}

Total

300

300

300

Estimate of parameter

ˆθ=0.853333

ˆθ=1.617611

χ2

24.969

1.51

d.f.

2

3

p-value

 

0

0.6799

Table 7 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure -90μg|kg

It is obvious from above tables that PLD gives much closer fit than Poisson distribution and thus it can be considered as an important tool for modeling data in genetics.

References

  1. Sankaran M. The discrete poisson-lindley distribution. Biometrics. 1970;26(1):145–149.
  2. Ghitany ME, Al-Mutairi DK. Estimation Methods for the discrete Poisson-Lindley distribution. Journal of Statistical Computation and Simulation. 2009;79(1):1–9.
  3. Shanker R, Mishra A. A two-parameter Poisson-Lindley distribution. International Journal of Statistics and Systems. 2014;9(1):79–85.
  4. Shanker R, Mishra A. A two-parameter Lindley distribution. Statistics in Transition new Series. 2013;14(1):45–56.
  5. Shanker R, Mishra A. A quasi Poisson-Lindley distribution (submitted). 2015.
  6. Shanker R, Mishra A. A quasi Lindley distribution. African journal of Mathematics and Computer Science Research. 2013;6(4):64–71.
  7. Shanker R, Sharma S, Shanker R. A Discrete two-Parameter Poisson Lindley Distribution. Journal of Ethiopian Statistical Association. 2012;21:15–22.
  8. Shanker R, Sharma S, Shanker, R. A two-parameter Lindley distribution for modeling waiting and survival times data. Applied Mathematics. 2013;4:363–368.
  9. Shanker R, Tekie AL. A new quasi Poisson-Lindley distribution. International Journal of Statistics and Systems. 2014;9(1):87–94.
  10. Shanker R, Amanuel AG. A new quasi Lindley distribution. International Journal of Statistics and Systems. 2013;8(2):143–156.
  11. Johnson NL, Kotz S, Kemp AW. Univariate discrete distributions. 2nd ed. John Wiley & sons Inc; 1992.
  12. Fisher RA, Corpet AS, Williams CB. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology. 1943;12(1):42–58.
  13. Kempton RA. A generalized form of Fisher’s logarithmic series. Biometrika. 1975;62(1):29–38.
  14. Tripathi RC, Gupta RC. A generalization of the log-series distribution. Comm. in Stat. (Theory and Methods). 1985;14(8):1779–1799.
  15. Mishra A, Shanker R. Generalized logarithmic series distribution-Its nature and applications. Proceedings of the Vth International Symposium on Optimization and Statistics. 2002:155–168.
  16. Loeschke V, Kohler W. Deterministic and Stochastic models of the negative binomial distribution and the analysis of chromosomal aberrations in human leukocytes. Biometrische Zeitschrift. 1976;18:427–51.
  17. Janardan KG, Schaeffer DJ. Models for the analysis of chromosomal aberrations in human leukocytes. Biometrical Journal. 1977;(8):599–612.
  18. Mc Guire JU, Brindley TA, Bancroft TA. The distribution of European corn-borer larvae pyrausta in field corn. Biometrics. 1957;13:65–78.
  19. Catcheside DG, Lea DE, Thoday JM. Types of chromosome structural change induced by the irradiation on Tradescantia microspores. Journal of Genetics. 1946;47:113–136.
  20. Catcheside DG, Lea DE, Thoday JM. The production of chromosome structural changes in Tradescantia microspores in relation to dosage, intensity and temperature. Journal of Genetics. 1946:137–149.
  21. Lindley DV. Fiducial distributions and Bayes theorem. Journal of Royal Statistical Society Ser. B. 1958;20:102–107.
Creative Commons Attribution License

©2015 Shanker, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.