Research Article Volume 3 Issue 6
On poisson-akash distribution and its applications
Rama Shanker,1
Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.
Hagos Fesshaye,2 Teklay Tesfazghi3
1Department of Statistics, Eritrea Institute of Technology, Eritrea
2Department of Economics, College of Business and Economics, Eritrea
3Department of Computer Engineering, Eritrea Institute of Technology, Eritrea
Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Asmara, Eritrea
Received: April 25, 2016 | Published: May 11, 2016
Citation: Shanker R, Fesshaye H, Tesfazghi T. On poisson-akash distribution and its applications. Biom Biostat Int J. 2016;3(5):146-153. DOI: 10.15406/bbij.2016.03.00075
Download PDF
Abstract
A simple and interesting method for finding moments of ‘Poisson-Akash distribution (PAD)’ of Shanker,1 a Poisson mixture of Akash distribution introduced by Shanker2 has been suggested. The first two moments about origin and the variance of PAD has been obtained and presented. The applications and the goodness of fit of PAD has been discussed using data-sets relating to ecology genetics, and thunderstorms and the fit has been compared with Poisson and Poisson-Lindley distribution, a Poisson mixture of Lindley3 distribution, introduced by Sankaran4 and the goodness of fit of PAD shows satisfactory fit in most of data-sets.
Keywords: akash distribution, poisson-akash distribution, lindley distribution; poisson-lindley distribution, compounding, moments, estimation of parameter, goodness of fit
Introduction
The probability mass function of Poisson-Akash distribution (PAD) having parameter
given by
(1.1)
has been introduced by Shanker1 for modeling various count data-sets. The PAD arises from Poisson distribution when its parameter
follows one parameter Akash distribution introduced by Shanker2 having probability density function
(1.2)
We have
(1.3)
(1.4)
This is the probability mass function of Poisson-Akash distribution (PAD)”.
It has been shown by Shanker2 that the Akash distribution (1.2) is a two component mixture of an exponential (
) distribution, and a gamma (3,
) distribution with their mixing proportions
and
respectively. Shanker2 has discussed its mathematical and statistical properties including its shape, moments, skewness, kurtosis, hazard rate function, mean residual life function, stochastic orderings, mean deviations, distribution of order statistics, Bonferroni and Lorenz curves, Renyi entropy measure, stress-strength reliability, amongst others along with the estimation of parameter and applications for modeling lifetime data from engineering and biomedical science.
Sankaran3 obtained Poisson-Lindley distribution (PLD) having probability mass function (p.m.f)
(1.5)
by compounding Poisson distribution with Lindley distribution when the parameter
of Poisson distribution follows Lindley distribution, introduced by Lindley5 having probability density function (p.d.f)
(1.6)
In this paper a simple and interesting method for finding moments of Poisson-Akash distribution (PAD) introduced by Shanker5 has been suggested and hence the first two moments about origin and the variance has been presented. It seems that not much work has been done on the applications of PAD so far for count data arising in various fields of knowledge. The applications and goodness of fit of PAD have been discussed with various count data from ecology, genetics and thunderstorms and the goodness of fit of PAD has been compared with Poisson distribution and Poisson-Lindley distribution (PLD). The goodness of fit of PAD shows satisfactory fit in most of the data-sets.
Moments of pad
Using (1.3), the th moment about origin of PAD (1.1) can be obtained as
It is obvious that the expression under the bracket in (2.1) is the
th moment about origin of the Poisson distribution. Taking
in (2.1) and using the first moment about origin of the Poisson distribution, the first moment about origin of the PAD (1.1) can be obtained as
Again taking
in (2.1) and using the second moment about origin of the Poisson distribution, the second moment about origin of the PAD (1.1) can be obtained as
Similarly, taking
in (2.1) and using the third and the fourth moments about origin of the Poisson distribution, the third and the fourth moments about origin of the PAD (1.1) can thus be obtained as
(2.4)
(2.5)
The variance of the PAD (1.1) can thus be obtained as
(2.6)
It has been shown by Shanker5 that PAD (1.1) has increasing hazard rate, unimodal and always over-dispersed, and thus is a suitable model for count data which are over-dispersed
Parameter estimation of pad
Maximum Likelihood Estimate (MLE) of the Parameter: Let
be a random sample of size
from the PAD (1.1) and let
be the observed frequency in the sample corresponding to
such that
, where
is the largest observed value having non-zero frequency.
The likelihood function
of the PAD (1.1) can be given by
The log likelihood function is thus obtained as
The first derivative of the log likelihood function is given by
where
is the sample mean.
The maximum likelihood estimate (MLE),
of
of PAD (1.1) is the solution of the equation
and is thus given by the solution of the non-linear equation
This non-linear equation can be solved by any numerical iteration methods such as Newton-Raphson method, Bisection method, Regula-Falsi method etc. In this paper Newton-Raphson method has been used to solve above non-linear equation to get maximum likelihood estimate of the parameter.
Method of moment estimate (MOME) of the parameter:
Let
be a random sample of size
from the PAD (1.1). Equating the population mean to the corresponding sample mean, the MOME
of
of PAD (1.1) is the solution of the following cubic equation
where
is the sample mean.
Applications and goodness of fit of pad
When events seem to occur at random, Poisson distribution is a suitable statistical model. Examples of events where Poisson distribution is a suitable model includes the number of customers arriving at a service point, the number of telephone calls arriving at an exchange , the number of fatal traffic accidents per week in a given state, the number of radioactive particle emissions per unit of time, the number of meteorites that collide with a test satellite during a single orbit, the number of organisms per unit volume of some fluid, the number of defects per unit of some materials, the number of flaws per unit length of some wire, are some amongst others. Further, the conditions for using Poisson distribution are the independence of events and equality of mean and variance, which are rarely satisfied completely in biomedical science and thunderstorms due to the fact that the occurrences of successive events in biomedical science and thunderstorms are dependent. Negative binomial distribution is the appropriate choice for the situation where successive events are dependent but negative binomial distribution requires higher degree of over-dispersion Johnson et al.6 In biomedical science and thunderstorms, these conditions are not fully satisfied. Generally, the count data in biomedical science and thunderstorms are either over-dispersed or under-dispersed. The main reason for selecting PLD and PAD to fit count data from biomedical science and thunderstorms are that these two distributions are always over-dispersed and PAD has some flexibility over PLD.
Applications in ecology
Ecology is the branch of biology which deals with the relations and interactions between organisms and their environment, including their organisms. Since the organisms and their environment in the nature are complex, dynamic, interdependent, mutually reactive and interrelated, ecology deals with the various principles which govern such relationship between organisms and their environment. Firstly Fisher et al.7 discussed the applications of Logarithmic series distribution (LSD) to model count data in the science of ecology. Later, Kempton8 who fitted the generalized form of Fisher’s Logarithmic series distribution (LSD) to model insect data and concluded that it gives a superior fit as compared to ordinary Logarithmic series distribution (LSD). He also concluded that it gives better explanation for the data having exceptionally long tail. Tripathi & Gupta9 proposed another generalization of the Logarithmic series distribution (LSD) which is flexible to describe short-tailed as well as long-tailed data and fitted it to insect data and found that it gives better fit as compared to ordinary Logarithmic series distribution. Shanker,10 Mishra & Shanker11 have discussed applications of generalized logarithmic series distributions (GLSD) to models data in ecology. Shanker & Hagos12 have tried to fit PLD for data relating to ecology and observed that PLD gives satisfactory fit.
In this section we have tried to fit Poisson distribution (PD), Poisson -Lindley distribution (PLD) and Poisson-Akash distribution (PAD) to many count data from biological sciences using maximum likelihood estimates. The data were on haemocytometer yeast cell counts per square, on European red mites on apple leaves and European corn borers per plant.
It is obvious from above tables that in Table 4.1.1, PD gives better fit than PLD and PSD; in Table 4.1.2 PAD gives better fit than PD and PLD while in Table 4.1.3, PLD gives better fit than PD and PAD.
Applications in genetics
Genetics is the branch of biological science which deals with heredity and variation. Heredity includes those traits or characteristics which are transmitted from generation to generation, and is therefore fixed for a particular individual. Variation, on the other hand, is mainly of two types, namely hereditary and environmental. Hereditary variation refers to differences in inherited traits whereas environmental variations are those which are mainly due to environment. Much quantitative works seem to be done in genetics but so far no works has been done on fitting of PAD for count data in genetics. The segregation of chromosomes has been studied using statistical tool, mainly chi-square (
). In the analysis of data observed on chemically induced chromosome aberrations in cultures of human leukocytes, Loeschke & Kohler13 suggested the negative binomial distribution while Janardan & Schaeffer14 suggested modified Poisson distribution. Shanker,10 Mishra & Shanker11 have discussed applications of generalized Logarithmic series distributions (GLSD) to model data in mortality, ecology and genetics. Shanker & Hagos12 have detailed study on the applications of PLD to model data from genetics. In this section an attempt has been made to fit to data relating to genetics using PAD, PLD and PD using maximum likelihood estimate. Also an attempt has been made to fit PAD, PLD, and PD to the data of Catcheside et al.15,16 in Tables 4.2.2, 4.2.3, and 4.2.4.
Number of Yeast Cells Per Square |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
213 |
202.1 |
234 |
236.8 |
1 |
128 |
138.0 |
99.4 |
95.6 |
2 |
37 |
47.1 |
40.5 |
39.9 |
3 |
18 |
|
|
|
4 |
3 |
5 |
1 |
6 |
0 |
Total |
|
400.0 |
400.0 |
400.0 |
Estimate of Parameter |
|
|
|
|
|
|
10.08 |
11.04 |
14.68 |
d.f. |
|
2 |
2 |
2 |
p-value |
|
0.0065 |
0.004 |
0.0006 |
Table 4.1.1 Observed and expected number of Haemocytometer yeast cell counts per square observed by Gosset18
Number of Red Mites Per Leaf |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
38 |
25.3 |
35.8 |
36.3 |
1 |
17 |
29.1 |
20.7 |
20.1 |
2 |
10 |
16.7 |
11.4 |
11.2 |
3 |
9 |
|
6.0
|
6.1
|
4 |
3 |
5 |
2 |
6 |
1 |
7+ |
0 |
Total |
80 |
80.0 |
80.0 |
80.0 |
Estimate of Parameter |
|
|
|
|
|
|
18.27 |
2.47 |
2.07 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0001 |
0.4807 |
0.558 |
Table 4.1.2 Observed and expected number of red mites on Apple leaves
Number of Corn- Borer Per Plant |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
188 |
169.4 |
194.0 |
196.3 |
1 |
83 |
109.8 |
79.5 |
76.5 |
2 |
36 |
35.6 |
31.3 |
30.8 |
3 |
14 |
|
|
|
4 |
2 |
5 |
1 |
Total |
324 |
324.0 |
324.0 |
324.0 |
Estimate of Parameter |
|
|
|
|
|
|
15.19 |
1.29 |
2.33 |
d.f. |
|
2 |
2 |
2 |
p-value |
|
0.0005 |
0.5247 |
0.3119 |
Table 4.1.3 Observed and expected number of European corn-borer of Mc Guire et al.19
It is obvious from the fitting of PAD, PLD, and PD that PAD gives much closer fit in Tables 4.2.1, 4.2.2 and 4.2.3 but in Table 4.2.4, PLD better fit than PD and PAD.
Number of Aberrations |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
268 |
231.3 |
257.0 |
260.4 |
1 |
87 |
126.7 |
93.4 |
89.7 |
2 |
26 |
34.7 |
32.8 |
32.1 |
3 |
9 |
|
11.2
|
11.5
|
4 |
4 |
5 |
2 |
6 |
1 |
7+ |
3 |
Total |
400 |
400.0 |
400.0 |
400.0 |
Estimate of Parameter |
|
|
|
|
|
|
38.21 |
6.21 |
4.17 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.1018 |
0.2437 |
Table 4.2.1 Distribution of number of Chromatid aberrations (0.2 g chinon 1, 24 hours)
Class/Exposure
|
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
413 |
374.0 |
405.7 |
409.5 |
1 |
124 |
177.4 |
133.6 |
128.7 |
2 |
42 |
42.1 |
42.6 |
42.1 |
3 |
15 |
|
13.3
|
13.9
|
4 |
5 |
5 |
0 |
6 |
2 |
Total |
601 |
601.0 |
601.0 |
601.0 |
Estimate of Parameter |
|
|
|
|
|
|
48.17 |
1.34 |
0.29 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.7196 |
0.9619 |
Table 4.2.2 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure-60
Class/Exposure
|
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
200 |
172.5 |
191.8 |
194.1 |
1 |
57 |
95.4 |
70.3 |
67.6 |
2 |
30 |
26.4 |
24.9 |
24.5 |
3 |
7 |
|
|
|
4 |
4 |
5 |
0 |
6 |
2 |
Total |
300 |
300.0 |
300.0 |
300.0 |
Estimate of Parameter |
|
|
|
|
|
|
29.68 |
3.91 |
3.12 |
d.f. |
|
2 |
2 |
2 |
p-value |
|
0.0000 |
0.1415 |
0.2101 |
Table 4.2.3 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure-70
Class/Exposure
|
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
155 |
127.8 |
158.3 |
160.7 |
1 |
83 |
109.0 |
77.2 |
74.3 |
2 |
33 |
46.5 |
35.9 |
35.3 |
3 |
14 |
|
16.1
|
16.5
|
4 |
11 |
5 |
3 |
6 |
1 |
Total |
300 |
300.0 |
300.0 |
300.0 |
Estimate of Parameter |
|
|
|
|
|
|
24.97 |
1.51 |
1.98 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.6799 |
0.5766 |
Table 4.2.4 Mammalian cytogenetic dosimetry lesions in rabbit lymphoblast induced by streptonigrin (NSC-45383), Exposure-90
Applications in thunderstorms
In thunderstorm activity, the occurrence of successive thunderstorm events (THE’s) is often dependent process which means that the occurrence of a THE indicates that the atmosphere is unstable and the conditions are favorable for the formation of further thunderstorm activity. The negative binomial distribution (NBD) is a possible alternative to the Poisson distribution when successive events are possibly dependent Johnson et al.6 The theoretical and empirical justification for using the NBD to describe THE activity has been fully explained and discussed by Falls et al.17Further, for fitting Poisson distribution to the count data equality of mean and variance should be satisfied. Similarly, for fitting NBD to the count data, mean should be less than the variance. In THE, these conditions are not fully satisfied. As a model to describe the frequencies of thunderstorms (TH’s), given an occurrence of THE, the PAD can be considered because it is always over-dispersed Tables 4.3.1, 4.3.2, 4.3.3 and 4.3.4.
No. of Thunderstorms |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
187 |
155.6 |
185.3 |
187.9 |
1 |
77 |
117.0 |
83.5 |
80.2 |
2 |
40 |
43.9 |
35.9 |
35.3 |
3 |
17 |
|
15.0 |
15.4 |
4 |
6 |
|
|
5 |
2 |
6 |
1 |
Total |
330 |
330.0 |
330.0 |
330.0 |
ML estimate |
|
|
|
|
|
|
31.93 |
1.43 |
1.35 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.6985 |
0.7173 |
Table 4.3.1 Observed and expected number of days that experienced X thunderstorms events at Cape Kennedy, Florida for the 11-year period of record for the month of June, January 1957 to December 1967, Falls et al.16
No. of Thunderstorms |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
177 |
142.3 |
177.7 |
180.0 |
1 |
80 |
124.4 |
88.0 |
84.7 |
2 |
47 |
54.3 |
41.5 |
40.9 |
3 |
26 |
|
18.9 |
19.4 |
4 |
9 |
|
|
5 |
2 |
Total |
341 |
341.0 |
341.0 |
341.0 |
ML estimate |
|
|
|
|
|
|
39.74 |
5.15 |
5.02 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.1611 |
0.1703 |
Table 4.3.2 Observed and expected number of days that experienced X thunderstorms events at Cape Kennedy, Florida for the 11-year period of record for the month of July, January 1957 to December 1967, Falls et al.16
No. of Thunderstorms |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
185 |
151.8 |
184.8 |
187.5 |
1 |
89 |
122.9 |
87.2 |
83.9 |
2 |
30 |
49.7 |
39.3 |
38.6 |
3 |
24 |
|
17.1 |
17.5 |
4 |
10 |
|
|
5 |
3 |
Total |
341 |
341.0 |
341.0 |
341.0 |
ML estimate |
|
|
|
|
|
|
49.49 |
5.03 |
4.69 |
d.f. |
|
2 |
3 |
3 |
p-value |
|
0.0000 |
0.1696 |
0.196 |
Table 4.3.3 Observed and expected number of days that experienced X thunderstorms events at Cape Kennedy, Florida for the 11-year period of record for the month of August, January 1957 to December 1967, Falls et al.16
No. of Thunderstorms |
Observed Frequency |
Expected Frequency |
PD |
PLD |
PAD |
0 |
549 |
449.0 |
547.5 |
555.1 |
1 |
246 |
364.8 |
259.0 |
249.2 |
2 |
117 |
148.2 |
116.9 |
114.9 |
3 |
67 |
40.1 |
51.2 |
52.3 |
4 |
25 |
|
21.9 |
23.2 |
5 |
7 |
|
|
6 |
1 |
Total |
1012 |
1012.0 |
1012.0 |
1012.0 |
ML estimate |
|
|
|
|
|
|
119.45 |
9.60 |
9.40 |
d.f. |
|
3 |
4 |
4 |
p-value |
|
0.0000 |
0.0477 |
0.0518 |
Table 4.3.4 Observed and expected number of days that experienced X thunderstorms events at Cape Kennedy, Florida for the 11-year period of record for the summer, January 1957 to December 1967, Falls et al.16
It is obvious from the fitting of PAD, PLD, and PD that PAD gives much closer fit than PLD and PD in all data-sets relating to thunderstorms and hence PAD can be considered as an important model for modeling thunderstorms events.
Acknowledgments
Conflicts of interest
Author declares that there are no conflicts of interest.
References
- Shanker R. The discrete Poisson-Akash distribution. Communicated, 2016.
- Shanker R. Akash distribution and Its Applications. International Journal of Probability and Statistics. 2015;4(3):65‒75.
- Lindley DV. Fiducial distributions and Bayes theorem. Journal of Royal Statistical Society. 1958;20(1):102‒107.
- Sankaran M. The discrete Poisson-Lindley distribution. Biometrics. 1970;26(1):145‒149.
- Shanker R, Hagos F, Sujatha S. On Modeling of Lifetime Data Using One Parameter Akash, Lindley and Exponential Distributions. Biometrics & Biostatistics International Journal. 2016;3(2):1‒10.
- Johnson NL, Kotz S, Kemp AW. Univariate Discrete Distributions, 2nd ed. John Wiley & sons Inc, USA, 1992.
- Fisher RA, Corpet AS, Williams CB. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology. 1943;12(1):42‒58.
- Kempton RA. A generalized form of Fisher’s logarithmic series. Biometrika. 1975;62(1):29‒38.
- Tripathi RC, Gupta RC. A generalization of the log-series distribution. Communications in Statistics. 1985;14(8):1779‒1799.
- Shanker R. Generalized Logarithmic Series Distributions and Their Applications, Unpublished Ph.D Thesis, Patna, India, 2002.
- Mishra A, Shanker R. Generalized logarithmic series distribution-Its nature and applications, Proceedings of the Vth International Symposium on Optimization and Statistics. 2002; p. 155‒168.
- Shanker R, Hagos F. On Poisson-Lindley distribution and Its applications to Biological Sciences. Biometrics and Biostatistics International Journal. 2015;2(4):1‒5.
- Loeschke V, Kohler W. Deterministic and Stochastic models of the negative binomial distribution and the analysis of chromosomal aberrations in human leukocytes. Biometrische Zeitschrift. 1976;18(6):427‒451.
- Janardan KG, Schaeffer DJ. Models for the analysis of chromosomal aberrations in human leukocytes. Biometrical Journal. 1977;19(8):599‒612.
- Catcheside DG, Lea DE, Thoday JM. Types of chromosome structural change induced by the irradiation on Tradescantia microspores. J Genet. 1946;47:113‒136.
- Catcheside DG, Lea DE, Thoday JM. The production of chromosome structural changes in Tradescantia microspores in relation to dosage, intensity and temperature. J Genet. 1946b;47:137‒149.
- Falls LW, Williford WO, Carter MC. Probability distributions for thunderstorm activity at Cape Kennedy, Florida. Journal of Applied Meteorology. 1970;10:97‒104.
- Gosset WS. The probable error of a mean. Biometrika. 1908;6:1‒25.
- Mc Guire JU, Brindley TA, Bancroft TA. The distribution of European corn-borer larvae pyrausta in field corn. Biometrics. 1957;13(1):65‒78.
©2016 Shanker, et al. This is an open access article distributed under the terms of the,
which
permits unrestricted use, distribution, and build upon your work non-commercially.