The discrete poisson-garima distribution

doi:10.15406/bbij.2017.05.00127

eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 5 Issue 2

The discrete poisson-garima distribution

Rama Shanker

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Department of Statistics, Eritrea Institute of Technology, Eritrea

Correspondence: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Eritrea

Received: December 13, 2016 | Published: February 13, 2017

Citation: Shanker R. The discrete poisson-garima distribution. Biom Biostat Int J. 2017;5(2):48-53. DOI: 10.15406/bbij.2017.05.00127

Download PDF

Abstract

In this paper, a discrete Poisson-Garima distribution has been obtained by compounding Poisson distribution with Garima distribution introduced by Shanker.¹ The general expression for the r th factorial moment has been derived and hence moments about origin and central moments have been obtained. The expression for coefficient of Variation, skewness, kurtosis and index of dispersion has been given. Maximum likelihood estimation and the method of moments have been discussed for estimating the parameter of the distribution. Two examples of real data set have been given to test the goodness of fit of the discrete Poisson-Garima distribution and the fit has been compared with Poisson and Poisson-Lindley distributions.

Keywords: garima distribution, poisson-lindley distribution, compounding, moments, skewness, kurtosis, index of dispersion, estimation of parameter, goodness of fit.

Introduction

Shanker¹ introduced a lifetime distribution named Garima distribution having probability density function

$f (x; θ) = \frac{θ}{θ + 2} (1 + θ + θ x) e^{- θ x}; x > 0, θ > 0$ (1.1)

to model behavioral science data. It has been shown by Shanker¹ that Garima distribution gives much better fit than exponential and Lindley² distributions and the new lifetime distributions introduced by Shanker^3–6 namely Shanker, Akash, Aradhana and Sujatha distributions. The first four moments about origin of Garima distribution obtained by Shanker¹ are given by

$μ_{1}^{'} = \frac{θ + 3}{θ (θ + 2)}$ , $μ_{2}^{'} = \frac{2 (θ + 4)}{θ^{2} (θ + 2)}$
$μ_{3}^{'} = \frac{6 (θ + 5)}{θ^{3} (θ + 2)}$ , $μ_{4}^{'} = \frac{24 (θ + 6)}{θ^{4} (θ + 2)}$ .
The central moments of Garima distribution obtained by Shanker¹ are as follows

$μ_{2} = \frac{θ^{2} + 6 θ + 7}{θ^{2} {(θ + 2)}^{2}}$
$μ_{3} = \frac{2 (θ^{3} + 9 θ^{2} + 21 θ + 15)}{θ^{3} {(θ + 2)}^{3}}$
$μ_{4} = \frac{3 (3 θ^{4} + 36 θ^{3} + 134 θ^{2} + 204 θ + 111)}{θ^{4} {(θ + 2)}^{4}}$

Shanker¹ has studied various properties of Garima distribution including its shape, moments, skewness, kurtosis, hazard rate function, mean residual life function, stochastic ordering, mean deviations, order statistics, Bonferroni and Lorenz curves, entropy measure and stress-strength reliability, amongst others. The estimation of the parameter of Garima distribution has been discussed by Shanker¹ using both maximum likelihood estimation and the method of moments.

In this paper, a Poisson mixture of Garima distribution named, Poisson-Garima distribution (PGD) has been proposed and its various statistical and mathematical properties have been investigated. The estimation of its parameter has been studied using maximum likelihood estimation and method of moments. Since Poisson-Lindley distribution (PLD), a Poisson mixture of Lindley² distribution and introduced by Sankaran,⁷ gives better fit than Poisson distribution, the Poisson-Garima distribution is expected to gives better fit than both Poisson and Poisson – Lindley distribution due to the fact that Garima distribution gives better fit than Lindley distribution. The goodness of fit of the Poisson-Garima distribution has been discussed and it has been compared with that of Poisson and Poisson-Lindley distributions.

Poisson-garima distribution

Assuming that the parameter $λ$ of Poisson distribution follows Garima distribution (1.1), the Poisson mixture of Garima distribution can be obtained as

$P (X = x) = \int_{0}^{\infty} \frac{e^{- λ} λ^{x}}{x!} \cdot \frac{θ}{θ + 2} (1 + θ + θ λ) e^{- θ λ} d λ$ (2.1)
$= \frac{θ}{(θ + 2) x!} \int_{0}^{\infty} e^{- (θ + 1) λ} [(1 + θ) λ^{x} + θ λ^{x + 1}] d λ$
$= \frac{θ}{θ + 2} \cdot \frac{θ x + (θ^{2} + 3 θ + 1)}{{(θ + 1)}^{x + 2}}; x = 0, 1, 2, 3, ..., θ > 0$ (2.2)

This probability mass function (p.m.f.) has been named as “Poisson-Garima distribution (PGD)”.
It should be noted that Sankaran⁷ obtained Poisson-Lindley distribution (PLD) having probability mass function (p.m.f)
$P (X = x) = \frac{θ^{2} (x + θ + 2)}{{(θ + 1)}^{x + 3}}; x = 0, 1, 2, ..., θ > 0$ (2.3)

by compounding Poisson distribution with Lindley distribution, introduced by Lindley² having probability density function (p.d.f)

$f (x, θ) = \frac{θ^{2}}{θ + 1} (1 + x) e^{θ x}; x > 0, θ > 0$ (2.4)

The graphs of the pmf of Poisson-Garima distribution (PGD) and Poisson-Lindley distribution (PLD) for varying values of the parameter are shown in the figure 1

Figure 1 Graphs of the pmf of PGD and PLD For Varying Values of the Parameter.

Moments and related measures of PGD

The r th factorial moment about origin of Poisson-Garima distribution (2.2) can be obtained as
$μ_{(r)}^{'} = E [E (X^{(r)} | λ)]$ , where $X^{(r)} = X (X - 1) (X - 2) ... (X - r + 1)$
$= \frac{θ}{θ + 2} \int_{0}^{\infty} [\sum_{x = 0}^{\infty} x^{(r)} \frac{e^{- λ} λ^{x}}{x!}] (1 + θ + θ λ) e^{- θ λ} d λ$
$= \frac{θ}{θ + 2} \int_{0}^{\infty} [λ^{r} \sum_{x = r}^{\infty} \frac{e^{- λ} λ^{x - r}}{(x - r)!}] (1 + θ + θ λ) e^{- θ λ} d λ$
Taking $x + r$ in place of $x$ within bracket, we get
$μ_{(r)}^{'} =$ $\frac{θ}{θ + 2} \int_{0}^{\infty} λ^{r} [\sum_{x = 0}^{\infty} \frac{e^{- λ} λ^{x}}{x!}] (1 + θ + θ λ) e^{- θ λ} d λ$

The expression within the bracket is clearly unity and hence we have
$μ_{(r)}^{'} =$ $\frac{θ}{θ + 2} \int_{0}^{\infty} λ^{r} (1 + θ + θ λ) e^{- θ λ} d λ$
Using gamma integral and some algebraic simplification, we get finally a general expression for the r th factorial moment of PGD (2.2) as
$μ_{(r)}^{'} = \frac{r! (θ + r + 2)}{θ^{r} (θ + 2)}; r = 1, 2, 3, ....$ (3.1)

Substituting $r = 1, 2, 3, and 4$ in (3.1), the first four factorial moments can be obtained and using the relationship between factorial moments and moments about origin, the first four moments about origin of the PGD (2.2) are obtained as
${μ^{'}}_{1} = \frac{θ + 3}{θ (θ + 2)}$
${μ^{'}}_{2} = \frac{θ^{2} + 5 θ + 8}{θ^{2} (θ + 2)}$
${μ^{'}}_{3} = \frac{θ^{3} + 9 θ^{2} + 30 θ + 30}{θ^{3} (θ + 2)}$
${μ^{'}}_{4} = \frac{θ^{4} + 17 θ^{3} + 92 θ^{2} + 204 θ + 144}{θ^{4} (θ + 2)}$
Using the relationship between moments about mean and the moments about origin, the moments about mean of the PGD (2.2) are obtained as
$μ_{2} = σ^{2} = \frac{θ^{3} + 6 θ^{2} + 12 θ + 7}{θ^{2} {(θ + 2)}^{2}}$
$μ_{3} = \frac{θ^{5} + 10 θ^{4} + 42 θ^{3} + 87 θ^{2} + 84 θ + 30}{θ^{3} {(θ + 2)}^{3}}$
$μ_{4} = \frac{(θ^{7} + 19 θ^{6} + 148 θ^{5} + 607 θ^{4} + 1402 θ^{3} + 1816 θ^{2} + 1224 θ + 333)}{θ^{4} {(θ + 2)}^{4}}$
The coefficient of variation $(C . V)$ , coefficient of Skewness $(\sqrt{β_{1}})$ , coefficient of Kurtosis $(β_{2})$ and index of dispersion $(γ)$ of the PGD (2.2) are thus obtained as
$C . V = \frac{σ}{{μ^{'}}_{1}} = \frac{\sqrt{θ^{3} + 6 θ^{2} + 12 θ + 7}}{θ + 3}$
$\sqrt{β_{1}} = \frac{μ_{3}}{μ_{2}^{3 / 2}} = \frac{θ^{5} + 10 θ^{4} + 42 θ^{3} + 87 θ^{2} + 84 θ + 30}{{(θ^{3} + 6 θ^{2} + 12 θ + 7)}^{3 / 2}}$
$β_{2} = \frac{μ_{4}}{μ_{2}^{2}} = \frac{(θ^{7} + 19 θ^{6} + 148 θ^{5} + 607 θ^{4} + 1402 θ^{3} + 1816 θ^{2} + 1224 θ + 333)}{{(θ^{3} + 6 θ^{2} + 12 θ + 7)}^{2}}$
$γ = \frac{σ^{2}}{μ_{1}^{'}} = \frac{θ^{3} + 6 θ^{2} + 12 θ + 7}{θ (θ + 2) (θ + 3)}$
To study the nature and behavior of $μ_{1}^{'}, μ_{2}, C .V, \sqrt{β_{1}}, β_{2} and γ$ of PGD and PLD, values of these characteristics for varying values of parameter $θ$ have been computed and presented in table 1

	Values of $θ$ for Poisson-Garima Distribution
	1	2	3	4	5	6
$μ_{1}'$	1.333333	0.625	0.4	0.291667	0.228571	0.1875
$μ_{2}$	2.888889	0.984375	0.551111	0.373264	0.279184	0.221788
CV	1.274755	1.587451	1.855921	2.094697	2.311655	2.511701
$\sqrt{β_{1}}$	1.915904	2.147798	2.355147	2.54717	2.727407	2.897852
$β_{2}$	8.210059	9.335601	10.36498	11.36106	12.34549	13.32641
$γ$	2.166667	1.575	1.377778	1.279762	1.221429	1.18287

	Values of $θ$ for Poisson-Lindley Distribution
	1	2	3	4	5	6
$μ_{1}'$	1.5	0.666667	0.416667	0.3	0.233333	0.190476
$μ_{2}$	3.25	1.055556	0.576389	0.385	0.285556	0.225624
CV	1.20185	1.541104	1.822087	2.068279	2.290174	2.493742
$\sqrt{β_{1}}$	1.792108	2.083265	2.314307	2.517935	2.704839	2.87957
$β_{2}$	7.532544	8.941828	10.10611	11.17187	12.19654	13.203
$γ$	2.166667	1.583333	1.383333	1.283333	1.22381	1.184524

Table 1 Values of $μ_{1}^{'}, μ_{2}, C .V, \sqrt{β_{1}}, β_{2} and γ$ of PGD and PLD for varying values of the parameter $θ$

No. of insects	Observed Frequency	Expected Frequency
No. of insects	Observed Frequency	PD	PLD	PGD
0 1 2 3 4	35 11 8 4 2	27.4 21.5 $\begin{array}{l} 8.4 \\ 2.2 \\ 0.5 \end{array}}$	33.0 15.3 $\begin{array}{l} 6.8 \\ 2.9 \\ 2.0 \end{array}}$	33.3 15.1 6.6 $\begin{array}{l} 2.9 \\ 2.1 \end{array}}$
Total	60	60.0	60.0	60.0
ML estimate		$\hat{θ} = 0.7833$	$\hat{θ} = 1.7434$	$\hat{θ} = 1.628413$
$χ^{2}$		7.98	2.20	1.71
d.f.		1	1	2
p-value		0.0047	0.1380	0.4253

Table 2 Distribution of mistakes in copying groups of random digits

No. of errors per Group	Observed Frequency	Expected Frequency
No. of errors per Group	Observed Frequency	PD	PLD	PGD
0 1 2 3 4 5	33 12 6 3 1 1	26.4 19.8 $\begin{array}{l} 7.4 \\ 1.8 \\ 0.3 \\ 0.3 \end{array}}$	31.5 14.2 $\begin{array}{l} 6.1 \\ 2.5 \\ 1.0 \\ 0.7 \end{array}}$	31.7 14.0 $\begin{array}{l} 6.0 \\ 2.5 \\ 1.0 \\ 0.8 \end{array}}$
Total	56	56.0	56.0	56.0
ML estimate		$\hat{θ} = 0.7500$	$\hat{θ} = 1.8081$	$\hat{θ} = 1.695033$
$χ^{2}$		4.87	0.53	0.38
d.f.		1	1	1
p-value		0.0273	0.4660	0.5376

Table 3 Distribution of Pyrausta nublilalis

The graph of the coefficient of variation (C.V), coefficient of skewness $(\sqrt{β_{1}})$ , coefficient of kurtosis $(β_{2})$ , and index of dispersion $(γ)$ of PGD and PLD are presented in figure 2.

Figure 2 Graphs of (C.V),

(\sqrt{β_{1}})

(β_{2})

, and

(γ)

of PGD and PLD for Varying Values of the Parameter

θ

Statistical properties of PGD

The PGD (1.3) is always over dispersed $(σ^{2} > μ)$ .
We have
$σ^{2} = \frac{θ^{3} + 6 θ^{2} + 12 θ + 7}{θ^{2} {(θ + 2)}^{2}}$
$= \frac{θ + 3}{θ (θ + 2)} [\frac{θ^{3} + 6 θ^{2} + 12 θ + 7}{θ (θ + 2) (θ + 3)}]$
$= \frac{θ + 3}{θ (θ + 2)} [1 + \frac{θ^{2} + 6 θ + 7}{θ (θ + 2) (θ + 3)}]$
$= μ [1 + \frac{θ^{2} + 6 θ + 7}{θ (θ + 2) (θ + 3)}] > μ$
This shows that PGD (2.2) is always over dispersed.

Unimodality and increasing hazard rate

Since
$\frac{P (x + 1; θ)}{P (x; θ)} = \frac{θ (x + 1) + (θ^{2} + 3 θ + 1)}{(θ + 1) [θ x + (θ^{2} + 3 θ + 1)]} = \frac{1}{θ + 1} [1 + \frac{θ}{θ x + (θ^{2} + 3 θ + 1)}]$ is decreasing function in x, $P (x; θ)$ is log-concave. Therefore, the PGD has an increasing hazard rate and thus unimodal. Detailed discussion about relationship between log-concavity, unimodality and increasing hazard rate of discrete distribution can be seen in Grandell.⁸

Generating functions

Probability generating function: The probability generating function of the PGD (2.2) can be obtained as
$P_{X} (t) = E (t^{X}) = \frac{θ}{(θ + 2) {(θ + 1)}^{2}} [θ \sum_{x = 0}^{\infty} x {(\frac{t}{θ + 1})}^{x} + (θ^{2} + 3 θ + 1) \sum_{x = 0}^{\infty} {(\frac{t}{θ + 1})}^{x}]$
$= \frac{θ}{(θ + 2) {(θ + 1)}^{2}} [\frac{θ (θ + 1) t}{{(θ + 1 - t)}^{2}} + \frac{(θ^{2} + 3 θ + 1) (θ + 1)}{θ + 1 - t}]$
$= \frac{θ}{(θ + 2) (θ + 1)} [\frac{θ t}{{(θ + 1 - t)}^{2}} + \frac{θ^{2} + 3 θ + 1}{θ + 1 - t}]$
$= \frac{θ^{3} + (4 - t) θ^{2} + 2 (2 - t) θ + (1 - t)}{(θ + 1) (θ + 2) {(θ + 1 - t)}^{2}}$

Moment generating function: The moment generating function of the PGD (2.2) is thus given by
$M_{X} (t) = \frac{θ^{3} + (4 - e^{t}) θ^{2} + 2 (2 - e^{t}) θ + (1 - e^{t})}{(θ + 1) (θ + 2) {(θ + 1 - e^{t})}^{2}}$ .

Estimation of parameter

Maximum likelihood estimate (MLE): Let $x_{1}, x_{2}, ..., x_{n}$ be a random sample of size n from the PGD (2.2) and let $f_{x}$ be the observed frequency in the sample corresponding to $X = x (x = 1, 2, 3, ..., k)$ such that $\sum_{x = 1}^{k} f_{x} = n$ , where k is the largest observed value having non-zero frequency. The likelihood function L of the PGD (2.2) is given by
$L = {(\frac{θ}{θ + 2})}^{n} \frac{1}{{(θ + 1)}^{\sum_{x = 1}^{k} (x + 2) f_{x}}} \prod_{x = 1}^{k} {[θ x + (θ^{2} + 3 θ + 1)]}^{f_{x}}$
The log likelihood function is obtained as
$\log L = n \log (\frac{θ}{θ + 2}) - \sum_{x = 1}^{k} f_{x} (x + 2) \log (θ + 1) + \sum_{x = 1}^{k} f_{x} \log [θ x + (θ^{2} + 3 θ + 1)]$
The first derivative of the log likelihood function is given by
$\frac{d \log L}{d θ} = \frac{2 n}{θ (θ + 2)} - \frac{n (\bar{x} + 2)}{θ + 1} + \sum_{x = 1}^{k} \frac{(x + 2 θ + 3) f_{x}}{θ x + (θ^{2} + 3 θ + 1)}$
where $\bar{x}$ is the sample mean.
The maximum likelihood estimate (MLE), $\hat{θ}$ of $θ$ is the solution of the equation $\frac{d \log L}{d θ} = 0$ and is given by the solution of the non-linear equation
$\frac{2 n}{θ (θ + 2)} - \frac{n (\bar{x} + 2)}{θ + 1} + \sum_{x = 1}^{k} \frac{(x + 2 θ + 3) f_{x}}{θ x + (θ^{2} + 3 θ + 1)} = 0$
This non-linear equation can be solved by any numerical iteration methods such as Newton- Raphson, Bisection method, Regula –Falsi method etc

Method of moment estimate (MOME): Let $x_{1}, x_{2}, ..., x_{n}$ be a random sample of size $n$ from the PGD (2.2). Equating the first population moment about origin to the corresponding sample moment, the MOME $\tilde{θ}$ of $θ$ is given by
$\tilde{θ} = \frac{(1 - 2 \bar{x}) + \sqrt{4 {\bar{x}}^{2} + 8 \bar{x} + 1}}{2 \bar{x}}; \bar{x} > 0$
where $\bar{x}$ is the sample mean.

Applications

The PGD has been fitted to a number of data - sets to test its goodness of fit over Poisson distribution (PD) and Poisson-Lindley distribution (PLD. The parameter has been estimated using maximum likelihood estimation. Two examples of observed data-sets, for which the PD, PLD and PGD has been fitted, are presented. The first data-set is due to Kemp and kemp⁹ regarding the distribution of mistakes in copying groups of random digits and the second data-set is due to Beall¹⁰ regarding the distribution of Pyrausta nublilalis.

Conclusions

A discrete Poisson-Garima distribution has been proposed by compounding Poisson distribution with Garima distribution introduced by Shanker.¹ Expression for r th factorial moment about origin has been derived and hence moments about origin and central moments have been given. The nature and behavior of coefficient of Variation, skewness, kurtosis and index of dispersion of the proposed distribution have been studied for varying values of the parameter. The estimation of parameter has been discussed using both maximum likelihood estimation and method of moments. The goodness of fit of the proposed distribution has been discussed with two examples of real data set and fit has been compared with Poisson and Poisson-Lindley distributions. The goodness of fit of the Poisson – Garima distribution shows that it gives better fit than both Poisson and Poisson-Lindley distribution and hence it can be considered as an important distribution to model discrete data over these two discrete distributions.