Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 8 Issue 1

A discrete Pranav distribution and its applications

Berhane Abebe, Kamlesh Kumar Shukla

Department of Statistics, College of Science, Eritrea

Correspondence:

Received: December 28, 2018 | Published: February 26, 2019

Citation: Abebe B, Shukla Kk. A discrete Pranav distribution and its applications. Biom Biostat Int J. 2019;8(1):33-37. DOI: 10.15406/bbij.2019.08.00267

Download PDF

Abstract

In the recent decades, the discretization of continuous distribution has been attracting to the attention of researchers because it generates distributions that can be used for strictly discrete data. In this paper, a discrete Pranav distribution, which is a discrete analogue of continuous Pranav distribution, has been carried out. It’s important properties including coefficient of variation, skewness, kurtosis and index of dispersion have been obtained and discussed graphically. The method of maximum likelihood estimation has been used for estimating its parameter. The goodness of fit of the proposed distribution have been illustrated using some real count datasets and it was found better fit as compared to other one parameter discrete distributions.

Keywords: Pranav distribution, discretization, moment generating function, moments, estimation, goodness of fit

Introduction

In recent past years, the use of discrete analogue of a continuous distribution avoids the use of a continuous distribution in the case of strictly available discrete data.

In many cases, it is not easy to get samples from continuous distributions. The observed values, in the most of cases, are collected actually discrete in nature for the reason that they are measured to only finite number of decimal places and cannot completely presents all points in a continuum. According to Lai,1 discretization of a continuous lifetime model is an appealing approach to derive a discrete lifetime model corresponding to the continuous one. Therefore, it is reasonable and convenient to model the situation by an appropriate discrete distribution generated from the underlying continuous distribution preserving one or more important characteristics including probability density function (pdf), mean residual life function etc. and important statistical properties of the distribution.

In Statistics literature, different researchers have used different methods of discritization to propose a discrete type of distribution analogues of continuous distribution. In this study, one of the discretization methods has been used to find discrete analogous of continuous Pranav distribution introduced by Shukla.2 Infinite series method has been used to find pmf of Pranav distribution, appropriate definition is given in the next paragraph. This method was firstly used by Good3 who proposed the discrete Good distribution to model for the frequencies of species. It was given as follows.

A random variable is said to have a discrete Good distribution if its pmf can be expressed as

P(Y=y)=αyyβj=0αjjβ   ;y=0,1,2,....       (1)

where βR   and   α(0,1)

The method of infinite series is formulated by the definition which is given as below:

Definition: Let X be a continuous random variable having pdf fX(x) with support on R. Then the corresponding discrete random variable Y has pmf given by

P(Y=y)=P(y;θ)=fX(y;θ)j=fX(j;θ)   ;yZ   (1.2)

where θ may be the vector of parameters indexing the distribution of X .

This method has been used by many researchers to derive discretization of continuous distribution, such as, Kulasekara & Tonkyn,4 Doray & Luong,5 Sato et al.,6 Nekoukhou et al.,7 are some among others, who proposed a version of the method when the continuous random variable of interest is defined on . Thus, if the random variable is defined on , the pmf of can be defined as

P(Y=y)=P(y;θ)=fX(y;θ)j=0fX(j;θ)   ;yZ+   (1.3)

Josmar et al.,8 using infinite series method has derived a discrete Shanker distribution (DSD) with parameter θ>0 and having pmf

P1(y;θ)=(eθ1)2(θ+y)eθ(y+1)1+(eθ1)θ;  y=0,1,2,...   (1.4)

They have discussed its various statistical properties including its applications to model count datasets in their paper. Which is a discrete analogue of continuous Shanker distribution introduced by Shanker9 having pdf

f1(x;θ)=θ2θ2+1(θ+x)eθx    ;x>0,  θ>0   (1.5)

Using same method of discretization, the pmf of discrete Lindley distribution (DLD) proposed by Berhane & Shanker10 is given by

P2(y;θ)=(eθ1)2e2θ(1+y)eθy;  y=0,1,2,...   (1.6)

where the parameter θ > 0 .

They have discussed its important statistical properties including estimation of parameter of DLD and applied on some count datasets from engineering and biology in their paper. They showed its superiority over other discrete one parameter distributions such as Poisson Lindley distribution (PLD) proposed by Shankar,11 Poisson Akash distribution (PAD) proposed by Shanker,12 and DSD proposed by Josmer et al. ,8As mentioned above, DLD is a discrete analogue of continuous Lindley distribution introduced by Lindley13 having pdf

f2(x;θ)=θ2θ+1(1+x)eθx    ;x>0,  θ>0   (1.7)

Recently, Berhane & Shanker,14 proposed a discrete Akash distribution (DAD) using infinite series method, the pmf of a discrete Akash distribution is given as

P3(y;θ)=(eθ1)3eθ(e2θeθ+2)(1+y2)eθy;  y=0,1,2,...   (1.8)

They have discussed its important statistical properties including estimation of method and applied on some count datasets, and showed its superiority over DSD, DLD, PLD and PAD in their paper. Which is the discrete analogue of a continuous Akash distribution introduced by Shanker,15 its pdf is given as:

f3(x;θ)  =  θ3θ2+2  (1+x2)  eθx  ;     x>0,    θ>0   (1.9)

Shanker12 proposed PAD, a Poisson mixture of Akash distribution, having pmf

P4(x;θ)=θ3θ2+2.x2+3x+(θ2+2θ+3)(θ+1)x+3;   x=0,1,2,...θ>0   (1.10)

He has discussed its important statistical properties including estimation of parameter along with applications of PAD in his paper. PAD was applied to count datasets and showed its superiority with PLD and other distribution of one parameter.

The PLD is a Poisson mixture of Lindley distribution introduced by Sankaran11 and is defined by its pmf

P5(x,θ)=θ2(x+θ+2)(θ+1)x+3;  x=0,1,2,...,  θ>0   (1.11)

The main objective of this paper is to propose a discretization of Pranav distribution for the reason being that it was observed, Pranav distribution gives better fit than one parameter continuous Akash, Shaker, Sujatha, Lindley and Exponential distributions. Keeping this view in mind, it is hoped that it would be better than discrete Akash, discrete Shanker and discrete Lindley distributions and other one parameter discrete distributions.

The main objective of this paper is to propose a discretization of Pranav distribution for the reason being that it was observed, Pranav distribution gives better fit than one parameter continuous Akash, Shaker, Sujatha, Lindley and Exponential distributions. Keeping this view in mind, it is hoped that it would be better than discrete Akash, discrete Shanker and discrete Lindley distributions and other one parameter discrete distributions.

A discrete Pranav distribution

The pdf and the cdf of a random variable X having Pranav distribution proposed by Shukla2 are given by

f(x;θ)  =  θ4θ4+6  (θ+x3)  eθx  ;     x>0,    θ>0   (2.1)

F( x;θ )=1[ 1+ θx( θ 2 x 2 +3θx+6 ) θ 4 +6 ] e θx ;x>0,θ>0 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqkY=Mj0xXdbba91rFfpec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiaadAeadaqadaqaaiaadIhacaGG7aGaeqiUdehacaGLOaGaayzkaaGaeyypa0JaaGymaiabgkHiTmaadmaabaGaaGymaiabgUcaRmaalaaabaGaeqiUdeNaamiEamaabmaabaGaeqiUde3aaWbaaSqabeaacaaIYaaaaOGaamiEamaaCaaaleqabaGaaGOmaaaakiabgUcaRiaaiodacqaH4oqCcaWG4bGaey4kaSIaaGOnaaGaayjkaiaawMcaaaqaaiabeI7aXnaaCaaaleqabaGaaGinaaaakiabgUcaRiaaiAdaaaaacaGLBbGaayzxaaGaamyzamaaCaaaleqabaGaeyOeI0IaeqiUdeNaamiEaaaakiaaykW7caaMc8UaaGPaVlaacUdacaWG4bGaeyOpa4JaaGimaiaaykW7caGGSaGaeqiUdeNaeyOpa4JaaGimaaaa@68C1@    (2.2)

Shukla2 has discussed its various statistical properties including moments based measures, hazard rate function, and other important properties along with Bonferroni and Lorenz curves and stress-strength reliability. Pranav distribution applied for modeling lifetime data from biomedical sciences and engineering and explained its superiority over Akash, Shanker, Ishita, Sujatha, and exponential distributions.

Using the above definition, the pmf of the discrete random variable Y, corresponding to a continuous random variable X follows Pranav distribution (2.1) with parameter θ>0, can be obtained as

P(y;θ)=(eθ1)4eθ(θ(eθ1)3+e2θ+4eθ+1)(θ+y3)eθy;  y=0,1,2,...   (2.3)

We would call this distribution, a discrete Pranav distribution (DPD). The nature and behavior of DPD for varying values of its parameter has been shown graphically in Figure 1. From the figure it was observed that pmf of DPD is increasing as increased values of (Figure 1).

Figure 1 The pmf plot of DPD for varying values of the parameter θ.

The survival function, S(y;θ) and the cumulative distribution function (cdf), F(y;θ) of DPD can be obtained as

S(y;θ)=[1+y3(eθ1)3+3eθy2(eθ1)2+3yeθ(e2θ1)θ(eθ1)3+(e2θ+4eθ+1)]eθ(y+1);y=0,1,2,...,θ>0 (2.5)

F2(y;θ)=1[1+y3(eθ1)3+3eθy2(eθ1)2+3yeθ(e2θ1)θ(eθ1)3+(e2θ+4eθ+1)]eθ(y+1);y=0,1,2,...,θ>0

cdf graphs of DPD are presented in Figure 2.

Figure 2 The cdf plot of DPD for varying values of the parameter θ.

Since P(y+1;θ)P(y;θ)=[1+3y2+3y+1θ+y2]eθ is a decreasing function of y3,P(y;θ) is log-concave and therefore, the DPD has an increasing hazard rate. Further, [P(y;θ)]2P(y1;θ)P(y+1;θ) for y3, which implies unimodality, by theorem 3 of Keilson & Gerber.16 The detailed about interrelationship between log-concavity, unimodality and increasing hazard rate of discrete distributions can be shown in Grandell.17

Mean variance and statistical constants

The probability generating function and the moment generating function (mgf) of DPD can be obtained as

G(t)=(eθ1)4(θ(eθ1)3+e2θ+4eθ+1)[θ(eθt)3+t(e2θ+4teθ+t2)(eθt)4]    ,for  teθ and

M(t)=(eθ1)4θ(eθ1)3+e2θ+4eθ+1[θ(eθet)3+et(e2θ+4eteθ+e2t)(eθet)4]    ,for  tθ

The first four moments about origin of DPD can thus be obtained as

μ1=θ(eθ1)3+e3θ+11e2θ+11eθ+1(eθ1)(θ(eθ1)3+e2θ+4eθ+1)

μ2=θe4θ+e4θ2θe3θ+26e3θ+66e2θ+2θeθ+26eθθ+1(eθ1)2(θ(eθ1)3+e2θ+4eθ+1)

μ3=θe5θ+e5θ+θe4θ+57e4θ8θe3θ+302e3θ+8θe2θ+302e2θθeθ+57eθθ+1(eθ1)3(θ(eθ1)3+e2θ+4eθ+1)

μ4=θe6θ+e6θ+8θe5θ+120e5θ19θe4θ+1191e4θ+2416e3θ+19θe2θ+1191e2θ8θeθ+120eθθ+1(eθ1)4(θ(eθ1)3+e2θ+4eθ+1)

Using the relationship μr=E(Yμ1)r=k=0r(rk)μk(μ1)rk between central moments and moments about origin, the central moments of DPD are derived as

μ2=eθ((θ+1)θe6θ+(226θ)θe5θ+(15θ223θ+8)e4θ(20θ2+64θ28)e3θ+(15θ2+95θ+72)e2θ(6θ2+22θ28)eθ+θ29θ+8)(eθ1)2(θ(eθ1)3+e2θ+4eθ+1)2

μ3=eθ(θ2(θ+1)e10θ(8θ347θ2+θ)e9θ+(27θ396θ2+15θ)e8θ(48θ3+342θ2+60θ8)e7θ+(42θ3+1320θ2+300θ+32)e6θ(1536θ2+390θ288)e5θ(42θ3438θ2+78θ536)e4θ+(48θ3+486θ2+204θ+536)e3θ(27θ3+393θ2132θ288)e2θ+(8θ3+65θ2150θ+32)eθθ3+10θ217θ+8)(θ(eθ1)3+e2θ+4eθ+1)3(eθ1)3

μ4=eθ(θ3(θ+1)e14θ(5θ4106θ3+θ2)e13θ(17θ4+41θ3111θ2+θ)e12θ+(230θ43184θ3+606θ2+92θ)e11θ(979θ412857θ3+578θ2+1136θ)e10θ+(2453θ419414θ36363θ2+5620θ+208)e9θ(4125θ42391θ36813θ2165θ+2232)e8θ+(4884θ4+34800θ3+22116θ213416θ+8400)e7θ(4125θ4+55749θ3+49788θ2+17856θ20544)e6θ+(2453θ4+41054θ3+33561θ2+26664θ+30528)e5θ(979θ4+13987θ3+3047θ26387θ20544)e4θ+(230θ4+128θ33522θ25236θ+8400)e3θ(17θ41291θ3+642θ2+2864θ2232)e2θ(5θ4+242θ3707θ2+668θ208)eθ+θ411θ3+27θ225θ+8)(θ(eθ1)3+e2θ+4eθ+1)4(eθ1)4

The coefficient of variation (C.V), coefficient of skewness (β1) , coefficient of kurtosis (β2) and index of dispersion (γ) of DPD can be obtained using the relationships below

C.V=σμ1β1=μ3(μ2)3/2β2=μ4μ2γ=σ2μ1

Table 1 exhibits the nature and behavior of coefficient of variation (C.V), coefficient of skewness, coefficient of kurtosis and index of dispersion (ID) for varying values of the parameter θ (Figure 3).

Figure 3 The plot of measures of descriptive statistics of DPD for varying values of the parameter θ.

θ MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaccmGae8hUdehaaa@37AB@

Values of descriptive statistics

Mean

Variance

C.V

Skewness

Kurtosis

ID

0.5

7.915

16.3833

0.5114

0.95

4.4328

2.0699

1

3.2844

5.288

0.7001

0.6689

3.6405

1.61

1.5

1.1918

2.2352

1.2545

1.3138

4.5122

1.8755

2

0.4147

0.7018

2.0197

2.4253

9.6591

1.692

2.5

0.1712

0.2418

2.8725

3.4854

17.7216

1.4125

3

0.0825

0.1013

3.8558

4.479

27.2341

1.2272

3.5

0.0438

0.0492

5.0675

5.5959

39.358

1.1239

4

0.0245

0.0261

6.6069

7.0201

57.6163

1.0682

Table 1 Values of descriptive statistics of DPD for varying values of θ

It is clear from Table 1 that the mean, variance, and index of dispersion of DPD are decreasing as increased values of the parameter θ, whereas coefficient of variation, coefficient of skewness and coefficient of kurtosis of DPD are increasing as increased values of parameter θ. σ2>μ , indicates that DPD can be a suitable model for over-dispersed data.

Maximum likelihood estimation

The likelihood function, L of (2.3) is given by

L=((eθ1)4eθ(θ(eθ1)3+e2θ+4eθ+1))ni=1n(θ+yi3)eθyi

and its log likelihood function is

lnL=nln((eθ1)4eθ(θ(eθ1)3+e2θ+4eθ+1))+i=1nln(θ+yi3)nθy¯

lnL=4nln(eθ1)θln(θ(eθ1)3+e2θ+4eθ+1)+i=1nln(θ+yi3)nθy¯

Differentiating above equation with respect θ, we have

lnLθ=4neθeθ11(eθ1)3+3θeθ(eθ1)2+2e2θ+4eθθ(eθ1)3+e2θ+4eθ+1+i=1n1θ+yi3ny¯=0

Above equation can be simplified to solve the value of parameter. In this paper, R-software is used to estimate for value of θ.

Application and goodness of fit

In this section, the goodness of fit of the DPD has been discussed with two count datasets. The dataset in Table 2 has been taken from Kemp & Kemp18 and dataset in Table 3 has been taken from Beall,19 detailed about the datasets can been shown in their paper. The proposed model is compared with DSD, DLD, PLD, PAD and DAD (Figure 4&5).20

No. of error per group

Observed

Expected frequency

Frequency

PLD

PAD

DLD

DSD

DAD

DPD

0

35

33.1

33.5

31

31.7

33.2

36

1

11

15.3

14.7

17.4

16.9

14.2

10.6

2

8

6.7

6.6

7.4

7.2

7.6

7.1

3

4

2.9

3

2.8

2.7

3.3

3.9

4

2

2

2.2

1.4

1.5

1.7

2.4

Total

60

60

60

60

60

60

60

 

θ

1.7434

2.078

1.2678

1.2276

1.5404

1.689

χ2

1.8141

1.4185

3.3667

2.9963

1.0398

0.1712

d.f.

1

2

1

1

2

2

p-value

0.178

0.492

0.066

0.0837

0.595

0.919

Table 2 Distribution of mistakes in copying groups of random digits

No. of insects

Observed

Expected frequency

Frequency

PLD

PAD

DLD

DSD

DAD

DPD

0

33

31.5

32

29.6

30.3

31.6

34.4

1

12

14.2

13.6

16.2

15.6

13.2

9.8

2

6

6.1

6

6.6

6.4

6.9

6.4

3

3

2.5

2.6

2.4

2.4

2.9

3.4

4

1

1

1.1

0.8

0.8

1

1.4

5

1

0.7

0.7

0.4

0.5

0.4

0.6

Total

56

56

56

56

56

56

56

θ

1.8115

2.1446

1.2993

1.2535

1.5686

1.7122

χ2

0.4598

0.2541

1.5422

1.1516

0.1747

0.6055

d.f.

1

1

1

1

1

2

p-value

0.498

0.614

0.215

0.283

0.676

0.739

Table 3 Observed and expected frequencies for distribution of Pyrausta nublilalis in 1937

In this section, the goodness of fit of the DPD has been discussed with two count datasets. The dataset in Table 2 has been taken from Kemp & Kemp18 and dataset in Table 3 has been taken from Beall,19 detailed about the datasets can been shown in their paper. The proposed model is compared with DSD, DLD, PLD, PAD and DAD (Figure 4&5).20

Figure 4 Fitted plot of distributions on first data set.

Figure 5 Fitted plot of distributions on second data set..

Conclusion

In this paper, a discrete Pranav distribution (DPD) has been proposed. Its moment generating function, moments and moments based measures including statistical constants have been derived and their nature and behavior has been discussed numerically and graphically. The method of maximum likelihood estimation has been discussed for estimating its parameter. The goodness of fit of DPD has been explained using two real count datasets. The DPD gives better fit as compared to PLD, PAD, DLD, DSD and DAD in the presented datasets.

Acknowledgments

None.

Conflicts of interest

Author declares that there is no conflict of interest.

References

  1. Lai CD. Issues concerning constructions of discrete lifetime models. Qual Techno Quant Mang. 2013;10(2):251–262.
  2. Shukla KK. Pranav distribution with Properties and Applications. Biom Biostat Int J. 2018;7(3):244–254.
  3. Good LJ. The population frequencies of species and the estimation of population parameters. Biometrika. 1953;40:237–264.
  4. Kulasekara KB, Tonkyn DW. A new discrete distribution with application to survival, dispersal and dispersion. Commun Stat Simul Comput. 1992;21:499–518.
  5. Doray LG, Luong A. Efficient estimators for the Good family. Commun Stat Simul Comput. 1997;21:499–518.
  6. Sato H Ikota, M Aritoshi S, Masuda H. A new defect distribution in meteorology with a consistent discrete exponential formula and its applications. IEEE Trans Semicond Manufactur. 1999;12(4):409–418.
  7. Nekoukhou VM, Alamatsaz MH, Bidram H. Discrete Generalized exponential distribution. Communications in Statistics–Theory & Methods. 2012;41:2000–2013.
  8. Josmar M, Wesley BDS, Ricardo PO. On the Discrete Shanker Distribution. Chilean Journal of Statistic. 2017.
  9. Shanker R. Distribution and Its Applications. International Journal of Statistics and Application. 2015a;5(6):338–348.
  10. Abebe B, Shanker R. A Discrete Lindley Distribution with Applications in Biological Science. Biom Biostat Int J. 2018a;7(1):1–5.
  11. Sankaran M. The discrete Poisson–Lindley distribution. Biometrics.1970;26:145–149.
  12. Shanker R. The Discrete Poisson–Akash Distribution. International Journal of Probability and Statistics. 2017;6(1):1–10.
  13. Lindley DV. Fiducial distributions and Bayes’ theorem. Journal of the Royal Statistical Society, Series B. 1958;20:102–107.
  14. Abebe B, Shanker R. A Discrete Akash Distribution with Applications, Klinikleri. Journal of Biostatistics. 2018b;10(1):1–12.
  15. Shanker R. Akash distribution and Its Applications. International Journal of Probability and Statistics. 2015b;4(3):65–75.
  16. Nakagawa T, Osaki S. The discrete Weibull distribution. IEEE Trans Reliability.1975;R–24(5):300–301.
  17. Keilson J, Gerber H. Some results for discrete Unimodality. Journal of the American Statistical Association.1971;66:386–389.
  18. Grandell J. Mixed Poisson Processes. USA: CRC Press. 1997.
  19. Kemp CD, Kemp AW. Some properties of the Hermite distribution. Biometrika. 1965;52:381–394.
  20. Beal G. The fit and significance of contagious distributions when applied to observations on larval insects. Ecology. 1940;21:460–474.
Creative Commons Attribution License

©2019 Abebe, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.