Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 11 Issue 2

Effect of correlated measurement errors on estimation of population mean with modified ratio estimator

Okafor Ikechukwu Boniface, Onyeka Aloysius Chijioke, Ogbonna Chukwudi Justin, Izunobi Chinyeaka Hostensia

Department of Statistics, Federal University of Technology, Nigeria

Correspondence: Okafor Ikechukwu Boniface, Department of Statistics, Federal University of Technology, Owerri, Imo state, Nigeria

Received: November 19, 2021 | Published: April 25, 2022

Citation: Boniface OI, Chijioke OA, Justin OC, et al. Effect of correlated measurement errors on estimation of population mean with modified ratio estimator. Biom Biostat Int J. 2022;11(2):52-56. DOI: 10.15406/bbij.2022.11.00354

Download PDF

Abstract

This paper proposes a class of modified ratio estimators of population mean using correlation coefficient between study and auxiliary variables in the presence of correlated measurement errors under simple random strategy. Usual unbiased estimator of sample mean per unit, ratio and product-type estimators belong to the suggested modified class of estimators. Considering large sample approximation, properties of the proposed estimator are obtained. Theoretical and empirical analysis revealed that the proposed class of estimators are more efficient than some existing estimators.

Keywords: Correlated measurement errors, ratio estimator, bias, mean squared error, correlation coefficient.

Introduction

Many researchers have widely utilized auxiliary information while estimating population parameters. This has contributed immensely in advancing sampling theory as a result of its ability to improve the accuracy of sampling strategies and reduce their design variances. Due to the fact that sample sizes are not sufficiently large in most of the survey exercises, estimators of population parameters based on these survey exercises may not be satisfactory in terms of their variances. At the same time it is not unusual that some auxiliary information about the study variable may be available. Such additional information, if available, can be utilized to improve properties of estimators. Some of the auxiliary information about the population that is used to improve the accuracy of an estimator may include a known variable to which study variable is approximately related. Such estimators which utilize auxiliary information include ratio, product and regression estimators. Although use of auxiliary information may have improved the estimates of population parameters, measurement errors may still influence the efficiency of the estimators.

In sampling survey, properties of estimators presume that observed values are indeed true values. However, several observations of the same quantity on the same subject may not in most cases be the same as a result of natural variation in the subject, variation in the observational process, or both. Hence, it is generally accepted that data available for statistical analysis are subject to error.

The difference between the individual observed values and their corresponding true values are referred to as measurement errors. This constitutes an essential part of errors in any sample survey data and their presence is practically inevitable whatever precautions one takes. The causes of these measurement errors may be attributed to errors during data collection stage due to respondents or enumerators’ bias or both, and to data collation and coding.1,2 The magnitude of the effect of measurement errors on statistical inference drawn about the population parameter may sometimes be inconsequential. However, in some other situation, the magnitude may throw a serious concern which may invalidate the inference drawn and lead to unfortunate implication.

Shalabh3 had examined the issue of observational error or measurement errors on ratio estimator under simple random sampling strategy. Following his work, other researchers further investigated the impact of measurement errors on the estimators of population parameters using different sampling schemes. Manish and Singh4 considered linear combination of ratio estimator and sample mean per unit and came up with a family of estimators of population mean. They obtained the bias and mean squared error of the proposed family of estimators when the sample data are contaminated with measurement errors. Using variable transformation, Diwakar et al.5 worked on estimator of a population mean in the presence of measurement errors and the properties of the estimator were obtained. Comparing this estimator with the estimators proposed by Manish and Singh4 and Shalabh3 when the study and auxiliary variables are contaminated with measurement errors, it was observed that their proposed estimator is more efficient in a localized domain. Using variable transformation, Viplav et al.6 studied a class of difference-type estimator for estimating the population mean of the study variable when measurement errors are present. They generated some new estimators that belong to the family of estimators proposed by them. Their empirical study showed that the suggested estimators have more gain in efficiency overother existing estimators.

Gregoire and Salas7 studied systematic measurement errors as well as measurement errors that are assumed to be stochastic in nature. They obtained the statistical properties of three ratio estimators under these measurement error conditions. They concluded that the ratio-of-means estimator appears to be less affected when the auxiliary variants are contaminated with measurement errors. Empirical study of ratio and regression estimators through Monte Carlo simulation by Sahoo et al.8 when the auxiliary variable is contaminated with the measurement errors reveals that the regression estimator is more sensitive to measurement errors than the ratio estimator with respect to their efficiency. Bias of both estimators is sensitive to measurement errors with the bias of an estimator decreasing as the sample size is increasing, and increase when the regression line of  (study variable) on  (auxiliary variable)moves away from the origin.

All the work reviewed so far were based on the general assumption that measurement errors are uncorrelated though the study variable  and auxiliary variable  are correlated. However, Shalabh and Jia-Ren9 relaxed the general assumption and studied the performance of ratio as well as product estimators of population mean with correlated measurement errors.

In this work, we examine the performance of modified ratio-type estimator of population mean under the influence of correlated measurement errors using simple random sampling scheme.

Measurement error model definition

Considering, a population of size N, (Ui=U1,U2,,UN)(Ui=U1,U2,,UN) . Let’s denote the study variable as yy  and the auxiliary variableas xx  and let them take on the values yiyi  and xixi respectively on the ithith  unit of Ui,(i=1,2,,N)Ui,(i=1,2,,N) . We denote population mean of yy  and xx  as μYμY and μXμX respectively, and the population variance of yy  and xx  as σ2Yσ2Y and σ2Xσ2X respectively. Also let σXYσXY and ρρ denote the population covariance and the correlation coefficient between ρρ and xx .

Assume a simple random sample without replacement (SRSWOR) of size n is drawn from population U. Let ˉy¯y  and ˉx¯x  be the sample means of yy and xx respectively. Thus, for a simple random sampling scheme, let ( yiyi ,xixi ) be observed values instead of the true values (y*i,x*i)(yi,xi)  on the two characteristics (y,x)(y,x) respectively for the ithith  unit (i=1,2,,n)(i=1,2,,n)  in a sample of size n. Let the measurement errors be defined as:

ui=yiy*iui=yiyi   (1)

vi=xix*ivi=xixi   (2)

Such that

E(u)=E(v)=0E(u)=E(v)=0  

Var(u)=σ2uVar(u)=σ2u , Var(v)=σ2vVar(v)=σ2v

cov(u,v)=ρ*σuσvcov(u,v)=ρσuσv  

Thus, expressing the observed value as a function of the true value and the measurement errors, we have,

yi=y*i+uiyi=yi+ui   (3)

xi=x*i+vixi=xi+vi   (4)

Notations

Considering large sample approximation, the finite population correction 1f1f  can be ignored,

where

f=nNf=nN

We define mean and variance of study variable YY and auxiliary variable XX as

ˉX=1NNi=1Xi,ˉY=1NNi=1Yi,σX=1NNi=1(XiˉX)2,σY=1NNi=1(YiˉY)2¯¯¯X=1NNi=1Xi,¯¯¯Y=1NNi=1Yi,σX=1NNi=1(Xi¯¯¯X)2,σY=1NNi=1(Yi¯¯¯Y)2  

Further, we define the coefficient of variation of XX  and YY  as

CX=σXˉXandCY=σYˉYrespectivelyCX=σX¯¯¯XandCY=σY¯¯¯Yrespectively  

Also Covariance of YY  and XX , Correlation Coefficient between YY  and XX , and Correlation Coefficient between uu  and vv are defined as

σXY=1NNi=1(XiˉX)(YiˉY),ρ=σXYσXσYandρ*=σuvσvσurespectivelyσXY=1NNi=1(Xi¯¯¯X)(Yi¯¯¯Y),ρ=σXYσXσYandρ=σuvσvσurespectively  

Using delta notation, we define the following:

δ0=ˉyˉy1ˉy=ˉY(1+δo)δ0=¯y¯y1¯y=¯¯¯Y(1+δo)   (5)

δ1=ˉxˉx1ˉx=ˉX(1+δ1)δ1=¯x¯x1¯x=¯¯¯X(1+δ1)   (6)

Such that,

E(δ0)=E(δ1)=0E(δ0)=E(δ1)=0   (7)

E(δ20)=σ2YnθYˉY2E(δ20)=σ2YnθY¯¯¯Y2   (8)

E(δ21)=σ2XnˉX2(σ2X+σ2vσ2X)=σ2XnθXˉX2E(δ21)=σ2Xn¯¯¯X2(σ2X+σ2vσ2X)=σ2XnθX¯¯¯X2   (9)

where,

θY=σ2Yσ2Y+σ2uθY=σ2Yσ2Y+σ2u  and θX=σ2Xσ2X+σ2vθX=σ2Xσ2X+σ2v ,

and are bounded on (0,1).

Also,

E(δ0hδ1h)=1nˉYˉX(CYCXρ+σuσvρ*)E(δ0hδ1h)=1n¯¯¯Y¯¯¯X(CYCXρ+σuσvρ)   (10)

Adapted Estimators

The traditional sample mean per unit estimator for estimating population mean when the sample data is contaminated with measurement error is given by:

t0=ˉyt0=¯y   (11)

The variance is given as

V(t0)=C2YnθYV(t0)=C2YnθY   (12)

Shalabh and Jia-Ren9 proposed ratio estimator and product estimator when the general assumption on the measurement errors is relaxed as

t1=ˉyˉXˉxt1=¯y¯¯¯X¯x   (13)

t2=ˉyˉxˉXt2=¯y¯x¯¯¯X   (14)

They obtained the mean square error of ratio and product estimators as

MSE(t1)=ˉY2n(C2YθY+C2XθX2(CYCXρ+σuσvρ*ˉYˉX))MSE(t1)=¯¯¯Y2n(C2YθY+C2XθX2(CYCXρ+σuσvρ¯¯¯Y¯¯¯X))   (15)

MSE(t2)=ˉY2n(C2YθY+C2XθX+2(CYCXρ+σuσvρ*ˉYˉX))MSE(t2)=¯¯¯Y2n(C2YθY+C2XθX+2(CYCXρ+σuσvρ¯¯¯Y¯¯¯X))   (16)

Proposed estimator

Motivated by the Shalabh and Jia-Ren,9 we propose the following modified ratio estimator to estimate population mean in the presence of correlated measurement errors as

tr=ˉy(ˉX+ρˉx+ρ)βtr=¯y(¯¯¯X+ρ¯x+ρ)β   (17)

where ββ  is any real number chosen so as to minimize the mean squared errors of t1t1 . It may be noted that the proposed modified estimator is a class of estimators and that the following estimators are particular members of the proposed estimators when

β=0,tr0=ˉyβ=0,tr0=¯y   (18)

β=1,tr1=ˉy(ˉX+ρˉx+ρ)β=1,tr1=¯y(¯¯¯X+ρ¯x+ρ)   (19)

β=1,tr2=ˉy(ˉx+ρˉX+ρ)β=1,tr2=¯y(¯x+ρ¯¯¯X+ρ)   (20)

β=12,tr3=ˉy(ˉX+ρˉx+ρ)12   (21)

β=12,tr4=ˉy(ˉX+ρˉx+ρ)12   (22)

Properties of proposed estimator

Using notations defined in Section 3, we obtain the properties of the proposed estimators. Expressing (17) in terms of δi,(i=0,1)

tr=ˉY(1+δ0)(ˉX+ρˉX(1+δ1)+ρ)β   (23)

(23) can be rewritten as

tr=ˉY(1+δ0)(1+ˉXρˉX+ρδ1)β  

=ˉY(1+δ0)[1β(ˉXρˉX+ρ)δ1+β(β+1)2(ˉXρˉX+ρ)2δ1+O(δ1)]  

tr=ˉY+ˉY[δ0β(ˉXρˉX+ρ)δ1δ0+β(β+1)2(ˉXρˉX+ρ)2δ21δ0β(ˉXρˉX+ρ)δ1+β(β+1)2(ˉXρˉX+ρ)2δ21]  

trˉY=ˉY[δ0β(ˉXρˉX+ρ)δ1δ0+β(β+1)2(ˉXρˉX+ρ)2δ21δ0β(ˉXρˉX+ρ)δ1+β(β+1)2(ˉXρˉX+ρ)2δ21]   (24)

Taking expectation of both sides of (24) and making necessary substitutions using (8), (9) and (10) and simplifying the bias up to first order approximation, (24) becomes

Bias(tr)=E(trˉY)=ˉYβn[(β+12)(ˉXρˉX+ρ)2CXθX(ˉXρˉX+ρ)(ρCYCX+σuσvρ*ˉYˉX)]   (25)

Squaring and taking expectation of both sides of (24) and making necessary substitution using (8), (9) and (10) and simplifyingthe mean square error up to first order approximation, (24) becomes

MSE(tr)=E(trˉY)2=ˉY2n[C2YθY+β2(ˉXρˉX+ρ)2C2XθX2β(ˉXρˉX+ρ)(ρCYCX+σuσvˉYˉXρ*)]   (26)

Using the least square method which seek to minimize sum of square errors, we obtain the optimum value β  which minimizes the mean square error of tr  as

β=βopt=(ˉX+ρˉXρ)(ρCYCX+σuσvˉYˉXρ*)θXC2X   (27)

Substituting (27) in (26) we obtain minimum mean square error of tr  as

MSEmin(tr)=ˉY2n[C2YθYθXC2X(ρCYCX+σuσvˉYˉXρ*)2]   (28)

The variance and the mean square errors of the estimators which are particular members of the proposed modified estimator can easily be obtained by substituting the appropriate values of β=0,1,1,12,12 in (26). Thus,

Var(tr0)=ˉY2nC2YθY   (29)

MSE(tr1)=ˉY2n[C2YθY+(ˉXρˉX+ρ)2C2XθX2(ˉXρˉX+ρ)(ρCYCX+σuσvˉYˉXρ*)]   (30)

MSE(tr2)=ˉY2n[C2YθY+(ˉXρˉX+ρ)2C2XθX+2(ˉXρˉX+ρ)(ρCYCX+σuσvˉYˉXρ*)]   (31)

MSE(tr3)=ˉY2n[C2YθY+14(ˉXρˉX+ρ)2C2XθX(ˉXρˉX+ρ)(ρCYCX+σuσvˉYˉXρ*)]   (32)

MSE(tr4)=ˉY2n[C2YθY+(ˉXρˉX+ρ)2C2XθX+2(ˉXρˉX+ρ)(ρCYCX+σuσvˉYˉXρ*)]   (33)

Theoretical efficiency comparison of t_r with some existing estimators

The optimum mean square error of tr was compared with the existing estimators t0,t1,t2 . Thus, from (28) and(12), we observed that

MSEmin(tr)Var(t0)=(ρCYCX+σuσvˉYˉXρ*)2<0   (34)

Since (ρCYCX+σuσvˉYˉXρ*)2 will always be positive, (34) will always be negative, and the proposed estimator will always be more efficient than the usual unbiased sample mean per unit estimator.

From (28) and (15), we observed that

MSEmin(tr)MSE(t1)=θXC2X(ρCYCX+σuσvˉYˉXρ*)2C2XθX+2(ρCYCX+σuσvˉYˉXρ*)<0   (35)

From (28) and (16), we observed that

MSEmin(tr)MSE(t2)=θXC2X(ρCYCX+σuσvˉYˉXρ*)2C2XθX2(ρCYCX+σuσvˉYˉXρ*)<0   (36)

From (34), (35) and (36), the proposed estimator will always be more efficient than the sample mean per unit estimator, ratio estimator and product estimator in the presence of correlated measurement errors.

Empirical efficiency comparison

The efficiency of the proposed estimator tr  is illustrated using hypothetical data set on income and expenditure from Gujarati and Porter.10

y*i=Household Spending(True Value)  

x*i=Household Earning(True Value)  

yi=Household Spending(Observed Value)  

xi=Household Earning(Observed Value)  

The following values of the parameter were obtained from the given data.

N

ˉY   ˉX   σ2Y   σ2X   σ2u   σ2v   ρ   ρ*   θY   θX  

10

127

170

1278

3300

36

41

0.964

-0.09087

0.975

0.988

Table 1 Value of the Parameters

Table 2 shows the percentage relative efficiency (PRE) with respect to sample mean per unit ˉy  of the proposed estimator and some existing estimator. This was defined as

PRE(·)=Var(ˉy)MSE(·)×100   (37)

Estimators

Mean square error

Percentage relative efficiency

t0  

131.3974

100

tropt  

14.4820

907.32

t1  

22.5620

582.38

t2  

613.1759

21.43

tr1  

19.6744

667.86

tr2  

611.8517

21.48

tr3  

32.6882

401.97

tr4  

315.8020

41.61

Table 2 Mean square error and relative efficiency

Further illustration of the efficiency of the proposed estimator was done using another hypothetical dataset from Okafor12 on land area available for cultivation and land area cultivate with maize, where,

yi=the observed land area of the village cultivated with maize  

xi=the observed land area of the village avaliable for cultivation   

y*i=the true land area of the village cultivated with maize  

x*i=the true land area of the village avaliable for cultivation   

The following values for the population parameter were obtained from the given data.

N

ˉY   ˉX   σ2Y   σ2X   σ2u   σ2v   ρ   ρ*   θY   θX  

20

530.08

829.16

61824.97

190361.30

9.57

9.31

0.814

0.998

0.99985

0.99995

Table 3 Value of the Parameters Population II

Table 4 shows the mean squared error and percentage relative efficiency (PRE) of the proposed estimator and some estimators which are particular members of the proposed modified estimator with respect to sample mean per unit ˉy.

Estimators

Mean square error

Percentage relative efficiency

t0  

3091.712

100.00

tropt  

0.892

346460.000

tr1  

1073.425

288.023

tr2  

10253.820

30.152

tr3  

2587.140

119.503

tr4  

4882.238

63.326

t1  

1336.565

231.318

t2  

12627.140

24.485

Table 4 Mean Squared Error and Percentage Relative Efficiency

For different values of β , we also obtained the relative efficiency of tr  over t0 defined as

PRE(.)=Var(t0)MSE(tr)   (38)

Table 5 represents the relative efficiency of tr with respect to t0 for different values of β .

Value of β  

MSE(tr)

Relative Efficiency

0.00

131.397

1.000

0.05

117.645

1.117

0.10

104.750

1.254

0.15

92.711

1.417

0.20

81.530

1.612

0.25

71.205

1.845

0.30

61.738

2.128

0.35

53.127

2.473

0.40

45.374

2.896

0.45

38.477

3.415

0.50

32.437

4.051

0.55

27.255

4.821

0.60

22.929

5.731

0.65

19.460

6.752

0.70

16.848

7.799

0.75

15.093

8.706

0.80

14.195

9.256

βopt=0.828  

14.067

9.341

0.85

14.154

9.283

0.90

14.970

8.777

0.95

16.643

7.895

1.00

19.173

6.853

1.05

22.560

5.824

1.10

26.803

4.902

1.15

31.904

4.119

1.20

37.862

3.470

1.25

44.676

2.941

1.30

52.348

2.510

1.35

60.876

2.158

1.40

70.262

1.870

1.45

80.504

1.632

1.50

91.603

1.434

1.55

103.560

1.269

Table 5 Relative efficiency of tr with respect to t0 for different values of β

Conclusion

The main aim of this work is to ascertain the extent of the impact of correlated measurement errors on the quality of sample statistics which estimate the population parameters. Thus, since Bias(tr) is a function of θX, it shows that the bias of the proposed class of estimator is affected by the presence of correlated measurement error in the auxiliary variable. Also MSEmin(tr) is a function of θY, θX,  it also showed that the mean squared error of the proposed class of estimator is affected by presence of correlated measurement errors in both study and auxiliary variables. Also the proposed modified ratio estimator at its optimum value has more gain in efficiency than some existing estimators in the presence of correlated measurement errors. The study also revealed that even when the proposed modified ratio estimator deviates from its optimum value, there are still range of estimators at different values of β to choose from. Therefore, the proposed estimator should be preferred in practice.

Acknowledgments

None.

Conflicts of interest

The authors declare that they have no conflict of interest.

References

  1. Cochran WG. Sampling Techniques. Wiley New York, third edition. 1977.
  2. Biemer PP, Groves RM, Lyberg LE, et al. Measurement errors in survey. John Wiley and Sons, Inc. 1991.
  3. Shalabh S. Ratio method of estimation in the presence of measurement errors. Journal of Indian Society of Agricultural Statistics. 1997;50(2): 150–155.
  4. Manish, Singh KR. An estimation of population mean in the presence of measurement errors. Journal of Indian Society of Agricultural Statistics. 2001;54(1):13–18.
  5. Diwakar S, Pathak S, Thakur NS. An estimator for mean estimation in presence of measurement errors. Research and reviews: A Journal of Statistics. 2012;1(1):1 −8.
  6. Viplav SK, Singh R, Smarandache F. Difference-type estimators for estimation of mean in the presence of measurement errors. Mathematics arXiv preprint arXiv:1410.0279. 2014.
  7. Gregoire TG, Salas C. Ratio Estimator with Measurement Error in the Auxiliary Variate. Journal of International Biometrics Society. 2008;65(2):590 −598.
  8. Sahoo LN, Sahoo RK, Senapati SC. An empirical study on the accuracy of ratio and regression estimator in the presence of measurement errors. Monte Carlo Methods and Application. 2006;12(5−6):495 – 501.
  9. Shalabh S, Jia-Ren T. Ratio and product methods of estimation of population mean in the presence of correlated measurement errors. Communication in Statistics Simulation and Computation. 2016;46(7).
  10. Gujarati, Darnodar N ,Porter D C. Basic econometrics. The McGraw-Hill. 2009.
  11. Singh HP, Tailor R, Tailor R, et al. An improved estimator of population mean using power transformation. Journal of Indian Society of Agricultural Statistics. 2004; 58(2):223 –230.
  12. Okafor FC. Sample Survey Theory with Applications. Afro-Orbis Publications Ltd., Nsukka. 2002.
Creative Commons Attribution License

©2022 Boniface, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.