Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 13 Issue 2

Calibration for efficiency of ratio estimator in domains of study with sub-sampling the nonrespondents

Ikot Ekemini E, Iseh Matthew Joshua

Department of Statistics, Akwa Ibom State University, Mkpat Enin, Nigeria

Correspondence: Iseh Matthew Joshua, Department of Statistics, Akwa Ibom State University, Mkpat Enin, Nigeria, Tel +23480386405

Received: April 15, 2024 | Published: May 20, 2024

Citation: Ikot EE, Iseh MJ. Calibration for efficiency of ratio estimator in domains of study with sub-sampling the nonrespondents. Biom Biostat Int J. 2024;13(2):42-50. DOI: 10.15406/bbij.2024.13.00413

Download PDF

Abstract

In sample survey, it is expected that the information would be collected from all the selected units in the sample, but practically, it is generally not possible because of non-response. Some of the units may not respond or may not be contacted during the survey period. This work focuses on domain estimation of population mean with sub-sampling the non-respondents. In this study, we consider calibration technique as a method of correcting non-response in domains of study by minimizing the chi-square distance function between the weight of the main estimator and the calibrated weight subject to the formulated constraint on the auxiliary variable. As a result, two estimators are proposed; these are the ratio estimator for domain mean and a ratio estimator for double sampling. Bias and Mean Square Error (MSE) for the proposed estimators are derived.

We have used an auxiliary variable to estimate the population mean assuming that the non-response is observed only on the study variable. The proposed estimators and the existing estimators where compared empirically in the domains with small sampling units and two populations where considered in terms of the MSE and Percentage Relative Efficiency (PRE). We considered two cases where non-responses are uniform in the two strata at approximately (30%) and a case where the non-response rates are different with 20% and 40% in strata 1 and 2 respectively. The proposed estimators are more efficient than the existing estimators.

Keywords: auxiliary variable, calibration, non-response, ratio estimator, sub-sampling, domain

Introduction

It is obvious that society cannot run effectively on the basis of hunches or trial and error. Decisions based on data will provide better results than those based on intuitions or gut feelings. Statistics is a range of procedures for gathering, organizing, analyzing and presenting quantitative data. In the modern society, the need for statistical information seems endless. In particular, data are regularly collected to satisfy the need for information about specified sets of elements, called as finite population. Statistics helps us to turn data into information. One of the most important modes of data collection for satisfying such needs is sample survey, that is, a partial investigation of the finite population and on the basis of such partial information (sample information) one tries to inference about the finite population characteristics (parameters). Sample survey is less expensive than a complete enumeration, it is usually less time consuming, and may even be more accurate than the method of complete enumeration. The term sample is used for the set of units or portion of the aggregate of material which has been selected with the belief that it will be representative of the whole aggregate. The sampling theory deals with scientific and objective procedure of choosing an appropriate sampling design, i.e. selecting a sample from the population which is representative of the population as a whole and also provides suitable estimation procedure to estimate the population parameters. Most challenging about the sample representation of the population is the effect of non-response on the estimation of the population parameter. Different authors have suggested different techniques for a reliable and efficient estimator, among which is the calibration technique.

Calibration estimation in sample surveys has since its introduction by Deville JC, et al.1 developed an established theory and method for estimation of finite population parameter. Calibration of weights is a technique that uses population data on auxiliary variables to improve estimates in sample surveys. If auxiliary data are available, some improvement in the precision of estimate may be achieved. Incorporation of auxiliary data in the estimation process is known as calibration. In stratified random sampling, calibration approach is used to obtain optimum strata weights for improving the precision of survey estimates of population parameters. Koyuncu N, et al.2 defined some calibration estimators in stratified random sampling for population characteristics and Clement EP, et al.3 applied the concept of calibration estimators for domain totals in stratified random sampling. Clement EP, et al.4 combined some scalars with the mean of the auxiliary variable and proposed calibration alternative ratio estimator of mean in stratified sampling.

When a researcher is interested in obtaining information from a local or small area, it becomes challenging with small sample size in some of the areas of interest and even very difficult when non-response occurs. Several authors have made attempts to obtain reliable estimates in such areas of interests popularly called domains of study. Among them is Godwin A, et al.5 The author considers modifications of some of the procedures for global ratio estimation in single-phase sampling with sub-sampling the non-respondents proposed by Rao P6 to obtain an estimate of mean for a small domain that cuts across constituent strata of a population with unknown weights. The bias and mean-square error of each of the modified estimators were obtained for comparison However, the estimators were not subjected numerical test to validate the analytical claims most importantly in areas of small/zero sample sizes. Unlike,6 the population mean of the auxiliary variable adopted by Godwin A, et al.5 is assumed to be unknown before the start of the survey and hence double sampling was applied under stratified simple random sampling.

In a bid to improve on the efficiency of the estimators under non-response,7,8 adopted the concept of calibration with a single constraint to estimate the population mean and the result was encouraging. Cochran WG9 showed that knowledge of, of domain j that is of interest reduces the variance of the estimator of domain mean in a single-phase simple random design. The reduction in variance is shown to be greater when the proportion of non-domain elements in the population is large and the study variable varies little among the domain elements. Ashutosh10 proposed estimators for domain mean utilizing stratified sampling with non-response. The proposed estimator was compared to a direct ratio estimator for domain mean utilizing stratified sampling with non-response. Clement EP, et al.4 stated that in the presence of powerful auxiliary variables, the calibration estimation meets the objective of reducing both non-response bias and the sampling error. Etebong P11 develops a new approach to ratio estimation that produces a more efficient class of ratio estimators that do not depend on any optimality conditions for optimum performance using calibration weightings. Iseh MJ, et al.,12 Iseh MJ, et al.,13 Iseh MJ, et al.14 considered the challenges of population mean estimation in small area that is characterized by small or no sample size and in the presence of unit non-response and presents a calibration estimator that produces reliable estimates under stratified random sampling from a class of synthetic estimators using calibration approach with alternative distance measure. To overcome the challenges of poor performance of the ratio estimator in small area occasioned with small/no sample size as a result of non-response, this work considers the calibration approach using the constraints of equal weights adjustment criteria, unbiased estimator of the population mean and variance of the auxiliary variable.

In this paper, based on the attempt by Godwin A, et al.5 who suggested the global ratio estimation in single-phase sampling with sub-sampling the non-respondents to obtain an estimate of mean for a small domain that cuts across constituent strata of a population with unknown weights, a new improved ratio estimator for population mean in stratified random sampling is suggested using the theory of calibration estimation with three constraints to achieve optimal precision and efficiency.

Some existing estimator and theoretical underpinnings

This section considers some existing ratio estimators for estimation of domain population mean and the theoretical underpinnings for the proposed ratio estimator. Though not much have been done in the area of domains of study in the presence of non-response probably due to the intricate nature of the estimation, this paper highlights some existing estimators as applicable to domain estimation which applied the concept of sub-sampling the non-respondents.

Some existing estimator

Study notations and definitions

N=N= population size under study

Nd=Nd= population size for the dthdth  domain

Ndh=Ndh= population size of hthhth stratum in dthdth domain

ndh=ndh= sample size for the dthdth domain in the hthhth  stratum

ndh=ndh= domain sample Size

n1dh=n1dh= sample size for respondent units for the dthdth domain in the hthhth Stratum

n2dh=n2dh= Sample size for nonrespondents units for the dthdth domain in the hthhth Stratum

Wdh*=Wdh= The calibration weight

Wdh=Wdh= Stratum weight

Wdh1Wdh1 = Response rate of the dthdth domain in the hthhth Stratum

Wdh2=Wdh2= Non-response rate of the dth domain in the hth Stratum

λ1,λ2 and λ3  = the LaGrange multipliers

X= Auxiliary variable

Y= Study variable

ˉxdh= Sample mean for the dth domain in the hth Stratum of the auxiliary variable

ˉy*dh= Unbiased estimator of the population mean for the dth domain in the hth Stratum of the study variable

ˉXdh= Population mean for dth domain of the auxiliary variable in the hth Stratum

ˉYdh= Population mean for dth domain of the study variable in the hth Stratum

ˉXd= Population mean for dth domain of the auxiliary variable

ˉYd= Population mean for dth domain of the study variable

S2ydh= Mean square of the dth domain in the hth Stratum of the study variable

S2xdh= Mean square of the dth domain in the Stratum of the hth auxilliary variable

Cxdh= Coefficient of variation for the dth domain in the hth Stratum of the auxilliary variable

Cydh= Coefficient of variation for the dth domain in the hth Stratum of the study variable

S2ydh2= Mean square of non-respondence of the dth domain in the hth Stratum of the study variable

kdh= Inverse sampling rate

Qdh= Tuning parameter

Udofia (2004) estimator

An alternative ratio estimator for domain mean was suggested by [5] is as follows:

t2j=khWhˉy*hˉxhˉXh    (1)

With

Bias(t2j)=kh=1Wh1fhnhˉXh(RhS2xhSxhyh)

and

MSE(t2j)=kh=1W2h[1fhnh(S2yh+R2hS2xh2RhSxhyh+W2h(k1)nhS22yh)]    (2)

where

Rh=ˉYhˉXh,fh=(1nh1Nh)

Pal and Singh HP estimator

Pal and Singh15 proposed a class of ratio-cum-ratio-type exponential estimators for population mean with sub sampling the non-respondents. The estimator and the mean square error is given as:

tps1=αˉy*(ˉXˉx)+(1α)ˉy*exp(ˉXˉxˉX+ˉx)

And

MSE(tps1)=ˉY2(λC2y(1ρ2xy)+W2(Z1)nC2y(2))    (3)

Where 

W2=n2n,λ=1fn,f=nN  and α is a constant

Ashutosh estimator

Ashutosh10 proposed a direct ratio generalized estimator for domain mean through stratified sampling with non-response as;

TDG.st.β.d=ˉyst.d[ˉxst.dˉXst.d]β

Where β is a chosen constant of dth domain mean of x and the value of y respondents can be written as;

ˉyst.d=Hh=1Wh.dˉyh.dˉxst.d=Hh=1Wh.dˉxh.d

Members of the proposed estimators  T*DG.st.β.d

T*DG.st.β.d=ˉy*st.a  if β = 0

T*DG.st.1.a=ˉy*st.aˉx*st.aˉXh.a  if β=1

T*DG.st.1.a=ˉy*st.aˉx*st.aˉx*st.a  if β=1

T*DG.st.2.a=ˉy*st.a[ˉx*st.aˉx*st.a]2  if β=2

Bias and Mean Square Error of T*DG.st.1.a   is given as;

Bias(T*DG.st.1.a)=Hh=1Wh.aˉYh.a[Nh.anh.aNh.anh.aC2Xh.a+(gh.a1)W2h.anh.aC22Yh.a]ˉYa

MSE(T*DG.st.1a)=Hh=1W2h.aˉY2h.a[Nh.anh.aNh.anha(C2Yh.a+C2Xh.a2CYXha)+(gh.a1)W2h.anh.a(C22Yh.a+C22Xh.a2C2YXha)]    (4)

Sampling design in single phase

Let π={U1,U2,...,UN} denote a finite population, the elements of which fall into L known strata with Ndh  elements the hth stratum, h=1,2,...,L,hNdh=Nd . It is assumed that π can also be partitioned according to the distribution of variable Z into exhaustive set of D sub-populations or domains of study that is denoted by {A*d;d=1,2,...,D} . Each stratum consist of a substratum of N1dh respondents and a substratum of N2dh  non-respondents, N1dh+N2dh=Ndh for all h. Let A*dh  denote the part of domain d(A*d) in stratum h and Ndhj the unknown number of elements in A*dh . Let ydhj  denote the value of characteristic Y for element i in A*dh .

Proposed estimator

Calibration has been proven to be an estimation technique to smoothen an existing estimator for a better precision and an improved efficiency. For household survey and other economic data that requires knowledge of the supplementary information, a new ratio estimator is suggested to enhance efficiency in domains of study even in the presence of non-response.  Motivated by [5] in an Alternative Ratio Estimator for domain mean, we proposed the following estimator:

t*cal=Lh=1W*dhˉy*dhˉxdhˉXdh    (5)

(5) can be written as

t*cal=Lh=1W*dhˉydhr    (6)

where

ˉydhr=rdhˉXdh

and

rdh=ˉy*dhˉxdh,ˉXdh  is assume to be known and W*dh is the calibration weight aimed at adjusting the existing weight in [5] estimators using a chi-square distance measure.

φ=Lh=1(W*dhWdh)2QdhWdh

Subject to the following constraints

Lh=1W*dh=1Lh=1W*dhˉxdh=Lh=1WdhˉXdhLh=1W*dhs2dh=Lh=WdhS2dh

Thus the optimization problem is given by:

φ=Lh=1(W*dhWdh)2QdhWdh2λ1(Lh=1W*dh1)2λ2(Lh=1W*dhˉxdhLh=1WdhˉXdh)2λ3(Lh=1W*dhs2dhLh=WdhS2dh)

where λ1,λ2  and λ3  are the Lagrange multipliers such that

φW*dh=2(W*dhWdh)QdhWdh2λ12λ2ˉxdh2λ3s2dh=0

W*dh=Wdh+QdhWdh(λ1+λ2ˉxdh+λ3s2dh)

Substituting W*dh  in Eq. 6 gives

ˆˉtcal=Lh=1Wdhˉydhr+β1(dh(1Lh=1Wdh)+β2(dh)(Lh=1Wdh(ˉXdhˉxdh))+β3(dh)(h=1Wdh(S2dhs2dh))    (7)

Where

β1(dh)=[(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhˉx2dh)(Lh=1QdhWdhs4dh)(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhˉxdhs2dh)2(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhˉydhr)(Lh=1QdhWdhs4dh)+(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdhˉydhr)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdhˉx2dh)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdhs2dh)2(Lh=1QdhWdhˉxdh)2(Lh=1QdhWdhs4dh)+(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdhs2dh)+(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhs2dh)2(Lh=1QdhWdhˉx2dh)]

β2(dh)=[(Lh=1QdhWdh)(Lh=1QdhWdhˉxdhˉydhr)(Lh=1QdhWdhs4dh)(Lh=1QdhWdh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhˉx2dh)(Lh=1QdhWdhs4dh)+(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhˉxdhs2dh)2+(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdhs2dh)2(Lh=1QdhWdhˉxdh)2(Lh=1QdhWdhs4dh)+(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdhs2dh)+(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhs2dh)2(Lh=1QdhWdhˉx2dh)]

β3dh)=[(Lh=1QdhWdh)(Lh=1QdhWdhˉx2dh)(Lh=1QdhWdhs2dhˉydhr)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhˉxdhˉydhr)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dhˉydhr)+(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdhˉydhr)+(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhˉydhr)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉx2dh)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdh)(Lh=1QdhWdhˉxdhs2dh)2(Lh=1QdhWdhˉxdh)2(Lh=1QdhWdhs4dh)+(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdhs2dh)+(Lh=1QdhWdhs2dh)(Lh=1QdhWdhˉxdh)(Lh=1QdhWdhˉxdhs2dh)(Lh=1QdhWdhs2dh)2(Lh=1QdhWdhˉx2dh)]

Bias and variance of the proposed estimator

From the proposed estimator above

Let

e0=(ˉy*dhˉYdh)ˉYdhW*dh=Wdh+QdhWdh(λ1+λ2ˉxdh+λ3s2dh) ,

e1=(ˉxdhˉXdh)ˉXdh ,

e2=(s2xdhS2xdh)S2xdh

Where

ˉy*dh=Lh=1y*dhndh , ˉxdh=Lh=1xdhndh , ˉXdh=Lh=1XdhNdh , s2dh=Lh=1(ˉxdhˉXdh)2ndh1  and S2dh=Lh=1(ˉXdhˉX)2Ndh1

Also,

ˉy*dh=ˉYdh(1+e0)ˉxdh=ˉXdh(1+e1)s2xdh=S2xdh(1+e2)

Let

E[e20]=Var(ˉy*dh)ˉY2dh=(1ndh1Ndh)C2ydh+(Kdh1)ndh2Wdh2C2ydh2=(1ndh1Ndh)S2ydhˉY2dh+(Kdh1)ndhˉY2dhWdh2S2ydh2E[e21]=Var(ˉxdh)ˉX2dh=(1ndh1Ndh)C2xdh=(1ndh1Ndh)S2xdhˉX2dhE[e22]=Var(s2xdh)S2xd=(1ndh1Ndh)S4xdhS2xdh=(1ndh1Ndh)S2xdhE[e0e1]=COV(ˉxdh,ˉy*dh)ˉXdhˉYdh=1ˉXdhˉYdh[C(E[ˉxdh],E[ˉy*dh])]=(1ndh1Ndh)ρxyCydhCxdh=1ˉXdhˉYdh(1ndh1Ndh)ρxySxdhSydhE[e0]=E[e1]=E[e2]=0E[e1e2]=(1ndh1Ndh)Cxdhλ03=(1ndh1Ndh)SxdhˉXdhλ03

where

λrs=μrsμr/220μs/202

And

μrs=1Ndh1Ni=1(YdhiˉYdh)r(XdhiˉXdh)sμ20=S2ydhμ02=S2xdh

Hence

λ03=μ03μ0/220μ3/202

E[e0e2]=(1ndh1Ndh)Cydhλ12=(1ndh1Ndh)SydhˉYdhλ12

Where λ12=μ12μ1/220μ02

ˉydhr=ˉy*dhˉxdhˉXdh

=ˉYdh(1+e0e1+e21e0e1)

To obtain the bias

B(t*cal)=E[t*calˉYd]=E[Lh=1Wdh[ˉYdh(1+e0e1+e21e0e1)]β2(dh)Lh=1WdhˉXdhe1β3(dh)Lh=1WdhS2xdhe2ˉYd]=E[Lh=1Wdh[ˉYdh(e0e1+e21e0e1)]β2(dh)Lh=1WdhˉXdhe1β3(dh)Lh=1WdhS2xdhe2]=Lh=1WdhˉYdh[(E(e21)E(e0e1))]B(t*cal)=Lh=1WdhˉYdh[(1ndh1Ndh)S2xdhˉX2dh]Lh=1WdhˉYdh[1ˉXdhˉYdh(1ndh1Ndh)ρxySxdhSydh]    (8)

=E[Lh=1Wdh[ˉYdh(1+e0e1+e21e0e1)]β2(dh)Lh=1WdhˉXdhe1β3(dh)Lh=1WdhS2xdhe2ˉYd]2  ignoring terms with power >2

MSE(t*cal)=Lh=1W2dh[(1ndh1Ndh)S2ydh+(Kdh1)ndhWdh2S2ydh2]2Lh=1W2dh[ˉYdhˉXdh(1ndh1Ndh)ρxySxdhSydh]2β2dhLh=1W2dh[(1ndh1Ndh)ρxySxdhSydh]2β3(dh)Lh=1W2dh[(1ndh1Ndh)S2xdhSydhλ12]+Lh=1W2dhˉY2dhˉX2dh(1ndh1Ndh)S2xdh+2β2dhLh=1W2dhˉYdhˉXdh(1ndh1Ndh)S2xdh+2β3(dh)Lh=1W2dhˉYdhˉXdh[(1ndh1Ndh)S3xdhλ03]+β22dhLh=1W2dh(1ndh1Ndh)S2xdh+β23dhLh=1W2dh[(1ndh1Ndh)S4dh(λ041)]    (9)

To obtain minimum variance, we differentiate (9) partially with respect to and

Such that

β2(dh)=Lh=1W2dh[(1ndh1Ndh)ρxySxdhSydh]Lh=1W2dhˉYdhˉXdh(1ndh1Ndh)S2xdhLh=1W2dh(1ndh1Ndh)S2xdh    (10)

β3(dh)=Lh=1W2dh[(1ndh1Ndh)S2xdhSydhλ12]Lh=1W2dhˉYdhˉXdh[(1ndh1Ndh)S3xdhλ03]Lh=1W2dh[(1ndh1Ndh)S4dh(λ041)]    (11)

minMSE(t*cal)=Lh=1W2dh[(1ndh1Ndh)S2ydh+(Kdh1)ndh2Wdh2S2ydh2]2Lh=1W2dhˉYdhˉXdh(1ndh1Ndh)ρxySxdhSydh+

Lh=1W2dhˉYdhˉXdh(1ndh1Ndh)S2xdh[Lh=1W2dh(1ndh1Ndh)ρxySxdhSydhLh=1W2dhˉYdhˉXdh(1ndh1Ndh)S2xdh]2Lh=1W2dh(1ndh1Ndh)S2xdh

[Lh=1W2dh(1ndh1Ndh)S2xdhSydhλ12Lh=1W2dhˉYdhˉXdh(1ndh1Ndh)S3xdhλ03]2Lh=1W2dh(1ndh1Ndh)S4dh(λ041)    (12)

Equation (12) is the minimum variance for the proposed estimator

Percentage relative efficiency of the estimators

The percentage relative efficiency of the proposed estimators with respect to the existing estimators is given as:

PRE=MSE(P)MSE(E)×100

Empirical study

We take the Sweden municipalities MU284,16 (appendix B). The population is geographically sub-divided (domain) into eight different parts 1, 2, 3, 4, 5, 6, 7 and 8 having their sizes 25, 48, 32, 38, 56, 41, 15 and 29 respectively. However, we considered only four domains 1, 3, 7 and 8 because these domains have small units compared to other domains. The proposed estimator is a calibration estimator. Variables like ndh1 and ndh were computed based on existing information from the populations. Then each of the domains is classified into homogeneous groups according to our convenient into two strata: value of below 1500 (millions of kronor) and above 1500 (millions of kronor). We consider two cases 1 and 2 of non-response (in both Population I and Population II).

Case 1: If non-respondents are available in both strata (1 and 2) as well as in the domains (approximately 30%).

Case 2: If different non-respondents are available in both strata 1 and 2 approximately 20% and 40% respectively.

Population I

Y: Real estate values according to 1984 assessment (in millions of kronor).

X: Total number of municipal employees in 1984.

Population II

Another population is considered ([16] appendix B) which is classified in to four domains with stratum 1 and 2 according to the revenues less than 100 (in millions of kronor) and revenues above 100 (in millions of kronor).

Y: Revenues of 1985 municipal taxation assessment (in millions of kronor).

X: 1985 population (in thousands).

Discussion

This discussion is based on the empirical analysis carried out and results presented in Tables 1–7. From Table 7 (Populations I and II) with respect to single stage sampling (MSE of estimators for domain mean), it is observed that the mean square error of the proposed estimator  is less than the MSE of the existing estimators in all the domains. This is seen in both cases of non-response where the non-response rate was uniform across the strata and where it was non-uniform as specified in the data. The Average Mean Squared Errors (AMSE) also confirms the behavior of the MSE in both populations and cases. From Table 8 (Populations I and II) with respect to single stage sampling, it is observed that the Percentage Relative Efficiency (PRE) for the proposed estimators kept at a benchmark of 100% had greater gains in efficiency than the existing estimators for all the domains.

Domain Parameter

Domain

 

 

 

 

 

 

 

Domain Size

25

 

32

 

15

 

29

 

Stratum

1

2

1

2

1

2

1

2

Ndh

2

23

12

20

2

13

18

11

Wdh

0.080

0.920

0.375

0.625

0.133

0.867

0.620

0.379

ˉYdh

955.50

6888

1056.9

3364

1231

4020

723.2

4799

ˉXdh

529

4385

485.4

1816

493

1694

354.1

2205

S2Ydh

40.5

136775663

71715.5

4652460

135721

5643626

81863.7

10236271

S2Xdh

101250

81259476

29239.7

2530489

162

2354475

18128.3

2902966

SXYdh ;

-2025

104701660

34761.1

3144594

-4689

2999017

13306

3557068

ρXYdh

-1.000

0.993

0.759

0.916

-1.000

0.823

0.345

0.653

Table 1 Value of parameters of the strata (1 and 2) and domains
Source: Statistical computation from original data 2023.

Domain

Strata

S2Ydh2

S2Xdh2

SXYdh2

Kdh

ndh2

wdh2

ndh1

ndh

1

1

0

0

0

3

0

0.3

0

0

 

2

223415888

132328227

171255299

2

4

0.3

10

14

2

1

120583

9862.3

28095.5

2

1

0.3

2

3

 

2

3977507

2517165

2669307

2

3

0.3

8

11

3

1

0

0

0

2

0

0.3

0

0

 

2

987699

1771955

1137277

3

1

0.3

3

4

4

1

79129

20311.8

8114.3

2

3

0.3

6

9

 

2

141512

5618

-28196

2

1

0.3

1

2

Table 2 The parameter values of strata (1 and 2) for domains (1, 2, 3 and 4) in case 1
Source: Statistical computation from original data 2023.

Domain

Strata

S2Ydh2

S2Xdh2

SXYdh2

kdh ndh2

Wdh2

ndh1

ndh

1

1

0

0

0

2

0

0.2

0

0

 

2

256872328

152987958

197463945

3

5

0.4

7

12

2

1

120583

9862.3

28095.5

2

1

0.2

2

3

 

2

3977507

2517165

2669307

3

4

0.4

7

11

3

1

0

0

0

2

0

0.2

0

0

 

2

987699

1771955

1137277

4

2

0.4

2

4

4

1

79129

20311.8

8114.3

2

2

0.2

7

9

 

2

141512

5618

-28196

3

1

0.4

1

2

Table 3 The parameter values of Strata (1 and 2) for domain (1,2,3 and 4) in case 2
Source: Statistical computation from original data 2023.

Domain Parameter

Domain

 

 

 

 

 

 

 

Domain Size

25

 

32

 

15

 

29

 

Stratum

1

2

1

2

1

2

1

2

Ndh

2

23

14

18

7

8

20

9

Wdh

0.08

0.92

0.438

0.563

0.467

0.533

0.69

0.31

ˉYdh

75.5

594

67.5

260.6

73

315

44.55

345.2

ˉXdh

9.00

67.1

10.643

34.5

10.714

40.63

6.55

41.89

S2Ydh

840.5

1551426

275.96

41200.8

369.67

51631.7

187.21

54848.9

S2Xdh

18.00

16649.6

4.555

544.97

6.905

731.13

5.103

681.61

SXYdh

123

160633.5

32.038

4559.147

49.5

6116.143

29.839

6076.778

ρXYdh

1.00

0.999

0.904

0.962

0.98

0.995

0.965

0.994

λ12

0.003414

0.0000163

0.0004078

0.0000537

0.005061

0.0001168

0.0003394

0.0000987

λ03

0.001047

0.0000066

0.000086

0.0000156

0.000813

0.0000207

0.0000813

0.0000208

λ04

0.250000

0.5941080

0.407718

0.543903

0.001194

0.417192

0.221445

0.283596

Table 4 The parameter value of the strata for the domains (1, 2, 3 and 4)

Domain

Strata

S2Ydh2

S2Xdh2

SXYdh2

kdh

ndh2

Wdh2

ndh1

ndh

1

1

0

0

0

2

0

0.3

0

0

 

2

2478255

26533

256331

2

4

0.3

10

14

2

1

373.7

4.200

35.05

2

2

0.3

3

5

 

2

64541.6

875.25

7146.75

2

3

0.3

6

9

3

1

0

0

0

2

0

0.3

0

0

 

2

0

0

0

2

0

0.3

0

0

4

1

168.16

5.018

27.945

3

3

0.3

8

11

 

2

0

0

0

2

0

0.3

0

0

Table 5 The parameter values of Strata (1 and 2) for domain (1,2,3 and 4) in case 1
Source: Statistical computation from original data 2023.

Domain

Strata

S2Ydh2

S2Xdh2 SXYdh2 kdh ndh2

Wdh2

ndh1 ndh

1

1

0

0

0

2

0

0.2

0

0

 

2

2654913

28395.1

274463.7

3

5

0.4

8

13

2

1

176.25

1.333

9.667

2

1

0.2

3

4

 

2

64184.8

874.3

7069.214

2

3

0.4

5

8

3

1

0

0

0

2

0

0.2

0

0

 

2

0

0

0

3

0

0.4

0

0

4

1

169.778

5.511

30

2

2

0.2

8

10

 

2

0

0

0

3

0

0.4

0

0

Table 6 Parameter values of strata (1 and 2) for each domain in the case 2

Estimator

1

2

3

4

AMSE

 

Case 1 (Population 1)

     
t2j

1950061

74298

817651

367172

802295.5

T*DG.st.1.d

969486

1524915

8225052

7635549

4588751

texp1

3388887

2662270

2108968

1580771

2435224

t*cal

389212

68914.02

613475

41078.01

278169.8

 

Case2 (Population 1)

     
t2j

1724942

69614.41

771605.6

36752.6

650728.7

T*(DG.st.1.d)i

531387

334565

501732.17

1024511

598048.8

texp1

1190074

28456.45

270455

927116.3

604025.4

t*cal

31623

2178.416

35028.05

32543.11

25343.14

 

Case 1 (Population 2)

     

t2j

716146

115492

-

721

208089.8

T*DG.st.1.d

279503

68413

-

409

87081.25

texp1

30790869

615419.7

-

246

7851634

t*cal

222071

21236

-

169

60869

 

Case 2 (Population 2)

     

t2j

11048.4

246.8

-

9

2826.05

T*DG.st.1.d

11989.8

639.6

-

313

3235.6

texp1

97184

942

-

206

24583

t*cal

10431

127.3

-

4.8

2640.775

Table 7 MSE of Estimators for domain mean in both cases 1 and 2(Population 1&2)
Note: AMSE, average mean square error.
Source: Statistical computation from original data 2023.

 

D1

D2

D3

D4

Estimator

Case 1 ( Population 1)

   
t2j

19.95897

92.75353

75.02895

11.18767

T*DG.st.1.di

40.14622

4.519204

7.458615

0.537984

texp1

11.48495

2.588544

29.08887

2.598606

t*cal

100

100

100

100

 

Case 2( Population 1)

   
t2j

1.833279

3.12926

4.539631

88.54642

T*DG.st.1.di

5.95103

0.651119

6.981424

3.176453

texp1

2.65723

7.655263

12.95153

3.510143

t*cal

100

100

100

100

 

Case 1( Population 2)

   
t2j

31.00918

18.38742

0

23.43967

T*DG.st.1.di

79.4521

31.04088

0

41.32029

texp1

0.721224

3.450653

0

68.69919

t*cal

100

100

0

100

 

Case 2( Population 2)

   
t2j

94.41186

51.58023

0

53.33333

T*DG.st.1.di

86.99895

19.90306

0

1.533546

texp1

10.73325

13.5138

0

2.330097

t*cal

100

100

0

100

Table 8 PRE of the estimators for domain mean in both cases 1 and 2(Population 1 and 2)

Conclusion

This study develops the concept of calibration estimator for ratio estimation and proposes calibration ratio estimators of population mean in single stage sampling. The study contributes to the theory of domain estimation in stratified random sampling of the population mean of the study variable with sub-sampling the non-respondents when there is non-response in the study variable and auxiliary variable is free from non-response.

The proposed class of estimators provide opportunity for different known values of the domain population parameters of the auxiliary variable to be incorporated in constructing estimators in the presence of non-response using the concept of calibration. The study revealed that the first constraint is just the sum of the calibration weight equals to one and the third constraint which has to do with the stratum variance also contributes immensely to the efficiency of the proposed estimator. Furthermore, with the adoption of the procedure of sub-sampling the non-respondents even with ratio estimator, the study has reveal that subjecting an estimator to conditions where the study variable is affected by non-response while the auxiliary variable is free of non-response has no effect in the mean estimate.

From the efficiency comparison and empirical work, it becomes pertinent that the use of calibration technique has really paid off in providing estimates of the population mean with sub-sampling the non-respondents that provides greater gains in efficiency better than the existing estimators. This will proffer useful results to users of statistics and researchers when working on economic data that requires the use of auxiliary data either from the records or from previous survey.

However, it could be seen clearly from Table 7 that it was impossible to compute estimates for domain 3 in both cases of population II and hence, the mean square error was not computed. As a result, the PRE was accorded zero value. This is as a result of no sample size for both the respondents and the non-respondents as indicated in Table 6. Future research is encouraged in the light of this through the use of synthetic estimation technique.

Acknowledgments

None.

Conflicts of interest

The authors declare there is not any conflict of interest.

Funding

None.

References

  1. Deville JC, Särndal C. E. Calibration estimators in survey sampling. JASA. 1992;87:376–382.
  2. Koyuncu N, Kadilar, C. (2013). Calibration estimators using different measures in stratified random sampling. International Journal of Modern Engineering Research. 2013;3(1):415–419.
  3. Clement EP, Udofia GA, Enang EI. Sample design for domain calibration estimators. International Journal of Probability and Statistics. 2014;3(1):8–14.
  4. Clement EP, Enang E I. Calibration approach alternative ratio estimator for population mean in stratified sampling. International Journal of Statistics and Economics. 2015;16(1):83–93.
  5. Godwin A Udofia. Ratio estimation for small domains with subsampling the non-respondents:an application of Rao strategy. Statistics in Transition. 2004;6(5):713—724.
  6. Rao Poduri SRS. Ratio estimation with sub-sampling the non-respondents. Survey Methodology. 1986;12:217—230.
  7. Iseh, MJ, Bassey MO. Calibration estimators for population mean with subsampling the nonrespondents under stratified sampling. Science Journal of Applied Mathematics and Statistics. 2022;10(4):45–56.
  8. Iseh Matthew, Bassey Mbuotidem. Smoothing of estimators of population mean using calibration technique with sample errors. Journal of Modern Applied Statistical Methods. 2024; 23(1):.
  9. Cochran WG. Sampling Techniques, 3rd edition, New York: Wiley. 1977.
  10. Ashutosh (2021) Estimator of domain mean using stratified sampling in the presence on non-response.,Sri Lankan Journal of Applied Statistics. 2021;22(1):13–29.
  11. Clement EP, Inyang EJ. Improving the efficiency of ratio estimators by calibration weightings. International Journal of Statistics and Mathematics. 2021;8(1):164–172.
  12. Iseh, M J, Bassey KJ. A New calibration estimator of population mean for small area with nonresponse. Asian Journal of Probability and Statistics. 2021(a);12(2):14–51.
  13. Iseh, M. J, Bassey, KJ. Calibration estimator for population mean in small sample size with non-response. European Journal of Statistics and Probability. 2021(b);9(1):32–42.
  14. Iseh, M.J, Enang EI. A calibration synthetic estimator of population mean in small area under stratified sampling design. Transition in Statistics new series. 2021;22(3):15–30.
  15. Pal SK, Singh HP. A class of ratio-cum-ratio-type exponential estimators for population mean with subsampling the non-respondents. Jordan Journal of Mathematics and Statistics. 2017;10(1):73–94.
  16. Sarndal CE, Swensson B, Wretman J. Model-assisted surveys. New York: Springer-Verlag. 1992.
Creative Commons Attribution License

©2024 Ikot, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.