It is obvious that society cannot run effectively on the basis of hunches or trial and error. Decisions based on data will provide better results than those based on intuitions or gut feelings. Statistics is a range of procedures for gathering, organizing, analyzing and presenting quantitative data. In the modern society, the need for statistical information seems endless. In particular, data are regularly collected to satisfy the need for information about specified sets of elements, called as finite population. Statistics helps us to turn data into information. One of the most important modes of data collection for satisfying such needs is sample survey, that is, a partial investigation of the finite population and on the basis of such partial information (sample information) one tries to inference about the finite population characteristics (parameters). Sample survey is less expensive than a complete enumeration, it is usually less time consuming, and may even be more accurate than the method of complete enumeration. The term sample is used for the set of units or portion of the aggregate of material which has been selected with the belief that it will be representative of the whole aggregate. The sampling theory deals with scientific and objective procedure of choosing an appropriate sampling design, i.e. selecting a sample from the population which is representative of the population as a whole and also provides suitable estimation procedure to estimate the population parameters. Most challenging about the sample representation of the population is the effect of non-response on the estimation of the population parameter. Different authors have suggested different techniques for a reliable and efficient estimator, among which is the calibration technique.
Calibration estimation in sample surveys has since its introduction by Deville JC, et al.1 developed an established theory and method for estimation of finite population parameter. Calibration of weights is a technique that uses population data on auxiliary variables to improve estimates in sample surveys. If auxiliary data are available, some improvement in the precision of estimate may be achieved. Incorporation of auxiliary data in the estimation process is known as calibration. In stratified random sampling, calibration approach is used to obtain optimum strata weights for improving the precision of survey estimates of population parameters. Koyuncu N, et al.2 defined some calibration estimators in stratified random sampling for population characteristics and Clement EP, et al.3 applied the concept of calibration estimators for domain totals in stratified random sampling. Clement EP, et al.4 combined some scalars with the mean of the auxiliary variable and proposed calibration alternative ratio estimator of mean in stratified sampling.
When a researcher is interested in obtaining information from a local or small area, it becomes challenging with small sample size in some of the areas of interest and even very difficult when non-response occurs. Several authors have made attempts to obtain reliable estimates in such areas of interests popularly called domains of study. Among them is Godwin A, et al.5 The author considers modifications of some of the procedures for global ratio estimation in single-phase sampling with sub-sampling the non-respondents proposed by Rao P6 to obtain an estimate of mean for a small domain that cuts across constituent strata of a population with unknown weights. The bias and mean-square error of each of the modified estimators were obtained for comparison However, the estimators were not subjected numerical test to validate the analytical claims most importantly in areas of small/zero sample sizes. Unlike,6 the population mean of the auxiliary variable adopted by Godwin A, et al.5 is assumed to be unknown before the start of the survey and hence double sampling was applied under stratified simple random sampling.
In a bid to improve on the efficiency of the estimators under non-response,7,8 adopted the concept of calibration with a single constraint to estimate the population mean and the result was encouraging. Cochran WG9 showed that knowledge of, of domain j that is of interest reduces the variance of the estimator of domain mean in a single-phase simple random design. The reduction in variance is shown to be greater when the proportion of non-domain elements in the population is large and the study variable varies little among the domain elements. Ashutosh10 proposed estimators for domain mean utilizing stratified sampling with non-response. The proposed estimator was compared to a direct ratio estimator for domain mean utilizing stratified sampling with non-response. Clement EP, et al.4 stated that in the presence of powerful auxiliary variables, the calibration estimation meets the objective of reducing both non-response bias and the sampling error. Etebong P11 develops a new approach to ratio estimation that produces a more efficient class of ratio estimators that do not depend on any optimality conditions for optimum performance using calibration weightings. Iseh MJ, et al.,12 Iseh MJ, et al.,13 Iseh MJ, et al.14 considered the challenges of population mean estimation in small area that is characterized by small or no sample size and in the presence of unit non-response and presents a calibration estimator that produces reliable estimates under stratified random sampling from a class of synthetic estimators using calibration approach with alternative distance measure. To overcome the challenges of poor performance of the ratio estimator in small area occasioned with small/no sample size as a result of non-response, this work considers the calibration approach using the constraints of equal weights adjustment criteria, unbiased estimator of the population mean and variance of the auxiliary variable.
In this paper, based on the attempt by Godwin A, et al.5 who suggested the global ratio estimation in single-phase sampling with sub-sampling the non-respondents to obtain an estimate of mean for a small domain that cuts across constituent strata of a population with unknown weights, a new improved ratio estimator for population mean in stratified random sampling is suggested using the theory of calibration estimation with three constraints to achieve optimal precision and efficiency.
Some existing estimator and theoretical underpinnings
This section considers some existing ratio estimators for estimation of domain population mean and the theoretical underpinnings for the proposed ratio estimator. Though not much have been done in the area of domains of study in the presence of non-response probably due to the intricate nature of the estimation, this paper highlights some existing estimators as applicable to domain estimation which applied the concept of sub-sampling the non-respondents.
Some existing estimator
Study notations and definitions
population size under study
population size for the
domain
population size of
stratum in
domain
sample size for the
domain in the
stratum
domain sample Size
sample size for respondent units for the
domain in the
Stratum
Sample size for nonrespondents units for the
domain in the
Stratum
The calibration weight
Stratum weight
= Response rate of the
domain in the
Stratum
Non-response rate of the
domain in the
Stratum
and
= the LaGrange multipliers
Auxiliary variable
Study variable
Sample mean for the
domain in the
Stratum of the auxiliary variable
Unbiased estimator of the population mean for the
domain in the
Stratum of the study variable
Population mean for
domain of the auxiliary variable in the
Stratum
Population mean for
domain of the study variable in the
Stratum
Population mean for
domain of the auxiliary variable
Population mean for
domain of the study variable
Mean square of the
domain in the
Stratum of the study variable
Mean square of the
domain in the Stratum of the
auxilliary variable
Coefficient of variation for the
domain in the
Stratum of the auxilliary variable
Coefficient of variation for the
domain in the
Stratum of the study variable
Mean square of non-respondence of the
domain in the
Stratum of the study variable
Inverse sampling rate
Tuning parameter
Udofia (2004) estimator
An alternative ratio estimator for domain mean was suggested by [5] is as follows:
(1)
With
and
(2)
where
Pal and Singh HP estimator
Pal and Singh15 proposed a class of ratio-cum-ratio-type exponential estimators for population mean with sub sampling the non-respondents. The estimator and the mean square error is given as:
And
(3)
Where
and α is a constant
Ashutosh estimator
Ashutosh10 proposed a direct ratio generalized estimator for domain mean through stratified sampling with non-response as;
Where β is a chosen constant of
domain mean of x and the value of y respondents can be written as;
Members of the proposed estimators
if
if
if
if
Bias and Mean Square Error of
is given as;
(4)
Sampling design in single phase
Let
denote a finite population, the elements of which fall into L known strata with
elements the
stratum,
. It is assumed that π can also be partitioned according to the distribution of variable Z into exhaustive set of D sub-populations or domains of study that is denoted by
. Each stratum consist of a substratum of
respondents and a substratum of
non-respondents,
for all h. Let
denote the part of domain
in stratum h and
the unknown number of elements in
. Let
denote the value of characteristic Y for element i in
.
Proposed estimator
Calibration has been proven to be an estimation technique to smoothen an existing estimator for a better precision and an improved efficiency. For household survey and other economic data that requires knowledge of the supplementary information, a new ratio estimator is suggested to enhance efficiency in domains of study even in the presence of non-response. Motivated by [5] in an Alternative Ratio Estimator for domain mean, we proposed the following estimator:
(5)
(5) can be written as
(6)
where
and
is assume to be known and
is the calibration weight aimed at adjusting the existing weight in [5] estimators using a chi-square distance measure.
Subject to the following constraints
Thus the optimization problem is given by:
where
and
are the Lagrange multipliers such that
Substituting
in Eq. 6 gives
(7)
Where
Bias and variance of the proposed estimator
From the proposed estimator above
Let
,
,
Where
,
,
,
and
Also,
Let
where
And
Hence
Where
To obtain the bias
(8)
ignoring terms with power >2
(9)
To obtain minimum variance, we differentiate (9) partially with respect to and
Such that
(10)
(11)
(12)
Equation (12) is the minimum variance for the proposed estimator
Percentage relative efficiency of the estimators
The percentage relative efficiency of the proposed estimators with respect to the existing estimators is given as:
Empirical study
We take the Sweden municipalities MU284,16 (appendix B). The population is geographically sub-divided (domain) into eight different parts 1, 2, 3, 4, 5, 6, 7 and 8 having their sizes 25, 48, 32, 38, 56, 41, 15 and 29 respectively. However, we considered only four domains 1, 3, 7 and 8 because these domains have small units compared to other domains. The proposed estimator is a calibration estimator. Variables like
and
were computed based on existing information from the populations. Then each of the domains is classified into homogeneous groups according to our convenient into two strata: value of below 1500 (millions of kronor) and above 1500 (millions of kronor). We consider two cases 1 and 2 of non-response (in both Population I and Population II).
Case 1: If non-respondents are available in both strata (1 and 2) as well as in the domains (approximately 30%).
Case 2: If different non-respondents are available in both strata 1 and 2 approximately 20% and 40% respectively.
Population I
Y: Real estate values according to 1984 assessment (in millions of kronor).
X: Total number of municipal employees in 1984.
Population II
Another population is considered ([16] appendix B) which is classified in to four domains with stratum 1 and 2 according to the revenues less than 100 (in millions of kronor) and revenues above 100 (in millions of kronor).
Y: Revenues of 1985 municipal taxation assessment (in millions of kronor).
X: 1985 population (in thousands).