Let be a finite population with distinct and identifiable units and
be a real variable with value
measured on giving a vector of measurements .
If
then the population is called a labelled population with a perfect linear trend among the population values. The problem is in general, to estimate the population mean
on the basis of a sample of size selected from the finite population
. Any ordered sequence
, and
is called a sample of size . Several sampling schemes like, simple random sampling without replacement, systematic sampling are available in the literature for selecting a sample of fixed size
from a finite population of size
. For the case of finite population with a linear trend among the population values and
, the linear systematic sampling (LSS) is normally recommended for selecting a random sample of fixed size . Further it is shown algebraically that the estimator from linear systematic sampling is better than the estimator provided by simple random sampling without replacement in the presence of linear trend. The performance of systematic sample mean can be improved further by introducing some modifications on the selection of the samples which includes the centered systematic sampling,1 balanced systematic sampling,2 modified systematic sampling3 and also by introducing changes in the estimator itself like, Yates type end corrections.4 In recent times several attempts are made to find an alternative to LSS. In this connection it is worth to note the following works which are alternative to LSS. Diagonal systematic sampling,5 Generalized diagonal systematic sampling,6 ,6 Determinant sampling,8 Modified linear systematic sampling,9 -11 Generalized modified linear systematic sampling,12 Star type systematic sampling,13 Remainder linear systematic sampling,14 Generalized systematic sampling,15 Remainder linear systematic sampling,14 Modified balanced circular systematic sampling,16 Modified systematic sampling by Huang,17 Lahiri,18 Leu & Tsui,19 Sampath & Uthayakumaran,20 Singh & Garg,121 Singh & Singh,22 Uthayakumaran,23 . For further discussions on linear systematic sampling the readers are referred to Cochran,24 , Gautschi,25 Khan et al.,26 ,27 Gupta & Kabe,28 Murthy,29 Singh,30 Sukhatme et al.,31 Fountain & Pathak,32 Mukerjee & Sengupta,33 Murthy & Rao,34 Wu,35 Zinger36 and the references cited there in.
If the population size
is not a multiple of sample size
then the linear systematic sampling is not applicable for selecting a sample of fixed size .In such situations, circular systematic sampling (CSS) introduced by Lahiri18 cited in Murthy29 provides a constant sample size
and the selected units are distinct if and only if
and
are relatively prime numbers. However the circular systematic samples are multiple copies of linear systematic samples when
and provides repetition of sampled units when
and
are not relatively prime numbers. As pointed out by Subramani et al.,13 the problems in circular systematic sampling are the following:
- The choice for the sampling interval, which ensures the distinct units in the sample and minimum variance
- The explicit expressions for the variance of CSS sample mean which is useful to assess the efficiency of the CSS with other sampling schemes, particularly SRSWOR.
Several attempts have been made in the past to get a suitable value for the sampling interval for the given values of population size
and sample size
. Murthy & Rao34 has given the choice for
as “It may be noted that the sample mean is unbiased for the population mean for all values of
, through the spread of the sample and hence efficiency is better if
is taken as an integer nearest to
. However, if repetition of the same unit in a sample is to be avoided, then it is desirable to take the sampling interval as
. It is shown that necessary and sufficient condition for all samples in CSS to have distinct units is that
and
are relatively co-prime”.37 Bell house38–40 has suggested that the choice for the sampling interval
when
and
when
.
Sengupta & Chattopadhyay33 have proposed the following: “A necessary and sufficient condition for a circular systematic sampling of size, drawn from a population of
units with sampling interval
,to contain all distinct units is that
or equivalently,
where
and
denote respectively the least common multiple and the greatest common divisor of
and
”. However it seems there is no theoretical result or empirical study available to justify the choice of k which ensures the efficient estimator or the estimator with minimum variance compared to other choices ofk .”
Recently Subramani et al.,41 and Subramani & Singh42 have made attempts to address the above problems and introduced the optimal circular systematic sampling (OCSS) together with the explicit expressions for its variance and the sampling interval in the presence of linear trend. In OCSS, the choice for the sampling interval k is
(1.1)
where
represents
For the hypothetical population with values
,
The variance of OCSS sample mean is given as
(1.2)
The variance of SRSWOR sample mean is given as
(1.3)
Example 1.1: The procedure of obtaining the optimum value of is explained for the fixed values of sample size and the population size .
If
and
then
. That is
If
and
then
. That is
The selected OCSS samples, their means, expected value and the variance are given for the sampling interval
and
in the following Table 1.1 & 1.2:
For both the cases of sampling interval
and
, it is obtained that
The value of the variance given above is coincided with the value obtained through the formula given in Table 1.2
Sample Number |
Sample Values |
OCSS Mean |
1 |
1 |
6 |
11 |
4 |
9 |
6.2 |
2 |
2 |
7 |
12 |
5 |
10 |
7.2 |
3 |
3 |
8 |
1 |
6 |
11 |
5.8 |
4 |
4 |
9 |
2 |
7 |
12 |
6.8 |
5 |
5 |
10 |
3 |
8 |
1 |
5.4 |
6 |
6 |
11 |
4 |
9 |
2 |
6.4 |
7 |
7 |
12 |
5 |
10 |
3 |
7.4 |
8 |
8 |
1 |
6 |
11 |
4 |
6.0 |
9 |
9 |
2 |
7 |
12 |
5 |
7.0 |
10 |
10 |
3 |
8 |
1 |
6 |
5.6 |
11 |
11 |
4 |
9 |
2 |
7 |
6.6 |
12 |
12 |
5 |
10 |
3 |
8 |
7.6 |
Table 1.2 OCSS samples and their means for the sampling interval
Further it seems, no attempt is made to derive the explicit expression for the variance of circular systematic sample mean even after 65 years of its introduction for the case of labelled population with a perfect linear trend. As a consequence, the efficiency of circular systematic sampling is not assessed algebraically with that of simple random sampling without replacement.
The points noted above are motivating the present study, which deals with the following:
- To derive the explicit expression for the variance of circular systematic sample mean for the population with a perfect linear trend among the population values
- To derive the explicit expressions for the Yates type end corrections for further improvements on the circular systematic sampling
- To assess the relative performance of circular systematic sampling with that of simple random sampling without replacement and the optimal circular systematic sampling algebraically and also for certain natural populations.
- To deduce the optimum values for the sampling fraction and the optimum variance for the circular systematic sampling.
Circular systematic sampling
As stated earlier, the LSS is not applicable when the population size is not a multiple of sample size for selecting a sample of fixed size whereas the CSS introduced by Lahiri18 cited in Murthy,29 provides a constant sample size . The steps involved in CSS for selecting a sample of size with sampling interval are given below:
Step 1: Arrange the population units around a circle
Step 2: Select a random number such that
Step3: For selecting a circular systematic sample of size select every elements from the random start in the circle until elements are accumulated.
The selected units
be the circular systematic sample of size
for the random start
. If
then select the item corresponding to
The variance of the circular systematic sample mean is obtained as given below:
(2.1)
Example 2.1: The procedure of obtaining the value of the sampling interval in the case of circular systematic sampling is explained for the fixed values of sample size and the population size .
If and then . That is
The selected CSS samples, their means, expected value and the variance are given for the sampling interval in the following Table 2.1.
Sample number |
Sample values |
CSS mean |
1 |
1 |
3 |
5 |
7 |
9 |
5 |
2 |
2 |
4 |
6 |
8 |
10 |
6 |
3 |
3 |
5 |
7 |
9 |
11 |
7 |
4 |
4 |
6 |
8 |
10 |
12 |
8 |
5 |
5 |
7 |
9 |
11 |
1 |
6.6 |
6 |
6 |
8 |
10 |
12 |
2 |
7.6 |
7 |
7 |
9 |
11 |
1 |
3 |
6.2 |
8 |
8 |
10 |
12 |
2 |
4 |
7.2 |
9 |
9 |
11 |
1 |
3 |
5 |
5.8 |
10 |
10 |
12 |
2 |
4 |
6 |
6.8 |
11 |
11 |
1 |
3 |
5 |
7 |
5.4 |
12 |
12 |
2 |
4 |
6 |
8 |
6.4 |
Table 2.1 CSS samples and their means for the sampling interval
For the cases of sampling interval and , it is obtained that
Computation of circular systematic sample means
Consider the labelled population with the population values
The population mean is
(2.2)
After a little algebra the circular systematic sample means are obtained as:
(2.3)
Remark 2.1: Since
then
From the above expressions the sum of the CSS sample means is obtained as
(2.4)
That is, the CSS sample mean is an unbiased estimator for its population mean.
Computation of variance of circular systematic sample mean
For the labelled population and the corresponding CSS sample means defined in Section 2.1, the derivation of the variance of circular systematic sample mean is given below.
Consider
(2.5)
By substituting the CSS sample means and the population means in the above expression, the variance of CSS sample mean for the labelled population is obtained as
After a little algebra, the variance of CSS sample mean is obtained as
By simplifying the above expression one may get
(2.6)
Computation of optimum values for the sampling fraction and the variance of circular systematic sample mean
We know that the sampling fraction is obtained as or the positive integer closest to . Without loss of generality, let us assume that . That is is the difference between and and . By replacing the values of k in the variance expression, one may get
By simplifying the above expression one may get
(2.7)
The above expression attains minimum at , which implies or .
That is, the optimum variance of CSS sample mean is exactly the same as given in (1.2).
Hence we conclude that the optimum value of the sampling fraction is obtained as stated by Subramani et al.,41 and Subramani & Singh42 as given in (Table 1.1)
Sample number |
Sample values |
OCSS mean |
1 |
1 |
8 |
3 |
10 |
5 |
5.4 |
2 |
2 |
9 |
4 |
11 |
6 |
6.4 |
3 |
3 |
10 |
5 |
12 |
7 |
7.4 |
4 |
4 |
11 |
6 |
1 |
8 |
6 |
5 |
5 |
12 |
7 |
2 |
9 |
7 |
6 |
6 |
1 |
8 |
3 |
10 |
5.6 |
7 |
7 |
2 |
9 |
4 |
11 |
6.6 |
8 |
8 |
3 |
10 |
5 |
12 |
7.6 |
9 |
9 |
4 |
11 |
6 |
1 |
6.2 |
10 |
10 |
5 |
12 |
7 |
2 |
7.2 |
11 |
11 |
6 |
1 |
8 |
3 |
5.8 |
12 |
12 |
7 |
2 |
9 |
4 |
6.8 |
Table 1 OCSS samples and their means for the sampling interval
It has been shown in Section 3 that the circular systematic sampling performs better than the simple random sampling without replacement. However it is not a trend free sampling33 which can be achieved by introducing Yates type end corrections4 as given below:
The modification involves the usual circular systematic sampling but the modified sample mean is defined as
(4.1)
That is, the units selected first and last are given the weights and respectively whereas the remaining units get the weight .By equating for the population with a perfect linear trend, we get the values for from (4.1) as:
Here one may have the following two situations: (i).The random start is less than or equal to and (ii). The random start is greater than .
Case (i). When the random start is less than or equal to
By setting (4.1) is equal to we get
By putting
,
we get
(4.2)
Case (ii). When the random start is greater than
Let the random start lies between and
By setting (4.1) is equal to we get
to
By putting
we get
to
(4.3)
Remark 4.1: In the presence of a perfect linear trend the modified circular systematic sample mean becomes the population mean and hence the . In this case the circular systematic sampling becomes a completely trend free sampling (See Mukerjee and Sengupta, 1990).