Many statistical applications and inferences rely on the validity of the underlying distributional assumption. Symmetry of the underlying distribution is essential in many statistical inference and modeling procedures. There are several tests of symmetry in the literature; however most of these tests suffer from low statistical power. Tests have been suggested by Butler,1 Rothman & Woodroofe,2 Hill & Roa,3 Baklizi,4 and McWilliams.5 McWilliams5 showed, using simulation, that his runs test of symmetry is more powerful than those provided by Butler,1 Rothman & Woodroofe,2 and Hill & Roa3 for various asymmetric alternatives. However, Tajuddin6 introduced a distribution-free test for symmetry based on Wilcoxon two-sample test which is more powerful than the runs test.
Moreover, Modarres & Gastwirth7 modified McWilliams5 runs test by using Wilcoxon scores to weight the runs. The new test improved the power for testing for symmetry about a known center but did not perform well when the asymmetry is focused in regions close to the median for a given distribution. Mira,8 introduced a distribution free test for symmetry based on Boferroni’s Measure. She showed that her test outperform tests introduced by Modarres & Gastwirth7 and Tajauddin.6 Recently, Samawi et al.9 provided a test of symmetry based on a nonparametric overlap measure. They demonstrated that the test of symmetry based on an overlap measure outperformed other tests of symmetry in the literature, including the runs test. Samawi & Helu10 introduced a runs test of conditional symmetry which is reasonably powerful to detect even small asymmetry in the shape of the conditional distribution. In addition, the Samawi & Helu10 test does not need any approximation nor extra computations such as kernel estimation of the density function as in the other tests that are found in the literature.
This paper uses the Kullback-Leibler information to test for the symmetry of the underlying distribution. Let
be two probability density functions. Assume samples of observations are drawn from continuous distributions. The Kullback-Leibler discrimination information function is given by
(1)
as defined by Kullback & Leibler.11 For simplicity we will write (1) as
where
This measure can be directly applied to discrete distributions by replacing the integrals with summations. It is well known that
and the equality holds if and only if
almost everywhere. The discrimination function
measures the disparity between
.
Many authors used the discrimination function
for testing goodness of fit of some distributions. For example see Alizadeh & Arghami.12,13
In this paper we consider testing the null hypothesis of symmetry for an underlying absolutely continuous distribution
with known location parameter and density denoted by
Under the null hypothesis of symmetry, if we let
then
.
Since kernel density estimation procedures are readily available in various statistical software packages such as SAS, STATA, S-Plus and R, we were interested in exploring the development of a new test of symmetry using kernel density estimation of
. This paper will introduce a powerful test of symmetry based on Kullback-Leibler discrimination information function. The Kullback-Leibler information test of symmetry and its asymptotic properties are introduced in Section 2. A simulation study is provided in Section 3. Illustrations of the test using base deficit score data and final comments are given in Section 4.
Assume that a random sample,
, is drawn from absolutely continuous distribution having known median, assumed to be 0. In the case of an unknown median, or if the center of the distribution is not known, then the data can be centered by a consistent estimate of the median. However, the implications of centering the data around a consistent estimator of the median on the asymptotic properties are not straightforward. Therefore, further investigations are needed to study the robustness of the proposed test of symmetry and compare it with other available tests of symmetry when the median is unknown. In this paper we will discuss only the case where the median of the underlying distribution is assumed known.
Consider testing for symmetry
Let
. Under the null hypothesis,
. An equivalent hypothesis for testing the symmetry is
let
be a consistent nonparametric estimator of
. Under the null hypothesis of symmetry and some regularity assumptions, which will be discussed later in this paper, we propose the following test of symmetry:
For large n, where
is a consistent estimator of the standard error of
. An asymptotic significant test procedure at level
is to reject
if
, where
is the upper
percentile of the standard normal distribution.
Kernel estimation of
For the i.i.d. sample
, let
be an estimate of
. To address which estimator of
will be appropriate to our inference procedure we need to state some necessary conditions:
C1: f is continuous. (Smoothness conditions)
C2: f is k times differentiable. (Smoothness conditions)
C3:
, where [X] is the integer part of X. (Tail condition)
C4:
(Tail condition)
C5:
(Peak condition) (Note that, this is also a mild tail condition.)
C6: f is bounded. (Peak condition)
Some suggested estimators for
may be found in the literature. These include the plug-in estimates of entropy which are based on a consistent density estimate
of f. For example, the integral estimate of entropy introduced by Dmitriev & Tarasenko.14 Joe15 considers estimating
when
is a multivariate pdf, but he points out that the calculation when
is a kernel estimator gets more difficult when the dimension of the integral is more than two. He therefore excludes the integral estimate from further study. The integral estimator can however be easily calculated if, for example,
is a histogram.
The re-substitution estimate is proposed by Ahmad & Lin16 as follows:
(3)
Where
is a kernel density estimator? They showed the mean square consistency of (3), such that
Joe15 considers the estimation of
for multivariate pdfs by an entropy estimate of the re-substitution type (3), also based on a kernel density estimate. He obtained asymptotic bias and variance terms, and showed that non-unimodal kernels satisfying certain conditions can reduce the mean square error. His analysis and simulations suggest that the sample size needed for good estimates increases rapidly when the dimension of the multivariate density increases. His results rely heavily on conditions C4 and C6. Hall & Morton17 investigated the properties of an estimator of the type (3) both when
is a histogram density estimator and when it is a kernel estimator. For the histogram estimation they showed that
under certain tail and smoothness conditions with
.(4)
Other estimators using sampling-spacing are investigated by Tarasenko,18 Beirlant & van Zuijlen,19 Hall,20 Cressie,21 Dudewicz & van der Meulen,22 and Beirlant.23 Finally, other nonparametric estimator has been discussed by many authors including Vasicek,24 Dudewicz & Van der Meulen,22 Bowman25 and Alizadeh.26 Among these various entropy estimators, Vasicek’s sample entropy has been most widely used in developing entropy based statistical procedures. However, deriving the asymptotic distribution for there is hard to establish. Therefore, in this paper we will adopt the kernel re-substitution estimate which is proposed by Ahmad & Lin.16
We will adopt the notation of Samawi et al.9 Our proposed test of symmetry is as follow: Let
be a random sample from absolutely continuous distribution
which is continuously differentiable with uniformly bounded derivatives and having known median.
Let K be a kernel function satisfying the condition
For simplicity, the kernel K will be assumed to be a symmetric density function with mean 0 and finite variance; an example is the standard normal density. The kernel estimators for
, are:
(6)
and
(7)
Respectively, where
is the number of bins and depends on the sample size. As in Samawi et al. [9], we suggest to take the integer of
. In addition,
is the bandwidths of the kernel estimators satisfying the conditions that
as
. There are many choices of the bandwidths (
). In our procedure we use the method suggested by Silverman27 Using the normal distribution as the parametric family, the bandwidths of the kernel estimators are
, (8)
Where
=min{standard deviation of (
), interquantile range of (
)/1.349}. This form of (8) is found to be adequate choices of the bandwidth for many purposes which minimizes the integrated mean squared error (IMSE),
We will use the Samawi et al.9 suggestion to calculate the bins as follows: Let
, then bins will be selected as
where
,
and
.
Using the above kernel estimator, the nonparametric kernel estimator of
under the null hypothesis is given by
Which can be approximated by?
The approximate variance of
is given by
Asymptotic properties of
The nonparametric kernel estimator of
(
) is based on the univariate kernel for density estimation,
. The necessary regularity conditions imposed on the univariate kernel for density estimation are:
I.
II.
III.
IV.
These conditions may be found in Silverman27 (Chapter 3) or Wand & Jones [28] (Chapter 2).
To show consistency of , apply the kernel density asymptotic properties found in Silverman,27 (Chapter 3) or Wand & Jones,28 (Chapter 2). Under assumptions 1-4 and assuming that the density
is continuous at each
, i=1, 2,… C,
(12)
(13)
and for
as
If f(.) uniformly continuous, then the kernel density estimate is strongly consistent. Moreover, as in Ahmad & Lin,16
and hence
and . However, since
therefore
To drive the asymptotic distribution of
, we will define
as a functional
Using the previously stated regularity conditions, some regularity conditions given by Serfing29 and assuming that the Gteuax derivatives of the functional
exist, we can show that the partial influence function of the functional
[30] are as follows:
and
Note that
Now using this functional representation of
, then as in Samawi et al.30 and Serfing,29
where
A consistent estimate for
is given by
Where,
Where in our case
.
For discussions about different methods addressing the issue of the performance of kernel density estimation at the boundary, see Hall & Park.31
As in Samawi et al.,9 to gain some insight of our procedure, a simulation study was conducted to investigate the performance of our new test of symmetry based on
. We compared our proposed test of symmetry with the test proposed by McWilliams,5 Modarres & Gastwirth,32 Mira8 Bonferroni’s test, and Samawi et al.9 tests of symmetry.
As in McWilliams [5], the runs test is described as follows: For any random sample of size n, let
denote the sample values ordered from the smallest to largest according to their absolute value (signs are retained), and
denote indicator variables designating the sign of the
values [
]. Thus, the test statistic used for testing symmetry is
= the number of runs in
sequence=
, where
We reject the null hypothesis if
is smaller than a critical value (
) at the pre-specified value of
. Moreover, Mira [8] Bonferroni’s test is
, where
. The process is to reject the null hypothesis if
where
The Modarres & Gastwirth32 test is the hybrid test of sign test in the first stage and a percentile-modified two-sample Wilcoxon see Gastwirth33 test in the second stage. Finally, Samawi et al.9 test of symmetry is based on kernel estimate of the overlap measure.
In the following simulation, SAS version 9.3 {proc kde; method=srot} is used. As in McWilliams,5 the generalized lambda distribution see, Ramberg & Schmeiser34 is used in our simulation with following set of parameters:
To generate the observations we used
where
a uniform random number. The significance level used in the simulation is
with sample sizes n=30, 50, and 100. To investigate the Type I error, the symmetric distributions used in the simulation are the first case of the generalized lambda and the normal. Our simulation is based on 5000 simulated samples. The 95% confidence intervals of the true probability of type I error under the null hypothesis with
are (0.04396, 0.05504).
Table 1.1 shows the estimated probability of type I error. Our test is an asymptotic test with a slight bias in D(., .) and in the variance estimation for small sample size. For sample sizes more than 30, the test seems to have an estimated probability of type I error close to the nominal value 0.05. However, Bonferroni’s test seems to be conservative test procedure, while Modarres, Gastwirth test is slightly conservative for small sample size. Table 1.2 and Table 1.3 show that using D(., .) based test is more powerful than McWilliams,5 Bonferroni’s, Modarres & Gastwirth32 and Samawi et al.9 tests in all of the presented cases. The efficiency increases as the sample size increases.
Distribution |
n |
Run Tests |
Test Based on the Overlap |
Bonferroni’s
|
Modarres and Gastwirth (1998) Test
|
Test Based on Kullback-Leibler Information |
Case #1 generalized lambda
|
30 |
0.046 |
0.056 |
0.03 |
0.027 |
0.051 |
50 |
0.052 |
0.051 |
0.032 |
0.044 |
0.047 |
100 |
0.058 |
0.052 |
0.027 |
0.046 |
0.051 |
Normal (0, 1) |
30 |
0.052 |
0.057 |
0.03 |
0.03 |
0.052 |
50 |
0.048 |
0.055 |
0.03 |
0.043 |
0.051 |
100 |
0.051 |
0.052 |
0.032 |
0.048 |
0.052 |
Table 1.1 Probability of Type I Error under the Null Hypothesis. (α =0.05)
Case # |
n |
Run Test |
Test Based on the Overlap |
Bonferroni’s
|
Modarres and Gastwirth (1998) Test
|
Test based on Kullback-Leibler Information |
-2
|
30 |
0.282 |
0.501 |
0.253 |
0.495 |
0.948 |
50 |
0.456 |
0.839 |
0.352 |
0.941 |
0.992 |
100 |
0.781 |
0.999 |
0.5 |
1 |
1 |
-3
|
30 |
0.444 |
0.846 |
0.508 |
0.61 |
0.98 |
50 |
0.678 |
0.953 |
0.756 |
0.99 |
0.999 |
100 |
0.913 |
1 |
0.966 |
1 |
1 |
-4
|
30 |
0.12 |
0.38 |
0.154 |
0.179 |
0.684 |
50 |
0.134 |
0.541 |
0.26 |
0.474 |
0.854 |
100 |
0.245 |
0.761 |
0.488 |
0.845 |
0.946 |
-5
|
30 |
0.141 |
0.451 |
0.231 |
0.247 |
0.81 |
50 |
0.201 |
0.601 |
0.41 |
0.652 |
0.92 |
100 |
0.336 |
0.839 |
0.741 |
0.954 |
0.98 |
Table 1.2 Power of Kullback-Leibler Information based test, with comparison with other tests Under Alternative Hypotheses (α =0.05)
Case # |
n |
Runs Test |
Test Based on the Overlap |
Bonferroni’s
|
Modarres and Gastwirth (1998) Test
|
Test Based on Kullback-Leibler Information |
-6
|
30 |
0.051 |
0.161 |
0.034 |
0.033 |
0.191 |
50 |
0.055 |
0.174 |
0.04 |
0.055 |
0.225 |
100 |
0.053 |
0.21 |
0.059 |
0.12 |
0.331 |
-7
|
30 |
0.101 |
0.189 |
0.091 |
0.092 |
0.452 |
50 |
0.111 |
0.241 |
0.155 |
0.21 |
0.611 |
100 |
0.122 |
0.361 |
0.336 |
0.478 |
0.737 |
-8
|
30 |
0.544 |
0.98 |
0.643 |
0.655 |
0.993 |
50 |
0.752 |
0.999 |
0.888 |
0.992 |
1 |
100 |
0.961 |
1 |
0.996 |
1 |
1 |
-9
|
30 |
0.571 |
1 |
0.685 |
0.676 |
0.993 |
50 |
0.81 |
1 |
0.916 |
0.995 |
0.999 |
100 |
0.963 |
1 |
0.999 |
1 |
1 |
Table 1.3 Power of Overlap based test and Run Tests under Alternative Hypotheses (α =0.05)
Note: The values of skewness
and kurtosis
are from McWilliams.5