A Test of Symmetry Based on the Kernel KullbackLeibler Information with Application to Base Deficit Data
Many statistical applications and inferences rely on the validity of the underlying distributional assumption. Symmetry of the underlying distribution is essential in many statistical inference and modeling procedures. There are several tests of symmetry in the literature; however most of these tests suffer from low statistical power. Tests have been suggested by Butler [1], Rothman & Woodroofe [2], Hill & Roa [3], Baklizi [4], and McWilliams [5]. McWilliams [5] showed, using simulation, that his runs test of symmetry is more powerful than those provided by Butler [1], Rothman & Woodroofe [2], and Hill & Roa [3] for various asymmetric alternatives. However, Tajuddin [6] introduced a distributionfree test for symmetry based on Wilcoxon twosample test which is more powerful than the runs test.
Moreover, Modarres & Gastwirth [7] modified McWilliams [5] runs test by using Wilcoxon scores to weight the runs. The new test improved the power for testing for symmetry about a known center but did not perform well when the asymmetry is focused in regions close to the median for a given distribution. Mira [8], introduced a distribution free test for symmetry based on Boferroni’s Measure. She showed that her test outperform tests introduced by Modarres & Gastwirth [7] and Tajauddin [6]. Recently, Samawi et al. [9] provided a test of symmetry based on a nonparametric overlap measure. They demonstrated that the test of symmetry based on an overlap measure outperformed other tests of symmetry in the literature, including the runs test. Samawi & Helu [10] introduced a runs test of conditional symmetry which is reasonably powerful to detect even small asymmetry in the shape of the conditional distribution. In addition, the Samawi & Helu [10] test does not need any approximation nor extra computations such as kernel estimation of the density function as in the other tests that are found in the literature.
This paper uses the KullbackLeibler information to test for the symmetry of the underlying distribution. Let
${f}_{1}(x)\text{and}{f}_{2}(x)$
be two probability density functions. Assume samples of observations are drawn from continuous distributions. The KullbackLeibler discrimination information function is given by
$D$
(1)
as defined by Kullback & Leibler [11]. For simplicity we will write (1) as
$$D({f}_{1},{f}_{2})={D}_{11}({f}_{1},{f}_{1}){D}_{12}({f}_{1},{f}_{2}),$$
where
${D}_{11}({f}_{1},{f}_{1})={\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(x)\mathrm{ln}\left({f}_{1}(x)\right)dx\text{and}{D}_{12}({f}_{1},{f}_{2})=}{\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(x)\mathrm{ln}\left({f}_{2}(x)\right)dx.}$
This measure can be directly applied to discrete distributions by replacing the integrals with summations. It is well known that
$D({f}_{1},{f}_{2})\ge 0,$
and the equality holds if and only if
${f}_{1}(x)={f}_{2}(x)$
almost everywhere. The discrimination function
$D({f}_{1},{f}_{2})$
measures the disparity between
${f}_{1}\text{and}{f}_{2}$
.
Many authors used the discrimination function
$D(.,.)$
for testing goodness of fit of some distributions. For example see Alizadeh & Arghami [12,13].
In this paper we consider testing the null hypothesis of symmetry for an underlying absolutely continuous distribution
$F(.)$
with known location parameter and density denoted by
$f(.)$
${H}_{0}:f(x)=f(x)\text{}$
$\text{versus}{H}_{a}:f(x)\ne f(x);\text{forsome}x.$
Under the null hypothesis of symmetry, if we let
${f}_{1}(x)=f(x)\text{and}{f}_{2}(x)=f(x)$
then
$D({f}_{1},{f}_{2})=0$
.
Since kernel density estimation procedures are readily available in various statistical software packages such as SAS, STATA, SPlus and R, we were interested in exploring the development of a new test of symmetry using kernel density estimation of
$D({f}_{1},{f}_{2})$
. This paper will introduce a powerful test of symmetry based on KullbackLeibler discrimination information function. The KullbackLeibler information test of symmetry and its asymptotic properties are introduced in Section 2. A simulation study is provided in Section 3. Illustrations of the test using base deficit score data and final comments are given in Section 4.
Assume that a random sample,
${X}_{1},{X}_{2}\mathrm{....},{X}_{n}$
, is drawn from absolutely continuous distribution having known median, assumed to be 0. In the case of an unknown median, or if the center of the distribution is not known, then the data can be centered by a consistent estimate of the median. However, the implications of centering the data around a consistent estimator of the median on the asymptotic properties are not straightforward. Therefore, further investigations are needed to study the robustness of the proposed test of symmetry and compare it with other available tests of symmetry when the median is unknown. In this paper we will discuss only the case where the median of the underlying distribution is assumed known.
Consider testing for symmetry
${H}_{0}:f(x)=f(x)\text{}$
$\text{versus}{H}_{a}:f(x)\ne f(x);\text{forsome}x.$
Let
${f}_{1}(x)=f(x)\text{and}{f}_{2}(x)=f(x)$
. Under the null hypothesis,
$D({f}_{1},{f}_{2})=0$
. An equivalent hypothesis for testing the symmetry is
${H}_{0}:D({f}_{1},{f}_{2})=0\text{}$
$\text{versus}{H}_{a}:D({f}_{1},{f}_{2})0$
let
$\widehat{D}$
be a consistent nonparametric estimator of
$D({f}_{1},{f}_{2})$
. Under the null hypothesis of symmetry and some regularity assumptions, which will be discussed later in this paper, we propose the following test of symmetry:
${z}_{0}=\frac{\widehat{D}0}{{\widehat{\sigma}}_{\widehat{D}}}\stackrel{L}{\to}N(0,1)$
For large n, where
${\widehat{\sigma}}_{\widehat{D}}$
is a consistent estimator of the standard error of
$\widehat{D}$
. An asymptotic significant test procedure at level
$\alpha $
is to reject
${H}_{0}$
if
${z}_{0}>{z}_{\alpha}$
, where
${z}_{\alpha}$
is the upper
$\alpha $
percentile of the standard normal distribution.
Kernel estimation of
$D({f}_{1},{f}_{2})$
For the i.i.d. sample
${X}_{1},{X}_{2}\mathrm{....},{X}_{n}$
, let
${\widehat{D}}_{11}({f}_{1},{f}_{1})$
be an estimate of
${D}_{11}({f}_{1},{f}_{1})$
. To address which estimator of
${D}_{11}({f}_{1},{f}_{1})$
will be appropriate to our inference procedure we need to state some necessary conditions:
C1: f is continuous. (Smoothness conditions)
C2: f is k times differentiable. (Smoothness conditions)
C3:
${D}_{11}([X],[X])<1$
, where [X] is the integer part of X. (Tail condition)
C4:
$In{f}_{f(x)>0}f(x)>0$
(Tail condition)
C5:
$\int f{(\mathrm{ln}f)}^{2}}<\infty $
(Peak condition) (Note that, this is also a mild tail condition.)
C6: f is bounded. (Peak condition)
Some suggested estimators for
${D}_{11}({f}_{1},{f}_{1})={\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(x)\mathrm{ln}\left({f}_{1}(x)\right)dx}$
may be found in the literature. These include the plugin estimates of entropy which are based on a consistent density estimate
${f}_{n}$
of f. For example, the integral estimate of entropy introduced by Dmitriev & Tarasenko [14]. Joe [15] considers estimating
${D}_{11}({f}_{1},{f}_{1})$
when
${f}_{1}$
is a multivariate pdf, but he points out that the calculation when
${\widehat{f}}_{1}$
is a kernel estimator gets more difficult when the dimension of the integral is more than two. He therefore excludes the integral estimate from further study. The integral estimator can however be easily calculated if, for example,
${\widehat{f}}_{1}$
is a histogram.
The resubstitution estimate is proposed by Ahmad & Lin [16] as follows:
$${\widehat{D}}_{11}({\widehat{f}}_{1},{\widehat{f}}_{1})=\frac{1}{n}{\displaystyle \sum _{i=1}^{n}\mathrm{ln}{\widehat{f}}_{1}({X}_{i}),}$$
(3)
Where
${\widehat{f}}_{1}$
is a kernel density estimator? They showed the mean square consistency of (3), such that
${}_{n}{\underrightarrow{\mathrm{lim}}}_{\infty}\text{}E\left\{{({\widehat{D}}_{11}({\widehat{f}}_{1},{\widehat{f}}_{1}){D}_{11}({f}_{1},{f}_{1}))}^{2}\right\}=0$
Joe [15] considers the estimation of
${D}_{11}({f}_{1},{f}_{1})$
for multivariate pdfs by an entropy estimate of the resubstitution type (3), also based on a kernel density estimate. He obtained asymptotic bias and variance terms, and showed that nonunimodal kernels satisfying certain conditions can reduce the mean square error. His analysis and simulations suggest that the sample size needed for good estimates increases rapidly when the dimension of the multivariate density increases. His results rely heavily on conditions C4 and C6. Hall & Morton [17] investigated the properties of an estimator of the type (3) both when
${f}_{n}$
is a histogram density estimator and when it is a kernel estimator. For the histogram estimation they showed that
${}_{n}{\underrightarrow{\mathrm{lim}}}_{\infty}\text{}{n}^{1/2}({\widehat{D}}_{11}({\widehat{f}}_{1},{f}_{1}){D}_{11}({f}_{1},{f}_{1}))\sim N(0,{\sigma}^{2})$
under certain tail and smoothness conditions with
${\sigma}^{2}=Var(\mathrm{ln}(f(X))$
.(4)
Other estimators using samplingspacing are investigated by Tarasenko [18], Beirlant & van Zuijlen [19], Hall [20], Cressie [21], Dudewicz & van der Meulen [22], and Beirlant [23]. Finally, other nonparametric estimator has been discussed by many authors including Vasicek [24], Dudewicz & Van der Meulen [22], Bowman [25] and Alizadeh [26]. Among these various entropy estimators, Vasicek’s sample entropy has been most widely used in developing entropy based statistical procedures. However, deriving the asymptotic distribution for there is hard to establish. Therefore, in this paper we will adopt the kernel resubstitution estimate which is proposed by Ahmad & Lin [16].
We will adopt the notation of Samawi et al. [9]. Our proposed test of symmetry is as follow: Let
${X}_{1},{X}_{2}\mathrm{....},{X}_{n}$
be a random sample from absolutely continuous distribution
$F(.)$
which is continuously differentiable with uniformly bounded derivatives and having known median.
Let K be a kernel function satisfying the condition
${\int}_{\infty}^{\infty}K(x)dx=1$
For simplicity, the kernel K will be assumed to be a symmetric density function with mean 0 and finite variance; an example is the standard normal density. The kernel estimators for
$f({w}_{i})\text{and}f({w}_{i}),i=1,2,\mathrm{...},C$
, are:
${\widehat{f}}_{K}({w}_{i})=\frac{1}{nh}\text{}{\displaystyle \sum _{j=1}^{n}K\left(\frac{{w}_{i}{x}_{j}}{h}\right)}$
(6)
and
${\widehat{f}}_{K}({w}_{i})=\frac{1}{nh}\text{}{\displaystyle \sum _{j=1}^{n}K\left(\frac{{w}_{i}{x}_{j}}{h}\right),}$
(7)
Respectively, where
$C$
is the number of bins and depends on the sample size. As in Samawi et al. [9], we suggest to take the integer of
$C=\sqrt{n}$
. In addition,
$h$
is the bandwidths of the kernel estimators satisfying the conditions that
$h>0,h\to 0\text{and(}nh\to \infty )$
as
$n\to \infty \text{}$
. There are many choices of the bandwidths (
$h$
). In our procedure we use the method suggested by Silverman [27] Using the normal distribution as the parametric family, the bandwidths of the kernel estimators are
$h=0.9A{(n)}^{1/5}\text{}$
, (8)
Where
$A$
=min{standard deviation of (
${x}_{1},{x}_{2}\mathrm{....},{x}_{n}$
), interquantile range of (
${x}_{1},{x}_{2}\mathrm{....},{x}_{n}$
)/1.349}. This form of (8) is found to be adequate choices of the bandwidth for many purposes which minimizes the integrated mean squared error (IMSE),
$IMSE={\displaystyle \int E{[{\widehat{f}}_{K}(x)f(x)]}^{2}dx.}$
We will use the Samawi et al. [9] suggestion to calculate the bins as follows: Let
$R=range({x}_{1},{x}_{2},\mathrm{...},{x}_{n})$
, then bins will be selected as
${w}_{i}={w}_{i1}+{\delta}_{x},$
where
$i=2,\mathrm{...},C$
,
${w}_{1}=\mathrm{min}({x}_{1},{x}_{2},\mathrm{...},{x}_{n})$
and
${\delta}_{x}=\frac{R}{C}$
.
Using the above kernel estimator, the nonparametric kernel estimator of
$D({f}_{1},{f}_{2})$
under the null hypothesis is given by
$\widehat{D}={\displaystyle \underset{}{\overset{}{\int}}{\widehat{f}}_{K}(x)\mathrm{ln}\left(\frac{{\widehat{f}}_{K}(x)}{{\widehat{f}}_{K}(x)}\right)dx},\text{=}{\widehat{D}}_{11}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x)){\widehat{D}}_{12}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x)),$
Which can be approximated by?
$\widehat{D}=\frac{1}{C}{\displaystyle \sum _{i=1}^{C}\mathrm{ln}{\widehat{f}}_{K}({w}_{i})\frac{1}{C}{\displaystyle \sum _{i=1}^{C}\mathrm{ln}{\widehat{f}}_{K}({w}_{i})}}$
The approximate variance of
$\widehat{D}$
is given by
$Var(\widehat{D})=\frac{Var({\displaystyle \sum _{i=1}^{C}\mathrm{ln}{\widehat{f}}_{K}({w}_{i}))}}{{C}^{2}}+\frac{Var({\displaystyle \sum _{i=1}^{C}\mathrm{ln}{\widehat{f}}_{K}({w}_{i}))}}{{C}^{2}}.$
Asymptotic properties of
$\widehat{D}$
The nonparametric kernel estimator of
$D({f}_{1},{f}_{2})$
(
$\widehat{D}$
) is based on the univariate kernel for density estimation,
$K:\mathbb{R}\to \mathbb{R}$
. The necessary regularity conditions imposed on the univariate kernel for density estimation are:
I.
${\int}_{R}K(z)dz=1.$
II.
${\int}_{R}{z}^{\beta}K(z)dz=0\text{forany}\beta =1,\mathrm{...},\text{}r}1\text{,and}{\displaystyle {\int}_{R}z{}^{r}K(z)dz\infty .}\text{$
III.
$R={\displaystyle {\int}_{R}{K}^{2}(z)dz<\infty .}$
IV.
$h>0,h\to 0\text{,(}nh\to \infty )\text{and(}\frac{nh}{\mathrm{log}n}\to \infty )$
These conditions may be found in Silverman [27] (Chapter 3) or Wand & Jones [28] (Chapter 2).
To show consistency of , apply the kernel density asymptotic properties found in Silverman [27], (Chapter 3) or Wand & Jones [28], (Chapter 2). Under assumptions 14 and assuming that the density
$f:\mathbb{R}\to \mathbb{R}$
is continuous at each
${w}_{i}$
, i=1, 2,… C,
$$Bias({\widehat{f}}_{K}({w}_{i}))=o{(1)}_{}\text{and}Bias({\widehat{f}}_{K}({w}_{i}))=o{(1)}_{+}$$
(12)
$$Var({\widehat{f}}_{K}({w}_{i}))=\frac{f({w}_{i})}{nh}{\displaystyle {\int}_{\mathbb{R}}{K}^{2}(z)dz+o(\frac{1}{n{h}_{}}}\text{)and}Var({\widehat{f}}_{K}({w}_{i}))=\frac{f({w}_{i})}{nh}{\displaystyle {\int}_{\mathbb{R}}{K}^{2}(z)dz+o(\frac{1}{nh}}\text{),}$$
(13)
and for
$h>0,h\to 0\text{and(}nh\to \infty )$
as
$n\to \infty \text{}$
${\widehat{f}}_{K}({w}_{i}){\to}^{P}f({w}_{i})\text{and}{\widehat{f}}_{K}({w}_{i}){\to}^{P}f({w}_{i})$
If f(.) uniformly continuous, then the kernel density estimate is strongly consistent. Moreover, as in Ahmad & Lin [16],
${}_{C}{\underrightarrow{\mathrm{lim}}}_{\infty}\text{}E\left\{{({\widehat{D}}_{11}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x)){D}_{11}({f}_{K}(x),{f}_{K}(x)))}^{2}\right\}=0,$
and hence
${\widehat{D}}_{11}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x))\stackrel{P}{\to}{D}_{11}({f}_{K}(x),{f}_{K}(x)),\text{as}C\to \infty $
and . However, since
$\widehat{D}={\widehat{D}}_{11}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x)){\widehat{D}}_{12}({\widehat{f}}_{K}(x),{\widehat{f}}_{K}(x))$
therefore
$\widehat{D}\stackrel{p}{\to}D(f(w),f(w)),\text{as}C\to \infty .$
To drive the asymptotic distribution of
$\widehat{D}$
, we will define
$D({f}_{1},{f}_{2})$
as a functional
$D({f}_{1},{f}_{2})={\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(w)\mathrm{ln}(}{f}_{1}(w))dw{\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(w)\mathrm{ln}}{f}_{2}(w)dw={\displaystyle \underset{\infty}{\overset{\infty}{\int}}\mathrm{ln}(}{f}_{1}(w))d{F}_{1}{\displaystyle \underset{\infty}{\overset{\infty}{\int}}\mathrm{ln}}{f}_{2}(w)d{F}_{1}$
Using the previously stated regularity conditions, some regularity conditions given by Serfing [29] and assuming that the Gteuax derivatives of the functional
$D({f}_{1},{f}_{2})$
exist, we can show that the partial influence function of the functional
$D({f}_{1},{f}_{2})$
[30] are as follows:
${L}_{1}(w;{F}_{1},{F}_{1})=\mathrm{ln}{f}_{1}(w){\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(w)\mathrm{ln}}{f}_{1}(w)dw,$
and
${L}_{2}(w;{F}_{1},{F}_{2})=\mathrm{ln}{f}_{2}(w){\displaystyle \underset{\infty}{\overset{\infty}{\int}}{f}_{1}(w)\mathrm{ln}}{f}_{2}(w)dw.$
Note that
$\int {L}_{1}(w;{F}_{1}(w),{F}_{1}(w))}d{F}_{1}(w)=0\text{and}{\displaystyle \int {L}_{2}(w;{F}_{1}(w),{F}_{2}(w))}d{F}_{1}(w)=0.$
Now using this functional representation of
$D({f}_{1},{f}_{2})$
, then as in Samawi et al. [30] and Serfing [29],
$\sqrt{C}(\widehat{D}D({f}_{1},{f}_{2}))\stackrel{L}{\to}N(0,{\sigma}_{\widehat{D}}^{2}),$
where
${\sigma}_{\widehat{D}}^{2}={\displaystyle \int {L}_{1}^{2}(w;{F}_{1},{F}_{1})d{F}_{1}}+{\displaystyle \int {L}_{2}^{2}(w;{F}_{1},{F}_{2})d{F}_{1}}$
A consistent estimate for
${\sigma}_{\widehat{D}}^{2}$
is given by
${\widehat{\sigma}}_{\widehat{D}}^{2}=\frac{1}{C}{\displaystyle \sum _{i=1}^{C}{L}_{1}^{2}(w;{\widehat{F}}_{1},{\widehat{F}}_{1})}+\frac{1}{C}{\displaystyle \sum _{i=1}^{C}{L}_{2}^{2}(w;{\widehat{F}}_{1},{\widehat{F}}_{2})},$
Where,
${L}_{1}^{}({w}_{i};{\widehat{F}}_{1},{\widehat{F}}_{1})=\mathrm{ln}{\widehat{f}}_{1}({w}_{i}){\widehat{D}}_{11}({\widehat{f}}_{1}({w}_{i}),{\widehat{f}}_{1}({w}_{i}))\text{and}{L}_{2}^{}({w}_{i};{\widehat{F}}_{1},{\widehat{F}}_{2})=\mathrm{ln}{\widehat{f}}_{2}({w}_{i}){\widehat{D}}_{12}({\widehat{f}}_{1}({w}_{i}),{\widehat{f}}_{2}({w}_{i})),i=1,2,\mathrm{...},C,$
Where in our case
${f}_{1}({w}_{i})=f({w}_{i})\text{and}{f}_{2}({w}_{i})=f({w}_{i})$
.
For discussions about different methods addressing the issue of the performance of kernel density estimation at the boundary, see Hall & Park [31].
As in Samawi et al. [9], to gain some insight of our procedure, a simulation study was conducted to investigate the performance of our new test of symmetry based on
$\widehat{D}$
. We compared our proposed test of symmetry with the test proposed by McWilliams [5], Modarres & Gastwirth [32], Mira [8] Bonferroni’s test, and Samawi et al. [9] tests of symmetry.
As in McWilliams [5], the runs test is described as follows: For any random sample of size n, let
${Y}_{(1)},{Y}_{(2),\mathrm{...}\text{},}{Y}_{(n)}$
denote the sample values ordered from the smallest to largest according to their absolute value (signs are retained), and
${S}_{1},{S}_{2,\mathrm{...},}{S}_{n}$
denote indicator variables designating the sign of the
${Y}_{(j)}$
values [
${S}_{j}=1\text{if}{Y}_{(j)}\text{isnonnegative,0otherwise}$
]. Thus, the test statistic used for testing symmetry is
${R}_{}^{*}$
= the number of runs in
${S}_{1},{S}_{2,\mathrm{...},}{S}_{n}$
sequence=
$1+{\displaystyle \sum _{j=2}^{n}{I}_{j}}$
, where
${I}_{j}=\{\begin{array}{c}0\text{if}{S}_{j}={S}_{j1}\\ 1\text{if}{S}_{j}\ne {S}_{j1}\text{}\end{array}$
We reject the null hypothesis if
${R}_{}^{*}$
is smaller than a critical value (
${c}_{\alpha}$
) at the prespecified value of
$\alpha $
. Moreover, Mira [8] Bonferroni’s test is
${\gamma}_{1}({F}_{n})=2({\overline{X}}_{n}{X}_{s:n})$
, where
${X}_{s:n}=Median({X}_{1},{X}_{2},\mathrm{...},{X}_{n})$
. The process is to reject the null hypothesis if
${\gamma}_{1}({F}_{n})\ge \frac{{a}_{n}}{\sqrt{n}}{S}_{c}({\gamma}_{1},{F}_{n}),$
where
$\begin{array}{l}{a}_{n}\to {z}_{1{\scriptscriptstyle \frac{\alpha}{2}}}\text{as}n\to \infty ,{S}_{c}^{2}({\gamma}_{1},{F}_{n})=4{\widehat{\sigma}}^{2}+{(}^{{D}_{n,c}}4{D}_{n,c}{S}_{{\overline{\mu}}_{F}},{\widehat{\sigma}}^{2}=\frac{1}{n1}{\displaystyle \sum _{i=1}^{n}({X}_{i}{\overline{X}}_{n}}{)}^{2},{S}_{{\overline{\mu}}_{F}}={\overline{X}}_{n}\frac{2}{n}{\displaystyle \sum _{i=1}^{n}{X}_{i}}I({X}_{i}\le {X}_{s:n}),\text{}{D}_{n,c}=\\ \frac{{n}^{1/5}}{2c}({X}_{[(n/2)+c{n}^{4/5}]:n}{X}_{[(n/2)+c{n}^{4/5}+1]:n}),\text{and}c=\mathrm{0.5.}\end{array}$
The Modarres & Gastwirth [32] test is the hybrid test of sign test in the first stage and a percentilemodified twosample Wilcoxon see Gastwirth [33] test in the second stage. Finally, Samawi et al. [9] test of symmetry is based on kernel estimate of the overlap measure.
In the following simulation, SAS version 9.3 {proc kde; method=srot} is used. As in McWilliams [5], the generalized lambda distribution see, Ramberg & Schmeiser [34] is used in our simulation with following set of parameters:
To generate the observations we used
${x}_{i}={\lambda}_{1}+\frac{1}{{\lambda}_{2}}({u}_{i}^{{\lambda}_{3}}{(1{u}_{i})}^{{\lambda}_{4}},\text{}i=1,\mathrm{...},m,$
where
${u}_{i}$
a uniform random number. The significance level used in the simulation is
$\alpha =0.05,$
with sample sizes n=30, 50, and 100. To investigate the Type I error, the symmetric distributions used in the simulation are the first case of the generalized lambda and the normal. Our simulation is based on 5000 simulated samples. The 95% confidence intervals of the true probability of type I error under the null hypothesis with
$\alpha =0.05$
are (0.04396, 0.05504).
Table 1.1 shows the estimated probability of type I error. Our test is an asymptotic test with a slight bias in D(., .) and in the variance estimation for small sample size. For sample sizes more than 30, the test seems to have an estimated probability of type I error close to the nominal value 0.05. However, Bonferroni’s test seems to be conservative test procedure, while Modarres, Gastwirth test is slightly conservative for small sample size. Table 1.2 and Table 1.3 show that using D(., .) based test is more powerful than McWilliams [5], Bonferroni’s, Modarres & Gastwirth [32] and Samawi et al. [9] tests in all of the presented cases. The efficiency increases as the sample size increases.
Distribution 
n 
Run Tests 
Test Based on the Overlap 
Bonferroni’s
${\gamma}_{1}({F}_{n})$

Modarres and Gastwirth (1998) Test
${W}_{0.80}$

Test Based on KullbackLeibler Information 
Case #1 generalized lambda
$\begin{array}{l}{\lambda}_{1}=0,{\lambda}_{2}=0.197454,{\lambda}_{3}=0.134915,\\ {\lambda}_{4}=0.134915,\text{}{\alpha}_{3}=0,{\alpha}_{4}=3.0\end{array}$

30 
0.046 
0.056 
0.03 
0.027 
0.051 
50 
0.052 
0.051 
0.032 
0.044 
0.047 
100 
0.058 
0.052 
0.027 
0.046 
0.051 
Normal (0, 1) 
30 
0.052 
0.057 
0.03 
0.03 
0.052 
50 
0.048 
0.055 
0.03 
0.043 
0.051 
100 
0.051 
0.052 
0.032 
0.048 
0.052 
Table 1.1: Probability of Type I Error under the Null Hypothesis. (α =0.05).
Case # 
n 
Run Test 
Test Based on the Overlap 
Bonferroni’s
${\gamma}_{1}({F}_{n})$

Modarres and Gastwirth (1998) Test
${W}_{0.80}$

Test based on KullbackLeibler Information 
2
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=1.4,{\lambda}_{4}=0.25\text{}{\alpha}_{\text{3}}\text{=0}\text{.5,}{\alpha}_{4}=2.2$

30 
0.282 
0.501 
0.253 
0.495 
0.948 
50 
0.456 
0.839 
0.352 
0.941 
0.992 
100 
0.781 
0.999 
0.5 
1 
1 
3
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=0.00007,{\lambda}_{4}=0.1,{\alpha}_{3}=1.5,{\alpha}_{4}=5.8$

30 
0.444 
0.846 
0.508 
0.61 
0.98 
50 
0.678 
0.953 
0.756 
0.99 
0.999 
100 
0.913 
1 
0.966 
1 
1 
4
$\begin{array}{l}{\lambda}_{1}=3.586508,{\lambda}_{2}=0.04306,{\lambda}_{3}=0.025213,{\lambda}_{4}=0.094029\\ {\alpha}_{3}=0.9,{\alpha}_{4}=4.2\end{array}$

30 
0.12 
0.38 
0.154 
0.179 
0.684 
50 
0.134 
0.541 
0.26 
0.474 
0.854 
100 
0.245 
0.761 
0.488 
0.845 
0.946 
5
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=0.0075,{\lambda}_{4}=0.03,{\alpha}_{3}=1.5,{\alpha}_{4}=7.5$

30 
0.141 
0.451 
0.231 
0.247 
0.81 
50 
0.201 
0.601 
0.41 
0.652 
0.92 
100 
0.336 
0.839 
0.741 
0.954 
0.98 
Table 1.2: Power of KullbackLeibler Information based test, with comparison with other tests Under Alternative Hypotheses (α =0.05).
Case # 
n 
Runs Test 
Test Based on the Overlap 
Bonferroni’s
${\gamma}_{1}({F}_{n})$

Modarres and Gastwirth (1998) Test
${W}_{0.80}$

Test Based on KullbackLeibler Information 
6
$\begin{array}{l}{\lambda}_{1}=0.116734,{\lambda}_{2}=0.351663,{\lambda}_{3}=0.13,{\lambda}_{4}=0.16,\\ {\alpha}_{3}=0.8,{\alpha}_{4}=11.4\end{array}$

30 
0.051 
0.161 
0.034 
0.033 
0.191 
50 
0.055 
0.174 
0.04 
0.055 
0.225 
100 
0.053 
0.21 
0.059 
0.12 
0.331 
7
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=0.1,{\lambda}_{4}=0.18,{\alpha}_{3}=2.0,{\alpha}_{4}=21.2$

30 
0.101 
0.189 
0.091 
0.092 
0.452 
50 
0.111 
0.241 
0.155 
0.21 
0.611 
100 
0.122 
0.361 
0.336 
0.478 
0.737 
8
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=0.001,{\lambda}_{4}=0.13,{\alpha}_{3}=3.16,{\alpha}_{4}=23.8$

30 
0.544 
0.98 
0.643 
0.655 
0.993 
50 
0.752 
0.999 
0.888 
0.992 
1 
100 
0.961 
1 
0.996 
1 
1 
9
${\lambda}_{1}=0,{\lambda}_{2}=1,{\lambda}_{3}=0.0001,{\lambda}_{4}=0.17{\alpha}_{3}=3.88,{\alpha}_{4}=40.7$

30 
0.571 
1 
0.685 
0.676 
0.993 
50 
0.81 
1 
0.916 
0.995 
0.999 
100 
0.963 
1 
0.999 
1 
1 
Table 1.3: Power of Overlap based test and Run Tests under Alternative Hypotheses (α =0.05).
Note: The values of skewness
$({\alpha}_{\text{3}}\text{)}$
and kurtosis
$({\alpha}_{4})$
are from McWilliams [5].
Note: The values of skewness
$({\alpha}_{\text{3}}\text{)}$
and kurtosis
$({\alpha}_{4})$
are from McWilliams [5].
Illustration Using Base Deficit Data
We applied our new test procedure of symmetry to the base deficit (bd) data as in Samawi et al. [9]. The base deficit score refers to a deficit of "base" present in the blood. Base deficit scores were first established by Davis et al. [35]. The base deficit score has been found correlated to many variables in the trauma population, such as, mechanism of injury, the presence of intraabdominal injury, transfusion requirements, mortality, the risk of complications, and the number of days spent in the intensive care unit as indicated by Tremblay et al. [36] and Davis et al. [37].
The samples used in this illustration are part from the data collected based on a retrospective study of the trauma registry at a level 1 trauma center between January, 1998 and May, 2000. The primary concern was to determine at what point we can differentiate between life and death based on a base deficit score. A first step in this analysis is to determine if there is a difference in location for the base deficit score of those who survive and those who fail to survive. As is frequently the case in such studies, the underlying distribution is assumed “normal” or at least symmetric and a ttest or a nonparametric test would be performed without checking the assumptions. In either case a test of symmetry is almost never considered as a means of determining how one may proceed in the analysis. Based on the conclusions of a test of symmetry, the analyst can chose the most powerful test for location. The goal is to test the hypothesis that, on average, the base deficit score is the same for those who survive and those who fail to survive their injuries. The injuries of interest in this group of patients are either penetrating injury or blunt injury. However, before deciding on the test procedure, we need to check the assumptions of underlying distribution of the base deficit for both penetrating injury and blunt injury groups of patients. In particular, the assumption of symmetry of the underlying distribution needs to be verified. The data will be centered about the estimated measure of location to perform the tests of symmetry.
Figure 1.1 and Figure 1.2 show the box plot for penetrating injury and blunt injury groups for dead and alive patients respectively. Clearly there is some asymmetry on all four distributions. Also, Table 2.1 and Table 2.2 show summery statistics for penetrating injury and blunt injury groups for dead and alive patients respectively. Table 2.3 shows the overlap based test, the runs test and the proposed test of symmetry based on the KullbackLeibler information of symmetry for the underlying distribution for patients discharged alive and dead patients of blunt trauma and penetrating trauma. We reject the assumption of symmetry for underlying distribution of these groups.
Descriptives 
BD 
Type of Wound 
Statistic 
Std. Error 
Penetrating 
Mean 
10.81 
0.846 
95% Confidence Interval for Mean 
Lower Bound 
12.49 

Upper Bound 
9.12 

5% Trimmed Mean 
10.68 

Median 
10 

Variance 
52.904 

Std. Deviation 
7.274 

Minimum 
29 

Maximum 
9 

Range 
38 

Interquartile Range 
10 

Skewness 
0.21 
0.279 
Kurtosis 
0.102 
0.552 
Blunt 
Mean 
7.59 
0.444 
95% Confidence Interval for Mean 
Lower Bound 
8.46 

Upper Bound 
6.71 

5% Trimmed Mean 
7.3 

Median 
6 

Variance 
60.65 

Std. Deviation 
7.788 

Minimum 
37 

Maximum 
23 

Range 
60 

Interquartile Range 
10 

Skewness 
0.518 
0.139 
Kurtosis 
1.368 
0.277 
Table 2.1: Summery statistics for base deficit for dead patients.
Descriptives 
Base Deficit 
Type of Wound 
Statistic 
Std. Error 

penetrating 
Mean 
3.52 
0.202 
95% Confidence Interval for Mean 
Lower Bound 
3.91 

Upper Bound 
3.12 

5% Trimmed Mean 
3.06 

Median 
2.7 

Variance 
24.683 

Std. Deviation 
4.968 

Minimum 
28 

Maximum 
12 

Range 
40 

Interquartile Range 
5 

Skewness 
1.75 
0.099 
Kurtosis 
5.079 
0.199 
Blunt 
Mean 
1.8 
0.059 
95% Confidence Interval for Mean 
Lower Bound 
1.92 

Upper Bound 
1.69 

5% Trimmed Mean 
1.61 

Median 
1.3 

Variance 
11.601 

Std. Deviation 
3.406 

Minimum 
27 

Maximum 
13 

Range 
40 

Interquartile Range 
3 

Skewness 
1.22 
0.043 
Kurtosis 
4.39 
0.085 
Table 2.2: Summery statistics for base deficit for alive patients.

Injury Type 
N 
Test 
Significance 
KullbackLeibler Information 
Penetrating  Dead 
74 
3.989 
<0.0001 
Penetrating  alive 
603 
13.057 
<0.0000 
Overlap test* 
Penetrating  Dead 
74 
2.09 
0.0183 
Penetrating  alive 
603 
16.928 
<0.0001 
Run test* 
Penetrating  Dead 
74 
2.065 
0.0195 
Penetrating  alive 
603 
16.41 
<0.0001 
KullbackLeibler Information 
Blunt  Dead 
306 
13.92 
<0.0001 
Blunt  alive 
3275 
8.053 
<0.0001 
Overlap test* 
Blunt  Dead 
306 
13.264 
<0.0001 
Blunt  alive 
3275 
79.074 
<0.0001 
Run test* 
Blunt  Dead 
306 
10.29 
<0.0001 
Blunt  alive 
3275 
52.405 
<0.0001 
Table 2.3: Test of symmetry with summary statistics.
Figure 1.1: Box plot to base deficit for dead patients.
Figure 1.2: Box plot to base deficit for alive patients.
The proposed test of symmetry based on the KullbackLeibler information, appears to outperform the other tests of symmetry in the literature in terms of power. Our test is more sensitive to detect a slight asymmetry in the underlying distribution than other tests proposed in the literature. Moreover, the kernel density estimation literature is very rich and many of the proposed methods and the improved methods are available on statistical software, such as SAS™, Splus, Stata and R. Since based on the KullbackLeibler information can be used in multivariate cases as well as in univariate cases, our proposed test of symmetry can be extended to multivariate cases for diagonal symmetry, conditional symmetry and other types of symmetry.
References
 Butler CC (1969) A test for symmetry using sample distribution function. The Annals of Mathematical Statistics 40: 22112214.
 Rothman ED, Woodroofe M (1972) A CramerVon Mises type statistic for testing symmetry. The Annals of Mathematical Statistics 43: 20352038.
 Hill DL, Rao PV (1977) Test of Symmetry based on CramerVon Mises statistics. Biometrika 64(3): 489494.
 Baklizi A (2003) A conditional distribution free runs test for symmetry. Journal of Nonparametric Statistics 15(6): 713718.
 McWilliams TP (1990) A distribution free test of symmetry based on a runs statistic. Journal of the American Statistical Association 85(412): 11301133.
 Tajuddin IH (1994) DistributionFree test for symmetry based on Wilcox on twosample test. J Applied Statistics 21(5): 409415.
 Modarres R, Gastwirth JL (1996) A modified runs test of symmetry. Statistics & Probability Letters 31(2): 107112.
 Mira A (1999) Distributionfree test for symmetry based on Bonferroni’s measure. Journal of Applied Statistics 26(8): 959971.
 Samawi HM, Helu A, Vogel R (2011) A nonparametric test of symmetry based on the overlapping coefficient. Journal of Applied Statistics 38(5): 885898.
 Samawi HM, Helu A (2011) DistributionFree Runs Test for Conditional Symmetry. Communications in Statistics Theory and Methods 40(15): 27092718.
 Kullback S, Leibler RA (1951) On information and sufficient. Annals of Mathematical Statistics 22(1): 7986.
 Alizadeh Noughabi H, Arghami NR (2011a) Testing exponentially using transformed data. Journal of Statistical Computation and Simulation 81(4): 511516.
 Alizadeh NH, Arghami NR (2011b) Monte Carlo comparison of five exponentially tests using different entropy estimates. Journal of Statistical Computation and Simulation 80(11): 15791592.
 Dmitriev, Yu G, Tarasenko FP (1973) On the estimation functions of the probability density and its derivatives. Theory Probab Appl 18: 628633.
 Joe H (1989) On the estimation of entropy and other functional of a multivariate density. Ann Inst Statist Math 41(4): 683697.
 Ahmad IA, Lin PE (1976) A nonparametric estimation of the entropy for absolutely continuous distributions. Information Theory, IEEE Transactions 22(3): 372375.
 Hall P, Morton SC (1993) On the estimation of the entropy. Ann Inst Statist Math 45(1): 6988.
 Tarasenko FP (1968) On the evaluation of an unknown probability density function, the direct estimation of the entropy from independent observations of a continuous random variable, and the distributionfree entropy test of goodnessoffit. Proceedings of the IEEE 56(11): 20522053.
 Beirlant J (1985) Limit theory for spacing statistics from general univariate distributions. Pub Inst Stat Univ Paris XXXI fasc 1: 2757.
 Hall P (1984) Limit theorems for sums of general functions of mspacing. Math Proc Camb Phil Soc 96(3): 517532.
 Cressie N (1977) The minimum of higher order gaps. Australian Journal of Statistics 19(2): 132143.
 Dudewicz E, Van der Meulen E (1981) Entropy based tests of uniformity. Journal of American Statistical Association 76(376): 967974.
 Beirlant J, Zuijlen MCA (1985) The empirical distribution function and strong laws for functions of order statistics of uniform spacings. Journal of Multivariate Analysis 16(3): 300317.
 Vasicek O (1976) A test for normality based on sample entropy. Journal of the Royal Statistical Society 38(1): 5459.
 Bowman AW (1992) Density based tests for goodnessoffit. Journal of Statistical Computation and Simulation 40: 113.
 Alizadeh Noughabi H (2010) A new estimator of entropy and its application in testing normality. Journal of Statistical Computation and Simulation 80(10): 11511162.
 Silverman BW (1986) Density estimation for statistics and data analysis. London Chapman and Hall.
 Wand MP, Jones MC (1995) Kernel Smoothing London. Chapman and Hall.
 Serfing RJ (1980) Approximation theorems of mathematical statistics. John Wiley & Sons, Inc, USA.
 Samawi HM, Woodworth GG, Lemke J (1998) Power estimation for twosample tests using importance and antithetic resampling. Biometrical Journal 40(3): 341354.
 Hall P, Park BU (2002) New methods for bias correction at endpoints and boundaries. The Annals of Statistics 30(5): 14601479.
 Modarres R, Gastwirth JL (1998) Hybrid test for the hypothesis of symmetry. Journal of Applied Statistics 25(6): 777783.
 Gastwirth JL (1965) Percentile modification of two sample ranked test. Journal of the American Statistical Association 60(312): 11271141.
 Ramberg JS, Schmeiser BW (1974) An approximate method for generating Asymmetric random variables. Communications of the ACM 17: 7882.
 Davis JW, Shackford SR, Mackersie RC, Hoyt DB (1988) Base deficit as a guide to volume Resuscitation. J Trauma 28(10): 14641467.
 Tremblay LN, Feliciano DV, Rozycki GS (2002) Assessment of initial base deficit as a predictor of outcome: mechanism does make a difference. Am Surg 68(8): 689694.
 Davis JW, Mackersie RC, Holbrook TL, Hoyt DB (1991) Base deficit as an indicator of significant abdominal injury. Ann Emerg Med 20(8): 842844.