Submit manuscript...
eISSN: 2378-315X

Biometrics & Biostatistics International Journal

Research Article Volume 9 Issue 2

A New coefficient of Skewness for grouped data

Mahmoud.A. Eltehiwy,1 Abu-Bakr A. AbdulMotaal2

1Faculty of Politics and Economics, Department of Statistics, Beni Suef University, Egypt
2Faculty of Commerce, Department of quantitative methods, South Valley University, Egypt

Correspondence: Mahmoud.A.Eltehiwy, Faculty of Politics and Economics, Department of Statistics, Beni Suef University, Egypt

Received: February 17, 2020 | Published: April 6, 2020

Citation: Eltehiwy MA, Abdul-Motaal ABA. A New coefficient of Skewness for grouped data. Biom Biostat Int J. 2020;9(2):54-59. DOI: 10.15406/bbij.2020.09.00300

Download PDF

Abstract

The primary objective of this paper is to introduce a new measure for detecting skewness for grouped data, which is simpler than the current measures in its application. The new proposed coefficient of skewness based on the cumulative frequency data and hence uses more information from the tails of the distribution and thus will be more appropriate to detect asymmetry in the data. Another advantage of the new statistic is that it is bounded by -1 and +1; hence, the coefficients of skewness can be interpreted easily. Simulation study is employed to assess the performance of the proposed coefficient of skewness with three of the classical measure of skewness appeared in the literature using the mean square error (MSE) and mean absolute error (MAE). The simulation study strongly supports the use of the proposed measure for comparing the degrees of skewness of different frequency distributions.

Keywords: coefficient of skewness, symmetry, mean absolute error, mean square error.

Introduction

Skewness is usually described with reference to symmetry. On the other hand, symmetry is not usually defined clearly, and it is assumed that everyone understands it. There may be many definitions of symmetry depending on the areas where it is used. As Murphy1 explains, any statement about symmetry of a structure must be made with reference to some principle of symmetry, a point, a line, an axis. In statistical distributions, the significant point or axis is taken as the center of a distribution. Thus, for unimodal case, the mass is concentrated around the center evenly in a symmetrical distribution. As explained in many statistics textbooks or elsewhere, in a symmetrical distribution, the three popular measures of center (or central tendency), namely, the mean, median and mode coincide at the center. This equality can be considered as the most important characteristic of a unimodal symmetric distribution. Thus a deviation from the symmetry condition is called asymmetry, or simply skewness to Arnold and Groeneveld.2 In a positively skewed distribution, the ordering of the measures of central tendency generally occurs as mode < median < mean, and the reverse ordering in negatively skewed distributions. The mean-median-mode inequality has been investigated by Groeneveld and Meeden,3 Runnenburg,4 MacGillivray,5 van Zwet,6 Abdous and Theodorescu,7 Abadir,8 and von Hippel,9 among others, for both continuous and discrete distributions. It is shown in these studies that, although there are some exceptions, the mean-median-mode inequality generally holds in unimodal continuous distributions.

However, there are many counter-examples for the mean-median-mode ordering in discrete distributions. Despite the fact that the mean-median-mode inequality is not universal, many measures of skewness are based on this inequality, to be more precise, on the difference between the location parameters in asymmetrical distributions.

As Arnold and Groeneveld10 explains, several measures of skewness had been proposed by 1920. Let denote the mean μ, the median m, the mode M, σ standard deviation, Q1 and Q3 for the first and the third quartiles, respectively. The measures are as follows:

1-Pearson’s coefficient of skewness ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaamaabmaabaaeaa aaaaaaa8qacaWGlbWdamaaBaaaleaapeGaaGymaaWdaeqaaaGccaGL OaGaayzkaaaaaa@39F5@ :

K 1 = μM σ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaeqiVd0MaeyOeI0IaamytaaWdaeaapeGaeq4Wdmhaaa aa@3F08@ ,

The numerical value given by this coefficient usually varies between ±3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaG4maaaa@3928@ . In fact, the mode is the least reliable measure of average as it is so much affected by grouping errors. Therefore, this coefficient is unreliable.

2- Pearson’s second Coefficient: ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

K 2 = 3( μm ) σ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaaG4mamaabmaapaqaa8qacqaH8oqBcqGHsislcaWGTb aacaGLOaGaayzkaaaapaqaa8qacqaHdpWCaaaaaa@418E@

This coefficient has also the value limits between ±3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaG4maaaa@3928@ , and is used when mode cannot be properly defined. In fact, both of the Karl Pearson’s coefficients give too much importance to the extreme values.

3- Bowly’s Coefficient (B)

B= Q 1 + Q 3 2*Median Q 1 Q 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOqaiabg2da9maalaaapaqaa8qacaWGrbWdamaaBaaaleaapeGa aGymaaWdaeqaaOWdbiabgUcaRiaadgfapaWaaSbaaSqaa8qacaaIZa aapaqabaGcpeGaeyOeI0IaaGOmaiaacQcacaWGnbGaamyzaiaadsga caWGPbGaamyyaiaad6gaa8aabaWdbiaadgfapaWaaSbaaSqaa8qaca aIXaaapaqabaGcpeGaeyOeI0Iaamyua8aadaWgaaWcbaWdbiaaioda a8aabeaaaaaaaa@4A28@

where Q 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3868@ and Q 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaiodaa8aabeaaaaa@386A@ are respectively the first and third quartiles. This measure has the defects that it fails to take into consideration the magnitude of the extreme values, and really speaking it measures the skewness of the middle half and not of the whole distribution. As such this measure is rarely used for determing asymmetry. The numerical value of this coefficient varies between ±1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaGymaaaa@3926@ .

4- The standardized third central moment:

γ 1 = μ 3 σ 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaeq4SdC2damaaBaaaleaapeGaaGymaaWdaeqaaOWdbiabg2da9maa laaapaqaa8qacqaH8oqBpaWaaWbaaSqabeaapeGaaG4maaaaaOWdae aapeGaeq4Wdm3damaaCaaaleqabaWdbiaaiodaaaaaaaaa@403C@

The measure of skewness based on the moments is very good measure. However, it is unpopular because of the difficulty of its calculation and also because of the fact that this measure gives too much importance to the extreme values.

Although several other measures, generally extensions of the above coefficients, have been introduced later on, the early measures are still used today, especially γ 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaeq4SdC2damaaBaaaleaapeGaaGymaaWdaeqaaaaa@3939@ (or its variants) is widely used in many statistical software. The first two of the measures of skewness are apparently based on the mean-median-mode inequality, generally encountered in asymmetrical distributions. In cases where the inequality does not hold, the skewness coefficients may give contradictory results.

In the light of the above argument, it is proposed in this paper to develop a new measure of skewness which takes into account the entire set of values, a neglected measure of central tendency and is simpler than the current measures in its application.

In next section, a new method for measuring skewness is developed. In sections 3 and 4, Simulated frequency distributions under different conditions of symmetry and asymmetry provided an opportunity to compare the performance of the proposed coefficient of skewness with that of the Pearson’s and Bowly’s coefficients using Monte Carlo simulation. An empirical example using the General Social Surveys data is given in section 5, and section 6 concludes.

Proposed measure of skewness

For a symmetrical frequency distribution, let :

C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4qaaaa@3745@ the number of classes,

f i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaaaaa@38B0@ the frequency of the ith class, i=1,2,3,C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamyAaiabg2da9iaaigdacaGGSaGaaGOmaiaacYcacaaIZaGaeyOj GWRaaiilaiaadoeaaaa@3F0B@

F i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadMgaa8aabeaaaaa@3890@ the cumulative frequency of the ith class as obtained by summing its frequency with the frequencies of all classes below it.

The proposed measure of skewness is defined in terms of F where

F= i=1 C F i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9maawahabeWcpaqaa8qacaWGPbGaeyypa0JaaGym aaWdaeaapeGaam4qaaqdpaqaa8qacqGHris5aaGccaWGgbWdamaaBa aaleaapeGaamyAaaWdaeqaaaaa@4089@ ,

and is based on the assumption that the frequency distribution has equal classes among which no classes have a frequency of zero.

Some properties of F

Some properties of F are now discussed to be used for defining the proposed measure of skewness which will be denoted by (A).

Theorem (1)

For a frequency distribution of equal classes and f i 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaak8qacqGHGjsUcaaI Waaaaa@3B4B@ ; i=1,2,,C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamyAaiabg2da9iaaigdacaGGSaGaaGOmaiaacYcacqGHMacVcaGG SaGaam4qaaaa@3E4E@ , the lowest and highest values of F are respectively given by:

F L = C( C1 ) 2 +f MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadYeaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaam4qamaabmaapaqaa8qacaWGdbGaeyOeI0IaaGymaa GaayjkaiaawMcaaaWdaeaapeGaaGOmaaaacqGHRaWkcaWGMbaaaa@414A@         and   F H = C( C1 ) 2 +C( fC+1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadIeaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaam4qamaabmaapaqaa8qacaWGdbGaeyOeI0IaaGymaa GaayjkaiaawMcaaaWdaeaapeGaaGOmaaaacqGHRaWkcaWGdbWaaeWa a8aabaWdbiaadAgacqGHsislcaWGdbGaey4kaSIaaGymaaGaayjkai aawMcaaaaa@4708@               

Where C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4qaaaa@3745@ is the number of classes, f= i=1 C f i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzaiabg2da9maawahabeWcpaqaa8qacaWGPbGaeyypa0JaaGym aaWdaeaapeGaam4qaaqdpaqaa8qacqGHris5aaGccaWGMbWdamaaBa aaleaapeGaamyAaaWdaeqaaaaa@40C9@ ; f i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaaaaa@38B0@ is as defined above.

Proof

The lowest value of F, F L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiaacYcacaWGgbWdamaaBaaaleaapeGaamitaaWdaeqaaaaa @39EE@ , is achieved when each of the first (C-1) classes has a frequency of one and the last class has a frequency of (f-C+1), i.e., when:

f i =1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaak8qacqGH9aqpcaaI Xaaaaa@3A8B@ for i=1,2,, ( C1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamyAaiabg2da9iaaigdacaGGSaGaaGOmaiaacYcacqGHMacVcaGG SaGaaiiOamaabmaapaqaa8qacaWGdbGaeyOeI0IaaGymaaGaayjkai aawMcaaaaa@42C2@ ,

and f C =fC+1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadoeaa8aabeaak8qacqGH9aqpcaWG MbGaeyOeI0Iaam4qaiabgUcaRiaaigdaaaa@3DE7@                 

That is ,

F L = i=1 C F i = i=1 C1 F i + F C = i=1 C1 i+ F C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadYeaa8aabeaak8qacqGH9aqpdaGf WbqabSWdaeaapeGaamyAaiabg2da9iaaigdaa8aabaWdbiaadoeaa0 WdaeaapeGaeyyeIuoaaOGaamOra8aadaWgaaWcbaWdbiaadMgaa8aa beaak8qacqGH9aqpdaGfWbqabSWdaeaapeGaamyAaiabg2da9iaaig daa8aabaWdbiaadoeacqGHsislcaaIXaaan8aabaWdbiabggHiLdaa kiaadAeapaWaaSbaaSqaa8qacaWGPbaapaqabaGcpeGaey4kaSIaam Ora8aadaWgaaWcbaWdbiaadoeaa8aabeaak8qacqGH9aqpdaGfWbqa bSWdaeaapeGaamyAaiabg2da9iaaigdaa8aabaWdbiaadoeacqGHsi slcaaIXaaan8aabaWdbiabggHiLdaakiaadMgacqGHRaWkcaWGgbWd amaaBaaaleaapeGaam4qaaWdaeqaaaaa@5C67@

Since F C =f MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadoeaa8aabeaak8qacqGH9aqpcaWG Mbaaaa@3A75@ ,

Then F L = C( C1 ) 2 +f MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadYeaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaam4qamaabmaapaqaa8qacaWGdbGaeyOeI0IaaGymaa GaayjkaiaawMcaaaWdaeaapeGaaGOmaaaacqGHRaWkcaWGMbaaaa@414A@                                                  (1)

The highest value of F, F H MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadIeaa8aabeaaaaa@386F@ , is achieved when:

f 1 =fC+1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaaigdaa8aabeaak8qacqGH9aqpcaWG MbGaeyOeI0Iaam4qaiabgUcaRiaaigdaaaa@3DDA@ ,

and f i =1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaak8qacqGH9aqpcaaI Xaaaaa@3A8B@ for i=2,3,,C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamyAaiabg2da9iaaikdacaGGSaGaaG4maiaacYcacqGHMacVcaGG SaGaam4qaaaa@3E50@

That is,

F H = i=1 C F i = i=1 C1 i+ i=1 C ( fC+1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadIeaa8aabeaak8qacqGH9aqpdaGf WbqabSWdaeaapeGaamyAaiabg2da9iaaigdaa8aabaWdbiaadoeaa0 WdaeaapeGaeyyeIuoaaOGaamOra8aadaWgaaWcbaWdbiaadMgaa8aa beaak8qacqGH9aqpdaGfWbqabSWdaeaapeGaamyAaiabg2da9iaaig daa8aabaWdbiaadoeacqGHsislcaaIXaaan8aabaWdbiabggHiLdaa kiaadMgacqGHRaWkdaGfWbqabSWdaeaapeGaamyAaiabg2da9iaaig daa8aabaWdbiaadoeaa0WdaeaapeGaeyyeIuoaaOWaaeWaa8aabaWd biaadAgacqGHsislcaWGdbGaey4kaSIaaGymaaGaayjkaiaawMcaaa aa@5897@ ,

Then F H = C( C1 ) 2 +C( fC+1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadIeaa8aabeaak8qacqGH9aqpdaWc aaWdaeaapeGaam4qamaabmaapaqaa8qacaWGdbGaeyOeI0IaaGymaa GaayjkaiaawMcaaaWdaeaapeGaaGOmaaaacqGHRaWkcaWGdbWaaeWa a8aabaWdbiaadAgacqGHsislcaWGdbGaey4kaSIaaGymaaGaayjkai aawMcaaaaa@4708@                                        (2)

Theorem (2)

For a symmetrical frequency distribution, the value of F is always equal to f( C+1 )/2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOzamaabmaapaqaa8qacaWGdbGaey4kaSIaaGymaaGaayjkaiaa wMcaaiaac+cacaaIYaaaaa@3CE4@ .

Proof

For any frequency distribution, F can be expressed as follows:

F= i=1 C i f ci+1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9maawahabeWcpaqaa8qacaWGPbGaeyypa0JaaGym aaWdaeaapeGaam4qaaqdpaqaa8qacqGHris5aaGccaWGPbGaamOza8 aadaWgaaWcbaWdbiaadogacqGHsislcaWGPbGaey4kaSIaaGymaaWd aeqaaaaa@4509@                                                    (3)

Since f i = f Ci+1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaadMgaa8aabeaak8qacqGH9aqpcaWG MbWdamaaBaaaleaapeGaam4qaiabgkHiTiaadMgacqGHRaWkcaaIXa aapaqabaaaaa@3F55@ for a symmetrical distribution, then, for a symmetrical distribution, F can also take the formula:

F= i=1 C i f i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9maawahabeWcpaqaa8qacaWGPbGaeyypa0JaaGym aaWdaeaapeGaam4qaaqdpaqaa8qacqGHris5aaGccaWGPbGaamOza8 aadaWgaaWcbaWdbiaadMgaa8aabeaaaaa@4197@                                                          (4)

Summing (4) with (3) will give

2F= i=1 C i( f i + f Ci+1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaaGOmaiaadAeacqGH9aqpdaGfWbqabSWdaeaapeGaamyAaiabg2da 9iaaigdaa8aabaWdbiaadoeaa0WdaeaapeGaeyyeIuoaaOGaamyAam aabmaapaqaa8qacaWGMbWdamaaBaaaleaapeGaamyAaaWdaeqaaOWd biabgUcaRiaadAgapaWaaSbaaSqaa8qacaWGdbGaeyOeI0IaamyAai abgUcaRiaaigdaa8aabeaaaOWdbiaawIcacaGLPaaaaaa@4A96@            

=( f 1 + f C )+2( f 2 + f C1 )+3( f 3 + f C2 )++C( f C + f 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaeyypa0ZaaeWaa8aabaWdbiaadAgapaWaaSbaaSqaa8qacaaIXaaa paqabaGcpeGaey4kaSIaamOza8aadaWgaaWcbaWdbiaadoeaa8aabe aaaOWdbiaawIcacaGLPaaacqGHRaWkcaaIYaWaaeWaa8aabaWdbiaa dAgapaWaaSbaaSqaa8qacaaIYaaapaqabaGcpeGaey4kaSIaamOza8 aadaWgaaWcbaWdbiaadoeacqGHsislcaaIXaaapaqabaaak8qacaGL OaGaayzkaaGaey4kaSIaaG4mamaabmaapaqaa8qacaWGMbWdamaaBa aaleaapeGaaG4maaWdaeqaaOWdbiabgUcaRiaadAgapaWaaSbaaSqa a8qacaWGdbGaeyOeI0IaaGOmaaWdaeqaaaGcpeGaayjkaiaawMcaai abgUcaRiabgAci8kabgUcaRiaadoeadaqadaWdaeaapeGaamOza8aa daWgaaWcbaWdbiaadoeaa8aabeaak8qacqGHRaWkcaWGMbWdamaaBa aaleaapeGaaGymaaWdaeqaaaGcpeGaayjkaiaawMcaaaaa@5D5A@

              = ( C+1 )f MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadoeacqGHRaWkcaaIXaaacaGLOaGaayzkaaGa amOzaaaa@3B75@

That is

F=f( C+1 )/2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9iaadAgadaqadaWdaeaapeGaam4qaiabgUcaRiaa igdaaiaawIcacaGLPaaacaGGVaGaaGOmaaaa@3EB5@                                                    (5)

Corollary

In terms of F L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadYeaa8aabeaaaaa@3873@ and F H MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaadIeaa8aabeaaaaa@386F@ , the value of F MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraaaa@3748@ for a symmetrical distribution is given by:

F= F L +  F H 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9maalaaapaqaa8qacaWGgbWdamaaBaaaleaapeGa amitaaWdaeqaaOWdbiabgUcaRiaabckacaWGgbWdamaaBaaaleaape GaamisaaWdaeqaaaGcbaWdbiaaikdaaaaaaa@3F5A@                                                                               (6)

Proof

Using theorems [1] and [2], (6) comes as a consequence of (1), (2), (5)

Now using formulas (1), (2), (5) and (6), the proposed coefficient of skewness (A) is defined so that it has the value limits between ±1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaGymaaaa@3926@ . That is,

A= 2 F ob f( C+1 ) ( fC )( C1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyqaiabg2da9maalaaapaqaa8qacaaIYaGaamOra8aadaWgaaWc baWdbiaad+gacaWGIbaapaqabaGcpeGaeyOeI0IaamOzamaabmaapa qaa8qacaWGdbGaey4kaSIaaGymaaGaayjkaiaawMcaaaWdaeaapeWa aeWaa8aabaWdbiaadAgacqGHsislcaWGdbaacaGLOaGaayzkaaWaae Waa8aabaWdbiaadoeacqGHsislcaaIXaaacaGLOaGaayzkaaaaaaaa @4AB2@ ,                                 (7)

where F ob MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaad+gacaWGIbaapaqabaaaaa@397D@ is the observed F value for the frequency distribution; F= i=1 C i f ci+1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOraiabg2da9maawahabeWcpaqaa8qacaWGPbGaeyypa0JaaGym aaWdaeaapeGaam4qaaqdpaqaa8qacqGHris5aaGccaWGPbGaamOza8 aadaWgaaWcbaWdbiaadogacqGHsislcaWGPbGaey4kaSIaaGymaaWd aeqaaaaa@4509@ , f and C are as defined above.

The notion on which the proposed coefficient of skewness is based is that the larger the value of f ob MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOza8aadaWgaaWcbaWdbiaad+gacaWGIbaapaqabaaaaa@399D@  the more likely the bulk of items are of low values and hence the stronger is the evidence that the frequency distribution is positively skewed, and vice versa. In addition the more closer to f(C+1)/2 is the value of F ob MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOra8aadaWgaaWcbaWdbiaad+gacaWGIbaapaqabaaaaa@397D@ the more likely the frequency distribution will be symmetrical.

Simulation study

In this study, only three well known measures of skewness are considered for purposes of comparison with the proposed measure. These three measures are based on either the Karl Pearson's method of measuring skewness or on Bowly's method. Simulated frequency distributions under different conditions of symmetry and asymmetry provided an opportunity to compare the performance of the proposed coefficient of skewness with that of the Pearson’s and Bowly’s coefficients.

The simulations

  1. For symmetrical distributions: 1000 samples, each of size 500 observations, were generated from the normal distribution using the R program . The 500 observations for each sample were then grouped into a frequency distribution of equal classes; each of width 2. Then, using the Kolmogorov-Smirnow test, it was determined that each frequency distribution included in the analysis was consistent with the normal distribution. It was also determined that none of the frequency distributions considered has any class of a frequency zero.
  2. For skewed distributions: 5000 samples, each of size 200 observations, were generated from a chi-square distribution of 10 degrees of freedom using the statistical program R. The 200 observations for each sample were then grouped into a frequency distribution of equall classes: each of width 2.

The following frequency distributions were discarded from the analysis and replaced by the convenient ones:

  1. Frequency distributions found inconsistent with a chi-square distribution of 10 degrees of freedom.
  2. Frequency distributions with any class of a frequency zero.
  3. Bimodal frequency distributions.

This total operation was then repeated with a chi-square distribution of 15 degrees of freedom.The number of classes was set at 11 and 13 for the chi-square distributions with 10 and15 degrees of freedom respectively.

Methodology

Methods used: the Coefficients of skewness considered in the analysis were:

  1. The Pearson’s two measures of skewness ( K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ and K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ ).
  2. The Bowly’s coefficient (B)
  3. The proposed coefficient (A).

The four coefficients of skewness were developed for each frequency distribution of the three families of generated samples described above. Sine the Pearson’s coefficients have the value limits between ±1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaGymaaaa@3925@ whereas the Bowly’s coefficient has the value limits between ±3 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeyySaeRaaGymaaaa@3925@ . Therefore, to make a meaningful comparison, the measures 3A and 3B were considered instead of A and B respectively.

Criteria used for evalution of various measures of skewness

For the measures of skewness considered , the mean-square error (MSE) and mean absolute error (MAE) were used for evaluating performances in the following manner:

(I) The Mean-Square Error (MSE):

Let K ^ 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadUeagaqcam aaBaaaleaacaaIXaaabeaaaaa@3823@ , K ^ 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadUeagaqcam aaBaaaleaacaaIYaaabeaaaaa@3824@ , B ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadkeagaqcaa aa@3734@ and A ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadgeagaqcaa aa@3733@ respectively denote the coefficients K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ , K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ , B and A when used as estimators for their correspondent true values K 1t , K 2t , B t MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdacaWG0baapaqabaGcpeGaaiil aiaadUeapaWaaSbaaSqaa8qacaaIYaGaamiDaaWdaeqaaOWdbiaacY cacaWGcbWdamaaBaaaleaapeGaamiDaaWdaeqaaaaa@3FE7@ and A t MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyqa8aadaWgaaWcbaWdbiaadshaa8aabeaaaaa@3895@ .

(a). for symmetrical distributions, the MSE for each coefficient was obtained. Since the true value for any coefficient of skewness for a symmetrical distribution is equal to zero, that is:

K 1t = K 2t = B t = MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdacaWG0baapaqabaGcpeGaeyyp a0Jaam4sa8aadaWgaaWcbaWdbiaaikdacaWG0baapaqabaGcpeGaey ypa0JaamOqa8aadaWgaaWcbaWdbiaadshaa8aabeaak8qacqGH9aqp aaa@41B3@ A t =0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyqa8aadaWgaaWcbaWdbiaadshaa8aabeaakiabg2da9iaaicda aaa@3A5F@

Therefore, the mean-square error (MSE) for the Pearson’s coefficient ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaaiikaiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaGcpeGaaiyk aaaa@39D4@ , for example, is given by:

MSE( K ^ )=E ( K ^ 1 ) 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamytaiaadofacaWGfbWaaeWaa8aabaGabm4sayaajaaapeGaayjk aiaawMcaaiabg2da9iaadweadaqadaqaa8aaceWGlbGbaKaadaWgaa WcbaGaaGymaaqabaaak8qacaGLOaGaayzkaaWdamaaCaaaleqabaWd biaaikdaaaaaaa@41C9@ ,

and so for other measures of skewness.

(b). for the two skewed distributions, the MSE was obtained for each coefficient when used for estimating its corresponding true value. In this case, the MSE for the Pearson’s coefficient ( K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ ), for example, is given by:

MSE( K ^ 2 )=Var( K ^ 2 )+E ( K 2t E( K 2t ) ) 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamytaiaadofacaWGfbWaaeWaa8aabaGabm4sayaajaWaaSbaaSqa aiaaikdaaeqaaaGcpeGaayjkaiaawMcaaiabg2da9iaadAfacaWGHb GaamOCamaabmaapaqaaiqadUeagaqcamaaBaaaleaacaaIYaaabeaa aOWdbiaawIcacaGLPaaacqGHRaWkcaWGfbWaaeWaa8aabaWdbiaadU eapaWaaSbaaSqaa8qacaaIYaGaamiDaaWdaeqaaOWdbiabgkHiTiaa dweadaqadaWdaeaapeGaam4sa8aadaWgaaWcbaWdbiaaikdacaWG0b aapaqabaaak8qacaGLOaGaayzkaaaacaGLOaGaayzkaaWdamaaCaaa leqabaWdbiaaikdaaaaaaa@514F@ ,

and so for other measures of skewness.

To determine the MSE and MAE for different coefficients cosidering the two chi-square distributions, it is required to determine the true values of these coefficients for both distributions which requires, in turn , some measures to be computed (Table 2 shows the values of these measures). It should be pointed out that these measures were determined as follows:

  • The mean (μ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaaiikaiabeY7aTjaacMcaaaa@398B@ and standard diviation (σ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaaiikaiabeo8aZjaacMcaaaa@3998@ ; it is known that they are K and 2K MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaOaaa8aabaWdbiaaikdacaWGlbaaleqaaaaa@3842@ respectively for a chi-square distribution of K degrees of freedom.
  • The median ( Q 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaaiikaiaadgfapaWaaSbaaSqaa8qacaaIYaaapaqabaGcpeGaaiyk aaaa@39DB@ , first Quartile ( Q 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadgfapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A29@ and third Quartile ( Q 3 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadgfapaWaaSbaaSqaa8qacaaIZaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A2B@ : were determined with the required precision using the statistical package MathCad
  • The mode: it can be proven that a chi-square distribution of K degrees of freedom has a unique maximum at X=K2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamiwaiabg2da9iaadUeacqGHsislcaaIYaaaaa@3AD8@ , that is, the mode of such a distribution is equal to K-2.

The true values of the Pearson’s and Bowly’s coefficients were then obtained (Table 3). In case of the proposed coefficient, the expected frequencies for the theoretical chi-square distributions with 200 observations and equal classes of width 2 were determined first considering 10 and 15 degrees of freedom and using the statistical program R. Then, the true values of the proposed coefficient were obtained for each chi-square distribution considered (Table 3).

Since the true values of various coefficients are different from each other, it might be convenient to obtain the relative mean-square error (RMSE) for each coefficient by dividing its mean-square error by its true value (e.g. RMSE( K ^ 1 )=MSE( K ^ )/ K 1t MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOuaiaad2eacaWGtbGaamyramaabmaabaWdaiqadUeagaqcamaa BaaaleaacaaIXaaabeaaaOWdbiaawIcacaGLPaaacqGH9aqpcaWGnb Gaam4uaiaadweadaqadaWdaeaaceWGlbGbaKaaa8qacaGLOaGaayzk aaGaai4laiaadUeapaWaaSbaaSqaa8qacaaIXaGaamiDaaWdaeqaaa aa@46D3@ )

It may be emphasized that the old and proposed coefficients of skewness are based on entirely different principles and hence the results obtained will be different. Therefore, each coefficient ,by itself, is of little use and it is useful when we try to decide which of distributions shows the greater degree of skewness. This was the main reason for considering two different skewed distributions. The proposed and old coefficients of skewness were used to compare the degrees of skewness of the two chi-square distributions as shown in the following point.

(c). To compare the degrees of skewness of the two chi-square distributions, the 50 frequency distributions constructed for each distribution were ranked from 1 to 50 according to order of execution on computer. Let S( k ), k=10,15 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4uamaabmaapaqaa8qacaWGRbaacaGLOaGaayzkaaGaaiilaiaa cckacaWGRbGaeyypa0JaaGymaiaaicdacaGGSaGaaGymaiaaiwdaaa a@4155@ denote the degree of skewness of a chi-square distribution with k degrees of freedom as obtained by a measure S. then, the amount S( 10 ) S( 15 ) =R( S ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaSaaa8aabaWdbiaadofadaqadaWdaeaapeGaaGymaiaaicdaaiaa wIcacaGLPaaaa8aabaWdbiaadofadaqadaWdaeaapeGaaGymaiaaiw daaiaawIcacaGLPaaaaaGaeyypa0JaamOuamaabmaapaqaa8qacaWG tbaacaGLOaGaayzkaaaaaa@4316@ say, was obtained for each pair of frequency distribution of the same rank. Thus, 50 values for R were obtained for each coefficient of skewness considered. The MSE of R was then determines for each coefficient when used for estimating the true value of its corresppondent R. that is , the mean-square error of R for the proposed coefficent, for example, will take the formula:

MSE{ R( 3 A ^ ) }=Var{ R( 3 A ^ ) }+E { R( 3 A t )E[ R( 3 A ^ ) ] } 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamytaiaadofacaWGfbWaaiWaa8aabaWdbiaadkfadaqadaWdaeaa caaIZaGabmyqayaajaaapeGaayjkaiaawMcaaaGaay5Eaiaaw2haai abg2da9iaadAfacaWGHbGaamOCamaacmaapaqaa8qacaWGsbWaaeWa a8aabaGaaG4maiqadgeagaqcaaWdbiaawIcacaGLPaaaaiaawUhaca GL9baacqGHRaWkcaWGfbWaaiWaa8aabaWdbiaadkfadaqadaWdaeaa peGaaG4maiaadgeapaWaaSbaaSqaa8qacaWG0baapaqabaaak8qaca GLOaGaayzkaaGaeyOeI0Iaamyramaadmaapaqaa8qacaWGsbWaaeWa a8aabaGaaG4maiqadgeagaqcaaWdbiaawIcacaGLPaaaaiaawUfaca GLDbaaaiaawUhacaGL9baapaWaaWbaaSqabeaapeGaaGOmaaaaaaa@5BBF@ ,

and so for other measures of skewness.

(II). The mean Absolute Error (MAE):

The total procedures described in (a), (b) and (c) were then repeated with the mean absolute error as obtained from the risk function which takes the formula:

R τ ( θ )=E| Tτ( θ ) | MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamOua8aadaWgaaWcbaWdbiabes8a0bWdaeqaaOWdbmaabmaapaqa a8qacqaH4oqCaiaawIcacaGLPaaacqGH9aqpcaWGfbWaaqWaa8aaba WdbiaadsfacqGHsislcqaHepaDdaqadaWdaeaapeGaeqiUdehacaGL OaGaayzkaaaacaGLhWUaayjcSdaaaa@48E4@

where T MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamivaaaa@3755@ is an estimator of τ( θ ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeqiXdq3aaeWaa8aabaWdbiabeI7aXbGaayjkaiaawMcaaaaa@3B9F@ . For example the mean absolute error of K ^ 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiqadUeagaqcam aaBaaaleaacaaIXaaabeaaaaa@3823@ is given by :

MAE( K ^ 1 )=E| K ^ 1 K 1t | MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaamytaiaadgeacaWGfbWaaeWaa8aabaGabm4sayaajaWaaSbaaSqa aiaaigdaaeqaaaGcpeGaayjkaiaawMcaaiabg2da9iaadweadaabda WdaeaaceWGlbGbaKaadaWgaaWcbaGaaGymaaqabaGcpeGaeyOeI0Ia am4sa8aadaWgaaWcbaWdbiaaigdacaWG0baapaqabaaak8qacaGLhW UaayjcSdaaaa@471E@

Results

Symmetrical distributions

For symmetrical distribution (normal distribution), Table 1 shows the results for the mean-square error (MSE) and mean absolute error (MAE).

 

Criterion.

Coefficient of Skewness

Pearson ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@

Pearson ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

Bowly
3B

Proposed (3A)

MSE

0.0581

0.0075

0.0248

0.0223

MAE

0.1973

0.0699

0.1253

0.1236

Table 1 MSE and MAE for coefficients of skewness (Symmetrical distributions)

It can be concluded from Table 1 that the Pearson's coefficient of skewness ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@ gave the best results with respect to the MSE and MAE criteria. The proposed coefficient (A) came to be the second best measure in terms of both criteria. The appropriate interpretation of the worst results obtained by the Pearson's coefficient ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A23@ is perhaps that the mode is so much affected by grouping errors that it becomes unreliable.

Skewed distributions

Table 2 shows the values of measures required to compute the true values of the Pearson's and Bowly's coefficients of skewness for the chi-square distributions with 10 and 15 degrees of freedom, whereas Table 3 presents the true values of these coefficients together with the true value of the proposed one.

Distribution

Measures

 

μ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaeqiVd0gaaa@3833@ σ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaeq4Wdmhaaa@3840@ Q 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3869@ Q 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3868@ Q 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaiodaa8aabeaaaaa@386A@

Mode

Chi-square

(10 d.f.)

10

4.4721

9.342

6.737

12.549

8

Chi-square

(15 d.f.)

15

5.4772

14.339

11.036

18.245

13

Table 2  The true values of measures required for determining the coefficients of skewness for the Chi-square distributions

Skewness

(Dist.)

Coefficient of Skewness

Pearson ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@

Pearson ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

Bowly
3B

Proposed (3A)

S(10)

0.4472

0.4414

0.3107

0.8965

S(15)

0.3652

0.362

0.2509

0.7205

R(S)

1.2245

1.2193

1.2383

1.2443

Table 3 The true values of  various measures of skewness (the chi-square distributions)

It can be concluded from Table 3 that similar results were obtained when the true values of different coefficients were used for comparing the skewness of the two chi-square distributions (the true values of S(10)/S(15) will be identical for different coefficients when rounded to the nearest tenth). It should be pointed out here that this value was found to be 1.2247 for the coefficient of skewness based on the moments which coincides with our results.

Using the true values of different coefficients (Table 3) , the mean square error (MSE), square error (RMSE), mean absolute error (MAE) and relative mean absolute error (RMAE) were obtained for each coefficient when used either for estimationg its true value or for comparing the skewness of the two chi-square distributions (Tables 4 and 5). from these Tables, the following points can be drawn:

 

criterion

 

Skew.(Dist.)

Coefficient of Skewness

Pearson ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@

Pearson ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

Bowly
(3B)

Proposed (3A)

MSE

S(10)

0.0493

0.0190

0.0361

0.0216

S(15)

0.0391

0.0119

0.0315

0.0178

R(S)

1.0751

0.4754

3.5957

0.0534

MAE

S(10)

0.1816

0.1085

0.1597

0.1233

S(15)

0.1685

0.0905

0.1352

0.1117

R(S)

0.8081

0.5006

1.2503

0.1695

Table 4 The MSE and MAE for different coefficients of skewness for the two chi-square distributions

 

criterion

 

Skew.(Dist.)

Coefficient of Skewness

Pearson ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@

Pearson ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

Bowly
(3B)

Proposed (3A)

RMSE

S(10)

0.1102

0.0431

0.1162

0.0240

S(15)

0.1071

0.0329

0.1256

0.0247

R(S)

0.8780

0.3899

2.9037

0.0429

RMAE

S(10)

0.4061

0.2459

0.5139

0.1375

S(15)

0.4614

0.2499

0.5388

0.1551

R(S)

0.6599

0.4105

1.0097

0.1362

Table 5 The RMSE and RMAE for different coefficients of skewness for the two Chi-square distributions

  1. When different coefficients were used for estimating their correspondent true values:
  1. In terms of the mean-square and mean absolute errors, the Pearson's coefficient ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape GaaiikaiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaGcpeGaaiyk aaaa@39D5@ gave the best results and was followed by the proposed coefficient. However, the proposed coefficient gave more competitive results to that obtained by the K2 coefficient that it was in the case of symmetrical distribution.
  2. The proposed coefficient gave considerably the best result with respect to the RMSE and RMAE. It was followed by the K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ coefficient.
  3. The results for the K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ and B coefficients were so much less satisfactory than the proposed and K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ coefficients with respect to all criteria used.
  1. when different coefficients were used for comparing the degrees of skewness of the two chi-square distributions
  1. the performance of the proposed measure of skewness was superior to the other measures. This was true with respect to all criteria used (the MSE,MAE, RMSE and RMAE criteria).
  2. Again the Pearson's K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ and Bowly's coefficients gave the worst results in terms of all criteria used.

The reason for the results given in (ii) and (iii) for (a) and given in (i) and (ii) for (b) is may be that the proposed coefficient was found to be the most stable measure of skewness as determined by the coefficient of variation (Table 6). This was true when various measures of skewness were used either for estimating their correspondent true values or for comparing the degrees of skewness of the two chi-square distributions.

Skewness

(Dist.)

Coefficient of Skewness

Pearson ( K 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIXaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A24@

Pearson ( K 2 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape WaaeWaa8aabaWdbiaadUeapaWaaSbaaSqaa8qacaaIYaaapaqabaaa k8qacaGLOaGaayzkaaaaaa@3A25@

Bowly
3B

Proposed (3A)

S(10)

31.65

57.18

60.12

10.65

S(15)

30.03

56.48

58.42

12.27

R(S)

49.04

78.09

103.95

17.52

Table 6 The coefficients of variation (%) for various measures of skewness for the two Chi-square distributions

Results in Table 6 indicate that the measure of skewness considered differ as to their relative stabilities. The proposed measure is the most stable measure and was followed by the Pearson's measure ( K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ ). The coefficient of variation of the Pearson' measure ( K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ ) did not considerably differ from that of the Bowly's measure (B) and both of them were the least stable measures. In fact the differences in stability were quite marked between the proposed measure and other measures of skewness. These results may be justify the results obtained for various measures of skewness.

An empirical example

So far, we have had some idea about the performance of the proposed statistics in continuous data. To find out the performances of the proposed statistics in discrete data, especially in real world data, we consider the General Social Surveys (1972-2010) data, as they were used in von Hippel9 and in Garcia et al.11 The data given in Table 7 correspond to a survey of respondents who are asked how many people older than 17 live in their household in the USA in 2002.

# of Members

1

2

3

4

5

Frequency

1045

1365

259

75

21

Table 7 Number of Adult Household Members in the U.S. in 2002 (n = 2,765)

The summary statistics of the data in Table 7 are as follows.

Mean

Median

Mode

s.d.

Min

Max

Range

Q 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3867@ Q 3 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaamyua8aadaWgaaWcbaWdbiaaiodaa8aabeaaaaa@3869@

1.7928

2

2

0.7783

1

5

4

1

2

Although the frequencies suggest a likely skewness to the right, the mean is lower than the median and the mode. This is one of the counter-examples for the mean-median-mode inequality in discrete data. The coefficients of skewness corresponding to the data in Table VII are as follows.

A

γ MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaeq4SdCgaaa@3823@ K 1 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaigdaa8aabeaaaaa@3861@ K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@

B

0.60471

1.1103

-0.2663

-0.7988

-1.000

Since the mean-median-mode inequality does not hold in this example, four of the coefficients of skewness, the ones based on the difference between measures of central tendency (namely, K1, K2 and B) yield negative values indicating the dataset is skewed to the left. Especially, Bowley’s coefficient of skewness (B) points to extremely negative skewness. Contrary to them, the proposed coefficients of skewness (A) as well as γ indicate that the dataset is skewed to the right. Although γ indicates a positively skewed distribution, it is difficult to interpret the magnitude of 1.11, since it is not bounded. The values of A (0.60471) indicate an approximately moderate skewness to the right.

Conclusion

This paper shows that various measures of skewness considered could yield, as expected, different degrees of skewness for the same frequency distribution. However, it was useful to use them either for estimating their true values for the symmetrical and skewed distributions or for comparing the degrees of skewnss of the two chi-square distributions with 10 and 15 degrees of freedom. In case of symmetrical distribution, the MSE and MAE showed that the Pearson' coefficients ( K 2 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqkY=grVeeu0dXdh9vqqj=hEeeu0xXdbba9frFj0=OqFf ea0dXdd9vqaq=JfrVkFHe9pgea0dXdar=Jb9hs0dXdbPYxe9vr0=vr 0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaabaaaaaaaaape Gaam4sa8aadaWgaaWcbaWdbiaaikdaa8aabeaaaaa@3862@ ) was the best measure for determining symmetry of the normal distribution and was followed by the proposed measure of skewness. In cases of skewed distributions, the RMES and RMAE strongly support the use of the proposed measure of skewness for comparing the degrees of skewness of the two chi-square distributions. In general, results pointed to the relative inferiority of the performance of the Bowly's and Pearson's (K1) measures of skewness when compared with that of the proposed and Pearson's K2 measures.

Finally, it must be stressed that each coefficient, by itself, is of little use and it becomes useful when used for comparing skewness of different frequency distributions. Therefore, the results obtained for the proposed measure of skewness are of great value bearing in mind its simplicity in application relative to the complexity of the other measure.

Acknowledgments

None.

Conflicts of interest

None.

References

Creative Commons Attribution License

©2020 Eltehiwy, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.