9 Measures of Dispersion: Skewness and Kurtosis
Prof. Pankaj Madan
Quadrant-I
Measures of Dispersion: Skewness and Kurtosis
Learning Objectives:
After the completion of this module the student will understand:
Ø Introduction to skewness and kurtosis
Ø Measures of skewness
Ø Relative measures of skewness (Karl Pearson’s coefficient of skewness, Bowley’s coefficients of skewness, Kelly’s coefficient of skewness)
Ø Moments
Ø β and γ–Coefficient
Ø Measures of Kurtosis (Leptokurtic, Mesokurtic, and Platykurtic)
1. Introduction
Measures of dispersion describe the spread of individual values in a data set around a central value. Such descriptive analysis of a frequency distribution remains incomplete until we measure the degree to which these individual values in the data set deviate from symmetry on both sides of the central value and the direction in which these are distributed. This analysis is important due to the fact that sets may have the same mean and standard deviation but the frequency curves may differ in their shape. The shape of any uni-model frequency distribution may vary in two aspects:
(a) Degree of asymmetry (Skewness)
(b) Flatness of mode (Kurtosis)
S-5
(a) Skewness
A frequency distribution of the set of values that is not symmetrical about the mean is called asymmetrical or skewed, or we can say that skewness is the departure from symmetry. In a skewed distribution, extreme values in a data set move towards the upper or right tail, the distribution is positively skewed. When such values move towards the lower or left tail the distribution is negatively skewed. The symmetrical and asymmetrical curves have been shown in the following diagrams:
In fig. (1) we see that the curve is symmetrical about the mean, i.e., there is no skewness. In fig. (2) and fig. (3) we see that the curves are not symmetrical about the mean, i.e. they are skewed. In fig. (2) skewness is negative and in fig. (3) skewness is positive. For a positively skewed distribution A.M.>Median>Mode, and for a negatively skewed distribution A.M.<Median<Mode.
(b) Kurtosis
It is that property of the single-humped or unimodaldistribution by virtue of which we can study the flatness of mode. The flatness of mode is of three types which have been shown in the following diagram.
2. Measures of skewness
The degree of skewness in a distribution can be measured both in the absolute and relative sense. For an asymmetrical distribution, the distance between mean and mode may be used to measure the degree of skewness because the mean is equal to the mode in a symmetrical distribution. Thus,
Absolute Sk = Mean-Mode
= Q3 + Q1 – 2 Median (if measured in terms of quartiles)
For a positively skewed distribution, Mean>Mode and therefore Sk is a positive value, otherwise, it is a negative value.
S-8
3. Relative measures of skewness
The following are the three important relative measures of skewness.
(i) Karl Pearson’s coefficient of skewness
The measure suggested by Karl Pearson for measuring the coefficient of skewness is given by:
Skp = − = ͞− ……………….(1)
Whereas Skp = Karl Pearson’s coefficient of skewness.
Since a mode does not always exist uniquely in a distribution, therefore it is convenient to define this measure using median. For a moderately skewed distribution, the following relationship holds:
Mean-Mode = 3(Mean-Median) or Mode = 3 Median- 2 Mean When this value of mode is substituted in eq. 1 we get
Skp = 3( ͞͞− ) ………….(2)
Theoretically, the value of Skp varies between ±3. But for a moderately skewed distribution, value of Skp = ±1. Karl pearson’s method of determining the coefficient of skewness is particularly useful in open-end distributions.
S-9
(ii) Bowley’s Coefficient of Skewness
The method is based on the relative positions of the median and the quartiles in a distribution. If a distribution is symmetrical, then Q1 and Q3 would be at equal distances from the value of the median, that is,
Median-Q1 = Q3- Median
Q3 + Q1 -2 Median = 0 or Median = 3+ 1 …………….(3)
The absolute measure of skewness is converted into a relative measure for comparing distributions expressed in different units of measurement. For this, the absolute measure is divided by the inter-quartile range. That is,
Relative Skb = 3+ 1−2/- = ( 3− )−( − 1)/(-)+(-) ……………(4)
S-10
(iii) Kelly’s Coefficient of Skewness
The relative measures of skewness suggested by Prof. Kelly are based on percentile and deciles:
Sk = 10+ 90−2 50or 1+ 9−2 5
k
90− 10 9− 1
S-11
4. Moments
The word moment has been derived from statistics where we have got the use of the phrase moment of a force. The moment of a force about a certain point is calculated by multiplying the force by its distance from the given point. In statistics also we have got a similar idea. Here we calculate the moments about any arbitrary value, or the origin (X=0), or the Arithmetic Mean by using the deviations of the variate values from these values.
4.1.Moments about any Arbitrary ValueA.
These moments are represented by the symbol µr’ and defined as:
µr’ = 1 ∑ ( − ) = 1 ∑
Thus µ1´= 1 ∑ ( − ) = 1 ∑
µ2´= 1 ∑ ( − )2 = 1 ∑ 2
µ3´= 1 ∑ ( − )3 = 1 ∑ 3
µ4´= 1 ∑ ( − )3 = 1 ∑ 4
Here µ1´,µ2´,µ3´and µ4´…….µu´ are known as first, second, third, fourth,………….rth moments about A.
s-12
Moments about the Origin
It is represented by the symbol mr.
Where, mr = 1
m1 = 1 ∑ m2 = 1 ∑ 2
m3 = 1 ∑ 3 m4 = 1 ∑ 4
m1 = 1 ∑ m2 = 1 ∑ 2 m3 = 1 ∑ 3 m4 = 1 ∑ 4
s-13
Central Moments
The moments about the Arithmetic Mean (X͞) are known as Central Moments, and are represented by µr. The rth moment about the mean is defined asX͞
µr = 1 ∑ ( − X͞)r = 1 ∑
µ1= 1 ∑ ( − X͞), µ2 = 1 ∑ ( − X͞)2, µ3 = 1 ∑ ( − X͞)3, µ4 = 1 ∑ ( − X͞)4
S-14
5. β and γ– Coefficients
There are two important quantities which are calculated from the moments about the mean and they are of particular importance in statistical work. These two quantities are β1 and β2 and are denoted by the following expressions along with their derivatives γ1 and
γ2.
2
β = 3; γ = +√β
s-15
6. Measures of Kurtosis
The measures of kurtosis describe the degree of concentration of frequencies (observations) in a given distribution. That is, whether the observed values are concentrated more around the mode (a peaked curve) or away from the mode towards both tails of the frequency curve.
The word kurtosis comes from a Greek word meaning humped. In statistics, it refers to the degree of flatness or peak in the region about the mode of thefrequency curve. There are three types of frequency cure.
(i) Leptokurtic
The curves which are very highly peaked, have the value of β2 greater than 3 and are called leptokurtic (γ2>0).
(ii) Mesokurtic
The curves, which have the value of β2 equal to 3, are called mesokurtic, (γ2 =0).
(iii) Platykurtic
The curves, which are flat-topped and have the value of β2 less than 3, are called platykurtic (γ2<0)
7. Summary
This module provides an opportunity to the students to carry a descriptive analysis of frequency distribution. The descriptive analysis of frequency distribution is important due to the fact that data sets may have the same mean and standard deviation but the frequency curves may differ in their shape. A frequency distribution of the set of values that is not symmetrical (normal) is called asymmetrical or skewed. In a skewed distribution, extreme values in a data set move towards the upper or right tail, the distribution is positively skewed. As discussed the mean, median and mode are affected by the high valued observations in any data set. Among these measures of central tendency, the mean value gets affected largely due to the presence of high valued observations in one tail of a distribution. The mean value shifted substantially in the direction of high values. The mode value is unaffected, while the median value, which is affected by the numbers but not the values of such observations, is also shifted in the direction of high valued observations, but not as far as the mean. The median value changes about 2/3 as far as the mean value in the direction of high valued observations (called extremes). For a positively skewed distribution A.M.>Median>Mode, and for a negatively skewed distribution A.M.<Median<Mode.
8. Self-check Exercise with solutions
Q.1. From the following data on age of employees, calculate the coefficient of skewness.
Mode value lies in the class interval 35-40 thus
Karl Pearson’s coefficient of skewness:
The positive value of Skp indicates that the distribution is slightly positively skewed.
Q.2. Find the standard deviation and kurtosis of the following set of data pertaining to kilowatt hours (kwh) of electricity consumed by 100 persons in a city.
Learn More:
- Sharma, J K (2014). In: Business Statistics, II eds., S Chand & Company, N Delhi.
- Chandel, S.R.S. (2006). In: A Handbook of Agricultural Statistics, Anchal Prakashan mandir, Kanpur.
- http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htmGupta, K.R. (2012). Practical Statistics, Atlantic Publishers & Distributors (P) Ltd., New Delhi.
- https://brownmath.com/stat/shape.htm