7 Measures of Central Tendency: Averages of Positions (Median, Mode, Quartile, Deciles, Percentile)
sanjay mishra
Measures of Central Tendency: Averages of Positions
(Median, Mode, Quartile, Deciles, Percentile)
Objectives
- After studying this module you would be able to understand:
- Concept of partition values; Median;
- Quartiles; Deciles
- Percentiles;
- Methods of calculating different partition values;
- Merits, demerits and uses of different partition values; Ogives;
- Modes;
- Methods of calculating mode; and Merits, demerits and uses of mode.
Introduction
“Uttar Pradesh and Bihar’s populations have the lowest median ages-or youngest populations-in India while Kerala and Tamil Nadu have the highest median ages, according to Census 2011 data, compiled by Bengaluru-based think tank Takshashila Institution.
The median age is the age which divides the population into two equal halves, i.e. there are as many people older than the median age as there are people younger than it. A low median age would suggest that a country’s population has more young people than older people.”
Business Standard, New Delhi September 27, 2016.
Median is a positional average which divides the data in to two equal parts when the data has been arranged either in ascending order or descending order. Similarly, there are other positional values which divide the ordered data into different number of equal parts, like, Quartiles divide in four equal parts, Deciles in ten equal parts and Percentiles in hundred equal parts.
A. Median
The median is the middle value in a set of data that has been arranged from smallest to largest. Half the values are smaller than or equal to the median, and half the values are larger than or equal to the median.
As distinct from the arithmetic mean which is calculated from each and every item in the series, the median is what is called ‘positional average’. The place of the median in a series is such that an equal number of items lie on either side of it, i.e. it splits the observations into two halves. We can also say that 50% of the observations lie above median value, while rest 50% of the observations lie below median value, i.e. median lies in the middle of the series.
Calculation of Median – Ungrouped Data:
Step-1: Arrange the data in ascending or descending order of magnitude.
Step-2: Number of observation can be even or odd.
Case-i If the number of observations is odd then median is the n 1/2 th observation in 2 the arranged order
Suppose a researcher wants to determine the median for the following numbers. 14, 21, 17, 22, 16, 19, 16
The researcher arranges the numbers in an ascending order. 14, 16, 16, 17, 19, 21, 22
Since there are seven numbers, the median is the 7 1 /th 2 observation i.e. 4th observation. As 17 occur at 4th place therefore median is 17.
Case-ii If the number of observations is even then the median is the mean of n/2 1 th observations in the arranged order
Suppose a researcher wants to determine the median for the following numbers. 14, 21, 17, 22, 16, 19, 16, 25
The researcher arranges the numbers in an ascending order. 14, 16, 16, 17, 19, 21, 22, 25
Since there are eight numbers, the median is the mean of n2 th and n/2-1th observations i.e. mean of 4th and 5th observations. As 17 occur at 4th place and 19 occur at 5th place therefore median is 18.
Calculation of Median-Grouped Data:
The median of grouped data can be calculated by using the following formula.
Where l1 =lower limit of the median class
l2 = upper limit of the median class
m = N/2 , N = total frequency
f = frequency corresponding to the median class
c = cumulative frequency of the class preceding the median class.
Example 1:- Find the median income from the following table showing the income distribution of persons in a particular region.
Income in Rs (in ’000) | No. Of persons (in hundreds) |
Below 10 | 2 |
Below 20 | 5 |
Below 30 | 9 |
Below 40 | 12 |
Below 50 | 14 |
Below 60 | 15 |
Below 70 | 15.5 |
70 and over | 15.6 |
Solution
First of all we make the classes continuous and calculate the frequencies & cumulative frequencies for different classes.
Income in Rs | No. Of persons | Cumulative Frequency |
(in ’000) | (in hundreds) | (less than type) |
0-10 | 2 | 2 |
10-20 | 3 | 5 |
20-30 | 4 | 9 |
30-40 | 3 | 12 |
40-50 | 2 | 14 |
50-60 | 1 | 15 |
60-70 | 0.5 | 15.5 |
70 and over | 0.1 | 15.6 |
N= ∑f =15.6 |
Now, m =N/2=15.6/2=7.8
Now putting these values in the median formula we get
Median = 20+104 (7.8-5) = 20+ 52 x 2.8
= 20+5 x 1.4 = 27 Hence, median income is Rs 27,000.
Important mathematical property of median:
The sum of the deviations of the items from median, ignoring signs is the least.
n
i.e xi md is least.
i1
Merits of Median:
The median can be used in case of frequency distribution with open-end classes.
The median is not affected by extreme observations.
The value of median can be determined graphically where as the value of mean cannot be determined graphically.
It is easy to calculate and understand.
Demerits of Median:
For calculating median it is necessary to arrange the data in some order, ascending or descending, where as other averages do not need arrangement.
Since it is a positional average its value is not determined by all the observations in the series.
Median is not capable for further algebraic calculations.
The sampling stability of the median is less as compared to mean.
B. Quartiles:
There are three quartiles, i.e. Q1, Q2 and Q3 which divide the total data into four equal parts when it has been orderly arranged. Q1, Q2 and Q3 are termed as first quartile, second quartile and third quartile or lower quartile, middle quartile and upper quartile, respectively.
The first quartile, Q1, separates the first one-fourth of the data from the upper three-fourths and is equal to the 25th percentile. The second quartile, Q2, divides the data into two equal parts (like median) and is equal to the 50th percentile. The third quartile, Q3, separates the first three-quarters of the data from the last quarter and is equal to 75th percentile.
Calculation of Quartiles:
The calculation of quartiles is done exactly in the same manner as it is in case of the calculation of median.
The different quartiles can be found using the formula given below:
Where,
l1= lower limit of ith quartile class
l2= upper limit of ith quartile class
c = cumulative frequency of the class preceding the ith quartile class
f = frequency of ith quartile class.
C. Deciles
Deciles are the partition values which divide the arranged data into ten equal parts.
There are nine deciles i.e. D1, D2, D3…….. D9 and 5th decile is same as median or
Q2, because it divides the data in two equal parts.
Calculation of Deciles:
The calculation of deciles is done exactly in the same manner as it is in case of calculation of median.
The different deciles can be found using the formula given below:
Where,
l1= lower limit of ith decile class
l2= upper limit of ith decile class
c = cumulative frequency of the class preceding the ith decile class
f = frequency of ith decile class.
D. Percentiles
Percentiles are the values which divide the arranged data into hundred equal parts. There are 99 percentiles i.e. P1, P2, P3, ……..,P99. The 50th percentile divides the series into two equal parts and P50 = D5 = Median.
Similarly the value of Q1 = P25 and value of Q3 = P75
Calculation of Percentiles:
The different percentiles can be found using the formula given below:
Where,
l1= lower limit of ith percentile class
l2= upper limit of ith percentile class
c = cumulative frequency of the class preceding the ith percentile class
f = frequency of ith percentile class.
Merits of Quartiles, Deciles and Percentiles:
These positional values can be directly determined in case of open end class intervals.
These positional values can be calculated easily in absence of some data. These are helpful in the calculation of measures of skewness.
These are not affected very much by the extreme items. These can be located graphically.
Demerits of Quartiles, Deciles and Percentiles:
These values are not easily understood by a common man.
These values are not based on all the observations of a series.
These values cannot be computed if items are not given in ascending or descending order.
These values have less sampling stability.
E. Ogives
An Ogive is a way to graph information showing cumulative frequencies. It shows how many of values of the data are below certain boundary.
Construction of an Ogive: To make an ogive, first a cumulative-frequency table is constructed. Vertical scale (y-axis) on the graph represents cumulative frequencies and horizontal scale (x-axis) represents variable of interest.
Less than type Ogive curve: If we start from the upper limit of class intervals and then add class frequencies to get cumulative frequency. Then such a distribution is less than type cumulative frequency distribution and plotting it on graph gives a less than type ogive curve. The less than type ogive looks like an elongated ‘S’.
More than type Ogive curve: If we start from the lower limits of class intervals and then subtract class frequencies from the cumulative frequency. Then such a distribution is more than type cumulative frequency distribution and plotting it on graph gives a more than type ogive curve. More than type ogive looks like an elongated ‘S’ turned upside down.
Determining the Median graphically
Median can also be determined graphically by using ogives through two methods given below.
Method-1:
Step-1: Draw two ogives- one by less than method and other by more than method.
Step-2: From the point where both these curves intersect each other draw a perpendicular on the X-axis.
Step -3: The point where this perpendicular touches the X-axis gives the value of median.
Method-2:
Step-1: Draw only one ogive by less than method or more than method by taking variable on the X-axis and cumulative frequency on the Y-axis.
Step-2: Determine the value of N/2.
Step-3: Locate this value on the Y-axis and from it draw a line parallel to X-axis which meets the ogive
Step -4: The point where this parallel line touches the ogive from it drop a perpendicular on X-axis. This point on X-axis gives the value of median.
Similarly, the other partition values like quartiles, deciles, etc can be also determined graphically.
F. Mode
The value of variable which occurs most frequently in data is called Mode. The concept of mode is often used in determining sizes. As an example, the most common shoe size is 6 or the most common shirt size is 42. It is a very appropriate measure of central tendency for nominal data.
Calculation of Mode – Ungrouped Data:
In this case mode is obtained by inspection.
Example 2:- The blood pressure of 9 patients is as follows:
86, 87, 80, 86, 76, 86, 90, 88, 86.
Calculate its mode.
Solution
The mode value is 86, as it occurs maximum times (i.e.4 times).
Note: In certain cases there may not be a mode or there may be more than one mode.
Example 3:
b) 3, 4, 5, 5, 4, 2, 1 (modes 4 and 5)
c) 8, 8, 8, 8, 8 (no mode)
A series of data, having one mode is called ‘unimodal’ and a series of data having two modes is called ‘bimodal’. It may also have several modes and be called ‘multimodal’.
Calculation of Mode – Grouped Data:
In case of grouped data, modal class is determined by inspection or by preparing grouping and analysis tables. Then we apply the following formula.
f 0 =frequency of the class preceding the modal class.
f 2 = frequency of the class succeeding the modal class.
i = size of the class.
f1 =frequency of the modal class
l1 = lower limit of the modal class.
Note:
1) While applying the above formula for calculating mode, it is necessary to see that the class intervals are uniform throughout. If they are unequal they should first be made equal on the assumption that the frequencies are equally distributed throughout.
2) In case of bimodal distribution the mode can’t be found.
Finding mode in case of bimodal distribution: In a bimodal distribution the value of mode can not be determined by the help of the above formulae. In this case the mode can be determined by using the empirical relation given below.
Mode = 3Median – 2Mean
And the mode which is obtained by using the above relation is called ‘Empirical mode’
Merits of Mode:
It is easy to calculate and simple to understand. It is not affected by the extreme values.
The value of mode can be determined graphically.
Its value can be determined in case of open-end class interval.
Demerits of Mode:
It is not suitable for further mathematical treatments. The value of mode cannot always be determined.
The value of mode is not based on each and every item of the series. The mode is not rigidly defined.
Summary
Partition values divide the data, when arranged in either ascending order or descending order, into different number of equal parts. Median is the middle value in a set of arranged data. The place of the median in a series is such that an equal number of items lie on either side of it, i.e. it splits the observations into two halves. We can also say that 50% of the observations lie above median value, while rest 50% of the observations lie below median value. Quartiles divide the total data into four equal parts when it has been orderly arranged. The first quartile, Q1, separates the first one-fourth of the data from the upper three-fourths and is equal to the 25th percentile. The second quartile, Q2, divides the data into two equal parts (like median) and is equal to the 50th percentile. The third quartile, Q3, separates the first three-quarters of the data from the last quarter and is equal to 75th percentile. Deciles are the partition values which divide the arranged data into ten equal parts whereas percentiles divide the data into hundred equal parts. Ogives are cumulative frequency graphs which help in finding different partition values graphically.
Mode is the value of variable which occurs most frequently in data. The concept of mode is often used in determining sizes. It is a very appropriate measure of central tendency for nominal data.
Learn More:
- Business Research Methods’ Authored by Naval Bajpai, Published by Pearson’s India PHI
- Business Statistics Authored by Dr. K.L. Gupta, Published by Nirupam Publications.
- Business Statistics Authored by G.C. Beri, Published by TMH Publications.
- Statistics For Managers using Microsoft Excel by David M. Levine David F. Stephan Timothy C. Krehbiel Mark L. Berenson, Published by PEARSON
- Business Statistics by Ken Black, Published by John Wiley & Sons, Inc