You are on page 1of 41

Statistical Analysis

Session 2: Measures of Central Tendency


Why do we need to consider measures of
central tendency?
 Frequency distribution and graphical
representation of data fail on three counts
 The numerical value of an observation around
which most numerical values of other observations
in the data set show a tendency to cluster or
group, called central tendency
 The extent to which numerical values are
dispersed around the central value, called
variation
 The extent of departure of numerical values from
symmetrical distribution around the central value,
called skewness.
Objectives of average (central
tendency)
 Useful to extract and summarize the
characteristics of the entire data set.
 Since average represents the entire data set,
it is possible to make comparison between
two or more data sets. E.g. performance of a
sales person based on average sales over two
month or two years
 It becomes the base for computing other
measures such as dispersion, skewness,
kurtosis etc.
Measures of Central Tendency
 Mathematical Averages
 Arithmetic mean – simple or weighted
 Geometric mean
 Harmonic mean
 Averages of position
 Median
 Quartiles
 Deciles
 Percentiles
 Mode
Arithmetic Mean
 Direct Method
…1

…2

 Example 1: In a survey the profit earned by


five car manufacturing companies was 15, 20,
10, 35, and 32. Find the arithmetic mean of
the profits earned.
 Using the formula 2 above, the arithmetic mean
 = (15+20+10+35+32)/5 = 22.4
Arithmetic Mean
 Direct Method
 Example 2: If A,B,C, and D are four chemicals
costing Rs. 15, Rs.12, Rs.8 and Rs.9 per 100
gram, and are contained in a given compound
in the ratio of 1:2:3:4 parts respectively, what
should be price of the resultant compound.

 = Rs. 8.30
Arithmetic Mean
 Short-Cut Method
 In this method an arbitrary assumed mean is
taken as the basis of calculating the
deviations from individual values in the data
set
Arithmetic Mean
 Indirect Method
 Example 1: The daily earnings (in rupees) of
employees working on a daily basis in a firm
are given below. Calculate the average daily
earning for all employees.
Daily earnings (Rs.) 100 120 140 160 180 200 220
Number of employees 3 6 10 14 24 42 75

Solution - > Next slide


Solution to the Example
Let the assumed mean be 160
Daily earnings Number of di = xi – A fidi
(Rs.) Employees = xi - 160
xi fi

100 3 -60 -180


120 6 -40 -240
140 10 -20 -200
160 15 0 0
180 24 +20 480
200 42 +40 1680
220 75 +60 4500
175 6040
Exercises on Direct/ Indirect methods
 Exercise 1: Calculate the simple and weighted
arithmetic mean price per tonne of coal
purchased by a company for six months.
Month Price/ Tons Month Price/ Tons
Tonnes purchased Tonnes purchased
January 4205 25 April 5200 52
February 5125 30 May 4425 10
March 5000 40 June 5400 45
Exercises on Direct/ Indirect methods
 Exercise 2: Salary paid by a company to its
employees is as follows. Using the indirect
method calculate the mean salary for all
employees.
Designation Monthly Salary (Rs.) Number of persons

Senior Manager 35,000 1


Manager 30,000 20
Executives 25,000 70
Jr. Executives 20,000 10
Supervisors 15,000 150
Arithmetic Mean of grouped data
 Direct method
 Indirect methods
 Short Cut method
 Step deviation method
 The following assumptions should be made
 The class intervals are closed
 The width of each class interval should be equal
 The values of observations in each class interval
must be uniformly distributed between the lower
and upper limits
 The mid value of the class interval must represent
the average of all values in that class.
Arithmetic mean of grouped data
 Direct method

 Where mi = mid-value of the ith class interval


and fi = frequency of the ith class
interval
 Example 1: A company is planning to
improve plant safety. The following is the data
for the accidents which happened over 50
weeks. Calculate
No. of accidents 0-4 the5-9
average
10-14 accident
15-19 per
20-24
week
No. of weeks 5 22 13 8 2
Solution to the example
No. of Mid-value (mi) No. of weeks fimi
accidents (fi)
0–4 2 5 10
5–9 7 22 154
10 – 14 12 13 156
14 – 19 17 8 136
20 – 24 22 2 44
50 500
Arithmetic Mean of grouped data
 Short-cut method
No. of Mid-value No. of weeks fidi
accidents (mi) (fi)
0–4 2 -10 5 -50
5–9 7 -5 22 -110
10 – 14 12 A 0 13 0
14 – 19 17 5 8 40
20 – 24 22 10 2 20
50 -100
Arithmetic mean of grouped data
 Step deviation method

Where, A = assumed value for the arithmetic mean


h = width of the class intervals
mi = mid-value of the ith class interval
Arithmetic mean of grouped data
 Exercise 1: The following distribution gives the pattern of
overtime work done by 100 employees of a company.
Calculate the average overtime hours
Overtime hours 10-15 15-20 20-25 25-30 30-35 35-40

No. of employees 11 20 35 20 8 6
 Exercise 2: In an examination of 675 candidates, the
examiner supplied the following information. Calculate the
mean percentage of marks obtained
Marks No. of Marks No. of
obtained (%) students obtained (%) students

Less than 10 7 Less than 50 381


Less than 20 39 Less than 60 545
Less than 30 95 Less than 70 631
Less than 40 201 Less than 80 675
Calculation of missing values
From the following data, find the missing item,
given that the Mean Wage of the workers is
115.86 Wages X Number of workers f

110 25
112 17
113 13
117 15
X 14
125 8
128 6
130 2
Merits and Demerits of Arithmetic
Mean
 Merits
 Calculation of AM is simple
 Calculation is based on all observations and hence
it can be regarded as representative of the given
data
 It is capable of being treated mathematically and
hence, is widely used in statistical analysis
 It represents center of gravity of the distribution
because it balances the magnitudes of
observations which are greater and less than it
 It gives good basis of comparison of two or more
distributions
Merits and Demerits of Arithmetic
Mean
 Demerits
 It can neither be determined by inspection nor by
graphical location
 Arithmetic mean cannot be computed for a
qualitative data
 It is affected too much by extreme observations
and hence does not adequately represent data
consisting of some extreme observations
 AM cannot be computed when class intervals have
open ends
 Simple arithmetic mean gives greater importance
to larger values and lesser importance to smaller
values
Weighted Arithmetic Mean

Example 1: An examination was held to decide the award of a


scholarship
The weights of various subjects are different. The marks obtained by
3 students are given below:
Subject Weight Students
A B C
Mathematics 4 60 57 62
Physics 3 62 61 67
Chemistry 2 55 53 60
English 1 67 77 49

Calculate the weighted AM to award the scholarship


Solution to the exercise
Subject Weight Students

Student A Student B Student C


Marks x iw i Marks xiwi Marks xiwi
(xi) (xi) (xi)
Mathematics 4 60 240 57 228 62 248
Physics 3 62 186 61 183 67 201
Chemistry 2 55 110 53 106 60 120
English 1 67 67 77 77 49 49
244 603 248 594 238 618
Geometric Mean
 In many business and economic problems we deal
with quantities that change over a period of time.
In such cases if we aim to know the average rate of
change, we consider geometric mean rather than
arithmetic mean
 Example 1: If the population of the country has
been growing at a rate of 3%, 2.5%, 2.8%, 2% and
1.9% respectively over the last five years, what
has been the average growth rate for the period.
 In this case, we need to calculate the geometric
mean rather than the arithmetic mean
Geometric Mean
 Example 2: The following table gives the
annual rate of growth of sales of a company in
the last five years. Calculate the average
growth rate over these five years.
Year Growth rate Sales at the
end of the year

2003 5.0 105


2004 7.5 112.87
2005 2.5 115.69
2006 5.0 121.47
2007 10.0 133.61
Solution to the example
 The average annual growth rate =
 GM =
 = (X1 x X2 x X3 x X4 x X5)1/5
 =
 = 5.9 percent
 Simplified solution:
 Log (G.M.) =

 GM = antilog{ }
Geometric Mean
 Exercise 1: The rate of increase in population
of a country during the last three decades is 5
percent, 8 percent and 12 percent. Find the
average rate of growth during the last three
decades.
Uses, Merits and Demerits of GM
 Uses
 GM is highly useful in averaging, ratios, percentages,
and rate of increase between two periods
 GM is important for construction of index numbers
 Merits
 The value of GM is not much affected by extreme
observations and is computed by taking all observations
 Useful in studying economic and social data
 Demerits
 GM cannot be computed if any item in the series is
negative or zero
 Difficult to calculate
Harmonic Mean
 Harmonic Mean of a set of observations is
defined as the reciprocal of the arithmetic
mean of the reciprocal of the individual
observations

(For ungrouped data)

(For grouped data)


Harmonic Mean
 Example 1: An investor buys Rs. 20,000 worth
of shares of a company each month. During
the first 3 months he bought the shares at a
price of Rs. 120, Rs.160 and Rs. 210. After 3
months what is the average price paid by him
for the shares
 Solution

 = Rs. 166.66
Harmonic Mean
 Example 2: Find the harmonic mean of the
following distribution of data
Dividend yield (%) 2–6 6 – 10 10 – 14

Number of 10 12 18
companies
 Solution
Class (DY) Mid value No. of Reciprocal
(mi) companies (fi)

2–6 4 10 ¼ 2.5
6 – 10 8 12 1/8 1.5
10 – 14 12 18 1/12 1.5
N = 40 5.5
HM = 7.27
Merits and Demerits of HM/
Relationship between AM, GM and HM
 Merits
 It is based on all observations of the series
 It is suitable in case of series having wide dispersion
 Demerits
 Difficult to calculate
 It is not often used for analyzing business problems
 Relationship between AM, GM, and HM
 If all values are equal then AM = GM = HM
 If values are different then AM > GM > HM
 If the values of an observation takes the values a, ar,
ar2, ar3, …., arn, then (GM)2 = AM x HM
Averages of Position - Median
 Median – Median may be defined as the middle
value in the data set when the elements are
arranged in sequential order (either ascending or
descending)
 Median for ungrouped data:
 If number of observations (n) is odd, then
 Median = Size or value of { }th observation
 If the number of observations are odd, then

 Median = observation in the data set


 Exercise 1: What is the median value for the
following data set: 3.5, 4, 3.8, 3, 5.5, 5, 4.5. What is
the median if 5.8 is added to this data set?
Averages of Position - Median
 Median for grouped data

 l = lower class limit of the median class


interval
 cf = cumulative frequency of the class prior to
the median class interval
 f = frequency of the median class
 h = width of the median class interval
 n = total number of observations in the
distribution
Averages of Position - Median
 Exercise 2: A survey was conducted to determine
the age in years of 120 automobiles. The result of
such a survey is given in the table below. What is
the median age of the autos?
Age of auto 0–4 4–8 8 – 12 12 – 16 16 – 20
No. of autos 13 29 48 22 8

 Solution -> next slide


Averages of Position - Median
 Solution:
Age of autos Number of autos Cumulative
(years) (fi) frequency (cf)
0–4 13 13
4–8 29 42
Median
8 – 12 48 90
Class
12 – 16 22 112
16 – 20 8 120
120

= 8 + 1.5 = 9.5
Partition Values – Quartiles, Deciles,
Percentiles
 Quartiles: The values of observations in a data set,
when arranged in an ordered sequence, can be
divided into four equal parts, or quarters, using three
quartiles viz. Q1, Q2 and Q3. The first quartile Q1
divides the distribution in such a way that 25 percent
of the observations have a value less than Q1 and 75
percent of the values are more than Q1.

Q1 Q2 Q3
Partition Values – Quartiles, Deciles,
Percentiles
 Deciles: The values of observations in a data
set when arranged in an ordered sequence
can be divided into then equal parts, using
nine deciles (D1, D2, ….., D9)

 Percentiles: The values of observations in a


data set when arranged in an ordered
sequence can be divided into 100 equal parts
using 99 percentiles (P1, P2, ….., P99)
Partition Values – Quartiles, Deciles,
Percentiles
 Exercise 1: The following is the distribution of
weekly wages of 600 workers in a factory
Weekly No. of Weekly No. of
wages (Rs.) workers wages (Rs.) workers

Below 375 69 600 – 625 58


375 – 450 167 625 – 750 24
450 – 525 207 750 – 825 10
525 – 600 65
 Find the 1st quartile and 3rd quartile
 Find the 5th decile and 7th decile
 Find the 29th percentile and 95th percentile
 Find the median
Averages of Position - Mode
 Mode: Mode is that value of an observation which
occurs most frequently in the data set, i.e. the
point or class mark with the highest frequency.

= frequency of the modal class

= frequency of the class preceding the modal class

= frequency of the class following the modal class


 Exercise 1:ofFind
= width theclass
the modal mode of the distribution in the
interval
earlier example
Averages of Position - Mode
 Graphical method
15
Frequency

10

5 15 25 35 45 Class interval

Mode
Relationship between Mean, Median
and Mode

Mean=median=mode
Mode Median Mean Mean Median Mode

For positively skewed distribution, Mean>Median>Mode

For negatively skewed distribution, Mean<Median<Mode

Mean – Mode = 3(Mean – Median)

Next Session : Measures of Dispersion

You might also like