You are on page 1of 31

DISPERSION

RANGE
VARIANCE
COEFFICIENT OF VARIATION
PEARSONS MEASURE OF SKEWNESS

Another important characteristic of a


data set is how it is distributed, or how far
each element is from some measure of
central tendency (spread).
Two variables can have same value in
the measure of central tendencies but
dissimilar in other aspects such as
consistency, performance and
dependable.

Measurements of central tendency


(mean, mode and median) locate the
distribution within the range of possible
values, measurements of dispersion
describe the spread of values.
A small dispersion or variability ensures a
good representation of the data by
measures of central tendency.
In other words, mean, median or mode
that is used has more credibility (or more
believable or true) when the variation
about it is small.

Example:
Number of minutes 20
clients waited to see a
consultant
Consultant
X
Y
05 15
11
12 03
10
04 19
11
37 11
09
06 34
09

Consultant X:
Sees some clients
almost
immediately
Others wait over
1/2 hour
Highly inconsistent
Consultant Y:
Clients wait about
10 minutes
9 minutes least
wait and 13
minutes most
Highly consistent

12
13
10
13
11

Measure the total spread in the batch of


data.
Simple, easily calculated measure of
total variation in the data.
However, it does not take account how
the data are actually distributed
between the smallest and largest value.

Calculation:

1. Find largest and smallest number in data

set
2. Subtract smallest number from largest
number
3. Difference = Range

Ungrouped data:

=
highest value lowest value
Grouped data:
=

upper limit of last class


lower limit of first class

Example:

Consultant X:
37 minutes highest
value
3 minutes smallest
value
Range = 37 - 3 = 34
minutes

Consultant Y:
13 minutes highest
value
9 minutes smallest
value
Range 13 - 9 = 4
minutes

Number of minutes 20
clients waited to see a
consultant
Consultant
X
Y
05 15
11
12 03
10
04 19
11
37 11
09
06 34
09

12
13
10
13
11

http://www.mathgoodies.com/lessons/v
ol8/range.html

The

interquartile range (IQR) is the


distance between the 75th percentile
and the 25th percentile. The IQR is
essentially the range of the middle
50% of the data. Because it uses the
middle 50%, the IQR is not affected by
outliers or extreme values.
This range is difference between the
third and first quartile = Q3 - Q1

Advantages over the range:


1. Not sensitive to extreme values in a
data set
2. Not sensitive to the sample size
Calculation:
1. Put the values in order from low to high
2. Divide the set of values into quarters
(1/4s)
3. For the values in the middle 50% -subtract the lower value from the
higher value

Example:
16 sales people were given 12 problems
associated with on-the-road sales
For keeping automobile expenses, the
rankings follow:
1 1 1 2 3 4 5 6 7 8 8 9 10 11 11 12
Location Q1
= (n +1)
= (16 + 1)
= 4.25 = 4th
observation

Q1=2

Range = 12 -1 = 11

Location Q3
= 3/4 (n +1)
= 3/4(16 +1)
= 12.75 = 13rd
observation

Q3= 10

Interquartile Range = 10 - 2 = 8

50% of respondents lie within 8 rank order points of each other!

It is based on the lower quartile Q1 and


the upper quartile Q3.

The difference Q3 - Q1 is called the inter


quartile range. The difference Q3 - Q1
divided by 2 is called semi-inter-quartile
range or the quartile deviation.
Thus
Quartile Deviation (Q.D) = Q3 - Q1
2

The quartile deviation is a slightly better


measure of absolute dispersion than the
range. But it ignores the observation on
the tails.
If we take difference samples from a
population and calculate their quartile
deviations, their values are quite likely to
be sufficiently different. This is called
sampling fluctuation. It is not a popular
measure of dispersion.
The quartile deviation calculated from
the sample data does not help us to
draw any conclusion (inference) about
the quartile deviation in the population.

These are the most familiar


measurements of dispersion.
Variance is the arithmetic mean (average)
of the square of the difference between
the value of an observation and the
arithmetic mean of the value of all
observations.

http://www.mathsisfun.com/standarddeviation.html

Standard deviation is the square root of


the variance.

Most

frequently used measure of


dispersion
It is the average of the distances of
the observed values from the mean
value for a set of data
Basic rule -- more spread will yield a
larger SD

Calculation:
1. Calculate the arithmetic mean (AM)
2. Subtract each individual value from the AM
3. Square each value -- multiply it times itself
4. Sum (total) the squared values
5. Divide the total by the number of values (N)
6. Calculate the square root of the value

1
n 1
1
n 1

(x x)2

f (x x)

OR

OR

1
n 1

1
n 1

f x


fx

Sum of squares of individual deviations from arithmetic mean

SD =

Number of items
Example:

Scores

Deviations From
Mean

Squares of
Deviations

-13

169

-11

121

-09

81

-08

64

No. of scores =
10

-03

-02

M = 143/10 = 14

+01

+05

25

+20

400

+23

529
140
3

SD =

1403
= 11.8
1
0

143

Sum of squares of individual deviations from arithmetic mean

SD =

Number of items
Scores

Example:

No. of scores =
10
M = 109/10 = 11

SD =

19
1
0

= 1.4

Deviations From
Mean

Squares of
Deviations

09

-02

09

-02

10

-01

10

-01

11

00

11

00

11

00

12

+01

13

+02

13
109

+02

4
19

NORMAL DISTRIBUTION CURVE


1 Standard Deviation

+1

-1
2.2
9.6

14
11

68%

25.8
12.4

NORMAL DISTRIBUTION CURVE


2 Standard Deviations

+2

-2
01
8.2

14
11

95%

37
13.8

NORMAL DISTRIBUTION CURVE


3 Standard Deviations

-3
01
6.8

+3
14
11

99.7%

37
15.2

Range

Use the range sparingly as


the measure of dispersion

Interquartile
Range

Median is measure of
central tendency -- use
the interquartile range

Mean is measure of
central tendency -- use
the standard deviation

Standard
Deviation

The coefficient of variation (CV), also


known as relative variability, equals the
standard deviation divided by the mean. It
can be expressed either as a fraction or a
percent.
It only makes sense to report CV for a
variable, such as mass or enzyme activity,
where 0.0 is defined to really mean zero.
A weight of zero means no weight. So it
would be meaningless to report a CV of
values expressed

100
CV =
x

The terms skewed and askew are used to


refer to something that is out of line or distorted
on one side. When referring to the shape of
frequency or probability distributions,
skewness refers to asymmetry of the
distribution.

A distribution with an asymmetric tail extending


out to the right is referred to as positively
skewed or skewed to the right, while a
distribution with an asymmetric tail extending
out to the left is referred to as negatively
skewed or skewed to the left. Skewness can
range from minus infinity to positive infinity.

PEARSONS MEASURE OF SKEWNESS


Mean Mode
S tan dard Deviation

or

3 Mean Median
S tan dard Deviation

You might also like