Lecture 7-Measure of Dispersion

DISPERSION
RANGE
VARIANCE
COEFFICIENT OF VARIATION
PEARSONS MEASURE OF SKEWNESS
Another important characteristic of a

data set is how it is distributed, or how far
each element is from some measure of
central tendency (spread).
Two variables can have same value in
the measure of central tendencies but
dissimilar in other aspects such as
consistency, performance and
dependable.
Measurements of central tendency

(mean, mode and median) locate the
distribution within the range of possible
values, measurements of dispersion
describe the spread of values.
A small dispersion or variability ensures a
good representation of the data by
measures of central tendency.
In other words, mean, median or mode
that is used has more credibility (or more
believable or true) when the variation
about it is small.
Example:
Number of minutes 20
clients waited to see a
consultant
Consultant
X
Y
05 15
11
12 03
10
04 19
11
37 11
09
06 34
09
Consultant X:
Sees some clients
almost
immediately
Others wait over
1/2 hour
Highly inconsistent
Consultant Y:
Clients wait about
10 minutes
9 minutes least
wait and 13
minutes most
Highly consistent
12
13
10
13
11
Measure the total spread in the batch of

data.
Simple, easily calculated measure of
total variation in the data.
However, it does not take account how
the data are actually distributed
between the smallest and largest value.
Calculation:
1. Find largest and smallest number in data
set
2. Subtract smallest number from largest
number
3. Difference = Range
Ungrouped data:
=
highest value lowest value
Grouped data:
=
upper limit of last class

lower limit of first class
Example:
Consultant X:
37 minutes highest
value
3 minutes smallest
value
Range = 37 - 3 = 34
minutes
Consultant Y:
13 minutes highest
value
9 minutes smallest
value
Range 13 - 9 = 4
minutes
Number of minutes 20
clients waited to see a
consultant
Consultant
X
Y
05 15
11
12 03
10
04 19
11
37 11
09
06 34
09
12
13
10
13
11
http://www.mathgoodies.com/lessons/v
ol8/range.html
The
interquartile range (IQR) is the

distance between the 75th percentile
and the 25th percentile. The IQR is
essentially the range of the middle
50% of the data. Because it uses the
middle 50%, the IQR is not affected by
outliers or extreme values.
This range is difference between the
third and first quartile = Q3 - Q1
Advantages over the range:

1. Not sensitive to extreme values in a
data set
2. Not sensitive to the sample size
Calculation:
1. Put the values in order from low to high
2. Divide the set of values into quarters
(1/4s)
3. For the values in the middle 50% -subtract the lower value from the
higher value
Example:
16 sales people were given 12 problems
associated with on-the-road sales
For keeping automobile expenses, the
rankings follow:
1 1 1 2 3 4 5 6 7 8 8 9 10 11 11 12
Location Q1
= (n +1)
= (16 + 1)
= 4.25 = 4th
observation
Q1=2
Range = 12 -1 = 11
Location Q3
= 3/4 (n +1)
= 3/4(16 +1)
= 12.75 = 13rd
observation
Q3= 10
Interquartile Range = 10 - 2 = 8
50% of respondents lie within 8 rank order points of each other!
It is based on the lower quartile Q1 and

the upper quartile Q3.
The difference Q3 - Q1 is called the inter

quartile range. The difference Q3 - Q1
divided by 2 is called semi-inter-quartile
range or the quartile deviation.
Thus
Quartile Deviation (Q.D) = Q3 - Q1
2
The quartile deviation is a slightly better

measure of absolute dispersion than the
range. But it ignores the observation on
the tails.
If we take difference samples from a
population and calculate their quartile
deviations, their values are quite likely to
be sufficiently different. This is called
sampling fluctuation. It is not a popular
measure of dispersion.
The quartile deviation calculated from
the sample data does not help us to
draw any conclusion (inference) about
the quartile deviation in the population.
These are the most familiar

measurements of dispersion.
Variance is the arithmetic mean (average)
of the square of the difference between
the value of an observation and the
arithmetic mean of the value of all
observations.
http://www.mathsisfun.com/standarddeviation.html
Standard deviation is the square root of

the variance.
Most
frequently used measure of

dispersion
It is the average of the distances of
the observed values from the mean
value for a set of data
Basic rule -- more spread will yield a
larger SD
Calculation:
1. Calculate the arithmetic mean (AM)
2. Subtract each individual value from the AM
3. Square each value -- multiply it times itself
4. Sum (total) the squared values
5. Divide the total by the number of values (N)
6. Calculate the square root of the value
1
n 1
1
n 1
(x x)2
f (x x)
OR
OR
1
n 1
1
n 1
f x

fx
Sum of squares of individual deviations from arithmetic mean
SD =
Number of items
Example:
Scores
Deviations From
Mean
Squares of
Deviations
-13
169
-11
121
-09
81
-08
64
No. of scores =
10
-03
-02
M = 143/10 = 14
+01
+05
25
+20
400
+23
529
140
3
SD =
1403
= 11.8
1
0
143
Sum of squares of individual deviations from arithmetic mean
SD =
Number of items
Scores
Example:
No. of scores =
10
M = 109/10 = 11
SD =
19
1
0
= 1.4
Deviations From
Mean
Squares of
Deviations
09
-02
09
-02
10
-01
10
-01
11
00
11
00
11
00
12
+01
13
+02
13
109
+02
4
19
NORMAL DISTRIBUTION CURVE

1 Standard Deviation
+1
-1
2.2
9.6
14
11
68%
25.8
12.4

2 Standard Deviations
+2
-2
01
8.2
14
11
95%
37
13.8

3 Standard Deviations
-3
01
6.8
+3
14
11
99.7%
37
15.2
Range
Use the range sparingly as

the measure of dispersion
Interquartile
Range
Median is measure of
central tendency -- use
the interquartile range
Mean is measure of
central tendency -- use
the standard deviation
Standard
Deviation
The coefficient of variation (CV), also

known as relative variability, equals the
standard deviation divided by the mean. It
can be expressed either as a fraction or a
percent.
It only makes sense to report CV for a
variable, such as mass or enzyme activity,
where 0.0 is defined to really mean zero.
A weight of zero means no weight. So it
would be meaningless to report a CV of
values expressed
100
CV =
x
The terms skewed and askew are used to

refer to something that is out of line or distorted
on one side. When referring to the shape of
frequency or probability distributions,
skewness refers to asymmetry of the
distribution.
A distribution with an asymmetric tail extending

out to the right is referred to as positively
skewed or skewed to the right, while a
distribution with an asymmetric tail extending
out to the left is referred to as negatively
skewed or skewed to the left. Skewness can
range from minus infinity to positive infinity.
PEARSONS MEASURE OF SKEWNESS

Mean Mode
S tan dard Deviation
or
3 Mean Median
S tan dard Deviation

Lecture 7-Measure of Dispersion

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7-Measure of Dispersion

Uploaded by

Copyright:

Available Formats

DISPERSION

Another important characteristic of a

Measurements of central tendency

Measure the total spread in the batch of

1. Find largest and smallest number in data

upper limit of last class

interquartile range (IQR) is the

Advantages over the range:

50% of respondents lie within 8 rank order points of each other!

It is based on the lower quartile Q1 and

The difference Q3 - Q1 is called the inter

The quartile deviation is a slightly better

These are the most familiar

Standard deviation is the square root of

frequently used measure of

Sum of squares of individual deviations from arithmetic mean

Sum of squares of individual deviations from arithmetic mean

NORMAL DISTRIBUTION CURVE

NORMAL DISTRIBUTION CURVE

NORMAL DISTRIBUTION CURVE

Use the range sparingly as

The coefficient of variation (CV), also

The terms skewed and askew are used to

A distribution with an asymmetric tail extending

PEARSONS MEASURE OF SKEWNESS

You might also like