Frequency Distributions and Graphs2

1
Frequency Distributions and Graphs

1. Frequency Distributions
A frequency distribution is a collection of observations produced by sorting them
into classes and showing their frequency (or numbers) of occurrences in each class.
Constructing a frequency distribution is the most convenient way of organizing data.

1.1 Basic Types of Frequency Distribution:

1. Categorical frequency distribution is used for data that can be placed in
specific categories, such as nominal or ordinal level data.
Nominal data that includes names, labels or categories only
Ordinal data are arranged in some order but differences between data
values either cannot be determined or are meaningless.

Example of Categorical Frequency Distribution:
The following are obtained from data results of a sample survey with categories
A, B and C. The third column is called the column of frequency.

Category Tally Frequency (f) CF
A 6 6
B 9 15
C 15 30
Sum=n = 30

2. Frequency Distribution for Ungrouped Data observations are sorted into
classes of single values.

3. Frequency Distribution for Grouped Data observations are sorted into
classes of more than one value.

The following are the basic terminologies associated with frequency
tables.
a) lower class limit the smallest data value that can be included in the
class.
b) Upper class limit the largest data value that can be included in the
class.
c) Class boundaries are used to separate the classes so that there are no
gaps in the frequency distribution.
d) Class marks the midpoints of the classes.

2
lim lim it upper it lower
X
m
+
=
e) class width the difference between two consecutive lower class
limits.

llll-l
llll-llll
llll-llll-llll
2

Example 1:

Weekly Expenses of 80 Employees
Weekly Expenses Number of Employees
101 300 5
16
501 700 11
701 900 40
901 - 1100 8

Class width = 301 101 = 200

Example 2: When 40 people were surveyed at Greenbelt 3, they reported
the distance they drove to the mall, and the results ( in kilometers) are given below.

2 8 1 5 9 5 14 10 31 20
15 4 10 6 5 5 1 8 12 10
25 40 31 24 20 20 3 9 15 15
25 8 1 1 16 23 18 25 21 12

Construct a frequency distribution table.

Cumulative Frequency for a table whose classes are in increasing order is the sum
of the frequencies for that class and all previous classes. It is used when cumulative
totals are desired.

Cumulative frequency for a table whose classes are in decreasing order is the sum of
the frequencies for that class and all succeeding classes.

2. Histograms, Frequency Polygons, and Ogives

Most people comprehend the meaning of data easier if they are presented
graphically than numerically.

Histograms display data using vertical bars of various heights to represent the
frequencies.

301 500
variable
2
nd
class
Lower limit of
The 5
th
class
upper limit of
The 5
th
class
Frequency of
the 2
nd
class
frequency
Class boundaries
3

2.2 Frequency Polygon display the data by using lines that connect points plotted
for the frequencies at the midpoints of the classes.

2.3 Ogive represents the cumulative frequencies of the classes.

2.4 Pie Graph a circle that is divided into sections of wedges according to the
percentage of frequencies in each category of the distribution.

Example 3: A survey of 500 families were asked the question Where are you
planning to spend your vacation this summer?. It resulted in the following
distribution and the corresponding pie graph.

Place Number of People Percentage
Boracay 200 40%
Palawan 125 25%
Tagaytay 90 18%
Baguio 35 7%
None of the Above 50 10%

frequency
Class midpoints
cumulative
frequency
Class boundaries
Boracay 40%
None of the above
10%
Palawan
25%
7% Baguio
Tagaytay
18%
4

3. Data Description

Measures of Central Tendency (Average) focuses on the average or center of
data.

Ungrouped Data: Mean:
(population mean)

(sample mean)

N= total number of observations in the population
n=total number of observations in the sample

Grouped Mean

n
x f
mean
m
=
.

Ungrouped Data: The median is the midpoint of the data array. Before finding this
value, the data is arranged in order, from least to greatest or vice versa. The median will
either be a specific value or will fall between two values.

Grouped Median
( )
md
L w
f
cf
n
Md median +
|
|
|
|
.
|
\
|

=
2
) (
Where n sum of frequencies
cf cumulative frequency of the class preceding/before the median
class
f frequency of the median class
w class width
L
md
lower boundary of the median class

The median class is the one that contains the midpoint of data.

Ungrouped Data: The modeis the value that occurs most often in the data set. A data
can have more than one or none at all.

Grouped Mode
w
d d
d
L Mo e
Mo
|
|
.
|
\
|
+
+ =
2 1
1
) ( mod
Where L
Mo
lower boundary of the modal class
w class width
d
1
difference of the frequency of the modal class and the class
preceding it
d
2
difference of the frequency of the modal class and the class
succeeding it
5

The modal class is the class with the largest frequency.
Midrange: This is a rough estimate of the middle value.

()

Weighted Mean: This is used to find the mean of the values of the data set that are not
equally represented. The weighted average can be found by multiplying the value by its
corresponding weight and dividing the sum of the products by the sum of their weights.

Geometric Mean:

Harmonic Mean:

Example 4 : A recent survey of a new cola reported the ff. percentages of people who
liked the taste. Find the weighted mean of the percentages.
Area %favored No. surveyed
1 40 1000
2 30 3000
3 50 800

Shapes of Distribution
a. Positively Skewed Distribution the majority of the data values falls to
the left of the mean and clusters to the lower end of the distribution.
b. Symmetrical Distribution the data values are evenly distributed on
both sides of the mean. Also, when the distribution is unimodal, the
mean, median, and mode are the same and are at the center of the
distribution.
c. Negatively Skewed Distribution the majority of the data values falls
to the right of the mean and clusters at the upper end of the
distribution.

Mean
Median
Mode
Symmetrical
y
0
x
Positively skewed
y
0
x
mode
median
mean
Negatively skewed
y
0
x
mode
median
mean
6

Measures of Variation for Grouped Data

Range difference between the largest and the smallest value in a given
data.
Variance and Standard Deviation

Ungrouped Data:
()
(population variance)
(standard deviation)

Unbiased estimator of the population variance:

( )
(sample variance)

Where ; x = observed value; n = sample size

s = sample standard deviation =

Grouped Data:

( )
( ) 1
2
2
2

=

n n
x f x f n
s
Where : x = class midpoint

Example 5: For 108 randomly selected high school students, the following IQ
frequency distribution were obtained.

Class Limits Frequency
90-98 6
99-107 22
108-116 43
117-125 28
126-134 9

Coefficient of Variation a statistic that allows us to compare two different data sets that
have different units of measurement.
For samples: % 100 =
x
s
CV

For populations: % 100
o
= CV
The data with larger CV is more variable.

Coefficient of Skewness
A measure to determine the skewness of a distribution is called Pearson
coefficient of skewness. The formula is
7

s
Md X
SK
|
|
.
|
\
|

=
__
3

Where
__
X -mean, Md-median, s standard deviation

When the distribution is symmetrical, the coefficient is zero; when the
distribution is positively skewed, the coefficient is positive; when the distribution
is negatively skewed, the coefficient is negative.

Example 6: Find the coefficient of skewness of a distribution with mean 10,
median 8 and standard deviation 3.

3.2.5 Measure of Kurtosis
Even if the curves of distributions have the same coefficient of skewness,
these curves may still differ in the sharpness of their peaks. The following figures
show different types of symmetrical curves.

Ungrouped Data:
( )

Grouped Data:
( )

Where x = class midpoint and s = sample st
d dev

A distribution is said to be : Mesokurtic if K=3
Leptokurtic if K>3
Platykurtic if K<3

Example 7: Calculate the measure of kurtosis for the data Example 5.

Measures of Position for Grouped Data

Standard Scores or Z scores measures the distance an observation and the mean,
measured in units of standard deviation.

s
x x
z
_
deviation standard
mean value
=
=
Mesokurtic
(normal)
Leptokurtic
(more peak)
Platykurtic
(flat-topped)
8

If z score is positive, the score is above the mean. If z =0, score = mean. If z <0,
score < mean.

Example 8: An IQ test has a mean of 105 and a standard deviation of 20. Find the
corresponding z score for each IQ.
a) 88 b) 122 c) 110

Grouped Data
The quartiles, deciles, percentiles can be determined using the following
formula.
w
f
cf kn
L |
.
|
\
|
+
Where k is equal to: i/4 for quartiles; i/10 for deciles; i/100 for percentiles
i ith quartile, decile, or percentile
L lower boundary of the quartile, decile or percentile class
n total number of observations
w class width
cf
p
frequency of the preceding class
f frequency of the quartile, decile or percentile

Example 9: Find the third quartile, 4
th
decile and 7
th
percentile for the given
frequency distribution below.

Class Boundaries Frequency cf
52.5-63.5 6 6
63.5-74.5 12 18
74.5-85.5 25 43
85.5-96.5 28 71
96.5-107.5 14 85
107.5-118.5 5 90

Frequency Distributions and Graphs2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Frequency Distributions and Graphs2

Uploaded by

Copyright:

Available Formats

1

Frequency Distributions and Graphs

You might also like