You are on page 1of 7


Frequency distribution is the tabular arrangement of numerical data showing its classes and
the frequency (denoted by the symbol f ) or times of occurrence of the given values belonging to
these classes. Data arranged in a frequency distribution table is also referred to as interval or
ratio scale.

Classes are grouped according to predetermined class interval that is defined by the end
numbers of each class. These are called class limits and are
composed of a lower limit (LL) and upper limit (UL). Their midpoint is known as the class mark
(often denoted by X). The true class limits though are the class boundaries. They refer to the
values midway between the upper limit of a specific class and the lower limit of the next (upper
class boundary, denoted by UCB) or the values midway between the lower limit of a certain
class and upper limit of the next (lower class boundary denoted by LCB). The difference
between the upper class boundary of one particular class and another before it is called class
size. It can also be computed using the class limits instead of class boundaries.

In constructing a frequency distribution table, it should be clearly understood that classes

must be mutually exclusive which means that the same value or observation cannot appear in
two classes. Likewise, classes must be exhaustive which means that all values of observation
must be accounted for.

Steps in constructing a frequency distribution

1. Arrange the raw data in either ascending or descending order.

2. Determine the number of classes. While there is no definite rule as to how many classes
should be used, in general however, they should not be less than 5 or more than 15. The
Sturges’ formula can help in estimating the number of classes.
Sturges’ formula K = 1 + 3.322 log n
Where: K = number of classes
n = number of observations

3. Determine the approximate class size. This can be done by doing the following steps:

a. Compute for the range (R),

b. R = maximum value – minimum value
c. Solve the class size (C) by dividing the range (R) by class size (K):
C = R /K
If necessary, round off the value of C

4. Determine the lowest class limit making sure that the smallest value in the set of data is
included. To find for all class limits, simply add the class size to the limit of the previous

5. Tally all the numerical figures in the appropriate classes. Get the sum and check against the
total number of values in the data set.
Using the data on the weights in kilograms of 60 diabetic female patients aged 30 – 50
years old, a frequency distribution table can be constructed following the steps discussed:

1. Organizing data in array.

55 60 65 68 71 77
57 60 65 70 72 77
57 60 65 70 72 77
58 63 65 70 72 77
59 63 65 70 74 80
59 63 66 70 74 81
59 63 66 70 74 83
59 63 66 71 74 83
60 63 67 71 75 85
60 65 68 71 75 88

2. Computing the class size with the use of Sturges’ formula.

= 1 + 3.322 log
K n
= 1 + 3.322 log
= 1 + 3.322 (1.778151)
= 1 + 5.907018
= 6.907018
3. Determining the class size
C =R/K
= 33 / 6.907018
= 4.777749

4. Deciding the lower class limit

The lower class limit should be 55 since it is the minimum value in the given set of data.
The upper class limit should be 59 since the class size is 5.Other classes are taken by
adding the class size to the limit of the previous class. The result, together with other
elements is as follows:
Frequency Distribution of the Weights In Kilograms of 60 Female Patients

Classes Tally f Class Boundaries X

55 - 59 8 54.5 - 59.5 57
60 - 64 11 59.5 - 64.5 62
65 - 69 12 64.5 - 69.5 67
70 - 74 17 69.5 - 74.5 72
75 - 79 6 74.5 - 79.5 77
80 - 84 4 79.5 - 84.5 82
85 - 89 2 84.5 - 89.5 87
n= 60

The complete frequency distribution table should look similar to the following tables, when using
class intervals and class boundaries, respectively:

Frequency Distribution of the Weights In Kilograms of 60 Female Patients

Weights f X
(In Kgs)

55 - 59 8 57
60 - 64 11 62
65 - 69 12 67
70 - 74 17 72
75 - 79 6 77
80 - 84 4 82
85 - 89 2 87


Frequency Distribution of the Weights In Kilograms of 60 Female Patients

Weights f X
(In Kgs)
54.5 - 59.5 8 57
59.5 - 64.5 11 62
64.5 - 69.5 12 67
69.5 - 74.5 17 72
74.5 - 79.5 6 77
79.5 - 84.5 4 82
84.5 - 89.5 2 87

The frequency distribution allows derivations of other distributions which can be used as tools
for interpretations of the given data set. These variations are the relative frequency and the
cumulative frequency distributions.

Relative frequency distribution

The relative frequency distribution (rf) expresses the ratio in percent the frequency to the total
frequency in the given set of data. It can be computed by dividing the number of observations of
each class by the total number of observations or the sample size and multiplying the result by
100. In formula,

rf  * 100
where rf = the relative frequency of each class
f = frequency of each class
n = sample size

To illustrate, consider the data on Table 3.2. The first class can be computed as follows:
rf  * 100
 0.1333 *100
= 13.33%

For interpretation, we can say that 13.33% female diabetic patients, aged 30 – 50 years old
weigh ranging from 55 to 59 kilograms. The complete list of relative frequency distribution is as

Relative Frequency Distribution of the Weights of 60 Female Patients (In Percentage)

Weights f rf%
(In Kgs)
55 - 59 8 13.33
60 - 64 11 18.33
65 - 69 12 20
70 - 74 17 28.33
75 - 79 6 10
80 - 84 4 6.67
85 - 89 2 3.33
60 99.99

By theory, a total of 100% should have resulted but the difference is very small and can
be attributed to rounding off of numbers.

Graphically, the above table can be shown through the use of a histogram or frequency
polygon. A histogram is composed of a group of adjacent rectangles whose base is on the
horizontal axis which either extends from the lower limit to the upper limit of the class
boundaries or which centers on the class marks. The width of the rectangle corresponds to the
class size while its height corresponds to the class frequency.

In plotting the histogram, it is assumed that frequencies are evenly distributed within the
class. In frequency polygon, it is assumed that the frequencies of each class are concentrated
at the midpoint of the class interval or class mark. A frequency polygon is a graph that is drawn
by plotting the frequencies against the class marks and connecting these plotted points with
straight lines. The resulting polygon is closed by adding class at both ends and plotting each
down to the horizontal axis against each class mark.

Cumulative frequency distribution

The cumulative frequency distribution shows the accumulation of frequencies of consecutive

classes from the start or end of the distribution. It can be derived by simply adding the class
frequencies. This type of distribution attempts to determine the partial sums from the data
categorized in terms of classes. As explained previously, it has two kinds. These are:

1. The less than cumulative frequency distribution (<cf) pertains to the distribution whose
frequencies are less than or below the upper class boundary they go with. Using the
same set of data on Table 3.3, the following <cf distribution can be computed:

Less than Cumulative Frequency Distribution of the Weights

In Kilograms of 60 Female Patients

Weights f <cf
(In Kgs)
54.5 - 59.5 8 8
59.5 - 64.5 11 19
64.5 - 69.5 12 31
69.5 - 74.5 17 48
74.5 - 79.5 6 54
79.5 - 84.5 4 58
84.5 - 89.5 2 60

To interpret the first <cf, we can say that 8 female diabetic patients weigh less than 59.5
kgs. Similarly, for the next <cf, we can say that 19 female diabetic patients weigh less than 64.5
kgs., and so on.

2. The greater than cumulative frequency distribution (>cf) pertains to the distribution whose
frequencies are greater than or above the lower class boundary they go with. Using similar
set of data, we can derive the following >cf distribution:
Frequency Distribution of the Weights In Kilograms of 60 Female Patients

Weights f >cf
(In Kgs)
54.5 - 59.5 8 60
59.5 - 64.5 11 52
64.5 - 69.5 12 41
69.5 - 74.5 17 29
74.5 - 79.5 6 12
79.5 - 84.5 4 6
84.5 - 89.5 2 2

For the first derived >cf, we can take 60 to mean that 60 female diabetic patients weigh
greater than 54.5 kilograms. For the second computed >cf , we can say that 52 female diabetic
patients weigh greater than 59.5 kilograms, and so on.
The graph of less than cumulative frequency is called less than ogive (>ogive) while that
of the greater than cumulative frequency is called greater than ogive (>ogive). These two type of
ogives can be drawn by plotting the <cf against the upper limit of the class boundaries for less
than ogive and plotting >cf against the lower limit of the class boundaries for greater than ogive.

Using the above set of data, the following is a sample of < and > ogives:

Figure 1. Frequency polygon of the Weights of 60 Female Patients

Exercise: For submission next meeting

The mini business sentiment index of PROC in 30 months is given below.

57.2 37.2 61.7 28.2 44.2
39.9 67.2 78.2 65.4 56.2
63.9 44.3 55.4 38.2 45.5
49.3 60.3 40.1 50.5 46.2
45.6 41.1 46.0 36.0 47.5
40.8 36.5 62.1 60.2 50.1

a. Construct the Frequency Distribution Table with the following:

i. Class limits
ii. Class boundaries
iii. Frequency
iv. Class mark
v. Relative Frequency
vi. <CF
vii. >CF
b. Construct a histogram with superimposed polygon
c. Construct the < ogive and > ogive.

You might also like