Professional Documents
Culture Documents
Frequency distribution is the tabular arrangement of numerical data showing its classes and
the frequency (denoted by the symbol f ) or times of occurrence of the given values belonging to
these classes. Data arranged in a frequency distribution table is also referred to as interval or
ratio scale.
Classes are grouped according to predetermined class interval that is defined by the end
numbers of each class. These are called class limits and are
composed of a lower limit (LL) and upper limit (UL). Their midpoint is known as the class mark
(often denoted by X). The true class limits though are the class boundaries. They refer to the
values midway between the upper limit of a specific class and the lower limit of the next (upper
class boundary, denoted by UCB) or the values midway between the lower limit of a certain
class and upper limit of the next (lower class boundary denoted by LCB). The difference
between the upper class boundary of one particular class and another before it is called class
size. It can also be computed using the class limits instead of class boundaries.
2. Determine the number of classes. While there is no definite rule as to how many classes
should be used, in general however, they should not be less than 5 or more than 15. The
Sturges’ formula can help in estimating the number of classes.
Sturges’ formula K = 1 + 3.322 log n
Where: K = number of classes
n = number of observations
3. Determine the approximate class size. This can be done by doing the following steps:
4. Determine the lowest class limit making sure that the smallest value in the set of data is
included. To find for all class limits, simply add the class size to the limit of the previous
class.
5. Tally all the numerical figures in the appropriate classes. Get the sum and check against the
total number of values in the data set.
Using the data on the weights in kilograms of 60 diabetic female patients aged 30 – 50
years old, a frequency distribution table can be constructed following the steps discussed:
55 60 65 68 71 77
57 60 65 70 72 77
57 60 65 70 72 77
58 63 65 70 72 77
59 63 65 70 74 80
59 63 66 70 74 81
59 63 66 70 74 83
59 63 66 71 74 83
60 63 67 71 75 85
60 65 68 71 75 88
The complete frequency distribution table should look similar to the following tables, when using
class intervals and class boundaries, respectively:
Weights f X
(In Kgs)
55 - 59 8 57
60 - 64 11 62
65 - 69 12 67
70 - 74 17 72
75 - 79 6 77
80 - 84 4 82
85 - 89 2 87
60
Weights f X
(In Kgs)
54.5 - 59.5 8 57
59.5 - 64.5 11 62
64.5 - 69.5 12 67
69.5 - 74.5 17 72
74.5 - 79.5 6 77
79.5 - 84.5 4 82
84.5 - 89.5 2 87
60
VARIATIONS AND GRAPHICAL PRESENTATIONS OF FREQUENCY DISTRIBUTION
The frequency distribution allows derivations of other distributions which can be used as tools
for interpretations of the given data set. These variations are the relative frequency and the
cumulative frequency distributions.
The relative frequency distribution (rf) expresses the ratio in percent the frequency to the total
frequency in the given set of data. It can be computed by dividing the number of observations of
each class by the total number of observations or the sample size and multiplying the result by
100. In formula,
f
rf * 100
n
where rf = the relative frequency of each class
f = frequency of each class
n = sample size
To illustrate, consider the data on Table 3.2. The first class can be computed as follows:
8
rf * 100
60
0.1333 *100
= 13.33%
For interpretation, we can say that 13.33% female diabetic patients, aged 30 – 50 years old
weigh ranging from 55 to 59 kilograms. The complete list of relative frequency distribution is as
follows:
Weights f rf%
(In Kgs)
55 - 59 8 13.33
60 - 64 11 18.33
65 - 69 12 20
70 - 74 17 28.33
75 - 79 6 10
80 - 84 4 6.67
85 - 89 2 3.33
60 99.99
By theory, a total of 100% should have resulted but the difference is very small and can
be attributed to rounding off of numbers.
Graphically, the above table can be shown through the use of a histogram or frequency
polygon. A histogram is composed of a group of adjacent rectangles whose base is on the
horizontal axis which either extends from the lower limit to the upper limit of the class
boundaries or which centers on the class marks. The width of the rectangle corresponds to the
class size while its height corresponds to the class frequency.
In plotting the histogram, it is assumed that frequencies are evenly distributed within the
class. In frequency polygon, it is assumed that the frequencies of each class are concentrated
at the midpoint of the class interval or class mark. A frequency polygon is a graph that is drawn
by plotting the frequencies against the class marks and connecting these plotted points with
straight lines. The resulting polygon is closed by adding class at both ends and plotting each
down to the horizontal axis against each class mark.
1. The less than cumulative frequency distribution (<cf) pertains to the distribution whose
frequencies are less than or below the upper class boundary they go with. Using the
same set of data on Table 3.3, the following <cf distribution can be computed:
Weights f <cf
(In Kgs)
54.5 - 59.5 8 8
59.5 - 64.5 11 19
64.5 - 69.5 12 31
69.5 - 74.5 17 48
74.5 - 79.5 6 54
79.5 - 84.5 4 58
84.5 - 89.5 2 60
60
To interpret the first <cf, we can say that 8 female diabetic patients weigh less than 59.5
kgs. Similarly, for the next <cf, we can say that 19 female diabetic patients weigh less than 64.5
kgs., and so on.
2. The greater than cumulative frequency distribution (>cf) pertains to the distribution whose
frequencies are greater than or above the lower class boundary they go with. Using similar
set of data, we can derive the following >cf distribution:
Frequency Distribution of the Weights In Kilograms of 60 Female Patients
Weights f >cf
(In Kgs)
54.5 - 59.5 8 60
59.5 - 64.5 11 52
64.5 - 69.5 12 41
69.5 - 74.5 17 29
74.5 - 79.5 6 12
79.5 - 84.5 4 6
84.5 - 89.5 2 2
60
For the first derived >cf, we can take 60 to mean that 60 female diabetic patients weigh
greater than 54.5 kilograms. For the second computed >cf , we can say that 52 female diabetic
patients weigh greater than 59.5 kilograms, and so on.
The graph of less than cumulative frequency is called less than ogive (>ogive) while that
of the greater than cumulative frequency is called greater than ogive (>ogive). These two type of
ogives can be drawn by plotting the <cf against the upper limit of the class boundaries for less
than ogive and plotting >cf against the lower limit of the class boundaries for greater than ogive.
Using the above set of data, the following is a sample of < and > ogives: