You are on page 1of 3

INPUT ANALYZER

Let us consider a variable of the system (e.g. the inter-arrival time between entities). It must be modeled as a random variable following a certain probability distribution. Which is this distribution? The problem requires a data set obtained by experiments. 0.5 2.1 3.4 4.1 4.6 5.7 6.2 6.6 7.8 8.1 8.3 8.4 8.6 8.9 9.2 9.8 10.0 10.3 10.5 10.6 10.8 11.2 11.3 11.6 11.7 12.1 12.5 12.6 12.8 12.9 12.9 13.2 14.4 15.0 15.5 16.3 17.0 17.3 18.5 23.5

MANUAL APPROACH The correct approach consists in building the frequencies histogram. In order to build frequencies histogram, the following steps are needed: data range must be divided into intervals (bins); the intervals must have all the same width; the horizontal axis must be divided into as many intervals as the data intervals; count the number of points that fall within each interval; the vertical axis must be labeled with these frequency values; plot the frequencies in the chart.
Histogram

12 10
Frequency

8 6 4 2 0 0 3 6 9 12
Bin

15

18

21

24

More

Which probability distribution best fits the data? We must choose a distribution by inspection, considering the plot. Given this data set, the most similar distribution is the Gaussian. Clearly, it is defined by its mean and variance.

Xn = 10.67 S 2 = 20.9451 S = 4.576

Thus, the distribution should be NORM(10.67, 4.576).

According to Arenas notation, we use the standard deviation and not the sample standard deviation. The following chart visually shows how good the choice has been.
Histogram 12 10 Frequency 8 6 4 2 0 3 6 9 12 Bin 15 18 21 24

12 10 8 6 4 2 0

If we want to accept the hypothesis we made about the type of the distribution, we have to use the chi-square test. We must compute the following statistic:
2 0

=
i =1

( O i Ei )
Ei

where k is the number of classes in which data are divided (typically the number of intervals of the histogram), Oi is the observed frequency for class i, Ei is the expected frequency for class i. Bin Frequency 3 6 9 12 15 18 21 24 2 4 8 11 9 4 1 1 CUMULATIVE PROBABILITY 0.046876582 0.153766508 0.357592649 0.614325097 0.827956576 0.945381505 0.988000472 0.998208077 INTERVAL PROBABILITY 0.046876582 0.106889926 0.203826141 0.256732448 0.213631479 0.117424929 0.042618967 0.010207605 EXPECTED VALUES 1.875063275 4.275597051 8.15304564 10.26929791 8.545259163 4.696997165 1.704758677 0.408304189 CHI-SQUARE 0.008324618 0.017764474 0.00287291 0.05199241 0.024199293 0.103428857 0.291351966 0.857458588

2 0 = 1.357393117

The corresponding p-value is: 0.9289167. The level of significativity is then > 0.9.

ARENA APPROACH We have to pass our data set to the Input Analyzer. To do so, we write a new file (typically a *.txt file) containing all the data. Opening the Input Analyzer, we choose to start a new analysis and we impose the data to be read from the existing file whose construction has been explained above. The histogram is generated automatically and the options let us specify the corresponding number of intervals. We can choose to approximate our histogram with many different distributions and we can also let the program choose the best fitting solution. In our example, the best solution is the Normal Distribution.

The summary shows us that the equation of the Normal distribution fitting the data is: NORM(10.7, 4.58) as shown before. The Chi-Square test performed with the Input Analyzer differs from the manual approach: special prize for those who understand how it works!

You might also like