You are on page 1of 59

INDUSTRIAL ATTACHMENT REPORT

Submitted by KAPPO OBAFEMI SETONJI MATRIC NO: 070541027

AT IBA LOCAL COUNCIL LCDA, IBA


Period of Attachment: 22.11.2010 13.5.2011 TO The Department of Computer Science Lagos State University, Ojo 2009/2010

TABLE OF CONTENTS

Page SUMMARY ACKNOWLEDGEMENTS I. INTRODUCTION II. THE ATTACHMENT PROGRAMME ASSIGNMENTS: 6 17 19 25 28 32 38 47 53 54 i ii 1 3

III. TRAINING 1. 2. 3. 4. 5. 6. 7. 8. IV.

Introduction to the hardware unit Introduction to Printers Introduction to C P U s Troubleshooting of printers Troubleshooting C P U s Understanding operating systems How to load operating systems (windows & Linux) How to configure local area networks

CONCLUSION

V. REFERENCES

SUMMARY

This Industrial Attachment Report describes the experience of an attachment with IBA LOCAL COUNCIL DEVELOPMENT AREA between 22, Nov 2011 to 13, May 2011. Iba LCDA is a among the new 57 local councils created during the Asiwaju Bola Ahmed Tinubu regime as the Lagos state Governor in 2006, it has an IT department that manages their network and resources (hardwares) and also Software that would be used in the development of many network facilities within the office premises and also to connect with other LCDAs and the main secretariat at Alausa.

During the attachment, we were required to familiarize ourselves with network Simulations data structures and hardware resources. The software development process trained us in analytical

thinking, problem solution and efficient programming. And the hardware department trained us to troubleshoot hardware resources. And configure small networks. Our team managed to make headway by developing more efficient queuing models that rely on better statistics and superior data structures.

ACKNOWLEDGEMENTS

I would like to acknowledge MR. SAMUEL ODU for his supervision of the Industrial Attachment, for his support of the project and for his confidence in allowing us freedom in our research and development endeavors. I would also like to especially thank MR BADRU for his expert knowledge and great Willingness to help and to work hand-in-hand with us, without which we would not have made such headway in our research.

Lastly, I would like to acknowledge my fellow workmates Tobiloba Badru, Omotayo Erinle, and Segun Joshua whom have been on the same team with me and whom I had the pleasure of working with.

ii

I. INTRODUCTION
SIWES is an acronym, which means Student Industrial Working Experience Scheme. The scheme was introduced by ITF in early 1970s in order to overcome the problems of inadequate practical skills preparatory for employment in industries by Nigerian graduates of higher institutions. The Scheme exposes students to industrial based skills necessary for a smooth transition from the theoretical aspect to the practical world. It gives students of higher institutions the opportunity of being familiarized and exposed to the needed experience in handling machinery and equipment which are usually not available in their institutions. Participation in SIWES becomes necessary pre-condition for the award of diploma, NCE and degree certificates in specific disciplines in the universities, colleges and polytechnics across the country, in order to comply with the educational policy of the country. Besides, the program being part of the curriculum of institutions of higher learning as part of the needs for a successful completion of the degree, diploma and NCE program. As part of this program, students are to undergo training within a minimum duration of four months for polytechnics and colleges of education, and six months for the universities in various industries and organizations depending on their fields of study to give them a practical feel of the knowledge they have acquired in school. One of the major problems of this program is the inability of students to secure placement in a suitable company where they can obtain good machinery and working experience that is related to their discipline. Even in situations where a good company with the availability of the right work environment is obtained, students often find themselves being used for menial jobs. This also has led to the school requiring every student to submit a written SIWES report on completion of their program in addition to the log book where the student records their day to day activities from the beginning to the end of the SIWES program to the department.

AIMS AND OBJECTIVES OF SIWES According to Federal Government in its Gazette of April 1978, the aim objectives of SIWES are as follows:1. To prepare student for the work situation they are likely to meet after graduation and provide student an opportunity to apply their theoretical knowledge in real work situation. 2. To expose students to the working environment so as to learn the method and techniques in handling equipment and machineries that may not be available in their school. 3. To provide opportunity to student to put their knowledge into practice there by bridging the gabs between the class work and the real-life application.
4. Provide an-avenue for the students in high institution to acquire industrial skills and

experience in their respective course of study.

Industrial Attachment Programme

The Industrial Attachment Programme is part of the course of study for a Bachelors Degree in computer science. It involves a 12-week attachment to a company that provides relevant experience to the field of study. It aims to give students hands-on experience of work in the industry and allows them to apply what they have been studying. Students will be involved in training, organization, research and product development. This attachment is scheduled to be from 22 Nov to 13 May 2011.

COMPANY

Iba LCDA is among the new 57 local councils created during the Asiwaju Bola Ahmed Tinubu regime as the Lagos state Governor in 2006. The council was headed by Hon. Toyin Isiaka Suarau as the Executive Chairman.

II. ATTACHMENT PROGRAMME

Scope

As the company is relatively new, research is in its infant stages. The task was to pioneer the research and to come up with findings that will set the direction for development. The aim of the research was to find novel ways of improving the speed and efficiency of network simulation software. More specifically, to improve the speed and efficiency of network simulation software, research is to be carried out on improving the management of the Pending Event Set (PES) within the network simulation engine.

The scope of the research covers the analysis of current-day network simulator engines, understanding how they function. During the course of analysis, data structures have to be examined, modified and tested to find out where the efficiency can be improved.

After identifying the areas to be improved, hypotheses have to be made and tested. Once the correct conclusion can be established, a prototype will be developed and extensively benchmarked. Benchmarking is essential to optimizing the management of the PES to be able to handle a large variety of simulation models.

After the best solution is established, the final step would be to incorporate the new Future Event List structure into the simulation engine. Development of a graphical User interface would help in the usability of the simulator and round up the software package.

Schedule

Week 1-2

Work Performed Understanding of Data Structures Understanding of Discrete Event Simulation Understanding of Pending Event Set (PES) Understanding of Various Methods of PES Management

36

Development of Benchmarking Tools to Aid in Problem Identification Analysis of Performance of Data Structures by Benchmarking Hypotheses of Various Possible Problems Testing of Hypotheses by Modifying Parameters and Methods

7 - 10

Evaluation of Results from Testing Implementation of a Solution to Increase Efficiency Extensive Benchmarking of Improved Data Structure Tweaking and Debugging to Obtain Optimal Solution

11 12

Research into Possible Improvement by Modifying Data Structure Hypotheses of Multi-tier Data Structures Development of Prototypes of Hypotheses

13 - 14

Testing of Prototypes Benchmarking of Prototypes Evaluation of Results from Testing Implementation of Best Data Structure

15 - 16

Formalizing New Data Structure with Variable Parameters and Thresholds

17 - 18

Analysis of Performance with Different Parameters and Thresholds Tweaking and Debugging to Obtain Optimal Solution

19 - 20

Investigation into Efficiency of New Mechanism on Other Data Structures

21 - 22

Understanding of Results Obtained from Testing Research into Theoretical Basis for New Data Structure

23 - 24

Consolidation of Research Up to Date Mark out Direction for Continuation of Project

III. TRAINING ASSIGNMENTS

1.

Discrete Event Simulation

1.1

Introduction

The purpose of a simulator is to simulate the operations of various real-world facilities or processes. The facility or process of interest is usually called a system. In order to study a real world system, we have to make a set of assumptions on how the system works. These assumptions are usually in the form of mathematical or logical relationships, constituting a model that is used to try to gain some understanding of how the corresponding system behaves.

If the relationships that compose the models are simple enough, it may be possible to use mathematical methods to obtain exact information on questions of interest. This is known as an analytic solution. However, most real-world systems are too complex to be evaluated analytically, and these models must be studied by means of simulation. In a simulation, we use a computer to evaluate a model numerically, and data is gathered in order to estimate the desired true characteristics of the model.

Before proceeding further, there are some basic terms essential to understanding the working of simulations that have to be defined.

A system is defined to be a collection of entities, e.g., people or machines that act and
interact together toward the accomplishment of some logical end. At any given point

in time a system will be in a certain state. The state of a system is defined to be the collection of variables necessary to describe a system at a particular time, relative to the objectives of a study. We categorize systems to be of two types, discrete or continuous.

A discrete system is one in which the state variables change instantaneously at separated or countable points in time. Whereas, a continuous system is one for which the state variables change continuously with respect to time. A system can also be broken down into deterministic or stochastic.

A deterministic system is one in which the output of the system is completely determined by the input and initial values of system parameters or state of the system. A stochastic system on the other hand, contains a certain degree of randomness in its transactions from one state to the other.

In simulations, a model has to be developed. A model is an abstraction of a system intended to replicate some of the properties of that system. The performance of a model is determined by the following criteria:

Accuracy:
Precision:

degree to which the model behaves like a real world system level of detail and non ambiguity in the model the extent to which a model contains all the elements necessary for satisfying a study purpose.

Completeness:

Simulation models can be classified into the following dimensions: Static Simulation model is one, which is independent of time. Dynamic Simulation model represents a system that evolves over time. Deterministic Simulation model does not contain any probabilistic or random components. The output is determined once the set of inputs and relationships have been specified. Stochastic Simulation model contains random input components. Hence these

models produce outputs that are also random, and hence must be treated as only an estimate of the model. Continuous Simulation model is one that is done in continuous time Discrete Simulation model, which is the model that we are concerned with and will be defined below.

1.2

Discrete Event Simulation

Discrete-event simulation concerns the modeling of a system as it evolves over time by a representation in which the state variables change instantaneously at separate points in time. These points in time are the ones at which an event occurs.

The discrete event simulation model is made up of the following building blocks: Activity: an action that changes the state of a system Event: Process: an instantaneous occurrence that may change the state of the system a time ordered sequence of all events and activities of an entity.

PROCESS

Activity 1

Activity 2

Time Event 1 Event 2 Event 3 Event 4

Figure 1. Building Blocks of the Discrete Event Simulation Model

1.3 Modelling Approaches

In creating a discrete event simulation model, there are several available methods of approach using the basic building blocks mentioned above.

1.3.1 Activity Scanning Approach

This approach is based on the scanning of all activities in a simulation system to

determine which activity can be started. All possible activities that may occur and the conditions, which may trigger them, must be identified before hand. For this to be possible, an activity list keeps tracks of all activities in the simulation system.

The advantage of using this approach is that simulation models are easy to build. However, the disadvantage is that it requires repeated scans to ensure any possible state changes are captured. This makes it run time inefficient due to the excessive scanning activities. Moreover this approach provides limited timing information due to its emphasis on action and not occurrence time.

Future Event List (FEL) Before proceeding on to the other two approaches, there is a need to introduce the concept of an FEL. An FEL is a list that stores all the future events that is to be executed at a later time. In both the Event Scheduling Approach and the Process Interaction Approach, events are stored in an FEL in a sorted manner with respect to its timestamp or occurrence time.

10

During a simulation run, the following sequence of actions occurs: 1. The event with the highest priority or smallest timestamp is taken from the future event list. 2. The system clock is updated to its occurrence time. 3. The activity associated with the event is then executed.

remove events

Highest Priority

Lowest Priority

Event Time Stamp

ueue Enq nt eve

Figure 2. Future Event List

1.3.2 Event Scheduling Approach

This approach is based on the detailed description of the actions that follow when an
individual event occurs in a system. For this approach to be employed several

conditions must be satisfied: The system must have finite states. All possible states of the system must be identified. All possible state transitions will be included as part of all possible events. Conditions that would trigger state transitions must be identified.

11

All possible states

FEL
Highest Priority Lowest Priority

Event Time Stamp

Figure 3. Event Scheduling Approach

The advantage of using this approach is that it is run time efficient as events are only executed when changes occur. The disadvantage is that it is difficult to identify all possible states of a system.

1.3.3 Process Interaction Approach

This approach provides a process for each entity in the real world system and emphasizes the interaction between these processes. Thus the system behavior is described by a set of processes.

For this approach to be applied all entities in the system must be identified and their dynamics must be described with a sequence of events and actions to form a process. The process describes the entire experience of an entity as it flows through the system or provides its service.

12

The process interaction approach enables multiple processes to be running concurrently. The processes are scheduled for execution by scheduling some of their events, which contain a reference to the action to be performed. These events are enqueued into the FEL.

PROCESS A State: -1 -2 -2

PROCESS B State: -1 -2

PROCESS C State: -1 -2 -3 -4

invoke process B invoke process A invoke process A invoke process C

FEL
Highest Priority Lowest Priority

Event Time Stamp

Figure 4. Process Interaction Approach

Throughout a process interaction simulation (PIS), events, belonging to different


processes, are interrelated by special scheduling statements, which are may be conditional or unconditional wait statements. These statements enable interaction

between the processes and as well as the measure of time to be included in the simulation.

13

The conditional wait is one where a process cannot continue until certain specified conditions are met. This delays the execution of a process for an unspecified time period. It is a representation of interactions between processes.

CONDITION When condition is true, insert event to awake C

Signal Condition PROCESS A State: PROCESSA State: -1

PROCESS C State:

PROCESS C State: -1

-1
-2
-2

- 2 : awake C
-3

-1:await condition
-2 -3 -4 invoke process C

-2 : wake up
-3 -4

invoke process A invoke process A invoke process C

FEL
Highest Priority Lowest Priority

Event Time Stamp

Figure 5. Conditional Wait

14

An unconditional wait is where a process is delayed from execution for a specified period of time. Thus it can represent the passage of time in a simulation.

PROCESS A State: - 1 Sleep X sec -2 -2

PROCESSA State: -1 - 2 : awake and continue -3 Event to awaken A inserted invoke process A

invoke process A

10.00 + x 10.00

FEL
Highest Priority Lowest Priority

Event Time Stamp

Figure 6. Unconditional Wait

The process interaction approach combines the modeling power of the activity scanning approach with the runtime efficiency of the event scheduling approach. This is achieved by grouping the events and actions belonging to a particular entity into a process. It also captures the interactions between processes through the conditional and unconditional wait statements.

15

1.4

SWAN

In our research, the simulator under development is known as SWAN or the Simulator
Without A Name. It is a discrete event simulator, which utilizes the process

interaction approach. SWAN was created by Boris Krunic from the Curtin University of Technology.

In SWAN the simulation system is designed such that it may be realized as a library, which can be used to derive the necessary data and structures and functions when building a particular simulation application.

The focus of our research is to look into ways to improve the performance of the
discrete event simulator engine. One of the main factors influencing the speed of the

simulator engine is the management of the future event list (FEL), or better known as the Pending Event Set (PES). The PES is in effect a priority queue, where events are stored in a data structure to be executed at a later specified time. The next section discusses about the development of data structures and how they can be managed efficiently to reduce simulation run time.

16

DATA STRUCTURES

2.

Introduction to Priority Queues

A Priority Queue is a queue for which each element has an associated priority, and for
which the dequeue operation always removes the lowest (or highest) priority item

remaining in the queue. Priority queues are used in discrete event simulation as a representation of the pending event set. The more efficient the priority queue is in managing the events in the list the less time is wasted in the simulation. It has been shown that up to 40% of the simulation time may be spent on the management of the list alone if the number of events is large, as in the case of fine-grain simulation.

There are various ways of implementing a priority queue. They can be classified according to how their execution time varies with queue size. The most common of measurements of execution time is a hold operation, where an enqueue is made followed by a dequeue. The simplest of priority queues is the linear list implementation. The time taken for a hold operation on the average is proportional to the queue size, or what is known as O(n), where n is the queue size and O stands for order of the queue implementation.

17

Over the years, many researchers have tried to implement priority queue algorithms that are able to perform close to the O(1) limit. Some have claimed O(1) performance over certain conditions. The performance of priority queues is very much dependent on the distribution of events in the queue, whether it is orderly or disorderly, even or skewed. The performance is also dependent on the distribution of enqueue and dequeue operations, as some implementations may be better if given steady state of the queue size, but much worse in a transitional state. Therefore, an ideal queue implementation would be one that is able to perform well on the average for all possible simulations.

Through benchmarking by some researchers, it has been found that the Calendar Queue (CQ) implementations have some of the best performances. Up to date, before our research started, there have been 4 variations of the CQ known to us. The original CQ was developed by Randy Brown and published in 1988. It was challenged by Robert Ronngren in 1991 when he published his lazy queue implementation. The latest implementations are from SeungHyun Oh who published 2 similar implementations called dynamic CQ and dynamic lazy CQ in 1997. All of them claim O(1) performance over many simulation conditions.

It is from this point that our research starts from. The challenge is to further improve on these CQ implementations in terms of efficiency and variance in performance over all possible simulation conditions.

18

3.

Calendar Queue

3.1

Calendar Queue Structure

The Calendar Queue (CQ) is a data structure that is modeled after a real-world model. The management of events is likened to that of how a human being would manage his schedule, using a calendar. An event is scheduled by filling up the appropriate page for the day. There may be many events in a day or none at all. The time at which an event is scheduled is its priority. The enqueue operation corresponds to scheduling an event. The earliest event on the calendar is dequeued by scanning the page for todays date and removing the earliest event written on that page.

In the computer implementation, each page of the calendar is represented by a sorted linked list of the events scheduled for that day. An array containing one pointer for each day of the year is used to find the linked list for a particular day. Each linked list is called a bucket. The current year is defined from the most current event, which may be in any bucket. The calendar is made to be circular so that buckets before the current one can be used to hold events after the tail of the array. This is synonymous to a person using a previous page for the subsequent year, as previous events would have been removed. If an event falls outside of the defined year, then it is queued as if in the current year, but at the end of the bucket, according to its priority. This solves the problem of overflow.

19

The length of the year is chosen to be long enough that most events will be scheduled
within the current year. The number of days and the length of each day determine the length of the year. The number of days in a year and the length of each day are chosen

such that no one day contains too many events and no 2 consecutive non-empty days are too far apart. The length of the year can be adjusted periodically when the queue size grows or shrinks.

Bucket 0

31.4

Bucket 1

36.1

36.1

72.2

Bucket 2

13.2

42.1

Bucket 3

Bucket 4

22.2

Bucket 5

25.6

27.3

Figure 7. Six-Day Calendar Queue

The diagram above is an example of a calendar queue with 6 days in the year. Each
day has a length of 5 units. Therefore, the length of the year would be calculated to be

30 units long.

20

Supposing the simulation starts at time 0.0, the next event to be dequeued would be
13.2, which is in bucket 2, where the current pointer is. Note that the events in buckets 0 and 1 will only be dequeued when the pointer comes round circularly back to them. After the current event has been dequeued, the next event in the bucket is skipped as it

is not in the current year and will be dequeued in the appropriate year. When the pointer reaches an empty bucket, it automatically moves to next one. If there are no events in the current year, then a direct search is initiated, where the next highest priority is searched for in all the buckets. This is inefficient and is to be avoided when determining the size of the calendar.

3.2

Calendar Queue Resizing

As discussed above, the determination of the year size, day size and the number of
days is crucial to maintaining efficiency in the calendar queue. Since the year size is

determined by the day size and number of days, controlling these 2 parameters are sufficient bearing in mind the year size being large enough.

The determination of the number of days is dependent on the number of events as compared to days, according to Brown. The greater the number of events as compared to days, the worse the enqueue time. On the other hand, the lesser the number of events as compared to days, the worse the dequeue time. This is shown in the following diagrams.

21

Bucket 0

Bucket 1

Figure 8. CQ with greater number of events compared to days

Bucket 0

Bucket 1

Bucket 2

Bucket 3

Bucket 4

Bucket 5

Bucket 6

Bucket 7

Figure 9. CQ with greater number of days compared to events

The solution given is to allow the number of days to grow and shrink correspondingly as the queue size grows and shrinks. This is done by doubling the number of days each time the queue size doubles, and halving the number of days each time the queue size halves.

22

Each time a resize is done, the events are copied from the current calendar to the new
calendar. It has been shown that for a steadily growing or shrinking queue size, the average number of times an event is copied is bounded between 1 and 2 and that the

recopying time per event is O(1) in the queue size. As for fluctuating queue sizes, the worst case would only come about when the queue size fluctuates about a power of 2.

The length of a day was adjusted each time a resize occurred. It was calculated based on sampling of events from the queue and estimating the average separation between
events. If events were too far apart, the length would be decreased, and vice versa, if the events were too close, then the length would be increased. The aim as given by

Brown was to spread the events out regularly so that they are not too clustered or too far apart. The worst case of estimation would be when the sampled events were not a reflection of the entire distribution.

On the whole, the calendar queue algorithm brought about some good solutions to improve queuing efficiency. Some benchmarking was done to test its performance. The results are shown below. From the charts, it can be seen that for most of the distributions, the calendar queue performed close to the promised O(1). It performed badly for very skewed distributions like the Triangular and Camel distributions.

23

CQ Hold Performance
0.012 0.01
Time/milliseconds
0.008 0.006 0.004 0.002

Rect NegTriag Triag Camel(70,20) Camel(98,01)

0.7 0.6
Time/milliseconds

00 150 00 180 00 210 00 240 00 270 00 300 00


Queue Size

300 0 600 0 900 1200

Figure 10. CQ Hold Performance

CQ Up/Down Performance

0.5 0.4 0.3 0.2 0.1 0

Rect NegTriag Triag Camel(70,20) Camel(98,01)

300 0 600 0 900 0 120 00 150 00 180 00 210 00 240 00 270 00 300 00
Queue Size

Figure 11. CQ Up/Down Performance

24

4.

Dynamic Calendar Queue

4.1

Dynamic Calendar Queue Algorithm

The Dynamic Calendar Queue (DCQ) was developed to counter the problems of the
original CQ. This was necessary as CQ performed poorly over skewed distributions. Such distributions are common in network simulations, where simulations have to be done on bursty-type of traffic. This is synonymous to peak-period usage of the

network.

The DCQ algorithm combats the problems of skewed distributions by allowing dynamic resizing to occur as and when events are too clustered or far apart. Since CQ only allows resizing at queue sizes of powers of 2, the dynamic resizing would be far superior in terms of pre-empting bad distributions at any time. This obviously requires greater overhead to monitor the distribution of events. However, over the long run of non-uniform conditions, this would prove to be worth the cost.

DCQ adds 2 mechanisms to CQ. The first mechanism decides the right time to resize dynamically. This is done by measuring the average queue operation cost, namely the average time taken to either enqueue or dequeue an event. When either of these two costs exceeds a certain predetermined threshold, indicating that events are not uniformly distributed over the multi-list, DCQ re-computes the bucket width and redistributes the events over the newly configured buckets.

25

The second mechanism approximates the average inter-event time gap accurately by
sampling appropriate events. It does this by keeping the same sampling algorithm as

CQ when there is an even distribution, and sampling around the largest bucket when the distribution is skewed. To decide whether the distribution is skewed, DCQ computes the ratio between the number of events at a few buckets around the largest bucket and the total number of stored events. When this ratio is above a predetermined threshold, meaning that the events are not evenly distributed, DCQ then chooses to sample around the largest bucket.

The improvement made to the queuing efficiency is quite significant. Again, benchmarking was done to test its performance improvement. The results are shown below. From the charts, it can be seen that there is much improvement for the skewed distributions; however, they continue to remain rather erratic at times. The erratic behavior could be due to incorrect estimation at times, although there has been improvement most of the time. This is still unacceptable, and a better cost metric needs to be found to provide more accurate estimation.

26

DCQ Hold Performance


0.0016
0.0014 0.0012

Time/milliseconds

Rect NegTriag Triag Camel(70,20) Camel(98,01)

0.001 0.0008 0.0006 0.0004 0.0002 0

0.009 0.008 0.007


Time/milliseconds

30 00 60 00 90 00 12 00 15 0 00 18 0 00 21 0 00 24 0 00 27 0 00 30 0 00 0
Queue Size

Figure 12. DCQ Hold Performance

DCQ Up/Down Performance

0.006 0.005 0.004 0.003 0.002 0.001 0

Rect NegTriag Triag Camel(70,20)


Camel(98,01)

300 0 600 0 900 0 120 00 150 00 180 00 210 00 240 00 270 00 300 00
Queue Size

Figure 13. DCQ Up/Down Performance

27

5.

Benchmarking Tools

5.1

Distribution Models

The distribution models used in our benchmarking were based on recommended


models by other researchers. We realized that it was important to choose the right parameters to enable a much wider diversity of possible simulation conditions. It is

also necessary to impose much stricter conditions to test the limits of our queuing
model. By putting prototypes through all kinds of conditions, we then can identify

problem areas more clearly and also better observe how different variables can affect the performance of the queuing model.

5.2

Recommended Models

There are 5 distributions that are recommended that provide a fairly good crosssection of real-world simulation conditions. These 5 distributions are the rectangular distribution, the triangular distribution, the negative triangle distribution, the camel distribution with wider humps and the camel distribution with narrower humps that

are skewed to either extremes. These distributions are given in the diagram below.

28

Rectangular

Triangular

Negative Triangular

Camel

Figure 14. Distributions used in Benchmarking

Benchmarking is usually done using two types of performance measurements. The first is the hold performance. This tests the steady state performance of the queuing model by keeping the queue size constant. A dequeue followed by an enqueue operation is repeatedly done many times. The second performance measurement is called the up/down performance. This tests the transient state performance of the queuing model by filling up the queue and subsequently emptying the queue. This is done many times consecutively.

29

5.3

How We Tested

Using a very fast computer, an AMD K7-800 MHz PC with 384 MB RAM, we were
able to subject queuing models to extreme conditions, without spending to much time waiting for results. This was a great advantage as the longer the simulations were and the more complex they were, we were then able to pinpoint more accurately the

problem areas.

For example, we were able to increase the number of hold operations to 100 times the queue size. This provided a more accurate picture as the effect of the distribution

could be observed for a longer period of time. We were also able to increase the number of up/down operations to 10. This again allowed for a longer observation period.

Since increasing queue size is a major consideration when benchmarking performance, testing queuing models up to very large queue sizes of 300,000 can further enhance the pattern to infinite queue size. It is through this that new observations were made, as complexity of data structures were put to the test, not just hypothetically.

Besides observing the performance of the queuing model, observations of the distribution in the queue at different times of the simulation were taken. Snap-shots during regular intervals were taken and plotted out on a bar chart to observe the distribution. It is with such precise observation that problems could be identified.

30

To further subject queuing models to extensive benchmarking, variations of the


standard 5 distributions were also tested. The distributions were combined to form more irregular distributions, so as to confuse unreliable algorithms and observe their performance. Instead of just using the hold and up/down performance, we also tested other variations like an irregular build up, where the number of enqueues and

dequeues are random but either is greater than the other.

One last observation that was made was the variation of the resolution of the events

being generated. As events generated cannot be truly random within a finite range, by increasing the resolution of the event times being generated, events can be closer to becoming random.

31

6.

Snoopy Calendar Queue

6.1

Snoopy CQ Algorithm

The Snoopy CQ was the result of rigorous testing and benchmarking using the above
tools. As mentioned in previous sections, the CQ and DCQ use sampling techniques

to estimate the optimum bucket width for the calendar. These techniques are highly subjective to the consistency of the distribution in the queue. As such, when the queue has combinations of extremely different distributions, or when there are multiple peaks, these sampling techniques would begin to falter.

In order to prevent such sampling problems, the Snoopy CQ algorithm moves away from sampling and uses a different approach, by gathering performance statistics. Using these statistics would enable the bucket width to be calculated irrespective of distribution, but rather dependent on the average costs of enqueuing and dequeuing. In the DCQ algorithm, these costs are only used to trigger the calendar resize. Through our observations, we realized that these statistics were useful and rather accurate in estimating the distribution in the calendar. As these statistics were highly susceptible to sudden surges, the algorithm provided for a long-term moving average to resolve this problem.

32

The other area of improvement is in the triggering of a resize. As resizing is an


expensive operation, it is crucial to resize only when absolutely necessary. The accurate estimation of the bucket width together with a more discerning threshold for

resize, would enable the calendar to resize less often and to do so wisely. The Snoopy CQ continues to recognize the need to resize whenever the queue size grows or shrinks. The algorithm comes in specifically when the queue is in a less transient state.

6.2

Bucket Width Optimization

The Snoopy CQ algorithm recognizes the importance of the average enqueue and dequeue cost statistics. In theory, the calendar is deemed to be most efficient when the average enqueue and dequeue costs are a minimum, as these are the definitions of efficiency, discounting the overheads. It is important to note from our observations, that the benefits from minimizing queuing costs far outweighs the benefits from
having minimal overheads when the queue becomes large or when the simulated time

is long. Overheads are still kept to a minimum but not at the expense of improved queuing costs.

Referring to the diagram below, we can see that the average enqueue cost and the average dequeue cost are inversely related to each other with respect to the bucket width. Decreasing the bucket width would improve enqueue cost in best situations, but dampen dequeue cost in worst situations, and vice versa.

33

Figure 15. Effect of Decreasing or Increasing Bucket Width

As such, there is a need to balance both costs. This is where the choice of the bucket width is relative to the sum of the two costs being minimized. In fact it is this relationship that has been undergoing refinement in our research efforts. It is only through many tests that we were able to estimate a good function to relate the two.

6.3

Resize Triggering

The Snoopy CQ algorithm adds another set of thresholds to the existing one in DCQ. This threshold is based on a 10-slot moving average cost. When the moving average enqueue cost exceeds that of the dequeue cost by a non-tolerable factor, or vice versa, the calendar would undergo a resize. The use of a 10-slots moving average has been found in our tests to provide enough stability to strike a good balance between excessive triggering and un-responsive triggering. The various thresholds were benchmarked extensively until an optimal one could be decided.

It was noted that by adding this new set of thresholds, together with the new bucket width calculation, the calendar underwent fewer resizes than the DCQ.

34

6.4

Performance Improvement

The Snoopy CQ algorithm has enabled the calendar queue to perform with greater
stability. After benchmarking was done to test its performance improvement, the results, shown below, shows that the Snoopy CQ performed much better for the hold performance. It can be seen that there is much improvement especially for the skewed distributions. The overall performance is also much closer to O(1). Note that the charts show performance using a higher resolution event generator as compared to the previous charts. This shows a flaw in the algorithm of DCQ that Snoopy CQ takes

care of by adding new mechanisms.

35

DCQ Hold Performance


0.0045
0.004 0.0035

Time/milliseconds

Rect NegTriag Triag Camel(70,20) Camel(98,01)

0.003 0.0025 0.002 0.0015 0.001 0.0005 0

0.0045 0.004

Tim e/m illiseconds

0.0035
0.003

300 0 600 0 900 1200 00 150 00 180 00 210 00 240 00 270 00 300 00
Queue Size

Figure 16. DCQ Hold Performance at Higher Resolution

Snoopy CQ Hold Performance

Rect NegTriag Triag Camel(70,20)


Camel(98,01)

0.0025
0.002

0.0015
0.001

300 0 600 0 90001200


0

0.0005

0 1500

Queue Size

Figure 17. Snoopy CQ Hold Performance at Higher Resolution

0 1800

21 0 000 24 000 27 000 300 00

36

DCQ Up/Down Performance


0.12
0.1

Time/milliseconds

Rect
0.08 0.06 0.04 0.02 0

NegTriag Triag Camel(70,20) Camel(98,01)

0.12
0.1

Time/milliseconds

300 0 600 0 900 0 120 00 150 00 180 00 210 00 240 00 270 00 300 00
Queue Size Figure 18. DCQ Up/Down Performance at Higher Resolution

Snoopy CQ Up/Down Performance

0.08 0.06 0.04 0.02 0

Rect NegTriag Triag Camel(70,20)


Camel(98,01)

Figure 19. Snoopy CQ Up/Down Performance at Higher Resolution

00 150 00 180 00 210 00 240 00 270 00 300 00


Queue Size

300 0 600 0 900 1200

37

7.

Multi-variety Queue Models - A Solution to Queue Limitations

7.1

Introduction

So far in our research, we have been looking at ways to improve the CQ algorithm.
Admittedly, there are many good algorithms in the market today that help to improve

event list management, however most of them have some form of limitation or another; thriving on some conditions, while failing on others. This is where the merit of marrying good algorithms with the aim of improving a larger variety of simulation conditions is a boon. When combining different queue structures, it is important to employ each structure to where it is most efficient, and to use another where the former is lacking. Since the Snoopy CQ has proved to be one of the better data structures to date for a large variety of conditions, research was continued into building on the Snoopy CQ structure.

7.2

Merits of Various Queue Structures

The most elementary of queue structures is the linked list. With this linked list, one is
able to add events without having to re-shuffle the queue. It also has the ability to keep in order quite easily. The linked list is able to grow to an unspecified size. This is more efficient when compared to an array, whose size needs to be pre-specified.

Arrays become useful only when direct referencing is needed. The simple priority linked list, though efficient, becomes exponentially cumbersome with increasing queue sizes.

38

This is where the calendar queue shines. It combines the use of an array to split the queue into multiple linked lists. The array of linked lists can be directly referenced and hence speeds up searching. The merit of the calendar queue is in its ability to cut the size of each linked list. Therefore, a good algorithm is needed to ensure that the sizes of the linked lists are monitored. Bearing in mind that array sizes need to be prespecified, this poses problems of finding the right specifications, especially when events being enqueued are of unpredictable distributions, ranging from even to bursty ones and combinations of both.

To solve this problem, researchers have split into two separate directions. One camp looked into effectively monitoring the distribution of the queue and the other camp looked into giving attention to events that are of higher priority. The calendar queue was evolved into the dynamic calendar queue (DCQ), which incorporates strategic sampling of events in the queue in order to determine the optimum width of its array (buckets). Once the optimum width is determined, the DCQ then resizes itself by requeuing its events in a newly formed DCQ. This has serious problems when the number of events that need to be re-queued is very large. The frequency at which this re-queuing happens is also another consideration. Thus, the DCQ will have problems when it resizes often. This occurs when the queue size fluctuates around powers of 2
and when the distribution is a mixture of both even and bursty type. The Snoopy

CQ, which was just developed, sought to solve some of the problems of sampling and resizing. It was able to improve performance quite substantially by bringing about greater stability through its monitoring of statistics.

39

To avoid the whole problem of the DCQ altogether, the other direction of research
looked into how the queue structure can be limited such that there would be limited events involved in resizing and attention could be given to higher priority events when the width is to be determined. The merits of this approach are obvious. If the queue size could be capped, then frequent resizing would involve a number of events

that would also be capped, thus preventing the cost of resizing from increasing proportionally with the queue size. Also, the determination of the bucket width would be more accurately suited to events that are of dearer concern. The research has fruited the lazy queue and the dynamic lazy calendar queue. These queue structures implement a multi-tier approach. The lazy queue has 3 tiers and the lazy DCQ has 2 tiers. This multi-tier approach enables the queue to place events that are in the far future in a dump. The dump is of a semi-sorted nature, thus requiring lesser rules for queuing, saving valuable overheads till the events are needed. The algorithm is costly when decisions made to determine where and when to operate the dump are not accurate and when the semi-sorted nature of the future tier is not efficient enough. Therefore the challenge is to find a good set of boundary thresholds where the future tier operates from and to develop an efficient enough structure for the future tier that requires little maintenance.

40

7.3

Multi-variety Queues

As the Snoopy CQ is the best of queue structures for queue sizes that are not too
large, (hence avoiding too much overhead during resizing) it is the better choice to use as a base from which to develop from. However, to avoid the problems that Snoopy CQ has for large queue sizes, a hybrid model can be used. In order to cap the

queue size, a lazy approach has to be taken to split the queue structure. By implementing a second tier above the Snoopy CQ, events that are of lesser priority can be dumped and their enqueuing delayed till they become more significant as time proceeds.

The limit for the Snoopy CQ can be set in two ways, either by setting a boundary for the timestamp or by restricting the number of events. The future list can also be implemented in two ways, either by using an array of linked lists with its width as a way to moderate the length of the linked lists, or by using a structure with nodes that are limited by its numerical size.

In setting the limit for the DCQ by means of a timestamp, a good boundary value must first be determined. One of the more efficient and logical ways is to use the current calendar year size as the boundary. The calendar year size has been sized such that events in the DCQ are evenly distributed. Since the calculation of the year size only occurs during resizing, the boundary should be set when a resize occurs. Further research would look into the choice of the boundary.

41

The other way to set the boundary is to restrict the number of events in the calendar.
Again, the choice of the boundary would have to be researched. When a resize occurs at this boundary, only a small number of events would be enqueued back into the

calendar, whilst the remaining events would be put in the future list. This ensures that the calendar would not need to resize with such large queue sizes.

To implement a future list using an array of linked lists, a width for each bucket needs to be set such that events are evenly distributed across the array. It is important to note here that the cost of optimizing the distribution should be kept to a minimum. As the future list is not of near concern, the structure can be quite loose. The queue can be of a semi-sorted nature, as events are transferred over to the calendar in batches. A stripped down version of the calendar can be implemented, where each bucket contains an unsorted list and the bucket width to be determined by the same algorithm albeit in lesser frequency, so as to prevent unnecessarily many resizes.

CQ

Fixed yr size Figure 20. Array of Unsorted Linked Lists

42

The other method of structuring the future list would be to do away with the problem
of width determination and concentrate on limiting the size of each node. This criterion would be suited to a leaf tree, where events are located at the leaves of the tree structure. Each leaf node can contain a maximum number of events, beyond which it would split into two new leaf nodes that contain approximately half the events each, emptying the originating node. The tree is ordered based on the minimum value of each node. Each node would maintain a linked list that is unordered. The benefit of this structure is that there is no need to determine distributions and calculate widths. However, the catch is that the tree would waste a lot of resources, because of the inherent need to maintain the whole tree structure,

although only the leaves are used.

CQ

Leaf node

Figure 21. Leaf Tree with Binary Structure

43

7.4

Experimentation

Some preliminary testing has been done to test out the effectiveness of the above ideas. The presumptions for these experiments are that the structures are not optimized as yet and tests are conducted selectively and not thoroughly. The boundary conditions were implemented in the following manner.

Boundary based on Timestamp When the calendar resizes when it contains 8192 events, the boundary is set at the calendar year size. It means that when the timestamp of a new event being enqueued is greater than the first year, it will be dumped into the future list. When the number of events decreases upon dequeuing, the future list would transfer events into the calendar, one bucket at a time, each time the queue size halves. If the newly calculated year size of the calendar increases because of resizing, then the condition is checked to see if the boundary exceeds the smallest event in the future list. Buckets will continue to be transferred as long as the boundary exceeds the smallest event in the future list.

Array of Unsorted Linked Lists In order to match the above boundary methods, a dump consisting of an array of unsorted linked lists was used. This is similar to a stripped-down version of the calendar. An array of 9 buckets is used in this case. This would mean that there are a total of 10 years including the first year, which is the original calendar. The last year would contain all the remaining events above the 9 year. Each bucket would contain
th

an unsorted linked list. To monitor the future list, the year size is evaluated each time

44

the list is emptied and reused. The smallest event in the whole list is also updated
when a smaller new event is enqueued or when a bucket is being transferred back to

the calendar.

Upon experimentation of these methods of setting the boundary and array structure, it was discovered that the boundary was only effective when the calendar year size was kept optimum for the rest of the time. This was because if the year size were to be

changed, then the whole queue would have to be re-ordered, including the array, which took too much overheads. As such, this method was dropped till further efficiency could be looked into.

Boundary based on Number of Events To restrict the number of events, whenever the calendar resizes when it contains 4096 events, only 512 events would be enqueued back into the calendar, whilst the
remaining events would be dumped into the future list. The smallest event in the

future list is once again updated to keep track of the boundary. After resizing, any new events that have a timestamp greater than the smallest event would be dumped into the future list, and the events that are smaller would be enqueued into the calendar. This would cause the calendar to grow once again. When it reaches the 4096 events threshold, it would repeat the cycle.

45

Leaf Tree with Binary Structure A binary structure is set up such that it can grow dynamically as events are dumped
into the future list and shrink when nodes are being transferred to the calendar. This

structure has all its events at the leaves of the tree, leaving the rest of the tree empty. This promotes a fast way of finding events and enables nodes that have a maximum capacity to be implemented efficiently. For this case, each node was limited to 500 events that are unsorted in a linked list. Whenever the node reaches 500 events, it is quicksorted and the node is split into 2 daughter nodes with the first 250 events in the lower node and the next 250 in the upper node. New events that are enqueued after that, search for the relevant node to be dumped into by traversing the tree. When a transfer of events to the calendar is required, the leftmost node is plucked out and enqueued into the calendar.

This combination of a boundary fixed on queue size and a leaf tree dump was a more feasible implementation. The initial results were quite encouraging, showing no signs of major problems. This implementation was chosen as the prototype to be built on, as the queue size-based thresholds seemed a more logical approach to curbing problems of estimating good time-based thresholds.

46

8.

Development on Lazy Queuing Algorithm

After the second prototype was chosen, research was done to refine the model. Development was carried out using the lazy queue algorithm on top of the Snoopy CQ. Observations were made based on this setup. It is to be further researched into whether the lazy queue algorithm would work just as well on top of other queue models. Initial opinion is that it would perform independent of the bottom tier; only parameters have to be optimized.

8.1

Tree List Structure

The tree list structure (TLS) used in the prototype was further refined. Research was
done to identify which parameters could be adjusted so as to improve the efficiency of the model. Two parameters were identified; the maximum size of the nodes and the

way the node is split.

During the testing of the node size, two possibilities were watched; how the node size varied with the queue size and how the node size varied with distribution. It was observed that the node size performance was proportional to the queue size. Further testing revealed that more specifically the TLS size affected the node size performance. As such, the node size was set to be a factor of the TLS size. The other observation was that the node size performance was proportional to the evenness of the distribution. The more skewed the distribution, the better the performance is for smaller node sizes. Therefore a balance point was needed to adjust the node size factor.

47

For the prototype model, it was suggested that the splitting of the node would require
a quicksort followed by a division of events into 2 equal halves. After some consideration, this was thought to be inefficient. Firstly, the sorting would be extra work and not worth the overheads to do a split. Secondly, equal number of events for each halve is not necessary for the structure to function. By splitting the node around the average timestamp of the events in it, a sufficient distribution of events in both halves would be done. Therefore the new splitting algorithm would involve the calculation of the average timestamp and a division of the node based on a simple

comparison with the average.

8.2

Boundary Values of the Calendar

There are 2 boundary values that have been identified for calendar growing and 1
boundary value for shrinking. There are typically 3 phases in simulations; the queue building up, the queue in steady state, and the queue emptying. During the queue build up, the calendar is bounded by a maximum value where, beyond this value, a portion of the events would be transferred to the TLS. How much this proportion is also one of the boundaries and it affects the rate of filling of the TLS. During the queue emptying stage, a boundary has to be set on the number of events to be

transferred when the calendar requests for the transfer of events back from the TLS. This boundary affects the rate of emptying from the TLS.

48

During the testing of the upper limit, it was found that a suitable limit would be one
that was not too high so as to render the lazy queue ineffective, and one that was not too low as to cause frequent transferring. The proportion of the calendar to be emptied was dependent on how large the frequency of transferring was acceptable (which was also dependent on the upper limit), and which point would have reasonable number of

events to keep the calendar stable.

The testing of the limit for transferring back from the TLS showed that the boundary did not make much of difference as long as it was somewhere halfway between 2 resize boundaries (i.e., powers of 2). Therefore, the wise choice is to set the boundary lower, to delay the transfer until necessary. The boundary cannot be set too low as well, because that will cause many resizes to occur, due to inherently closer resize boundaries for smaller queue sizes (smaller powers of 2). It was also observed that the closer the limit was to the upper limit the greater the number of resizes. This gave better performance to the more skewed distributions.

8.3

Threshold Values for Transferring

The last but most important parameters are the threshold values, to be set so as to
decide when the transferring is to be done. The decision on when to transfer is based on the state of the queue. There are 4 states that have been identified. They are the 3

states identified in the section before, plus an in-between state called the sinusoidal state. This sinusoidal state defines a fluctuation about a stable mean. It fluctuates to great to be considered as steady, and it does not move in any one general direction to be considered a build up or an emptying.

49

To check the state of the queue, a queue size buffer has been set up to store the values
of the total queue size at various time intervals. 2 parameters can be identified here. One is the size of this buffer and the other is the rate at which the queue size is

sampled. The greater the number of samples in the buffer, the less sensitive the threshold, and the smaller the sampling rate, the more sensitive the threshold. These parameters have to be adjusted so as to make good judgments, not to falsely trigger the transferring and also not to delay the transferring for too long. During testing, it was found that there was a relationship between the sampling rate and the total queue size. So the sampling rate was set as a factor of the queue size.

An algorithm was developed to test the state of the queue based on the statistics collected by the buffer about the queue size variation. The checking was done whenever there was a calendar resize, so as to do the transferring together with the resize. The algorithm first checked to see if all consecutive values were consistently moving in one direction. If it were growing, then the decision would be to transfer from the calendar to the TLS. If it were decreasing, then the decision would be to transfer from the TLS to the calendar. If it was neither of these two cases, then the algorithm would go on to check the variance of the samples. If the variance exceeds a certain threshold, the queue is deemed not to be steady, but in a sinusoidal state. If this is the case, no transferring is done and the queue proceeds as normal. If the variance is below the threshold, then it would trigger a full transfer of the whole TLS to the calendar. This decision has to be made carefully as it incurs high overheads. The decision is controlled by another parameter, which was tested as well. It was found that the threshold should be dependent on the total queue size, and not fixed.

50

The choice of algorithm was made based on testing to see if it were profitable to
transfer incrementally. The result was that a full transfer benefited a queue that is

decidedly in steady state, and an incremental transfer benefited a queue that is in transient state.

During the course of testing, though not related to the lazy algorithm, it was noticed that there was significant improvement made to the efficiency of the calendar just by increasing the number of buckets as compared to the original calendar algorithm. This could be due to the fact that the number of buckets is not optimized in the original algorithm. Through testing, it was found that the best performance was achieved with the number of buckets at least 4 times greater than the queue size (the original was at least half).

8.4

Comparison of Results Lazy Snoopy CQ Hold Performance


0.0045 0.004 0.0035

9000 12000 15000 18000 21000 24000 27000 30000

Tim e /m illisec onds

Rect NegTriag Triag Camel(70,20) Camel(98,01)

0.003 0.0025 0.002 0.0015 0.001 0.0005

6000

3000

Queue Size

Figure 22. Lazy Snoopy CQ Hold Performance

51

Lazy Snoopy CQ Up/Down Performance


0.12

Time/milliseconds

0.1 0.08 0.06 0.04 0.02

Rect NegTriag Triag Camel(70,20) Camel(98,01)

8.5

Future Directions

Further research would be done on the lazy algorithm to see it is able to perform
independent of the bottom tier. Research has already been started to improve the

original CQ and DCQ algorithms by adding the new lazy algorithm.

Another possible direction would be to fall back on the time-based algorithm as described previously. Through further refinement, there could be another of such a lazy structure that could possibly challenge the one described here.

Research still needs to be carried on the theory behind the parameters described above. There have been loose conclusions and relations, but they have yet to be theoretically sorted out. This challenge requires great mathematical background and clear grasp of queuing models and statistics. 52

300 0 600 0 900 0 120 00 150 00 180 00 210 00 240 00 270 00 300 00
Queue Size

Figure 23. Lazy Snoopy CQ Up/Down Performance

IV. CONCLUSION

The Industrial Attachment with High Speed Networks Research has been an eyeopening experience. It gave us the opportunity to get a taste of what research and development is like. During the course of the attachment, there was a lot of room for freedom to explore different ideas. Co-operation with fellow colleagues enabled healthy exchange of ideas and promoted speedier development. There were times

when research was futile and ideas were not coming; these were the times when patience was build up and thinking was most draining. On the other hand, when times were better, success was sweet and spurred us on.

Much was learned during the attachment. We picked up skills in analysis and problem solving. Our knowledge of queuing algorithms and data structures in the field of network simulations grew to a great extent. The challenging research helped us to develop our thinking, and gave us confidence to come up with new developments.

At the end of the 6-month stint, our team managed to make headway in developing a more efficient queuing model that relies on better statistics. Headway was also made in development of a multi-list structure that opened the door to avenues for further advancement in the efficiency of queuing models.

53

V. REFERENCES

Vaucher, J. G., and Duval, P., 1975. A Comparison of Simulation Event Lists. Commun. ACM 18, 4(June), 223-230.
Jones, D.W., 1986. An Empirical Comparison of Priority-queue and Event-set

Implementations. Commun. ACM 29, 4, 300-311.


Brown, R., 1988. Calendar Queues: A Fast O(1) Priority Queue Implementation for

the Simulation Event Set Problem. Commun. ACM 31, 10(Oct), 1220-1227.
Ronngren, R., Riboe, J., and Ayani, R., 1993. Lazy Queue: New Approach to

Implementing of the Pending Event Set. Int. J. Computer Simulation 3, 303-332.


Oh, S., and Ahn, J., 1999. Dynamic Calendar Queue. In Proceeding of the 32
nd

Annual Simulation Symposium.

54

You might also like