CHP 3A10.1007 2F978 3 540 25969 5 - 24

Design and Analysis of a Virtual Output
Queueing Based Windowing Scheduling Scheme

for IP Switching System
Jin Seek Choi1 and BongSue Suh2
1
Dept. of Computer Education, Hanyang University

17 Haengdang-dong, Seongdong-gu, Seoul 133-791, Korea
jinseek@hanyang.ac.kr
Tel: +82-2-2290-1129, Fax: +82-42-2290-1740
2
Dept. of Information and Communication Engineering
Andong National University, 388, Songchun-dong, Andong, 760-749, Korea
bsuh@andong.ac.kr
Tel: +82-54-820-5163, Fax: +82-54-820-6125
Abstract. In this paper, we investigate the performance of a virtual

output queueing (VOQ) based windowing (VOQW) scheduling scheme
for IP switching systems. From the result, we observe that the proposed
scheme can considerably reduce the arbitration complexity and improve
the switch throughput under nonuniform and correlated bursty trac.
Moreover, the nonuniform IP trac has no impact on the performance
of the VOQW scheme, even through it has a severe impact on the performance of FIFO based windowing scheme. Therefore, we conclude that
the VOQW scheme will be useful in designing of IP switching systems.
Introduction
There are several challenges in merging IP packet forwarding with cell-based

switching. One important issue is that the length of IP packet is variable, compared to asynchronous transfer mode (ATM) cell in which the length is xed.
The fragmented cells make bursty trac with highly correlated destinations [1].
Another important issue is trac imbalance. In Internet and telecommunication
networks, trac imbalance is inherent since some particular destination(s) such
as popular databases, communication servers or outgoing trunks can cause trac
concentration. Output ports included in these ports may cause the trac imbalance. The trac imbalance refers to a trac model with unevenly distributed
routing and dierent intensity at certain output port, which is called nonuniform trac. Hence, the input source trac of the IP switching system is likely
to present nonuniform and correlated bursty trac in the switch [1].
In order to overcome the problem associated with nonuniform and correlated
bursty trac, considerable works have been done on VOQ based scheduling algorithms [2, 5, 7, 8]. For example, maximum matching algorithms such as iSLIP
This work was supported in part by ETRI and ICU-OIRC funded by KOSEF.
Z. Mammeri and P. Lorenz (Eds.): HSNMC 2004, LNCS 3079, pp. 268279, 2004.
c Springer-Verlag Berlin Heidelberg 2004
Design and Analysis of a Virtual Output Queueing
269
and parallel iterative matching (PIM) schemes have been proposed to achieve
100% throughput for cell-based input queueing switches [8]. Marsan et al. develop novel scheduling algorithms to deal with variable length IP packets for
IP switching system, and prove that no throughput limitations exist by operating an input queueing switches in packet mode comparing to output queueing
switches [4]. Nong and et al. evaluate the maximum throughput of cell-based
IP switching systems for the PIM algorithm under bursty trac [5]. Note that
all of these works are based on VOQ based maximum matching algorithm such
as PIM and iSLIP, which can achieve 100% throughput even under nonuniform
trac. However, they have two types of constraints. One constraint is that the
multiple arbitrations have to be completed within one cell time slot. The other
constraint is that each arbitration logic has to handle up to N contending cells
at a time.
For the former constraint, a pipeline-based scheduling algorithms called
round-robin greedy scheduling (RRGS) was proposed by Smiljanic et al. [9].
Recently, Eiji et al. introduced the pipeline-based scheduling scheme which enables to relax the timing constraint for arbitration [10]. However, the constraint
of arbitration logic has not been studied yet, even though the arbitration logic is
not practical due to the implementation complexity of multiple cell arbitrations
per each output port when the switch size increases.
In this paper, we only consider the complexity of arbitration logic and show it
is a still bottleneck. Then, we propose a VOQ-based windowing (VOQW) scheme
and analyze the performance of the proposed scheme under nonuniform IP trac.
We believe that the combination of VOQ and windowing scheme can overcome
the drawback of the performance degradation of the conventional windowing
scheme under nonuniform trac. Moreover, the arbitration logic can be suitable
to be implemented in hardware since the proposed scheme only handles a small
number of contending cells in each arbitration similar to dual round-robin (DRR)
scheme [3]. With the analysis of the maximum throughput, we also show that the
proposed scheme outperforms comparing to the FIFO based windowing scheme
and DRR schemes even though it has a little less performance than iSLIP. We
verify the analytic results through computer simulation.
The remainder of this paper is organized as follows. In Section 2, we describe
the switch model and the VOQW scheme. In Section 3, we analyze the complexity of the arbitration logic and obtain the maximum throughput of the switch
under various trac patterns. In Section 4, we present the numerical results and
compare with simulation. Finally, we conclude in Section 5.
2
2.1
VOQ Based Windowing Scheme

Switch Model
The switch architecture being considered in this paper is a N N input queueing

cell-based switch with windowing scheme [11]. The switch fabric is nonblocking
and has no internal speed-up. The variable length packets are internally segmented into ATM-like cells that are switched. Cells are of xed-length and the
270
Jin Seek Choi and BongSue Suh

HOL
Contention
Logic
1
2
Arbiter
1
a1
g1
N
W-HOL
Switching
Fabric
HOL
Contention
Logic
Arbiter
2
aN
N
gN
Fig. 1. VOQ based windowing structure
buer size of each input queue is assumed to be innite. The switch operates
synchronously so that the cells are received and transmitted within the xed
time interval called a slot. Each input port can transmit at most one cell to any
output and each output can receive at most one cell in each slot time. However,
multiple cells arrive in the input queue as a train of cells.
Each input has a separate FIFO queue for each output called a virtual output
queue (VOQ). For example, input port i has N VOQs, says from 1 to N , and
VOQi,j stores cells arriving at input port i with the destination of output port j.
Each input has its own contention logic, and operating independently from the
others. The contention logic decides which VOQ at input port will be transferred
to outputs in each contention phase. Each output also has an arbiter which can
pick cells from the contending cells. Fig. 1 shows an example of the switch
structure with VOQW scheduling scheme, where W-HOL i queue consists of
HOL cells which are rst queued in all VOQs.
2.2
VOQW Scheduling Scheme
For arbitrating the input queueing cells,

wthe proposed scheduling procedure is
divided into w contention phases t = i=1 ci where ci is the time duration of
ith contention phase. At the rst contention phase c1 , each input port randomly
selects a contending cell from W-HOL queue and contends for its output port. At
each output port, an arbiter randomly selects a cell among the contending cells
with the same destination, and replies a grant to the winner input port. Due to
the output conict, some inputs may not be selected to transmit, then the input
port selects another cell from W-HOL queue and contends for the output at
the second window phase c2 . On the other hand, the winner input port (i.e., the
port won the output contention) at a previous contention phase is not allowed to
contend for the remaining contention phases. Moreover, the output ports which
are occupied by a cell in the previous contention phase will not arbitrate the
271
contending cells after then. Such a contention phase is repeatedly done up to w

times within a slot time. It means that each input can contend for the output
up to w times, where w is the window size. However, at most one cell is allowed
to be transmitted from/to input/output.
The VOQW scheme is a simple practical solution for IP switching, since it
can be implemented with a contention logic that can handle only one cell at
each input port similar to the conventional FIFO based scheme [11]. Moreover,
each output arbiter handles small number of contending cells with the same
destination. One dierence is that the proposed scheme is a windowing scheme
with a separate logical queue for each destination port, called VOQ instead of a
single FIFO queue. The other dierence is that the proposed scheme randomly
selects a contending cell among separated VOQ queues instead of sequently
selecting a cell from a single FIFO queue. From now on, we call this contention
procedure as a VOQW scheduling scheme.
3
3.1
Performance Analysis
Trac Model
The trac intensity in the switch can be represented by means of a rate matrix
describing trac passing from input i to output j. The particular form of the
rate matrix which has been used in previous studies is
ij = i Qj
(1)
where i is the average arrival rate of cells at input i, and Qj is the probability
of a cell at any input passing to output j.
The arrival statistics considering in this paper are correlated bursty trac.
The correlated bursty trac model represents a realistic IP trac since real IP
packets tend to fragments of a variable length packet, corresponding to arrival
in bursts. The input trac alternates between burst and idle with geometrically distributed mean lengths, while output address of each burst are tightly
correlated with the same output. We can assume the input trac as a simple
on/o arrival process modelled by the interrupted bernoulli process. For input
trac model, we also consider the self-similar arrival process modelled by Paretodistributed ON/OFF trac with Hurst parameter H = (3 a)/2. It can be used
to characterize probability densities that describe packet interarrival time with
heavy-tailed distribution.
Next, we consider outgoing trac intensity. In real environment, some particular destination(s) such as a popular database, communication server or outgoing
trunks can cause trac concentration. Output ports included in these ports may
cause the trac imbalance. Thus, the number of packets destined for dierent
outputs may not be identical. Such trac imbalance that is dierent from the
uniform one is referred to as nonuniform trac.
In this paper, we do not consider input imbalance trac. We only consider
nonuniform trac that the output addresses are not uniformly distributed. The
272
output imbalance factor is dened by Qj such that the probability of an output

address being j as follows
Qj = 1/N
and
Qj = 1.
for all j
(2)
for all j
(3)
It can be divided into following two cases. The most general nonuniform
trac pattern is the output imbalance trac consisting of two output groups.
In this case, the outputs are divided into two groups N1o and N2o . The output
imbalance factor for each output group is given by

if j N1o
P1 N1o
1
Qj =
(4)
(1 P1 ) N1o if j N2o
2
where P1 (or 1 P1 ) means the portion of input trac going to group N1o (N2o )
and 1/N1o (1/N2o ) means the portion of a specic output in the same output
group. From now on, we call P1 as the bi-group coecient.
Another nonuniform trac pattern is the hot-spot imbalance where a single
hot-spot is super-imposed on the background of uniform trac. This is a special
case of bi-group imbalance model as N1o 1 and the output imbalanced factor
becomes

h + 1h if j N1o
Qj = 1h N
(5)
otherwise
N
where h is called the hot-spot coecient.
3.2
Complexity Analysis
Now, we analyze the complexity of the VOQW scheduling scheme. An input

queueing switch can switch at most one cell per each input and output. The
scheduler determines which cell will contend for its output. Each output also
has its own arbiter, and operating independently from the others. The main
process of an arbiter is the decision of which among the contenting cells belong
to the given output to be scheduled at the next slot. In other words, an arbiter
picks one which should be transmitted next, among all contending cells with the
same destination.
The switch throughput is essentially depending on a service discipline that
optimally arranges the service order among the contending cells. The optimal
matching can be achieved in favor of the number of contending cells. However, the
number of contending cells results in the complexity of arbitration, the decision
of which among contending cells. The reason is that each arbiter memorizes all
inputs that are contending. After then, each output arbiter picks one of the
contending cells based on a service discipline.
273
For example, iSLIP and PIM may contend all cells which are queued in WHOL. The contending cells with the same destination have to be arbitrated in an
arbiter. iSLIP and PIM schemes enormously increase the number of contending
cells for arbitration, even though they can improve the switch throughput. On the
other hand, the proposed VOQW scheme picks one cell from each W-HOL queue
as a random selection. The FIFO-based scheduling scheme picks the oldest cell.
DRR scheduling scheme picks one from W-HOL queue based on a slightly more
complicated round-robin service discipline. Hence, the VOQW scheme limits the
total number of contending cells by the number of inputs N , and distributes
the contending cells into all outputs randomly. Therefore, the average number
of contending cells destined for an output is considerably reduced as much as
that of DRR scheme. Moreover, the average number of contending cells per each
contention phase is almost the same of DRR scheme.
Fig. 2 shows the average number of contending cells of the VOQW scheme
and compare to that of iSLIP under the hot-spot nonuniform trac for h = 0.005
and the average number of contending cells per each contention phase when the
oered load is 0.98. As shown in this gure, iSLIP (w = 1) can increase the
average number of contending cells up to N in proportional to the oered load.
When the oered load approaches 0.98, the number of contending cells becomes
100 or more. For iterative scheme (w = 5), the average number of contending
cells is considerably reduced to 28. However, the number of contending cells at
the rst contention phase is still high, greater than 40. On the other hand, the
proposed VOQW scheme picks one cells from W-HOL queue, and contends for
its output. Since each input picks a nonempty VOQ as a random selection, the
contending cells are evenly distributed to all outputs. Therefore, the number of
contending cells remains at low when the trac load is increased. In addition, the
arbitration of the proposed VOQW scheme can be performed through requestgrant procedure while iSLIP requires for the three-way handshaking mechanism
(request-grant-accept) to arbitrate input queueing cells. It indicates that the
VOQW scheme can considerably reduce the complexity of arbitration comparing
to iSLIP or PIM.
3.3
Throughput Analysis
Let us analyze the dynamics of the VOQW scheme. In the switching system,
the cell will be served by the windowing scheme. So, each input port contend
for the desired outputs up to w times, but one and after. It means that a cell
can contend for the output one by one, but the next cell can do only when all of
former contentions are blocked. Regarding a contending cell, each input chooses
a VOQ with equal probability for scheduling.
Let focus on the dynamics of a tagged output. From the point of the tagged
output, the probability that none of the contending VOQs is destined for the
tagged output is (11/N )M . Here, M is the expected number of total contending
VOQs. If is the utilization of each VOQ, the expected number of contending
VOQs in the system is given by N . So, the probability that at least one cell
among M cells is destined for the tagged output becomes 1 (1 1/N )N . By
274

50
120
iSLIP
DRR
VOQW
80
iSLIP
window size =1
60
VOQW
window size =1 & 5
40
iSLIP (w=5)
DRR
VOQW (w=5)
40
Average number of contending cells
Average number of contending cells
100
iSLIP
window size =5
30
20
10
20
0.2
0.4
0.6
0.8
Offered load
4
Arbitration window
(a)
(b)
Fig. 2. Average number of contending cells per each arbiter as a function of (a) oered
load and (b) arbitration window under the hot-spot nonuniform trac (h = 0.005)
taking expectation of this probability, we can get the expected throughput for
the tagged output such as
E[T ] = E[1 (1 1/N )N ].
(6)
This expected throughput gives the switch throughput.

Under the saturated state, all VOQs always have cells waiting for the contention M N . Based on this assumption, we can analyze the maximum
throughput of the VOQ based windowing scheme. In VOQ based windowing
scheme, the total number of contending cells, M is the sum of all contenting
cells within a slot time. Suppose there are w contention phases in a slot time.
The number of contending cells at the 1st contention phase becomes N since every input contends one cell for its output at the saturation point. Only blocked
inputs contend for its output at the second phase. The number of contending
cells at the 2nd contention phase becomes the number of inputs blocked at the
1st contention. In the same way, the number of contending cells at the ith contention phase is equal to the number of inputs blocked at all previous contention
phases. Thus, the number of contending cells at ith contention phase is equal
to one minus the maximum throughput of the switch with window size i 1,
multiplied by the number of inputs N . Hence, the total number of contending
cells for the outputs is denoted by
Mw = Mw1 + Mw1 (1 E[Tw1 ]),
(7)
where Mw1 is derived from the following iterations

Mw1 = Mw2 + Mw2 (1 E[Tw2 ])
..
.
M3 = M2 + M2 (1 E[T2 ])
(8)
275
M2 = M1 + M1 (1 E[T1 ])
M1 = N
By applying (7) into (6), then we can obtain the maximum throughput of the
VOQ-based windowing scheme as follows
E[T1 ] = 1 e1
1
E[T2 ] = 1 e(1+(1E[T1 ])) = 1 e(1+e
E[T3 ] = 1 e(1+(1E[T1 ])+(1E[T2 ]))

1
= 1 e(1+e
..
.
1 )
+e(1+e
(9)
where E[Ti ] is the maximum throughput of the VOQ-based windowing scheme

with window size i.
Until now, we analyze the maximum throughput of VOQ based windowing
schemes. However, the waiting time and switch throughput cannot be easily
derived. This is because the service probability of a contending cell are correlated
each other under correlated and bursty trac. Therefore, we evaluate the waiting
time of the switch through computer simulation in the following section.
Numerical Results and Discussion
In this section, we present the maximum throughput as numerical results. The

results of analysis are compared to those of simulation. In the simulation, we
assume that the size of the switch is 128 128. All input queues are composed
of w + 100 FIFO buers or 50 VOQ buers for each destination. In the switch,
there is no loss and the switch operation is slotted of running for 106 times. Input
load is balanced and overhead is not taken. In this analysis, we are interested in
the maximum throughput to observe the eects of trac patterns on the switch
performance for various window sizes. The maximum throughput is derived when
the switch is saturated, for instance, = 1 without driving the switch stability.
The maximum throughput indicates the upper bound of the average utilization
of the N outputs. We also evaluate the switch throughput and waiting time of
a cell through computer simulation.
In the simulation, we consider two factors. One is average burst length of an
incoming packet and the other is the imbalance of output address distribution.
For the bursty trac, the average burst length of an incoming packet 1/b will
be varied from 1 to 20. The cells in the same burst have the same destination
address. The arrival process of all inputs is assumed independent process. For the
bi-group imbalance case, the outputs are divided into 32 and 96 for group 1 and
2, respectively. The trac intensity of output group 1, called bi-group coecient,
P1 will be varied from 0.25 to 0.65. For the hot-spot imbalance case, on the other
hand, the outputs are divided into 1 and 127 and the trac intensity of output
1, called the hot-spot coecient h, will be varied from 0.0 to h =0.025. When
276
P1 = 0.25 or h = 0 both cases become the uniform trac case (i.e., addresses
of incoming cells are uniformly distributed to all outputs). Here, we use the
Pareto distribution with Hurst parameters of H = 0.7 and H = 0.8 as a selfsimilar trac. In the following gures, lines indicate the simulation results and
small circles indicate the analytic results. The close match between the analytic
and simulation results indicates that the analysis is adequate in predicting the
performance.
Fig. 3 shows the maximum throughput versus bi-group coecient and hotspot coecient under correlated bursty trac in (a) and (b), respectively. Fig. 4
shows the maximum throughput as a function of average burst length as well as
window size under correlated bursty and hot-spot nonuniform trac (h = 0.005).
As shown in these gures, the maximum throughput of FIFO based windowing
scheme is dramatically decreased as the nonuniform coecient increases. Hotspot trac has more adverse eect on the maximum throughput. The reason
is that the majority of cells in the input queues are destined to the specied
output, which are attempting to pass an oered load many times that of their
capacity. While FIFO based windowing scheme has a little eect for the correlated bursty trac (the maximum throughput per port abruptly converges to
0.5 when the burst size is greater than 5), the performance improvement of the
windowing scheme is rapidly reduced as the burst size increases [12]. This is
because the correlated bursty trac increases only HOL blocking due to the dependency of consecutive cells in the same input port. Moreover, the blocked cells
are accumulated into the same input queue. On the other hand, correlated bursty
trac and nonuniform trac have no impact on the maximum throughput of
the VOQW scheme. This is because the blocked cells are accumulated into the
VOQ. Each input can select another VOQ for contending as a random selection.
Thus, the VOQW scheme can considerably increase the maximum throughput
as the window size increases. As shown in this gure, the maximum throughput
becomes 0.85 when w = 5. The maximum throughput is consistently remained
at the same value under the correlated bursty and nonuniform trac. Consequently, we know that the VOQW scheme is useful under correlated bursty and
nonuniform trac as well as uniform trac.
Figs. 5 and 6 show the switch throughput and delay performance below the
saturation point. These results are obtained from computer simulation for various self-similar trac under 1/b = 20, and h = 0.05. As shown in Fig. 5,
the self-similar trac (H = 0.7 and H = 0.8) has little impact on the switch
throughput. Moreover, the switch throughput of the VOQW scheme is linearly
increased but it settles down in a saturated trac load. The saturated trac
load is restricted by the window size. iSLIP scheme also linearly increases the
switch throughput below the saturated point, but the switch throughput is continuously increased with low rate up to 1. It means that the switch throughput
of the VOQW scheme is almost same to that of iSLIP without considering trac
condition.
Fig. 6 shows the waiting time of iSLIP and DRR schemes as well as the
VOQW scheme. As shown in this gure, the self-similar trac deteriorates the
1.0
1.0
0.8
0.8
Maximum Throughput
Maximum Throughput
0.6
VOQ based windowing scheme
0.4
0.0
0.2
0.3
0.6
0.4
FIFO based windowing scheme
w=1
w=3
w=5
w=7
0.2
0.2
0.4
0.5
Bigroup coefficient
0.6
0.7
w=1
w=3
w=5
w=7
0.0
0.00
0.8
277
0.01
0.02
0.03
hotspot coefficient
(a)
(b)
Fig. 3. Maximum throughput versus imbalance coecient under correlated bursty trafc for FIFO and VOQ based windowing schemes (1/b = 20)
1
0.95
w=1
w=5
0.8
Maximum Throughput
Maximum Throughput
0.85
0.75
0.65
0.4
VOQ based scheme (P1=0.35)
VOQ based scheme (h=0.005)
FIFO based scheme (P1=0.35)
FIFO based scheme (h=0.01)
0.2
0.55
0.45
0.0
0.6
5.0
10.0
Average burst length
(a)
15.0
20.0
4
Window size
(b)
Fig. 4. Maximum throughput versus (a) average burst length (b) window size, under
correlated bursty and nonuniform trac (1/b = 20)
performance of waiting time a little. Moreover, this gure shows that the VOQW
scheme has the lower waiting time than iSLIP just below the saturated trac
load while has higher waiting time than iSLIP above the saturated trac load.
This is because the switch throughput of iSLIP scheme can be continuously
increased below the saturated trac load through the desynchronization eect.
The iSLIP (or DRR) scheme can reduce the waiting time at the region even
though the waiting time is a slightly increased below the saturated trac load
due to the desynchronization eect. On the other hand, the VOQW scheme
can restrict the switch throughput up to upper bound. The waiting time of
the VOQW scheme abruptly is increased at the saturated trac load, but the
waiting time remains at low below the saturated trac load. From the results,
we observe that the VOQW scheme can considerably reduce the total waiting
time below the saturated trac load comparing to that of the iSLIP scheme
278

1
iSLIP
window size =5
VOQW
window size =5
Self similar H=0.7

Self similar H=0.8
Bernoulli
0.8
Switch Throughput
DRR
0.6
VOQW
window size =1
0.4
iSLIP
window size =1
0.2
0.2
0.4
0.6
0.8
Offered load
Fig. 5. Switch throughput under correlated bursty and nonuniform self-similar trac
10000
Self Similar H=0.7

Self similar H=0.8
Bernoulli
Total waiting time [slots]
8000
iSLIP
window size =1
VOQW
window size =1
6000
VOQW
window size =5
4000
DRR
2000
iSLIP
window size =5
0.2
0.4
0.6
0.8
Offered load
Fig. 6. Total waiting time under correlated bursty and nonuniform self-similar trac
even though the waiting time is abruptly increased in the saturated trac load.
Consequently, designer can consider the VOQ based windowing scheme below
the saturated trac load.
Conclusion
The objective of this paper is to show the performance of the proposed VOQbased windowing (VOQW) scheme under the correlated bursty and nonuniform
trac. From the results, we know that the VOQW scheme can be implemented
with a simple arbitration logic similar to DRR scheme. The VOQW scheme can
considerably reduce the switch complexity comparing to that of iSLIP. Moreover,
the nonuniform or correlated bursty trac has no impact on the performance
279
of the switch with VOQW scheme. That is, the VOQW scheme can provide
consistent performance under various trac. In addition, the VOQW scheme
can considerably increase the switch throughput comparing to the FIFO based
windowing scheme or DRR scheme, even though the throughput of the VOQW
scheme is a little less than that of iSLIP. Consequently, we concluded that the
VOQW scheme is useful to be implemented when desiging scheduling scheme
for high-speed IP switches below the saturation point.
References
1. A. Adas, Trac models in broadband networks, IEEE Commun. Mag., vol. 35,
pp. 8289, July 1997.
2. P. Gupta, Scheduling in input queued switches: a survey, in
citeseer.nj.nec.com/246798.html
3. Y. Li, S. Panwar, H. J. Chao, On the performance of a Dual Round-Robin switch,
in IEEE INFOCOM 01, vol. 3, pp. 1688-1697, April 2001.
4. M. A. Marsan, A. Bianco, P. Giaccone E. Leonardi and F. Neri, Packet scheduling
in input-queued cell-based switches, in IEEE INFOCOM 01, 2001.
5. G. Nong, M. Hamdi, and J. K. Muppala, Performance evaluation of multiple
input-queued ATM switches with PIM scheduling under bursty trac, IEEE
Trans. Commun., vol. 49, pp. 13291333, Aug. 2001.
6. D. Manjunath and B. Sikdar, Variable length packet switches: delay analysis of
crossbar switches under Poisson and self similar trac, in IEEE INFOCOM 00,
2000.
7. A. Mekkittikul and N. Mckeown, A practical scheduling algorithm to achieve
100% throughput in input-queued switches, IEEE INFOCOM 98, pp. 792799,
1998.
8. N. Mckeown, A. Mekkittikul, V. Anantharam, and J. Walrand, Achieving 100%
throughput in an input-queued switch, IEEE Trans. Commun., vol. 47, pp. 1260
1267, Aug. 1999.
9. A. Smiljanic, R. Fan and G. Ramamurthy, RRGS-round-robin greedy scheduling
for electronic/optical terabit switches, Proc. of GLOBECOM 99, pp. 12441250,
1999.
10. E. Oki, R. Rojas-Cessa and H. J. Chao, A pipeline-based approach for maximalsized matching scheduling in input-buered swtiches, IEEE Commun. Letter,
vol. 5, No. 6, pp. 263265, June 2001.
11. A. Santhanam and A. Karandikar, Window-based cell scheduling algorithm for
VLSI implementation of an input-queued ATM switch, in IEE Proc.-Commun.
Vol. 147, No. 2, April 2000.
12. J. S. Choi and H. H. Lee, Performance Study of an Input Queueing ATM Switch
with windowing scheme for IP Switching System , in Proceeding of HPSR 2002.
Kobe, Japan, May, 2002.
13. Y. J. Hui, Switching and trac theory for integrated broadband network. Boston:
Kluwer Academic Publishers, 1990.

CHP 3A10.1007 2F978 3 540 25969 5 - 24

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CHP 3A10.1007 2F978 3 540 25969 5 - 24

Uploaded by

Copyright:

Available Formats

Design and Analysis of a Virtual Output

Queueing Based Windowing Scheduling Scheme

Dept. of Computer Education, Hanyang University

Abstract. In this paper, we investigate the performance of a virtual

There are several challenges in merging IP packet forwarding with cell-based

Design and Analysis of a Virtual Output Queueing

VOQ Based Windowing Scheme

The switch architecture being considered in this paper is a N N input queueing

Jin Seek Choi and BongSue Suh

Fig. 1. VOQ based windowing structure

VOQW Scheduling Scheme

For arbitrating the input queueing cells,

Design and Analysis of a Virtual Output Queueing

contending cells after then. Such a contention phase is repeatedly done up to w

Jin Seek Choi and BongSue Suh

output imbalance factor is dened by Qj such that the probability of an output

Now, we analyze the complexity of the VOQW scheduling scheme. An input

Design and Analysis of a Virtual Output Queueing

Jin Seek Choi and BongSue Suh

Average number of contending cells

Average number of contending cells

This expected throughput gives the switch throughput.

where Mw1 is derived from the following iterations

Design and Analysis of a Virtual Output Queueing

E[T2 ] = 1 e(1+(1E[T1 ])) = 1 e(1+e

E[T3 ] = 1 e(1+(1E[T1 ])+(1E[T2 ]))

where E[Ti ] is the maximum throughput of the VOQ-based windowing scheme

Numerical Results and Discussion

In this section, we present the maximum throughput as numerical results. The

Jin Seek Choi and BongSue Suh

Design and Analysis of a Virtual Output Queueing

VOQ based windowing scheme

FIFO based windowing scheme

FIFO based windowing scheme

VOQ based windowing scheme

FIFO based windowing scheme

Jin Seek Choi and BongSue Suh

Self similar H=0.7

Self Similar H=0.7

Total waiting time [slots]

Design and Analysis of a Virtual Output Queueing

You might also like