Professional Documents
Culture Documents
multi-path routing
Sahel Alouneh
a,
*
, Anjali Agarwal
b
, Abdeslam En-Nouaary
c
a
German-Jordanian University, Jordan
b
Department of Electrical and Computer Engineering, Concordia University, 1515 St. Catherine West, Montreal, Canada H4G 2W1
c
Institut National des Postes et Telecommunications (INPT) Madinat Al Irfane, Rabat, Morocco
a r t i c l e i n f o
Article history:
Received 17 April 2008
Received in revised form 17 November 2008
Accepted 3 February 2009
Available online 11 February 2009
Responsible Editor: G. Ventre
Keywords:
MPLS
Fault tolerance
Failure recovery
Path protection
Packet loss
a b s t r a c t
Multi-protocol label switching (MPLS) is an evolving network technology that is used to
provide trafc engineering (TE) and high speed networking. Internet service providers,
which support MPLS technology, are increasingly demanded to provide high quality of ser-
vice (QoS) guarantees. One of the aspects of QoS is fault tolerance. It is dened as the prop-
erty of a system to continue operating in the event of failure of some of its parts. Fault
tolerance techniques are very useful to maintain the survivability of the network by recov-
ering from failure within acceptable delay and minimum packet loss while efciently uti-
lizing network resources.
In this paper, we propose a novel approach for fault tolerance in MPLS networks. Our
approach uses a modied (k, n) threshold sharing scheme with multi-path routing. An IP
packet entering MPLS network is partitioned into n MPLS packets, which are assigned to
node/link disjoint LSPs across the MPLS network. Receiving MPLS packets from k out of n
LSPs are sufcient to reconstruct the original IP packet. The approach introduces no packet
loss and no recovery delay while requiring reasonable redundant bandwidth. In addition, it
can easily handle single and multiple path failures.
2009 Elsevier B.V. All rights reserved.
1. Introduction
Multi-protocol label switching (MPLS) [1] is an evolving
technology that improves routing performance and speed,
and enables trafc engineering by providing signicant
exibility in routing. It is also capable of providing control-
lable quality of service (QoS) by primarily prioritizing
Internet trafc.
MPLS provides mechanisms in IP backbones for explicit
routing using label switched paths (LSPs), encapsulating
the IP packet in an MPLS packet. When IP packets enter a
MPLS based network, label edge routers (LERs) assign them
a label identier based on classication of incoming pack-
ets and relating them to their forward equivalence class
(FEC). Once this classication is complete and mapped,
different packets are assigned to corresponding labeled
switch paths (LSPs), where label switch routers (LSRs)
place outgoing labels on the packets. In this basic proce-
dure all packets which belong to a particular FEC follow
the same path to the destination, without regards to the
original IP packet header information. The constraint based
label distribution protocol (CR-LDP) [2] or RSVP-TE [3], an
extension of the resource reservation protocol, is used to
distribute labels and bind them to LSPs. Fig. 1 shows a sim-
ple process for requesting and assigning labels in a MPLS
network. Here, an IP packet with IP prex value 47.1 enter-
1389-1286/$ - see front matter 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.comnet.2009.02.001
* Corresponding author. Tel.: +1 5146929353.
E-mail addresses: sahel.alouneh@gju.edu.jo, sahel_a@yahoo.com
(S. Alouneh), aagarwal@ece.concordia.ca (A. Agarwal), ennouaar@ece.
concordia.ca (A. En-Nouaary).
Computer Networks 53 (2009) 15301545
Contents lists available at ScienceDirect
Computer Networks
j our nal homepage: www. el sevi er . com/ l ocat e/ comnet
ing the MPLS network is assigned labels L
7
and L
4
to setup
an LSP used to forward the packet from the ingress router
towards the egress router. Therefore, each LSR that re-
ceives a MPLS packet with Label IN value, after checking
its FEC value, forwards the packet to the next router with
Label Out value.
MPLS has many advantages for trafc engineering. It
increases network scalability, simplies network service
integration, offers integrated recovery, and simplies
network management. However, MPLS is very vulnerable
to failures because of its connection oriented architecture.
Path restoration is provided mainly by rerouting the trafc
around a node/link failure in a LSP, which introduces con-
siderable recovery delays and may incur packet loss. Such
vulnerabilities are much costly to time-critical communi-
cation [5] such as real-time applications, which tolerate a
recovery time in the order of seconds down to 10s milli-
seconds. Therefore, service disruption due to a network
failure or high trafc in the network may cause the cus-
tomers signicant loss of revenue during the network
down time, which may lead to bad publicity for the service
provider [4]. To prevent such bad consequences, routers
and other system elements in MPLS networks should be
resilient towards node or link failures. In other words,
MPLS networks should implement efcient mechanisms
for ensuring the continuity of operations in the event of
failure anywhere in the network. This aspect is usually re-
ferred to as fault tolerance. It is dened as the property of a
system to continue operating properly in the event of fail-
ure of some of its parts.
Over past years, several research works have been done
to deal with fault tolerance in MPLS networks (for instance
[514]). They can be classied into two categories, namely:
pre-established protection techniques and dynamic pro-
tection techniques. In the former, a backup LSP is pre-
established and congured at the beginning of the commu-
nication in order to reserve extra bandwidth for each
working path. In the latter, no backup LSP is established
in advance but after a failure occurs; hence, extra band-
width is reserved upon the happening of failures. Pre-
established path protection is the most suitable for
restoration of MPLS networks in real-time due to fast
restoration speed. The dynamic protection model, how-
ever, does not waste bandwidth but may not be suitable
for time sensitive applications because of its large recovery
time [57].
In addition, recovery schemes can be classied as either
link restoration or path restoration according to the initiali-
zation locations of the rerouting process. In link restora-
tion, the nodes adjacent to a failed link are responsible
for rerouting all affected trafc demands. In contrast, in
path restoration, the ingress node initiates the rerouting
process irrespective of the location of the failure. When
the reserved spare capacity can be shared among different
backup paths, it is called shared path/link restoration. In
general, path restoration requires less total spare capacity
reservation than link restoration scheme [8].
Besides the above classication issue, protection tech-
niques can be compared one to another based on parame-
ters like bandwidth redundancy, type of failures handled,
recovery time, and packet loss. The rst parameter mea-
sures howmuch extra bandwidth is required by the protec-
tion scheme. The second parameter determines whether
the protection technique recovers from single or multiple
failures. The third parameter indicates how much time is
required by the technique to reroute trafc after failure. Fi-
nally, the last parameter gives the percentage of packets
lost due to failures. As pointed out later on, none of the
existing fault tolerance schemes provides path protection
without packet loss, recovery delay, and minimum redun-
dant bandwidth. Therefore, there is still a need for newpro-
tection techniques that can optimize all of these factors.
In this paper, we present a novel approach for fault tol-
erance in MPLS networks using a modied (k, n) threshold
sharing scheme with multi-path routing. An IP packet
entering MPLS network is partitioned into n MPLS packets,
which are assigned to n disjoint LSPs across the MPLS net-
work. Receiving MPLS packets from k out of n LSPs is suf-
cient to reconstruct the original IP packet. The proposed
approach has the following advantages. Firstly, it handles
single failure as well as multiple failures. Secondly, it al-
lows the reconstruction of the original packet without
any loss of packets and a zero recovery delay. Thirdly, it re-
quires a reasonable redundant bandwidth for full protec-
tion from failures. It is worth to note that in case the
network topology does not offer n disjoint paths, the pro-
posed approach should use n maximally disjoint paths. In
this case, the reconstruction of the original packet might
Fig. 1. MPLS single path label distribution.
S. Alouneh et al. / Computer Networks 53 (2009) 15301545 1531
be at risk if failures occur in the shared links. We should
also point out here that our proposed method.
The rest of this paper is organized as follows. Section 2
is devoted to related work. Section 3 introduces our ap-
proach for fault tolerance in MPLS and discusses the main
issues related to it. Section 4 presents the performance
evaluation and the simulation results. Section 5 concludes
the paper.
2. Related work
As indicated previously, the recovery approaches in
MPLS fall into one of two categories: pre-established pro-
tection and dynamic protection. Since our approach is
based on n pre-established paths between the ingress
and egress routers, the following discussion is limited to
existing pre-established protection schemes only.
The 1 + 1 one plus one protection discussed in [5,6
and 9] can provide path recovery without packet loss or
recovery time. The resources (bandwidth, buffers, and pro-
cessing capacity) on the recovery path are fully reserved,
and carry the same trafc as the working path, requiring
substantial amount of dedicated backup resources. Selec-
tion between the trafc on the working and recovery paths
is made at the path merge LSR (PML) or the egress LER. In
this scheme, the resources dedicated for the recovery of
the working trafc may not be used for anything else.
Generally speaking, protection schemes use 1:1 path
protection (extendible to 1:N and M:N shared protection)
for efcient bandwidth utilization, where every link can
carry either regular trafc or backup trafc and thus does
not require dedicated backup links. The resources on the
recovery path may be shared with other working paths.
To support differentiated services, normally if the network
is safe, the capacity of the backup paths is utilized to carry
packets belonging to lower priority class types (e.g. best ef-
fort). In case of failure, the lower class trafc is blocked to
support backup for high priority trafc [10].
The paper by Makam et al. [5] proposes a PSL (Path
Switched LSR) oriented path protection mechanism that
consists of three components: minimize the delay experi-
enced by notication message traveling from the fault
detection node to the protection switching node (i.e., PSL)
by building a fast and efcient reverse notication tree
structure, a hello protocol to detect faults, and a lightweight
notication transport protocol to achieve scalability.
The paper by Haskin et al. [11] is also based on pre-
established alternative path. The backup path is comprised
of two segments. The rst segment is established between
the last hop working switch and the ingress LSR in the re-
verse direction of the working path. The second segment is
built between the ingress LSR and the egress LSR along an
LSP that does not utilize any working path. In Fig. 2, if the
link between LSR4 and LSR9 fails, all the trafc in working
path is rerouted along the backup path, LSR 321567
89. Optionally, as soon as LSR1 detects the reverse trafc
ow, it may stop sending trafc downstream of the pri-
mary path and start sending data trafc directly along
the second segment.
The paper by Buddhikot et al. [12] addresses guaranteed
uninterrupted connectivity in case of link/node failure in
primary path by nding a set of backup LSPs that protect
the links along the primary LSP. The authors introduce
the concept of backtracking where the backup path
may originate at the failed link (i.e., local restoration) or
in the worst case the backup paths may originate at the in-
gress node (i.e., path protection). In other words, it pro-
vides algorithms that offer a way to tradeoff bandwidth
to meet a range of restoration latency requirements.
The paper by Virk et al. [13] presents an economical glo-
bal protection framework that is designed to provide min-
imal involvement of intermediate LSRs, reduction in the
number of Path Switch LSRs responsible to switch the traf-
c from failed working path to the backup path. The pro-
posed scheme uses a directory service that is logically
centralized and physically distributed database to provide
a fast lookup of information. In the paper, the performance
evaluation only considers packet loss.
The authors of [14] present a solution to further reduce
the spare capacity allocation (SCA) to near optimality by
proposing a successive survivable routing (SSR) algorithm
for mesh based communication networks. First, per-ow
spare capacity sharing is captured by a spare provision ma-
trix (SPM). The SPM matrix has a dimension of the number
of failure scenarios by the number of links. It is used by
each demand to route the backup path and share spare
capacity with other backup paths. Next, based on a special
link matrix calculated from SPM, SSR iteratively routes/up-
dates backup paths in order to minimize the cost of total
spare capacity. The network redundancy (which is the ratio
of the total spare capacity over the total working capacity)
for shared path restoration cases measured in this paper
Fig. 2. Path restoration examples Haskin et al.
1532 S. Alouneh et al. / Computer Networks 53 (2009) 15301545
ranges from 35% to 70%. The paper, however, does not con-
sider recovery delay, packet loss and re-ordering.
The proposed work in reference [31] provides a method
for capacity optimization for path and span restorable net-
works. The approach uses the integer program formulation
based ow constraints which solves the spare and working
capacity placement problem for both span and path restor-
able networks. The paper however does not consider resto-
ration delay or packet loss overhead.
The paper in reference [32] proposes a mixed shared
path protection scheme. The scheme denes three types
of resources: (a) primary resources that can be used by pri-
mary paths, (b) spare resources that can be used by backup
paths, and (c) mixed resources that can be shared by both
the primary and backup paths. The approach in this paper
is different than other shared protection scheme because it
allows some primary and backup paths to share the com-
mon mixed resources if the corresponding constraints
can be satised. However, the paper does not consider path
restoration delay and packet loss overhead.
Seok et al. proposed in [15] a fault tolerant multi-path
trafc engineering scheme for MPLS networks with the
objective to effectively control the network resource utili-
zation. The proposed scheme consists of the maximally
disjoint multi-path conguration and the trafc rerouting
mechanism for fault recovery. When some link failures
are detected, the proposed mechanism routes the trafc
owing on the failed LSPs into available LSPs. Only band-
width utilization is considered; recovery time and packet
loss have not been discussed in this work. However, this
approach applies the same mechanism used in any other
path protection approach where recovery time and packet
loss occurs.
A distributed LSP scheme to reduce spare bandwidth
demand in MPLS networks proposed in [7] is also based
on pre-established protection scheme. Here the ingress
LER groups the incoming LSP, which is carrying the incom-
ing IP packet ows, into several sub-groups each of which
is assigned to a distinct sub-LSP from a number of K sub-
LSPs. A single failure is only assumed; hence only one back-
up sub-LSP of the amount 1/K is required. The scheme pro-
posed in [7] does not however consider reducing recovery
delay and/or packet loss. A path failure results in switching
over to the backup path and therefore incurs considerable
recovery delay.
Dispersity routing on ATM networks [16] also requires
networks with multiple disjoint paths to spread the data
from a source over several paths. To provide redundancy,
message is divided into fewer sub-messages than there
are paths. Additional sub-messages are constructed as a
linear combination of the bits in the original sub-messages
(e.g. modulo-2, parity check codes), such that the original
message may be reconstructed without receiving all of
the sub-messages. For a (N, K) system, N is the number of
paths used and the original message is subdivided into K
parts, where N > K. For a modulo-2 system N = K + 1, and
the system can only tolerate single path failure. The
missing sub-message creates holes in the codeword whose
position should be known and whose value can be recon-
structed from the received sub-messages. In order to han-
dle multiple failures, N > K + 1, for example in a (3, 1)
system, Flooding strategy is used where the entire mes-
sage is transmitted on all three paths until at least one path
is received, thus requiring more redundant bandwidth.
It is worth to note that we have also proposed in [30]
the use of threshold sharing scheme to improve the secu-
rity in MPLS networks. The difference between the two pa-
pers is in the application of this threshold sharing scheme.
The security requirements and analysis are different from
those required in fault tolerance. In other words, data
condentiality and integrity are the main issues that were
considered in [30].
It is seen from the previous related work in MPLS fault
tolerance that recovery time, packet loss and bandwidth
utilization are the main service parameters for real-time
trafc. However, most of the approaches in the literature
focus on reducing working and recovery bandwidth utili-
zation while considering the recovery delay. There is no
scheme that can provide path protection with no packet
loss and no recovery delay except the 1 + 1 protection at
the cost of 100% redundant bandwidth reservation, or dis-
persity routing that can handle single failures with lower
redundant bandwidth but requires to know the location
of the failure.
3. Our approach to fault tolerance in MPLS
This section presents our approach for fault tolerance in
MPLS networks. The approach uses a modied version of
the (k, n) threshold sharing scheme (TSS) [17] with mul-
ti-path routing wherein k out of n LSPs are required to
reconstruct the original message. Threshold sharing
scheme is a very well-known concept used to provide
security. However, to the best of our knowledge it has
never been employed for providing fault tolerance in net-
works, particularly MPLS networks.
The idea behind the threshold sharing scheme (TSS) is
to divide a message into n pieces, called shadows or shares,
such that any k of them can be used to reconstruct the ori-
ginal message. Using any number of shares less than k will
not help to reconstruct the original message. There exists
different ways for implementing TSS [1719]. For instance,
the Adi Shamir polynomial approach [17] is based on the
Lagrange interpolation for polynomials. The polynomial
function is as shown in Eq. (1), where p is a prime number,
coefcients a
0
, . . ., a
k1
are unknown elements over a nite
eld Z
p
, and a
0
= M is the original message.
f x a
k1
x
k1
a
2
x
2
a
1
x a
0
mod p: 1
Using Lagrange linear interpolation the polynomial func-
tion can be represented as follows:
f x
X
k
j1
y
ij
Y
16s6k;sj
x x
is
x
ij
x
is
: 2
Since y
ij
= f(x
ij
), 1 6 j 6 k, 1 6 i 6 n, a subset of partici-
pants can obtain k linear equations in the k unknowns
a
0
,. . .,a
k 1
, where all arithmetic is done in Z
p
. If the
equations are linearly independent, there will be a unique
solution, and a
0
will be revealed.
The original TSS is not suitable for network data com-
munication because of its excessive overhead [22] as
S. Alouneh et al. / Computer Networks 53 (2009) 15301545 1533
pointed out in Section 4.1. That is why in our approach we
modify it such that the coefcients of Eq. (1) are not chosen
randomly but are parts of the original IP packet. Fig. 3 gives
an architectural view of our approach and shows how the
original IP packet is distributed and reconstructed by our
technique. Indeed, when an IP packet enters a MPLS ingress
router, a distribution process at the ingress router is used
to divide, encode and generate the n share messages that
will become the payloads for n MPLS packets. The gener-
ated MPLS packet shares are allocated over n disjoint LSPs
obtained using multi-path routing [4,2022] or equal-cost
multi-path (ECMP) routing [23]. When an ECMP routing
scheme is used, where all multiple LSPs found have the
same cost, at least k MPLS packets are to be received at
the same time by the egress router. In case of other mul-
ti-path routing protocols, at least k MPLS packets should
be received within the time frame of receiving a MPLS
packet from the slowest LSP. Thereafter the reconstruction
process at the egress router generates the original IP pack-
et from the rst k MPLS packets received. Notice that the
intermediate LSRs are not involved at all in the division
and reconstruction of the original packets; their role is
limited to the routing of the MPLS packets they receive
towards the egress router. To help us better understand
our proposed algorithm, the following illustrates the distri-
bution and reconstruction processes through an example.
3.1. An example of the distribution process
Fig. 4 shows an example of how the distribution process
apply a (3, 4) modied threshold sharing scheme onto an IP
packet. The IP packet is rst divided into m blocks S
1
, S
2
, . . .,
Fig. 3. Distribution and reconstruction processes.
.
44 4A 1F 0B 08 1C 06
Block size L= 3 bytes
S
1
S2 S
m
f (S
m
, x= x
4
)
f (S
2
, x =x
4
)
= 34
f (S1, x =x
4
)
= 246
f (S
m
, x =x
3
)
f (S
2
, x = x
3
)
= 256
f (S
1
, x = x
3
)
= 162
f (S
m
, x= x2)
f (S
2
, x=x
2
)
= 112
f (S1, x=x
2
)
= 94
f (S
m
, x= x1)
f (S
2
, x= x
1
)
= 116
f (S1, x= x
1
)
= 42
Distributor process
(3, 4)
a
0
a
1
a
2
L/3 (bytes)
IP packet
f ( Sm ,x) = (a 2 )x
2
+ (a 1)x + (a 0) mod 257
f ( S2 ,x) = (74)x
2
+ (31)x + (11) mod 257
f (S 1 ,x) = (08)x
2
+ (28)x + (06) mod 257
08 1C 06
4A 1F 0B
1
/
3
o
f
t
o
t
a
l
I
P
p
a
c
k
e
t
s
i
z
e
MPLS packet 4
LSP 4
MPLS packet 3
LSP 3
MPLS packet 2
LSP 2 LSP 1
MPLS packet 1
Fig. 4. Distribution process in the ingress router applying a (3, 4) modied TSS.
1534 S. Alouneh et al. / Computer Networks 53 (2009) 15301545
S
m
where each block size L is a multiple of k bytes The ef-
fect of block size L on packet processing is shown later in
Fig. 14. From Eq. (1), it can be easily seen that there are
three coefcients, a
0
, a
1
and a
2
, for a (3, 4) scheme where
k = 3. Each block is therefore divided in k equal parts and
these coefcients are assigned values from the block
(unlike original TSS scheme where a
1
and a
2
are assigned
random values). For example a
0
= 06, a
1
= 28 (1C in hex),
and a
2
= 08 for the block S
1
. Next mquadratic equations
f(S
j
, x), where 1 6 j 6 m, are generated using the three coef-
cients from each of the m blocks, that is, every block gen-
erates a quadratic equation. Each quadratic equation is
solved n times using the n different x
i
, 1 6 i 6 n, values as
agreed between a sender (ingress) and a receiver (egress).
Each MPLS packet payload therefore consists of m encoded
values obtained from the m quadratic equations using the
same x
i
value, as shown in Fig. 4. Each LSP corresponds to a
x
i
value. It can be easily seen that the size of each MPLS
packet payload is m L/k; in other words, it is equal to the
size of an IP packet divided by k.
For example, for block S
1
, the equation generated is
f S
1
; x 8x
2
28x 6mod 257:
The above equation is solved for different n values as
follows:
f S
j1
; x 1 42 f S
j2
; x 1 116;
f S
j1
; x 2 94 f S
j2
; x 2 112;
f S
j1
; x 3 162 f S
j2
; x 3 256;
f S
j1
; x 4 246 f S
j2
; x 4 34:
The complexity of the distribution process is deduced from
the explanation above; it is expressed in terms of the origi-
nal packet size, the size of the blocks to be used, and the
number of LSPs over which the resulting MPLS packets
are sent. More precisely, if a is the size of the original IP
packet coming into the ingress router, b is the size of the
blocks resulting from the division of the IP packet, and c
is the number of LSPs used between the ingress and egress
routers then the complexity of the distribution process is
O
a
b
c.
3.2. An example of the reconstruction process
Now, we consider again the example of Fig. 4 to
illustrate how the reconstruction of the original IP packet
is done at the egress router. Fig. 5 shows the process after
receiving any k of the nMPLS packets of Fig. 4. In the gure,
MPLS packets received from the LSP2, LSP3 and LSP4
are considered. Since both ingress and egress routers use
the same polynomial function, the order of coefcients
Fig. 5. Reconstruction process in the egress router applying a (3, 4) modied TSS.
S. Alouneh et al. / Computer Networks 53 (2009) 15301545 1535
a
0
,a
1
,. . .,a
k 1
are already preserved and does not depend
on the location of failure or the path that has failed.
The following three equations are used to obtain the
function f(S
1
, x) using Langrage Interpolation:
a
0
a
1
2 a
2
22 94 mod 257; from LSP2;
a
0
a
1
3 a
2
3
2
162 mod 257; from LSP3;
a
0
a
1
4 a
2
4
2
246 mod 257; from LSP4:
Using Eq. (2), the following is obtained:
f S
1
; x 8x
2
28x 6 mod 257;
where the original values of the coefcients for block S
1
ob-
tained are a
2
= 8 (0 08 in hex), a
1
= 28 (0 1C in hex),
and a
0
= 6 (0 06 in hex).
Similarly, the following three equations are used to ob-
tain the function f(S
2
, x) using Langrage Interpolation:
a
0
a
1
2 a
2
2
2
112 mod 257; from LSP2;
a
0
a
1
3 a
2
3
2
256 mod 257; from LSP3;
a
0
a
1
1 a
2
1
2
34 mod 257; from LSP4:
Using Eq. (2), the following is obtained:
f S
2
; x 74x
2
31x 11 mod 257;
where the original decimal values of the coefcients for
block S
2
are a
2
= 74 (4A in hex), a
1
= 31 (1F in hex), and
a
0
= 11 (0B in hex).
Similar to the distribution process, the complexity of
the reconstruction process can be easily derived from
the previous explanation. It is expressed in terms of the
number of MPLS packets required to reconstruct the ori-
ginal IP packet, the number of blocks used, and the com-
plexity of the Lagrange linear interpolation. More
precisely, if a is the number of MPLS packets required
and b is the number of blocks used then the complexity
of the reconstruction process is O(b a
3
), where the
complexity of the Lagrange linear interpolation is O(a
3
)
according to [29].
Now that our approach for fault tolerance is introduced,
and the distribution and reconstruction processes are illus-
trated with examples, the following subsections are de-
voted to some issues related to the proposed approach,
namely: the variability of the transfer delay over the com-
munication paths used between the ingress and egress
routers, the multiple QoS of the trafcs over MPLS net-
works, packets ordering in MPLS networks, the nding of
disjoint paths between the ingress and egress routers,
and the applicability of the proposed approach to MPLS
multicast.
3.3. Considering variable path length
One of the issues related to multi-path routing is the
case of uneven length of paths. The transfer delay may be
different for the LSPs used to route the MPLS packets
because one LSP may be longer than the other. So, for the
egress router to be able to reconstruct successfully the ori-
ginal IP packet, it should on one hand possess enough buf-
fers for storing the arriving MPLS packets and on the other
hand use a timer to wait for the latest MPLS packet to ar-
rive. The value of the timer is dictated by the slowest LSP
used. To formalize this, let us consider an original trafc
ow f with n subows f
1
, f
2
, . . ., f
n
. The end-to-end delay
for a subow f
t
, t = 1, . . ., n, towards the egress node is de-
noted by d
ft;egress
and is calculated as follows:
d
ft;egress
X
i;j2LSP
t
d
ij
; 3
where d
ij
is the delay of each link (i, j) in LSP
t
.
Hence, the delay for the slowest f
t
belonging to f is
then:
d
f
slowest
max
t1...n
fd
ft;egress
g: 4
Therefore, the timer used by the egress router to recon-
struct the original IP packet should be at least equal to
d
f
slowest
.
Moreover, the buffer size required at the egress should
be large enough to store all the k 1 subows received
while waiting for the slowest one, as well as any other traf-
c originated from other IP packets and received before the
slowest subow of f. In other words, the buffer size re-
quired for each subow f
t
is:
Fig. 6. A (2, 3) modied TSS example.
1536 S. Alouneh et al. / Computer Networks 53 (2009) 15301545
B
ft;egress
d
ft;egress
slowest
d
f
t;egress
b
ft
5
and the buffer size needed by the reconstruction process
is:
B
f ;egress
X
8LSP
t
B
ft;egress
6
where b
ft
is the bit rate arrival for the egress node from
ow f
t
.
Let us consider an example to illustrate more the calcu-
lation of the required buffer size and the value of the timer
at the egress router. Fig. 6 shows a network topology with
a modied (2, 3) TSS model. The egress node receives
shares from three disjoint LSPs (LSP1, LSP
2
and LSP
3
). In
this case:
d
f
1;egress
X
i;j2LSP
1
d
ij
d
ingress;3
d
3;5
d
5;7
d
7;9
d
9;egress
4 5 3 5 3 20;
d
f
2;egress
X
i;j2LSP
2
d
ij
d
ingress;11
d
11;13
d
13;14
d
14;15
d
15;egress
4 5 4 5 3 21;
d
f
3;egress
X
i;j2LSP3
d
ij
d
ingress;2
d
2;4
d
4;6
d
6;8
d
8;10
d
10;egress
4 5 5 4 3 4 25:
For this example, the value of the timer to be used by the
egress router should be at least equal to 25 ms and there-
fore the total buffer size equals to 1152 bits, as shown in
Table 1.
The calculation shown above in Table 1 is only used to
show how to calculate the buffering at the egress router
for the (2, 3) modied TSS multi-path connection. How-
ever, in real networks, the bit rate value can be very large
(i.e., in Mega or Giga bits per second). In this case our ap-
proach may have a real challenge as the buffering size
can be very large, especially when there are high length
variations between the LSPs.
3.4. Considering packets ordering
As explained earlier, each IP packet creates n MPLS
packets; each MPLS packet is sent over an LSP. MPLS pack-
ets generated are sent in the same order in which the IP
packets were received by the ingress router, and therefore
the MPLS packets at each LSP will also be received in order
if there is no MPLS packet lost due to transmission errors.
To identify packets lost due to transmission errors, we pro-
pose to use sequence numbering. However, in MPLS the
shim header length is only four bytes long. The header for-
mat does not provide space for the packet sequence num-
ber. In our approach, we adopted the solution proposed in
[28] to provide sequencing for MPLS packets. A control
word (CW) can be added to each MPLS packet share pay-
load. In other words, this requires the CW to be carried
as the rst four bytes of the MPLS share payload as shown
in Fig. 7, where:
Flags (bits47): These bits may be used by for per-pay-
load signaling.
FRG (bits 8 and 9): These bits are used when fragment-
ing MPLS packet share payload.
Length (bits 1015): When the path between ingress
and egress nodes includes an Ethernet segment, the MPLS
packet shares may include padding appended by the Ether-
net Data Link Layer. So, the length eld serves to determine
the size of the padding added by the MPLS network, and
hence only the payload required for the reconstruction
process is extracted by the egress router.
Sequence number (Bit 1631): these bits are used for
MPLS packet shares ordering.
Table 1
Buffer allocation required at the egress node.
d
ft;egress
ms max Partial Buffer (bits) Total buffer
size (bits)
f
1,egress
20 (2520).128 = 640 1152
f
2,egress
21 (2521).128 = 512
f
3,egress
25 25 0
Header CW Encoded Data
MPLS packet share payload
Fig. 7. CW as part of MPLS packet payload.
S. Alouneh et al. / Computer Networks 53 (2009) 15301545 1537
The use of the version eld in the CW is summarized as
follows. All IP packets start with a version number that is
checked by LSRs performing MPLS payload inspection
[28]. Therefore, to prevent the incorrect processing of
packets, MPLS packet payload must not start with the va-
lue 4 (IPv4) or the value 6 (IPv6) in the rst nibble, as those
are assumed to carry normal IP payloads. In our proposed
scheme, the payload of a MPLS packet share is not an IP
packet; it is an encoded payload produced by the distribu-
tion process at the ingress node. So, in order to avoid hav-
ing the previous values in the beginning of MPLS packet
share payload, the version led has to be given different
value. On the other hand, the egress node is responsible
for checking the content the CW in the MPLS payload,
and accordingly uses the sequence number value to syn-
chronize the order of MPLS packet shares.
3.5. Considering multiple classes of quality of service
In networking, different trafcs may require different
kinds of treatment. Therefore, our approach should be able
to consider multiple classes of QoS services. To clarify this
point, consider the example shown in Fig. 8. There are two
types of trafcs traversing through the network. The rst
type is a high priority trafc demonstrated by trafcs C1
and C2 which are allocated into three disjoint LSPs, where
shares from two LSPs are required to reconstruct the origi-
nal data at the egress router. In our approach, high priority
trafc does not tolerate recovery delay or packet loss, and
therefore can not be pre-empted. The second type is a low
priority trafc demonstrated by trafc C3. This type has no
stringent trafc requirements such as recovery delay or
packet loss, and accordingly can be pre-empted. In other
words, the amount of trafc dedicated for C3 on LSP3 can
be pre-empted in favor of high priority trafc C4 which
can be a part of another network trafc connection. In this
case, C3 can still be reconstructed at the egress node from
only two LSPs but it is not provided with protection if a
link/node failure occurs on any of these two LSPs.
3.6. Finding n disjoint paths
As indicated previously, the approach we are proposing
builds on multi-path routing. It assumes that n disjoint
paths are available through the MPLS network being used
to carry the trafc. To nd these disjoint LSPs, a number
of research proposals have been proposed by researchers.
For instance, Bhandari [4] has proposed an algorithm to
nd n > 2 disjoint paths by extending the modied Dijkstra
approach that nds n = 2 disjoint paths in the same man-
ner to obtain n > 2 disjoint paths. The n > 2 disjoint paths
are obtained from n iterations of the modied Dijkstra in
a graph modied at the end of each iteration. Other ap-
proaches for deriving multiple disjoint paths in a network
can be found in [2326].
In case n disjoint LSPs are not available, maximally dis-
joint paths should be found between the ingress and the
egress pair, where one or more links may be shared be-
tween two or more LSPs. It is clear that when n disjoint
paths are used the method guarantees full protection from
failures. However, if the method builds on n maximally dis-
joint paths then networks with failure in the shared links
will not obviously be protected unless packets are received
from at least k paths. To exemplify the protection power of
our approach, let us consider the network in Fig. 9. Suppose
that a (2, 3) TSS scheme is used. Here, we need to set up 3
LSPs and the egress router has to receive packets from at
least 2 LSPs to be able to reconstruct the original IP packet.
However, the topology in the gure does not offer 3 dis-
joint LSPs but 3 maximally disjoint LSPs; the link between
LSR1 and LSR2 is shared between the second and the third
LSPs. Hence, if a failure occurs on this link then the egress
router will obviously fail to reconstruct the original IP
packet and therefore the path protection is at risk. How-
ever, if a failure occurs elsewhere in the network then
the egress router will successfully be able to reconstruct
the original IP packet and therefore the path protection is
guaranteed.
It is worth to note that our approach can provide multi-
ple path failures by increasing the number of n disjoint
paths. The following provides the level of protection based
on the values of n and k.
Level of protection