You are on page 1of 126

Aalto University

School of Science and Technology


Faculty of Information and Natural Sciences
Degree programme of Computer Science and Engineering

Jyrki Oraskari

The Performance of Open Message-Oriented Middleware Protocols


in Smart Space Access

Master’s Thesis was submitted in partial fulfilment of the requirements for the degree
of the Master of Science in Technology.

Espoo, 10 June 2010

Supervisor: Heikki Saikkonen, Professor


Instructor: Seppo Törmä, D.Sc. (Tech)
Aalto University
School of Science and Technology
Faculty of Information and Natural Sciences ABSTRACT OF
Degree programme of Computer Science and Engineering MASTER’S THESIS

Author: Jyrki Oraskari


Name of the work:

The Performance of Open Message-Oriented Middleware Protocols in Smart Space Access

Date: Espoo, June 10, 2010 Number of pages: 126


Professorship: T-106 Software Technology
Supervisor: Heikki Saikkonen, Professor
Instructor: Seppo Törmä, D.Sc.(Tech)
The background of this research is a device interoperation solution in which an event-based RDF store
is used for information sharing between agents in a smart space. When the number of agents
increases, the performance of the RDF store becomes a bottleneck. In this study, I concentrate on the
performance of the application-level network communication of the agents and an RDF store.
This study addresses the following research questions. How can message systems be used to access a
smart space system that is implemented using an RDF store? What advantages does the message system
implementation offer in relation to direct TCP communication? What differences are there in the
performance of the different open message-oriented middleware protocols?
In the review of the literature, concepts that are closely related to messaging systems are presented.
Event-based systems and the notion of loose coupling are described in detail.
Ways in which an existing semantic broker system, Smart-M3, can be adapted to communicate through
a message system using the open protocols are presented. The protocols studied are AMQP, XMPP,
OpenWire, and Stomp.
The implementations that use these different protocols are compared, with attention being paid to the
effectiveness, the maintainability of the implementation, and the perceived network communication
errors. The effect that the narrow bandwidth has on the performance of the system is shown.
To summarise the results briefly, it was possible to add a messaging system to the system in a
transparent way. A survey of the open message oriented middleware protocols and their
implementation alternatives was conducted and the alternative ways to connect the messaging systems
to Smart-M3 were studied. The performance comparisons were made. The main result was that all the
messaging system implementations outperformed the pure TCP implementation. In special, Stomp was
found to be the best of the alternatives that were studied.
Keywords: AMQP,XMPP, OpenWire, Stomp, JMS, Message-Oriented Middleware, RDF, Smart
space
Language: English

ii
Aalto-yliopisto
Teknillinen korkeakoulu
Informaatio- ja luonnontieteiden tiedekunta DIPLOMITYÖN
Tietotekniikan koulutusohjelma TIIVISTELMÄ

Tekijä: Jyrki Oraskari


Työn nimi:
Avointen viestijonoprotokollien suorituskyky älykkään tilan kommunikaatiossa

Päivämäärä: 10.6.2010 Sivumäärä: 126

Professuuri: T-106 Ohjelmistotekniikka

Työn valvoja: Heikki Saikkonen, Professori


Työn ohjaaja: Seppo Törmä , Tekniikan tohtori

Tämän tutkimuksen taustana on laitteiden yhteentoimivuusratkaisu, jossa käytetään


tapahtumapohjaista RDF-kantaa informaation jakamiseen agenttien kesken älykkäissä ympäristöissä.
Kun agenttien määrä kasvaa, RDF-kannan tehokkuudesta tulee pullonkaula.
Keskityn tässä työssä agenttien ja kannan välisen sovellustason tietoliikenteen tehokkuuteen.
Työssä keskitytään seuraaviin tutkimuskysymyksiin: Miten viestiväliohjelmistoa voidaan käyttää RDF-
kannalla toteutetun älykkään tilan agenttien viestintään? Mitä etuja viestijärjestelmätoteutus tarjoaa
verrattuna pelkkään TCP/IP -liikennöintiin? Mitä eroja avoimilla viestinvälitysohjelmaprotokollilla on
suorituskyvyn näkökulmasta?
Diplomityön teoriaosuus esittelee viestijärjestelmiin kiinteästi liittyvät käsitteet, kuten
tapahtumapohjaisuus, viestiväliohjelmistot ja löyhän kytkennän käsite. Viestijärjestelmien avoimet
protokollat kuvataan ja esitellään lyhyesti semanttisen tiedon välitin.
Työssä esitetään, miten Smart-M3 RDF-store muokataan välittämään viestit ActiveMQ ja RabbitMQ
viestiväliohjelmistojen kautta käyttäen avoimia viestijärjestelmien protokollia, kuten AMQP, XMPP,
OpenWire ja Stomp.
Eri protokollia käyttäviä toteutuksia on verrattu huomioiden tehokkuus, toteutettavuus ja
tietoliikenteessä havaitut virheet. Työssä esitetään myös kapean kaistan vaikutus järjestelmän
tehokkuuteen.
Tulokset lyhyesti: Viestijärjestelmä oli mahdollista liittää järjestelmään transparentilla tavalla. Avoimet
protokollat ja niiden toteutusvaihtoehdot kartoitettiin ja selvitettiin vaihtoehtoiset tavat liittää nämä
viestijärjestelmät Smart-M3:een. Työssä tehtyjen suorituskykyvertailujen perusteella
viestijärjestelmätoteutusten havaittiin olevan TCP toteutusta suorituskykyisempiä. Tarkasteltavista
protokollista Stomp havaittiin parhaaksi.
Avainsanat: AMQP, XMPP, OpenWire, Stomp, JMS, viestiväliohjelmisto, RDF, Smart space
Kieli: Englanti

iii
ACKNOWLEDGEMENTS

I wish to express my sincere appreciation and gratitude to my supervisor,


Professor Heikki Saikkonen, my instructor, D.Sc. Seppo Törmä, and his colleague D.Sc.
Esko Nuutila. Your encouragement, guidance, and feedback made my thesis possible.

I would also like to take this opportunity to thank my colleagues at Aalto University.
Jukka Honkola gave invaluable help by clarifying the basic facts of the Smart M3
technique. Special mentions go to Taina Hyppölä for organising constant support for
the thesis writing, my employer Capgemini for flexibility, and Mrs. Ruth Vilmi for help
with the English language. Lastly, my special thanks go to my friends and parents for
their irreplaceable support.

Jyrki Oraskari
Thursday, 10 June 2010
Espoo, Finland

iv
ABBREVATIONS AND ACRONYMS

AMQP Advanced Message Queuing Protocol

API Application Programming Interface

JMS Java Message Service

MOM Message-oriented middleware

RDF The Resource Description Framework. RDF is a World Wide Web


Consortium’s group of specifications for the conceptual description of
information that is implemented in web resources.
RFC Request for Comments

SSAP Smart Space Access Protocol

Stomp Streaming Text Oriented Message Protocol


URI Uniform Resource Identifier
W3C World Wide Web Consortium
XML Extensible Mark-up Language

XMPP Extensible Messaging and Presence Protocol

v
TABLE OF CONTENTS

ACKNOWLEDGEMENTS .......................................................................................................iv
ABBREVATIONS AND ACRONYMS ....................................................................................... v
LIST OF FIGURES ..................................................................................................................ix
1 INTRODUCTION .......................................................................................................... 1
1.1 Background .......................................................................................................... 2
1.2 Research problem and objectives of the thesis .................................................. 3
1.3 Scope of the thesis .............................................................................................. 3
1.4 Thesis Outline ...................................................................................................... 4
2 REVIEW OF LITERATURE ............................................................................................. 5
2.1 Event-Based Systems ........................................................................................... 5
2.2 Middleware.......................................................................................................... 9
2.3 Message-oriented Middleware ......................................................................... 11
2.4 Java Message Service ........................................................................................ 12
2.4.1 The Structure of the Message ................................................................. 15
2.4.2 The Publish/Subscribe Mode of Operation ............................................. 15
2.4.3 Parameters and Performance ................................................................. 15
2.4.4 JMS Benchmarks ..................................................................................... 22
2.5 Open Message-Oriented Middleware Protocols ............................................... 22
2.5.1 AMQP – Advanced Message Queueing Protocol .................................... 22
2.5.2 XMPP – Extensible Messaging and Presence Protocol............................ 24
2.5.3 STOMP - Streaming Text Oriented Message Protocol ............................ 25
2.6 Smart Spaces ..................................................................................................... 25
2.7 Semantic Web.................................................................................................... 27
2.8 RDF Store ........................................................................................................... 28
2.9 Loosely Coupled Systems .................................................................................. 31
3 The System Software and Available Libraries .......................................................... 35
3.1 The Message Service Providers ......................................................................... 35
3.1.1 ActiveMQ................................................................................................. 35
3.1.2 RabbitMQ ................................................................................................ 36
3.1.3 Apache Qpid ............................................................................................ 36
3.1.4 OpenAMQP.............................................................................................. 36
3.2 Libraries to connect to the Message Systems ................................................... 36
3.3 Nokia Smart-M3 Interoperability Platform ....................................................... 38
vi
4 The Architectural Solution for Test Arrangements .................................................. 43
4.1 The architectural choice of SIBMQ .................................................................... 46
4.2 The Resulting Design ......................................................................................... 49
5 Tests and Methods ................................................................................................... 58
5.1 The General Picture ........................................................................................... 58
5.2 Estimating the Performance .............................................................................. 61
5.2.1 The Performance Model of the System .................................................. 62
5.2.2 The Benchmark........................................................................................ 64
5.2.3 The Benchmark Tests .............................................................................. 65
5.2.4 The Network Data ................................................................................... 69
5.2.5 Evaluating the effect of the Network Bandwidth ................................... 71
5.2.6 Profiling the Software Solutions .............................................................. 80
5.2.7 The Mixed Topic and Queue Based Solution .......................................... 81
5.3 The Maintainability of the Code ........................................................................ 83
6 Results and Conclusion............................................................................................. 89
7 Discussion ................................................................................................................. 92
8 BIBLIOGRAPHY.......................................................................................................... 93
Appendix A: The NodeMQ Stomp interface implementation ........................................ 102
Appendix B: The NodeMQ OpenWire interface implementation .................................. 105
Appendix C: The NodeMQ XMPP interface implementation......................................... 108
Appendix D: The NodeMQ AMQP interface implementation ........................................ 111
Appendix E: The effect of the Network Bandwidth ........................................................ 114
Appendix F: Time Series of Stomp Measurements ........................................................ 115

vii
LIST OF TABLES

Table 1 the chosen set of tecnical solutions. .................................................................... 53


Table 2 The results of the Welch's t test........................................................................... 68
Table 3 The time the system takes to handle the sample messages............................... 71
Table 4 The error rates of the test runs ............................................................................ 80
Table 5 The cumulative runtime with the MQ interface methods ................................... 81
Table 6 the effect of the Network Bandwidth ................................................................ 114

viii
LIST OF FIGURES

Figure 1 The concepts of an event-based system .............................................................. 6


Figure 2 The handlers design patterns (Ferg 2006) ............................................................ 7
Figure 3 The message system pattern (Ferg 2006) ............................................................. 8
Figure 4 The Publish/Subscribe message dispatcher pattern (Ferg 2006) ......................... 8
Figure 5 The concept of middleware (Bernstein 1996) ...................................................... 9
Figure 6 The concept of middleware ................................................................................ 10
Figure 7 The Message-Oriented Middleware concept ..................................................... 11
Figure 8 The concept of JMS message delivery (Barcia 2003) .......................................... 13
Figure 9 The transaction options decision tree (Malani 2002) ......................................... 17
Figure 10 Pub/Sub one-to-one model: 100 byte messages {{56 Gaddah,A. 2006}} ......... 20
Figure 11 Pub/Sub few-to-many model (Gaddah, Kunz 2006) ......................................... 21
Figure 12 Pub/Sub many-to-few model (Gaddah, Kunz 2006) ......................................... 21
Figure 13 AMQP protocol stack (O'Hara 2007) ................................................................. 23
Figure 14 AMQP Semantic Model (O'Hara 2007) ............................................................. 24
Figure 15 The concept of Smart Space ............................................................................. 26
Figure 16 The Semantic Web Pyramid of Languages ........................................................ 28
Figure 17 The RDF triple.................................................................................................... 29
Figure 18 The Nokia Python API implementation ............................................................. 39
Figure 19 The connector class interface .......................................................................... 40
Figure 20 The basic Python Knowledge Processor communication sequence ................ 41
Figure 21 The repeating communication pattern ............................................................. 42
Figure 22 The proposed message queue implementation ............................................... 43
Figure 23 The extended connector inteface of the Python KP API ................................... 44
Figure 24 The Open MOM Protocol Interfaces ................................................................. 45
Figure 25 SSAP messages can be set to a queue as a TextMessage ................................. 45
Figure 26 The equivalence of the implementations ........................................................ 46
Figure 27 The structure of SIB-TCP ................................................................................... 48
Figure 28 SIBMQ using the SIB-TCP connection ............................................................... 49
Figure 29 The concept graph of the implementation choices. ......................................... 51
Figure 30 The minimal resolution set ............................................................................... 54
Figure 31 The visualization of the whole resolution set ................................................... 55
Figure 32 The final architecture ........................................................................................ 57
Figure 33 The layers of the test environment.................................................................. 58

ix
Figure 34 The ActiveMQ test arrangements ..................................................................... 59
Figure 35 The RabbitMq test arrangement ..................................................................... 59
Figure 36 The Stomp arrangement ................................................................................... 60
Figure 37 The OpenWire arrangement ............................................................................. 60
Figure 38 The XMPP arrangement .................................................................................... 61
Figure 39 The AMQP arrangement ................................................................................... 61
Figure 40 The average bench mark times ......................................................................... 67
Figure 41 The extended benchmark test .......................................................................... 69
Figure 42 The network data count .................................................................................... 70
Figure 43 The network test arrangement ......................................................................... 72
Figure 44 The class diagram of the TCP mediator tester .................................................. 72
Figure 45 The regression line of the Stomp test arrangement ......................................... 74
Figure 46 The regression line of the OpenWire test arrangement ................................... 74
Figure 47 The regression line of the XMPP test arrangement .......................................... 75
Figure 48 The regression line of the AMQP test arrangement ......................................... 75
Figure 49 The regression line of the reference test arrangement.................................... 76
Figure 50 The comparison of the amount of data sent per unit of information to the
derivatives of the estimated trend lines ........................................................................... 77
Figure 51 The trend lines of the test sets ........................................................................ 78
Figure 52 The trend lines when the delay is set below 10 milliseconds ........................... 78
Figure 53 The queue topic solution and fixed number of threads ................................... 82
Figure 54 The effect the number of threads have on the alternative Stomp Solution .... 82
Figure 55 The comparison of the Stomp solutions ........................................................... 83
Figure 56 The Pymetrics Constructive Cost Model II (COCOMO II) cost estimation
metrics of the NodeMQ implementations........................................................................ 84
Figure 57 The McCabe Cyclomatic complexity metrics of the NodeMQ implementations
.......................................................................................................................................... 85
Figure 58 The SLOCCount person-months estimates of the amount of work needed for
the Node.py interface changes. ........................................................................................ 86
Figure 59 The SLOCCount person-months estimates of the changes compared to the
original Node.py ................................................................................................................ 87
Figure 60 The SLOCCount person-months estimates of the SIBMQ implementations .... 88
Figure 61 The total person-months estimate of the test arrangements .......................... 88
Figure 62 The summary of the results .............................................................................. 90
Figure 63 The selected implementation ........................................................................... 91

x
Figure 64 the benchmark time series of a Stomp test arrangement. A message queue
implementation .............................................................................................................. 115

xi
1 INTRODUCTION

Modern applications are not separate islets as pc applications used to be years ago.
People around the world want to share their experiences, their mood, and their life.
Programs are increasingly accessible online, which enables information to be shared
not only between people but also among the diverse application sets. The applications
need to understand the information without any uncertainty. The limitations of real
life will have effects but one must know how to tack gracefully past their rocks.

The mobile application environment has its challenges. The physical limitations cannot
always be undone. Battery life continues to limit the use of the devices and, although
these have an almost-PC-level network connection, in unusual conditions, it can slow
down or break down totally. Furthermore, although the small gadgets are more
powerful than ever, the increasingly fancy media-rich animated interfaces, concurrent
use, and the huge amount of simultaneously available information easily take over the
new capacity.

At a time when the number of services available is increasing, nomadic users have a
need to connect to them dynamically and combine them in a meaningful way. The
dynamic environment has to be flexible and efficient in order to handle the ever-
changing topography of the system. This work addresses some of the questions raised.

As the applications provide most of its content by refining and combining data from
other resources, the quality of the connection between providers has a vital role in the
user experience. Here the connection can be seen not only as a physical network line
but also as a logical abstraction. The data should be available just in time in a form that
is usable for the user. If a component is used to convey data between applications, the
performance is a fundamental aspect.

Local geographical areas that are full of smart devices capable of communicating are
called smart spaces. This work shows that a message system can give better message
throughput for applications using smart spaces.
Moreover, it is shown that the throughput depends significantly on the protocol used.

1
1.1 Background

Smart space applications presume that a dynamic set of agents whose identities are not
know in advance are capable of sharing information with each other. The required
communication can be greatly facilitated by a suitable smart space middleware. DIEM
and SOFIA/Artemis projects1 have developed a smart space middleware solution called
Smart-M3 whose core is an event-based RDF store. However, the use of a shared RDF
store and the RDF notation creates performance challenges.

Xiaohang Wang et al. evaluated the performance of the Jena2 RSF toolkit in the smart
space context. They concentrated on the performance of the RDF stores and context
queries in their studies. Their studies (Wang, Dong et al. 2004) show that the size of the
RDF knowledge base creates an increase in the response time that is perceptible to
humans. Wilkinson et al. (2003) studied the effect that the Jena version, the database
normalisation, and state of the hardware cache have on the performance of the Jena
RDF toolkit. They found out that the new version is constantly faster and benefits from
normalization, and that the hardware cache has a significant effect on the performance.

Already at the beginning of this century, some studies of the performance of message
systems were made by Prakash Malani et al. (2002) and Umar Farooq et al. (2004).
However, to this author’s knowledge, neither the middleware messaging protocols,
such as XMPP and AMQP, nor smart spaces have been used in tests of this kind before.
That makes this study the first of its kind.

The Smart-M3 Semantic Information broker is an open-source RDF store whose key
role in a smart space is to offer a coherent shared view of the resources and relevant
information in the environment. The communication between application clients and
the store takes place using TCP sockets. The access has performance limitations that
are addressed in this thesis.

1
DIEM (Devices and Interoperability Ecosystem) is a large collaborative research project
coordinated by Tivit and funded by Tekes (2008-2010). SOFIA is a project funded through the
European Artemis programme under the subprogramme SP3 Smart environments and scalable
digital services (2009-2011)
2
1.2 Research problem and objectives of the thesis

The aim of this study is threefold. The main research questions in this work are the
following.
1. How can messaging systems can be used to access a smart space system that is
implemented using an RDF store?
2. What advantages does the message system implementation offer in relation to
direct TCP communication?
3. What differences are there in the performance of the different open message-
oriented middleware protocols?

The first hypothesis is that forwarding messages through a messaging system can
increase the throughput of the messages that are sent between clients and smart spaces
during their dialogue. The second hypothesis is that the open message-oriented
middleware protocol that is used has a significant effect on the performance.

An important objective in this thesis is to find out the performance characteristics of


the publicly available and the widely used open message-oriented middleware
protocols and to find out the differences in performance between them.

To be practical, one objective is to adapt the findings to create a software architecture


proposal that has the best performance of the candidates.
It is presumed that the proposed architecture will promote non-blocking interfaces
where the number of sockets that are opened or closed is reduced. It is postulated
that the system will be truly event-based without any polling.

1.3 Scope of the thesis

The focus of the project is on the upper abstraction level of a smart space. That smart
space consists of agents and a semantic information broker. In the project the
semantic broker is based the use of an RDF store.

In this thesis, I concentrate on smart spaces that use Smart-M3 as an RDF store to
produce a consistent view of the resources and relevant information in the
environment for the entities in the space.

3
The solution is based on the assumption that the client applications are mainly written
in the Python programming language. That is why the Python Knowledge Module of
Smart-M3 has a major role here, but to make the development easy for a wider variety
of programming platforms non-Python programs should also be able to use the
solution presented.

1.4 Thesis Outline

This thesis is organised as follows. The concepts of event-based systems, message-


oriented middleware, JMS, and the open message-oriented middleware protocols are
presented in Chapter 2. Chapter 3 presents the techniques available for implementing
test arrangements. The following chapter explains how I ended up with the solution
that is presented. The design criteria and the architectural solution of the test
arrangements are introduced. Chapter 5 describes the test arrangements and the
benchmark used, and a performance model of the software architecture that was
designed is introduced. The preliminary test results are presented. In Chapter 6 I
analyse the test results shown in the previous chapter. A recommended
implementation alternative is presented. At the end, in Chapter 7, the thesis is
summarised.

4
2 REVIEW OF LITERATURE

In this thesis, I study possible ways to use message-oriented middleware to access a


smart space system that is implemented using an event-based RDF store. Therefore, in
this chapter, I describe what a message-oriented middleware is. As it is event-based and
middleware, I start with a review of these concepts. I continue by describing Java
Message Service (JMS) that is a standard API widely used in current message-oriented
middleware systems. It has a significant role in the MOM systems. The performance
aspects of JMS are described, as performance is one of the major objectives of the
thesis.

I have selected a set of protocols that I consider representative. I review them in this
chapter.

I describe the concepts of smart space, semantic web, and RDF store. The semantic web
is clarified since it is closely related to the RDF store.

Finally, loose coupling is reviewed for two reasons. Firstly, messaging systems are
considered loose coupled. I clarify what that means. Secondly, loose coupling is related
to the quality of the design. I this thesis, I design an architecture of using a messaging
system.

2.1 Event-Based Systems

This subchapter introduces the basic concepts of event-based systems mainly based on
the treatment in 2006).

Event-based systems have software components that can act as a producer or a


consumer of a notification. The notifications reify that an interesting thing, an event, has
occurred in the system. The notifications are conveyed in messages to the event-
notification service of the system.

The consumer can register its interests by submitting a subscription into the event
notification service. A subscription is a filter that defines a set of notifications that the

5
consumer is interested of. The service is responsible for conveying notifications to the
consumer, when a new notification is arrived in and matches the criteria of the
subscription filter.

Figure 1 shows a summary of the event-based system concepts.

Figure 1 The concepts of an event-based system

The first computing paradigm was batch programming. In that paradigm, computing
systems were seen as central processing units that intake data from an input stream,
process them, and put the result in an output stream. The units could be interlinked by
connecting the output streams to the input of other units. In 1968, Yourdon and
Constantine stated that some of the processing units could serve as transaction centres
that were able to analyse transactions, dispatch them, and complete the processing of
each of them. They considered transactions to be triggered when any element of data, a
control, a signal, an event, or a change of state was sent to that unit. (Ferg 2006)

In contrast to batch processing, which follows a predefined control flow, event-based


programs mainly react to external events. Bach programs start, do something, and

6
stop at the end whereas event-based programs can wait for events forever until an
event signals them to stop.

In event-driven programming, the main concepts are event, event signals, event
handlers, and event generators. Difference between events and event signals is not
always made. The event is the message itself whereas an event signal is an indication
that an event has happened. Signal handling can be seen as an instance of a general
message-handling pattern. Usually, the operating system or the runtime system
performs the event-dispatching task and programs just get the events to their
handlers. That is why the pattern is called the Headless Handlers Pattern (Figure 2).
(Ferg 2006)

Events

dispatcher
Events Events Events

handler 1 handler 2 ... handler handler 1 handler 2 ... handler

Handlers pattern The Headless handlers pattern

Figure 2 The handlers design patterns (Ferg 2006)

Stephen Ferg (2006) thinks messaging systems can be seen as an extremely evolved
version of the uses of the Handlers Pattern event-driven programs. The difference from
ordinary event-based programs is that in the messaging systems, messages can be sent
to different locations, between applications, and across different software platforms. In
general, the message-based system functions like a post office. The sender addresses
messages to a named receiver. The principle of the message system pattern is shown in
Figure 3.

7
Messaging System receiver
sender dispatcher

sender receiver

sender
receiver

Figure 3 The message system pattern (Ferg 2006)

At the extreme end of the evolution, the receivers state which types of messages they
are interested in. No addresses are used. An analogy to this is an ordinary newspaper
subscription (Figure 4). (Ferg 2006)

Subscriptions can be topic-based or content-based. The clients that use a topic-based


subscription connect to a named topic, whereas, when content-based subscriptions
are used, the client specifies its interests and puts them into the event notification
service. That specification is typically a query run at that service. The subscriptions of
the Smart-M3 RDF store can be seen as an example of that.

A type-based subscription can be seen as a subtype of the content-based subscription.


Events are filtered by the type of an event. The events are objects and can have a class
hierarchy, as in object-based programming. One of the aspects is type safety, which can
be promoted by the parameterisation of the type of events at the resulting abstraction
interface at compile-time (Eugster, Felber et al. 2003). (Eugster 2007)

Messaging System receiver


sender dispatcher topic

sender receiver
topic
sender
receiver

Figure 4 The Publish/Subscribe message dispatcher pattern (Ferg 2006)

8
2.2 Middleware

The heterogeneity of the application environment places a heavy burden on developers


and makes implementing enterprise-wide information systems hard. To solve this
problem, vendors have offered distributed services with standard programming
interfaces and protocols. As it sits between the applications and platforms, it is called
middleware (Figure 5). Philip Bernstein, an architect at the Microsoft Corporation, has
clarified the concept and properties of Middleware in his article (Bernstein 1996). By
platforms, he refers to the low-level services of operating systems and the processor
architecture.

Application Application Application


...
APIs

Middleware
(distributed system services)

Platform interface Platform interface Platform interface

Platform Platform Platform


• OS • OS • OS
•Hardware •Hardware ... •Hardware

Figure 5 The concept of middleware (Bernstein 1996)

Figure 6 shows a summary of the concept of middleware. The discussion below is based
on (Coulouris, Dollimore et al. 2005) . They state that middleware refers to a software
layer that offers a programming abstraction, i.e. it provides an application programming
interface and it masks the underlying details.

9
Middleware provides location transparency and uniform computational model. It masks
the heterogeneity of the computer hardware, the implementations, networks, and
programming languages.

As Hanslo et al. clarify, middleware is broadly used to address the questions that are
raised concerning the integration of incompatible components (Hanslo, MacGregor
2004). Middleware helps programming by offering a homogenous view of a world that
can consist of distributed applications and many operating systems (Muhl, Fiege et al.
2006). On the other hand, Malani (Malani 2004) states that the purpose of middleware
is to provide an interface for applications to access networked resources and services.

Figure 6 The concept of middleware

Middleware services share many common properties but not all of them are required
for a system to be categorised as middleware. For example, Bernstein states that
middleware systems usually offer a standard API but, if a proprietary protocol is broadly
used and published, it can be considered enough. Likewise, ideally, broad platform
coverage is offered but if the system is ubiquitous in practice, that is not necessary.
(Bernstein 1996)

Andrahennadi et al. (2008) classify the types of middleware on the basis of scalability
and recoverability. The categories are Remote Procedure Calls (RPC), Object Request
Brokers (ORB), SQL-Oriented Data Access, and Message-Oriented Middleware (MOM).
Malani, for one, stated there are many categories but she named only the transaction-
oriented middleware and the MOM (Malani 2004). On the other hand, Mühl et al.
10
(2006) see a trend from the fixed data and client server approach to the extensions of
existing middleware that provides asynchronous messaging and decoupling.

2.3 Message-oriented Middleware

The Message-Oriented Middleware is also called a messaging system (Deitel, Deitel et al.
2002) . Message-Oriented Middleware (MOM) consists of a concept to construct a
message, to pass that to the messaging middleware, and to receive it from the
middleware (Kuo, Palmer 2001) (Figure 7). The Message-Oriented Middleware provides
two types of communication. The point-to-point mode is usually queue based while
public/subscribe can be content-based or topic-based. The nature of the communication
is asynchronous.

Point-to-point is used when the sender addresses a message to a queue and one,
usually known, client is connected to that queue. Publish/subscribe offers a more
loosely coupled approach. Clients can subscribe to a named topic without letting the
publisher know it. In the general implementations, public/subscribe has its limitations
as keywords are used to match advertisements and subscriptions. (Schuldt 2008) (Li,
Jiang 2004)

Figure 7 The Message-Oriented Middleware concept

In essence, the MOM is used to construct distributed systems where the autonomy of
the communication parts is desirable (Kuo, Palmer 2001).

11
The MOM has an infrastructure for receiving and sending messages. All the handled
messages are queued, which means communication is always asynchronous. A sender
establishes no direct connection to the receiver. The MOM offers an Application
Programming Interface to its clients. That interface simplifies the communication
between clients and the broker. Instead of performing the actual transfer, the client
application calls the offered library to make the network protocol calls. (Schuldt 2008)

Some MOMs can be accessed by using open message-oriented middleware protocols.


Examples of such protocols are AMQP, XMPP, and STOMP.

MOM queues and topics usually relay messages in the FIFO queue manner (Flieder
2005). Mistra (1991) proved that this kind of communication is loosely coupled by
nature. Proof of this is presented in Chapter 2.9.

MOM is middleware. That implies a layer that separates the application code from the
message management. The separation, as well as the asynchronous nature of the
communication, adds a loosely coupling character to the whole concept.

2.4 Java Message Service

The first messaging queue systems were proprietary, which made application users
and developers dependent on the vendors and vendor specifications. With the
introduction of Sun Microsystems’ Java Messaging Services (JMS), that changed. The
specification creates access to all vendors' message queues by using the same API
(Application Programming Interface).

Many existing message-oriented middleware systems vary in their capability and the
features they offer. To manage all of them would be arduous and laborious to learn. To
help with the Java -programming task, the authors of the Java specification selected
the most common subset of the functionalities to simplify the concepts needed. The
objectives of the interface, listed by Farrell (2004), are:

1. Define a common set of messaging concepts and facilities.


2. Minimise the concepts a programmer must learn to use enterprise messaging.

12
3. Maximise the portability of messaging applications.
4. Minimise the work needed to implement a provider.
5. Provide client interfaces for both point-to-point and pub/sub domains.”
Domains" is the JMS term for the messaging models discussed earlier.

Formally put, the Java Message Service (JMS) consists of a set of interfaces and related
semantics that define how an application program uses the facilities of a message-
oriented middleware provider. The specification was developed by Sun Microsystems
to give Java programs a means to access middleware.

In JMS, the ends to which messages are delivered or where they are fetched from are
called destinations. The application program sends messages to a JMS destination and
then the receiving application gets that same message from the destination. In this
context, the sender is called the producer. (Farrell 2004)
That concept is illustrated in Figure 8.

Destination
Producer
Writes
Message A

Message A

Reads
Message A

Consumer

Figure 8 The concept of JMS message delivery (Barcia 2003)

JMS is an Application Programming Interface specification only. It does not define how
the message delivery is implemented. As it concentrates on the interface, it does not
model the behaviour of the system and is vague in explaining the expected behaviour
(Kuo, Palmer 2001). The transport protocol used can vary and thus the different
vendors’ products cannot interact directly. On the other hand, the standardised API

13
makes it easier to create bridge products to enable that communication to take place.
(Voss 2006)

JMS is a de facto programming interface for MOM (Barcia 2003). Most MOM vendors
support it. The standard has two advantages. The first benefit is that a designer can
change the messaging broker product without any radical change to the application
itself; that is, JMS is vendor-agnostic (Pietzuch, Eyers et al. 2007). The second
advantage is that as the API is standard, that is probably the part of the code that
changes least often, and this adds robustness to the system to help it to stand up to
the ravages of time. (Monson-Haefel, Chappell 2001)

Messaging queues do not only add a messaging interface to the application. They
inherently make messaging asynchronous, which changes the design of applications in
a fundamental way. The queues are unidirectional (Flieder 2005) and usually a third
party, a messaging broker, is used between the sender and the receiver.

The Java Message Service is based on one-way delivery. It is called the “fire and forget”
principle. The producer sends a message anonymously (Farooq, Parsons et al. 2004) to
a destination without knowing anything about the consumer. The module does not
need to know who the receiver is or how many receivers there are, if any. On the other
hand, the consumer can fetch a message from any of the destinations by knowing only
the destination name. As the message is self-describing, the consumer does not need
to refer to the producer to decode the content. That keeps the producer and the
consumers independent of each other, which makes the system scalable and well
adaptable to changes.

In addition to public/subscribe and Point-to-point (P2P), JMS offers selectors to filter


incoming messages. Message selectors are strings whose syntax is a subset of the
SQL92 database query language (Farrell 2004). They are stored in the messaging server
and evaluated when a client attempts to receive a message. Only messages that match
the filter are returned. Only message properties and no content can be used for
filtering. For example, messages can be filtered by the application ID. Message
selectors delegate the work of filtering messages to the JMS server (Haase 2002).

14
2.4.1 The Structure of the Message

A JMS message consists of a header, properties, and a body. The header includes a set of
predefined fields to let the messaging system know how to route and identify the
message, e.g. the message ID, the destination, the timestamp, and the expiration time.
Properties are typically used for categorising or grouping messages. Examples are the
sender ID and the application ID. The body, for one, holds the actual content. The body
can be text or binary. To be precise, five different Java objects are used. They are
TextMessage, MapMessage, BytesMessage, StreamMessage and ObjectMessage (Flieder
2005). (Farrell 2004)

2.4.2 The Publish/Subscribe Mode of Operation

As Umar Farooq (Farooq, Parsons et al. 2004) put it, in the Publish/Subscribe model,
the client subscriptions are subject- or channel-based. That is, the client subscribes to
all the topics they are interested in and then the JMS provider sends the related
messages or events to the client as long as the subscription remains active.

The Java Message Service standard defines two subscription types, durable and non-
durable. Non-durable subscriptions are temporary only. The customers get only the
messages that they have subscribed to but messages are not delivered if they are not
online. In practice, that means the customer loses any message received when they
are off-line. To get all the messages sent, the customer needs to make their
subscription durable. That is, the messaging broker keeps the records of the client
messages and saves them in its memory when needed. Literally speaking, it means the
subscriber can get messages saved on the messaging broker, which waits for when the
client is online again and can receive them. (Farooq, Parsons et al. 2004)

2.4.3 Parameters and Performance

JMS API offers two fundamentally different models for handling messages. When the
performance factors of the system are considered, the point-to-point model and the
publish/subscribe model need to be thought out separately but there are also some
common factors. For example, in a mobile system, the bandwidth can have a
remarkable effect on the throughput of the system but factors related to the running
environment can also change the performance. If the MOM service is written in Java,

15
the version of the Java runtime system has been found to make a difference (Sachs,
Kounev et al. 2009b).

In mobile networks, the network bandwidth is usually remarkably lower than at in


fixed-line environments. That means, in practice, that the bandwidth restricts the
number of bytes that can be transmitted via the connection. Not only does it limit the
number of messages that can possibly be sent or received during a certain time period
but it also means that increasing the message sizes reduces the maximum number of
messages per second that the system can deliver. However, one bigger message can
deliver more information and thus although message throughput becomes lower the
information delivery rate rises noticeably at the same time (Farooq, Parsons et al.
2004).

In JMS, the sender of the message can choose from numerous handshake options. The
decision tree of the selectable options is shown in Figure 9. The non-transactional
options are AUTO_ACKNOWLEDGE, DUPS_OK_ACKNOWLEDGE, and
CLIENT_ACKNOWLEDGE. In auto-acknowledge mode, messages that are sent or received
are automatically acknowledged by the JMS provider or the JMS library. The messages
are sent once-and-only-once.

The duplicates OK option informs the JMS provider that the automatic acknowledge is
required, but duplicate messages are allowed to avoid overhead caused by the once-
and-only-once delivery limitation (Chappell, Monson-Haefel 2001). The use of
CLIENT_ACKNOWLEDGE sets the application responsible of sending the
acknowledgments. (Malani 2002)

16
Transaction?

No Yes

Auto Duplicates Okay Client Transacted Message-Driven


Acknowledgement Acknowledgement Acknowledgement Session Bean

Container-Managed Bean-Managed
Transaction Transaction
Demarcation Demarcation

Figure 9 The transaction options decision tree (Malani 2002)

The lower bandwidth of the mobile connection can cause the acknowledge packets to
be delayed, which can make a bad situation even worse. When a JMS provider does not
get the acknowledge packet of a sent message, it tries to resend it. If the message really
were lost, that would replace the missing packet. However, in this situation, messages
are duplicated, which means the limited bandwidth is used even more. On the other
hand, if the delay increases even longer the JMS provider can interpret the connection
as being dead and will not accept any new messages from it for a while. (Farooq,
Parsons et al. 2004) Both of the cases reduce the performance of the system.

Henjes et al. have studied the impact of filters on the maximum message throughput of
popular JMS servers in their studies (Menth, Henjes et al. 2006) and (Henjes, Menth et
al. 2006b). The throughput of the servers being tested differed by orders of magnitude.
When FiranoMQ, SunMQ, and WebSphereMQ were compared, FioranoMQ was the
most efficient (Menth, Henjes et al. 2006).

In detail, FioranoMQ processed the CorrelationID filter rules faster than the application
property filters, while SunMQ and WebSphereMQ did not make any difference. All the
above-mentioned server types processed complex OR -filters faster than the equivalent
number of simple OR -filters. I assume that the server can take advantage of equal sub-
filters when optimising the complex filters, but as the simple OR-filters are set by the

17
application that same pruning could not be done for them. There was still variation,
depending on the server type, which was seen in almost every test. The high level of
complexity of the AND -filters reduced the throughput of FioranoMQ and SunMQ
servers, while WebSphereMQ processed the rules equally (Menth, Henjes et al. 2006).
The order of filter components matters only in FioranoMQ and SunMQ. That implies
that WebSphere optimises the filters better. In general, the number of topics had almost
no effect on the overall throughput of the messages. The non-persistent messages were
routed considerably faster than persistent messages but losing a message is possible.
For WebSpereMQ, the message size has a major impact on the throughput of the server.

Henjes et al. (2007) also tested the impact that complex OR and AND filters have on the
performance of ActiveMQ. Unexpectedly, all the filter types had a comparable message
processing times. Only the number of sub-clauses in the filtering phrases affected the
time usage. Increasing the length and number of filters installed in the server reduced
the measured throughput of the server. With short filter lengths, the change was almost
linear.

Point-to-point
The point-to-point mode of operation allows messages to be saved so that the delivery
of the messages is guaranteed. That mode is called persistent mode. In that case, all
messages are saved onto a disk (Henjes, Menth et al. 2006b) or a database to allow
recovery if a software or hardware failure takes place. The detail of the
implementation varies according to the software and the installation of the MOM
service provider. To save the message takes time, but that delay depends on the
implementation.

Henjes et al. (Henjes, Menth et al. 2006a) estimated that, in the SunOneMQ JMS
server, the throughput of the received and dispatched messages was about 19000
msgs/s for the non-persistent mode in contrast to about 13000 msgs/s for the
persistent mode. The difference of the throughput was constant when the number of
the producers was more than three. Three or less producers could not fully saturate
the communication line.

18
Publish/Subscribe model
In mobile systems, the discontinuation of the network connection can cause messages
to be lost, if that is allowed (Farooq, Parsons et al. 2004).

Research conducted by Abdulbaset Gaddah and Thomas Kunz (2006) showed that non-
durable subscriptions outperformed durable subscriptions in general (Figure 10). They
also confirmed that using persistent messages causes an extra load on the broker since
updating the external message storage is a resource-intensive task. The reliable delivery
guarantee had a significant impact on the performance. They suspect the same
phenomenon is to be found in the acknowledge modes that limit messages to being
delivered once and only once. However, they did not find an observation that supports
that conclusion.

The non-durable message subscription over a wired line outperformed the durable one
by an order of magnitude (Figure 10). By contrast, in a wireless network the situation is
much more even. Gaddah and Kunz (2006) observed that when the network link is a
bottleneck the communication types could send and receive almost the same volume of
messages through the broker.

19
Figure 10 Pub/Sub one-to-one model: 100 byte messages {{56 Gaddah,A. 2006}}

They (Gaddah, Kunz 2006) perceived also other factors to have an effect on the
throughput. In general, larger message sizes tend to reduce the throughput of the
system (Figure 11 and Figure 12). Even here, non-durable subscriptions tend to
outperform durable ones in most of the tests, although they are more sensitive to the
growth in message sizes. Besides the growing message size, durable subscriptions are
more sensitive to a high load of the message broker in general. Finally, introducing
message selectors can cause the performance of both subscription types to diminish.
That causes extra computing in the message broker, since the added filters need to be
matched to the messages sent.

20
Figure 11 Pub/Sub few-to-many model (Gaddah, Kunz 2006)

Figure 12 Pub/Sub many-to-few model (Gaddah, Kunz 2006)

21
2.4.4 JMS Benchmarks

The Standard Performance Evaluation Corporation’s (SPEC) subcommittee OSG-java has


created SPECjms2007 (Mendes, Bizarro et al. 2009), which is a standard industry
benchmark to analyse the messaging performance of a JMS system. The benchmark is
based on the scenario of a supermarket supply chain (Happe, Friedrich et al. 2008). That
means that it reproduces the way message queues are used in a real-life system. The
test allows us to estimate the effect of the message broker and its configuration without
disturbances from infrastructure components such as a database. SPECjms2007 uses
Point-to-Point communication. No selectors are used. The benchmark consists of a set of
Java programs, each running on its own JVM (Sachs, Kounev et al. 2009a).

A newer benchmark jms2009-PS is based on SPECjms2007 (Sachs, Kounev et al. 2009b).


In contrast to SPECjms2007, jms2009-PS uses only a public/subscribe type of
communication and selectors (Sachs, Kounev et al. 2009a).

2.5 Open Message-Oriented Middleware Protocols

All the open message-oriented protocols are application-layer protocols in the ISO Open
System Interconnection (OSI) network model. They need a streaming protocol such as
TCP to operate (Petri 2009).

2.5.1 AMQP – Advanced Message Queueing Protocol

The Advanced Message Queuing Protocol (AMQP) was specified to meet the demands
of the financial community. In contrast to the instant messaging protocol
specifications, AMQP does not cover the application-specific content of the online
data. The main goals of the specification are reliability and performance. That is, it was
designed to minimise the expenses that take place when a message is lost, delayed, or
mishandled. (O'Hara 2007, Vinoski 2006)

AMQP is arranged into two layers (Figure 13). It specifies a binary wire protocol that
consists of frames sent by a stream-based transport. TCP is normally used but AMQP
also supports SCTP (Stream Control Transmission Protocol) and InfniBand. On top of
the protocol layer, AMQP specifies a semantic model that defines the abstract

22
concepts of exchange, binds, and queues. The concept (O'Hara 2007, Vinoski 2006) is
shown in Figure 14.

Exchanges are routers that sort the messages to their destinations on the basis of a set
of rules. Queues are endpoints to keep messages saved and waiting for receivers to
fetch them. Binds attach them together. The use of binds makes the system more
loosely coupled than a general message system. In other systems, the applications
specify those queues from which they are going to receive messages. In the AMQP
model, the binds can be changed at any time without affecting the application code.
(O'Hara 2007, Vinoski 2006)

AMQP Protocol Stack


AMQP semantic model

AMQP wire format

framing

transport

ethernet infiniband

Figure 13 AMQP protocol stack (O'Hara 2007)

JPMorgan Chase (JPMC), Cisco Systems, Envoy Technologies, iMatix Corporation, IONA
Technologies, Red Hat, TWIST Process Innovations, and 29West together founded the
Advanced Message Queuing Protocol (AMQP) working group in 2006 (Vinoski 2006).

23
AMQP Semantic Model

file transfer queue

messaging exchanges queue

transactions ctl queue

Binds

Figure 14 AMQP Semantic Model (O'Hara 2007)

2.5.2 XMPP – Extensible Messaging and Presence Protocol

Jeremy Miller specified the Extensible Messaging and Presence Protocol (XMPP) in 1998.
Frustrated by the multitude of instant messaging clients, he created an instant
messaging system Jabber to fulfil the requirements of freedom he had in mind. (Saint-
Andre 2009)

The Jabber was open sourced in 1999 and in 2002 the community contributed the core
protocols of XMPP to the Internet Engineering Task Force (IETF). The IETF formalised the
protocol in the standards RFC 3920 and RFC 3921. (Saint-Andre 2009)

Social networking service providers and mobile device manufacturers have shown
considerable interest in the XMPP recently. Currently, companies such as Apple, Google,
Nokia, and Sun are actively engaged in the development of XMPP. Furthermore, as the
bias is towards the protocol, the community has been renamed the XMPP Standards
Foundation (XSF). (Saint-Andre 2009)

XMPP is based on the XML. This has made it easy to extend it to fit to various purposes.
It is not only a single protocol, but it is a set of protocols that are layered on top of the
IETF -approved core. XMF has provided some extensions and even private extensions
exist. (Saint-Andre 2009)

24
2.5.3 STOMP - Streaming Text Oriented Message Protocol

The Streaming Text Oriented Message Protocol (Stomp) is a simple human- readable
MOM protocol (Andrahennadi, Samararathna et al. 2008) whose syntax is near to HTTP.
It consists of three parts: the Frame type, Frame header, and body. The current version
of the specification is 1.0. The official reference implementation for this protocol is in
ActiveMQ. Client library implementations are available for 14 different programming
languages, such as Java, C#, Python, and Ruby. (Carter 2008)

Carter (2008) points out that the protocol specification of Stomp 1.0 is poorly written.
For example, it has inconsistency at the frame level. The ASCII null used for delimiting
frames cannot be used when sending binary data. Consequently, content length is only
used in the binary mode. Furthermore, the semantics and detailed syntax of the
destination tag are not specified at all. The reference implementation of ActiveMQ is
more specific, i.e. ActiveMQ has added syntax to indicate whether the message is
destined for a queue or a topic. This is totally missing from the official specification.

2.6 Smart Spaces

A Smart Space is a dynamic multi-user, multi-device environment that merges the


material and the digital worlds together to enhance a physical space (Prehofer, van Gurp
et al. 2007) (Wang, Dong et al. 2004).

A Smart Space is an area comprehensively equipped with numerous devices that have a
computation capability and a network connection. Nixon (Nixon, Dobson et al. 2000)
divides these devices into sensors, actuators, and computing components. The actuators
here refer to the physical or virtual devices that are capable of affecting the Smart
Space. (Okoshi, Wakayama et al. 2001)

The concept of a Smart Space is summarised in Figure 15.

25
Figure 15 The concept of Smart Space

Främling et al. (2009) aptly wrote that a Smart Space is a heavily overloaded notation.
Lupiana et al. (2009) agree on the matter.

Smart Spaces can be seen as separate islets. When people are moving, they roam from
one smart space into another (Yang, Lim et al. 2006). The spaces are autonomous and
can be maintained by different entities such as an individual, a company, or a local
authority (O’Sullivan, Wade 2002). When a device joins a space, it becomes part of the
space. To find the services that are available, a discovery technique is needed. Van Gurp
et al. (2008) introduced a test implementation using the Zeroconf mDNS mechanism in.

Främling et al. (2009) share the above-mentioned viewpoint quite closely. They
emphasise the idea that a Smart Space is a geographical area where information about
the environment is available. That area consists of devices and services but also the
people present and any information that is related and can be useful. Furthermore,
Boldyrev et al. (2009) have stated that devices can use a shared view of resources and
services in the environment when accessing information.

The Devices and Interoperability Ecosystem2 (DIEM) project takes an approach that is on
a higher level of abstraction. The work is based on the assumption that the physical
smart space is already established. The smart space is divided into three separate layers.
Physical devices and networks are beneath the two other layers. The middle layer

2
http://www.diem.fi/ (referred 29 April 2010)
26
consists of shared services. The focus of the project is on the upper of the two layers,
where the actual smart world, which consists of agents and a common shared memory
of information - a semantic information broker - is located. (Soininen, Liuha et al. 2010)

Soininen et al. (2010) have summarised the key features of a smart environment. They
state that the space is a physical area where embedded devices are interconnected but
it should also have the means to perceive any information that characterises the state
and situation of an entity. The information should be made available in an autonomous
way. The applications should also be able to perform smart actions.

In the vision of Oliver and Honkola, a Smart Space is regarded as a local semantic web,
in which information varies depending on the locality or is personal, or in which
information is organised according to usage. The space consists of semantic
information brokers, each of which contains an RDF repository. (Oliver, Honkola et al.
2008).

2.7 Semantic Web

Sir Tim Berners-Lee, who is credited with inventing the World-Wide Web, had a two-
fold vision of the future of the Web. He thought the Web should be a collaborative
medium but that the meaning of the pages should also be understood by machines
(Daconta, Obrst et al. 2003).

The founder of W3C, Tim Berners-Lee, stated in (Berners-Lee, Hendler et al. 2001):
“The Semantic Web will bring structure to the meaningful content of Web pages,
creating an environment where software agents roaming from page to page can
readily carry out sophisticated tasks for users.”

One of the motivations for creating a semantic web was the fast growing number of
web pages. The fundamental reason why the enormous amount of data was difficult to
find was that it was written in natural languages. To make finding and maintaining the
pages easier, there were numerous initiatives to enrich the web data. RDF, Topic
Maps, and OWL are such ways. They add machine-readable meta-data to pages
written in HTML, XHTML, and other XML-based languages (Furche, Linse et al. 2006).

27
In 2000, Tim Berners-Lee gave a presentation where he clarified the protocols and
challenges underlying the semantic Web technologies (Davies, Fensel et al. 2002). The
Semantic Web Pyramid of Languages is shown in Figure 16.

Trust
Rules

Data Proof

Digital Signature
Data Logic
Self
Desc. Ontology vocabulary
doc
RDF + rdfschema

XML + NS + xmlschema

Unicode URI

Figure 16 The Semantic Web Pyramid of Languages

2.8 RDF Store

The Resource Description Framework (RFD) is a recommendation of the World Wide


Web Consortium (W3C) for adding semantic information to a web page. It was
originally designed to describe the pages but, as the specification does not require
them to be retrievable, in fact, RDF can be used to describe anything.(McBride 2004)

In the RDF context, all entities are resources, i.e. they have associated URIs (Uniform
Resource Identifier). The RDF semantic model consists of statements about the
resources. The statements are expressed in Subject- Predicate-Object triples. The
simple tuple model, <S, P, O>, can be used to express all the information (Wilkinson,
Sayers et al. 2003). The interpretation of the tuple is that subject S has property P with
value O.

The terminology and syntax can vary. For example, Davies et al. (2002) state that the
triples consist of object-attribute-values and the common syntax to write them is
A(O,V). The interpretation of the notation is analogous to the tuple model. The key
elements of the RDF triple are shown in Figure 17 (Daconta, Obrst et al. 2003).
28
Object
Predicate
Subject
Predicate
Literal

= URI

= Literal

= Property or Association

Figure 17 The RDF triple

The triples are usually seen as directed graphs, where subjects and objects are nodes. A
node can be URIs, labelled literals, or unlabelled ones. The latter is also called a “blank
node”. They are commonly used to define groups or an aggregation. The literals can be
e.g. strings or numbers.

In addition to the syntax itself, RDF has reserved words to describe the vocabulary used.
The RDF Schema (RDFS) describes the classes of resources and properties and the
relationships between them (McBride 2004). Classes describe the sets of resources,
while properties state binary relations between the subjects and the objects. Specific
predefined properties are used for this purpose. They are e.g. rdf:type that states the
type of a resource, rdfs:Class and rdfs:subClassOf to define the class hierarchy for
resources, and rdfs:Property and rdfs:subPropertyOf for the property hierarchy,
correspondingly.

RDF query languages and RDF have evolved in parallel (Gutierrez, Hurtado et al. 2004).
Various query languages implement different query mechanisms. They are e.g.
relational and pattern-based languages, such as SPARQL, RQL, TRIPLE, and Xcerpt,
reactive rule query languages, such as Algae, and navigational access query languages,

29
such as Versa (Furche, Linse et al. 2006). Of these, SPARQL became the W3C
recommendation for RDF querying in 2008 (Prud'hommeaux, Seaborne 2008). It is
based on the SquishQL and RDQL query languages (Furche, Linse et al. 2006) and has
SQL-like syntax.

Although RDF is a W3C standard, there are many ways to serialise an RDF triple, i.e. to
write it in a format that can be stored or sent on-line. This is a direct consequence of
the fact that RDF allows several notations of the same data content (Gutierrez,
Hurtado et al. 2004). From the notations, the RDF/XML encoding of RDF is one of the
most commonly used (Furche, Linse et al. 2006).

An RDF parser is a library or a tool for parsing RDF serialisations to offer an application
programming interface to programmers to access the triples or generate RDF queries.
They are e.g. SiRPAC, which is a de facto standard for Java-based RDF applications,
Profium, and W3C perllib. (Ding, Fensel et al. 2002)

RDF repositories offer a way to store large amounts of RDF triples. They are built on
top of the functionality of an RDF parser. An RDF repository typically has two features.
It stores RDF triples into a database but it also parses and formulates RDF queries into
relational database queries.

There are many existing RDF storages. Janik et al. have evaluated Jena, Redland, and
Sesame in (Janik, Kochut 2005).

Jena is an RDF/OWL storage and a toolkit implemented in Java. It offers an RDQL query
interface. Jena2 is the second generation of the store. It stores data either in memory
or in a database in a triple-centric way, where triples are indexed. Thus, the neighbour
triples can be searched fast. Jena has a rich API for manipulating RDF graphs.
(Wilkinson, Sayers et al. 2003)

Redland is implemented in C. It calculates memory hashes in order to achieve the fast


computation of the node neighbourhood. However, the hashes use large amounts of
memory and the implementation is slow when compared with the Java RDF stores.
Sesame, for one, is also implemented in Java. It offers an SeRQL RDF query interface.
Of above-mentioned RDF stores, only Sesame stores data in a node-centric way.

30
The Nokia Smart-M3 Semantic Information Broker (SIB) is an RDF store that is used as
a key element of a smart space. The core parts of Smart-M3 have been published as
open source software. (Soininen, Liuha et al. 2010) Smart-M3 uses Piglet is as an RDF
store. The RDF store was created by Ora Lassila to replace Wilbur, which was “a Nokia
Research Center's toolkit for programming Semantic Web applications that use RDF”.
Piglet a light build RDF triple store written in C++. It uses SQLite3 relational database to
store data.

2.9 Loosely Coupled Systems

Kell et al. (2008) provided the first definition of coupling in a software system in their
article, “Structured Design”, in 1974. They stated that coupling is “the measure of the
strength of association established by a connection from one module to another”.

Kowalewsly et al. (2008) and Davis (1999) share Kell’s (2008) idea. They state that a
coupling refers to the degree of associations between modules, components, or
subsystems. In that way, loose coupling is a measure of how independent the
components of the systems are. In the literature that has addressed the model of
coordination and communication among parallel processes, the system is considered
loose if there are few constraints on the concurrent execution of the processes and the
interfaces between them are simple (Kell 2008).

Loose coupling is considered to reduce the complexity of the code and increase its
robustness regarding minor alterations. This is achieved as modules make fewer
assumptions of each other. However, this is not only a good matter. Fewer
assumptions mean less automatic checking of the method names and the data types
(Hohpe 2005). That can make the debugging of the system more difficult. On the other
hand, the human-to-human agreement of data semantics and use is more vital (Hohpe
2005), as the parties to the communication do not validate the data as thoroughly as in
traditional systems.

Eugster et al. (2003) divide systems into three different decoupling dimensions, i.e.
space, time, and synchronisation. Space decoupling means the communication parties
communicate indirectly, do not have to know one another, and do not even refer to

31
each other. Time decoupling refers to the condition that the communication parties do
not necessarily take part in the interaction concurrently. Synchronisation decoupling
denotes the level on which non-blocking interfaces and callbacks are used in the
system, i.e. the communication can take place in an asynchronous manner.

In messaging systems, messages are sent according to the “fire and forget” principle. It
is guaranteed that messages cannot be unsent (Misra 1991). These facts imply that the
sender and receiver are decoupled by time, i.e. their running states can be
independent (Barcia 2003).

Messages make the system data-centric, in contrast to an interface-centric approach


(Barcia 2003). The latter refers to previously agreed method names, parameters, and
parameter types. The use of messages increases the flexibility of a system. Messages
can be handled as data, which makes the error handling of the system robust. If the
sender changes the way it constructs the messages, the receiver should only see them
as corrupted or partly correct data. Error handling routines can apply. In addition to
that, Barcia (2003) assumes that using messages tends to help in avoiding chatty
interfaces. If a messaging system is event-based, a message maps to an event
notification and contains all the information of a happening of interest. That promotes
using chunkier messaging, which lessens the mutual references of the modules.

If the Waterfall sequential software development process is used, loosely coupling


affects all the phases (Faison 2006.). A loosely coupled module is easier to implement,
it can more easily be tested separately (Kowalewski, Bubak et al. 2008), and the
maintenance is lighter. Making a module loosely coupled is attractive, but coupling
cannot be totally avoided (Faison 2006.). By definition, even an association between
components makes them coupled. An object-oriented program cannot be written
without constructing object instances and making calls to methods.

Coupling cannot be totally avoided. It is the consequence of the communication. It is


not possible to communicate directly or indirectly without being connected.

32
Hao He (2003) divides dependencies into two categories: real and artificial. Real
dependency is where one system depends on a functionality some other system offers,
e.g. a device needs a power supply. That linkage is unavoidable.

“The real dependency is that you need power; the artificial dependency is that your
plug must fit into the local outlet.” He claims that loose coupling can be achieved by
minimising the artificial dependencies. (Kell 2008, He 2003) If a phone call is
considered, the need to make a call is the real dependency while the way voice is
carried on the lines is the artificial one. The realisation needs to be done still. The idea
is to minimise the number of assumptions a user needs to know about a system to be
able to use it. In short, that means thinks should be made as simple as achievable.

Stephen Kell (2009) has an interesting point of view on how the coupling between the
software components can be loosened. He proposes separating the functionality of the
software component clearly from its integration details. Like in technical writing,
where the writer defines his own terminology at the beginning without any concern
for standardising the terms, and in IC design, where the integration of the components
has a separate set of concepts, the software industry should copy good practices from
them. In both cases, the decision on the integration was left apart and deferred.

In (Kell 2008), Stephen Kell et al. point out that there are two ways to handle coupling.
One is to minimise the amount of coupling in the system. The other is to mitigate the
consequences caused by the coupling. They state that mitigation can be achieved by
localisation and by using standards. By localisation, they refer to “the features of
programming languages which enable definitions to be made in a single place but
referenced from many others”. Standards make coupling less challenging, since the
connected modules are designed against the same standard. Standards do not change
often, and it is more likely that developers will be familiar with the standard than a
separate specification.

Eugster et al. (2003) compared the decoupling capabilities of the common interaction
paradigms, e.g. message passing, RMI, notifications, message queuing and
public/subscribe. Of these, public/subscribe was the only one of which space
decoupling, the time decoupling, and the synchronisation decoupling were fully
characteristic.

33
Misra (1991) proved that if system parts communicate through FIFO queues, that
makes them loosely coupled. Misra justified his conclusion with the following facts. If a
system is considered as consisting of a finite set of processes, each of them has local
variables, and the processes consist of the set of actions. It is presumed that the
actions are run deterministically and fairly, i.e. every action is run eventually. If it holds
for the processes f and g that f(g(x) = g(f(x), they are loosely-coupled. Misra proved
that this condition holds for two processes communicating via a FIFO queue but his
mathematical proof also holds true for a group of queues and even for baggy
communication.

Gui & Scott (2006) gives a short review of coupling metrics. These are e.g. CBO, RFC,
CF, and DAC. In CBO, classes are considered, if a method or an instance variable in one
class is used by another. RFC counts methods in classes and methods called in other
classes. CF for a software system is the number of coupled class pairs divided by the
total number of class pairs. DAC counts the number of attributes having other classes
as their types. All the metrics deal with class pairs and method pairs one by one
observing the relations intransitively.

Gui & Scott (2006) proposed that the relations between the classes should be handled
transitively to get more precise estimation for ranking the reusability of Java
components. The new metric was compared with the above-mentioned ones and
source lines of the requisite extensions that a very experienced Java programmer
implemented. The proposed algorithm was shown to give the best linear prediction of
reusability.

Coupling measures the lack of cohesion between the modules or components of a


system. Strong cohesion means that an unpredicted change affects only a single part
of the system, i.e. a change in a module should not affect the other modules of the
application. Kell points out that this factor predicts the extensibility and maintainability
of the code (Kell 2008), i.e. loose coupling facilitates the reusability of the code.

34
3 The System Software and Available Libraries

In the first phase, I searched for knowledge of all the software components that can be
used to build up a test arrangement to access a smart space system via a message
system. Then the implementation techniques of the open message-oriented
middleware protocols were studied. The set is presented in this chapter.

I first concentrate on the available message service providers and give more details on
the open source message service providers that offer an open message-oriented
middleware protocol interface. Then I introduce the libraries that can be used to
produce messages using the protocols and are capable of connecting to a message
system. At the end of the chapter, I briefly introduce the Nokia Semantic Information
Broker and the Smart-M3 Python library Python that agents use for communicating
with the broker.

3.1 The Message Service Providers

There are plenty of proprietary message-oriented middleware products that can fulfil
the MOM specification. To name some of them, they are IBM WebSphereMQ, Oracle
Open Message Queue (OpenMQ), Microsoft Message Queuing Server, TIBCO’s
Rendezvous and Enterprise Messaging Servers, and Progress Software Corporation’s
Sonic MQ (Marsh, Sampat et al.). What is interesting is that there are also a few efficient
open source implementations that are capable of communicating using open message-
oriented middleware protocols. A set that I considered representative is introduced in
the following sections.

3.1.1 ActiveMQ

Apache ActiveMQ is an open source message broker written in Java. The software offers
a standard JMS 1.1 specification-compliant Java application programming interface but
libraries are also available for other programming languages such as C/C++, .NET, Perl,
PHP, Python, and Ruby. Besides libraries, ActiveMQ offers the following protocol
interfaces: HTTP/S, JGroups, JXTA, multicast, SSL, TCP, UDP, XMPP, Stomp, and
OpenWire. (Snyder, Davies et al. 2009)

35
ActiveMQ is licensed using the Apache License (Snyder, Davies et al. 2009), which grants
anyone the right to use of the software, including the rights to modify it without the
obligation to distribute the changes.

3.1.2 RabbitMQ

RabbitMQ is one of the open source message brokers that implement the AMQP
protocol. RabbitMQ was written in Erlang, which was originally created in Ericsson for
telecom switches. Erlang is designed for concurrent programming. That is, processes
are fast to create, they share nothing, and they have their own garbage collection.
They are also lightweight.

Erlang gives RabbitMQ good scalability and performance. The code is also concise. The
RabbitMQ source code contains about 5000 lines of code. (Barthel 2009)(Radestock)
-
3.1.3 Apache Qpid

The Apache Software Foundation’s Qpid is an open source messaging system that
implements the Advanced Message Queuing Protocol. The developers of Apache Qpid
have been active members of the AMQP Working Group.

3.1.4 OpenAMQP

iMatix Corporation’s OpenAMQP is designed to be extremely robust and fast but it does
not implement the whole specification of the AMQP standard. However, it achieves 20K
messages per second throughput per a CPU core and 50K messages per client.
Andrahennadi et al. (2008) have estimated the latency of the messages to be about 200
microseconds. iMatix has also developed a High-Performance Extension (HPX) for
OpenAMQP. This increases the throughput of the messages to ten-fold.

3.2 Libraries to connect to the Message Systems

Stomppy

Julian Scheid et al. wrote the Python library to access message servers using the Stomp
protocol. The library is named Stomppy. Perfunctory tests have been made to ensure
compatibility with the ActiveMQ, RabbitMQ and stompserver software. The authors of
36
the library have granted anyone the right to use the software according to the Apache
License 2.0.

Pyactivemq

Albert Strasheim’s Pyactivemq is a Python module that implements the JMS API
specification for the Python language. It wraps method calls into ActiveMQ-CPP module,
which offers the native implementation. Pyactivemq is licensed under the Apache
Licence 2.0.

ActiveMQ-CPP
The Apache Foundation has made a C interface that is as close an equivalent to the Java
Message Service API as possible. The interface is called CMS (C Message Service). The
name is similar to JMS on purpose. The objective was to design a specification that
diverges only when the JMS specification depends on the Java programming language.
The implementation is called ActiveMQ-CPP. ActiveMQ-CPP supports the Stomp and
Openwire protocols. (Apache Software Foundation 2009)

Xmppy

Alexey Nezhdanov’s Xmppy is a Python library that was aimed to offer scripting for
Jabber. Jabber is an IM service based on XMPP. Even though the library was made for
instant messaging, it produces decent XMPP protocol packets that are accepted by the
ActiveMQ messaging server. Conversations are mapped to anonymous public/subscribe
messages. Messages that are predestined to a chat room are mapped to a topic with the
same name. xmpppy is distributed under the terms of the GNU General Public Licence.

py-amqplib

Barry Pederson has written a non-threaded Python library that implements the AMPQ
0.8 specification. py-amqplib has been tested for be compatibility with RabbitMQ.

rabbitmq-c

Tony Garnock-Jones et al. offer the RabbitMQ C client, which is still experimental but is
stable. It is capable of making valid AMQP connections to a RabbitMQ messaging server.

37
3.3 Nokia Smart-M3 Interoperability Platform

The Nokia Smart-M3 Semantic Information Broker (SIB) is an RDF store that is used as a
key element of a higher abstraction level of a smart space. SSAP is a XML based
language, which is used to communicate with Smart-M3. SSAP contains the CRUD
operations, but it also allows subscriptions. One of the key ides is that user’s agents
open a session to a named smart space. It is possible to have many of them.

Smart-M3 offers comprehensive programming interfaces. There is a good support for


the libraries that have generally been used in the Symbian or Maemo mobile
programming. Applications written using the GTK+/Glib libraries can use a Gnome KP
API, whereas QT programs can use the QT KP library written in C++.

As this thesis is part of the Devices and Interoperability Ecosystem3 (DIEM) project, the
treatment focuses mostly on the Python API widely used in the project.

The Smart-M3 applications consist of agents. The agents written in the Python
language use the Node library module of the Smart-M3 Python Knowledge Processor
framework to access the Smart-M3 Semantic Information Broker (SIB) (Figure 18).

To be exact, the connection is made to SIB-TCP, which in turn communicates to the


core of the Smart-M3 via the D-BUS channel of the Linux system.

The smart space access protocol (SSAP) is an XML-based Smart-M3 specific protocol that
the agents of a smart space use for accessing the semantic information broker (SIB) of
the space. The protocol specifies basic operations for joining, leaving, inserting, and
removing triples, as well as making RDF queries and subscriptions. (Soininen, Liuha et al.
2010)

The Python Knowledge Processor uses SSAP messages to communicate. SSAP is a


language, in which all the messages are independent of each other. The semantics of the
messages are the pieces of dialogue, single statements, on the communication line in a
client-server type of architectural pattern.

3
http://www.diem.fi/ (referred 29 April 2010)
38
SIB-TCP acts as a mediator. It parses SSAP XML messages to the lower-level language
used in the D-Bus channel. SIB-TCP delivers the parsed messages to SIBD, which is a
daemon process of the core of Smart-M3. That sequence is performed for each
command.

Python KP API Smart-M3

SSAP
TCP

Figure 18 The Nokia Python API implementation

During the initialisation phase of the library, a connector class is selected and its
parameters are set. Then the Python API operates in two layers. The API layer provides
an interface to the application program and transforms the programming calls into
actual messages to be sent on line. The lower level handles the actual sending of the
message and parses the received message packets to an internal structure of the library.

Since the communicator class is selected separately every time the library is initialised
and it has a clear and a limited interface, it is easy to extend the library in such a way
that it is capable of handling new ways of communicating. You only need to provide a
new communicator class with the same interface. The connector class interface is shown
in Figure 19.

39
Figure 19 The connector class interface

When a knowledge processor accesses a SIB, it needs to repeatedly open and close
network connections, which causes an unnecessarily large number of TCP connections
to be established in any given time period. There are two distinct reasons for avoiding
them. First, opening a new TCP connection is a task that consumes both processor time
and real time at the level of the operating system. The second reason is that open
connections allocate resources and they can be prestigious in tiny devices such as
mobile phones.

The overall connection pattern of the Python interface is depicted in Figure 20. The
graph shows the smallest possible communication sequence that is still valid. It starts
with initialising the ParticipantNode class, which is used for the connection. A discovery
of the connection method is made. This allows the agent to select the used
communication protocol and a way to set up communication parameters such as the IP
address of the semantic information broker. The discovery is followed by the required
SSAP messages of JOIN and LEAVE. They follow the communication pattern, which is
used for every message.

Figure 21 shows the repeating connection sequence pattern of each message


transaction. It consists of initialisation of the connection, getting the transaction ID for
the message, opening the network session, sending a message and waiting for an

40
answer to that. The sequence is ended by closing the network session. The physical line
is opened and closed every time.

Figure 20 The basic Python Knowledge Processor communication sequence

41
message
create TcpConnector
initialize

get tr_id

connect
send
receive

close

Figure 21 The repeating communication pattern

42
4 The Architectural Solution for Test Arrangements

In this chapter, the first sections describe the overall architectural solution and the
principles that were used. They are followed by sections that describe the selection
process that was used to minimise the solution set.

The first objective of this thesis was to provide for software agents a message queue
access to Smart-M3. The message system is therefore located between the agents and
the semantic information broker of Smart-M3. However, as the messaging system does
not actively deliver messages to the destined clients, there was a need to write
mediator software for both ends of the messaging system. At the client end, the
Python KP API can be extended naturally using its own extension mechanism. I named
the other end SIBMQ. The name was selected to be similar to the existing SIB-TCP
module so as to indicate the similar functionality of the two software components. The
role of SIBMQ is to redeliver messages to the semantic information broker and vice
versa. The overall architecture is shown in Figure 23.

The proposed architecture


SIBMQ
Message Smart M3
SSAP Queues
Application SSAP/TCP
MQ
MQ

SSAP

NodeMQ.py
Python
MQConnector
Application

discovery.py

Figure 22 The proposed message queue implementation

43
The extension mechanism of the Python KP API interface was used to generate the API
interface that provides for an application a transparent access to the message system. A
class diagram of the connector classes is shown in Figure 23.

All of the send or receive function calls map to single SSAP messages. The extended
library copies the workload of a TCP connection to the data part of the separate
message queue messages. The structure of the dialogue is kept. That is, every TCP write
is substituted with the transmission of a new message. When messages are being
received, the same takes place correspondingly but in the opposite order.

Figure 23 The extended connector inteface of the Python KP API

The dialogue between the Python KP library and the Smart-M3 semantic information
broker consists of separate XML messages. All of them are text strings that can be cut
and sent as separate elements. As they are text, they can be sent on top of the open
MOM protocols such as STOMP, OpenWire, XMPP, and AMQP, as payload. This is
illustrated in Figure 24. The same payload can be saved to a message system container,
a destination. JMS specifies many payload types of a message. Of the alternative
message payload types that JMS specifies, the TextMessage format is suitable for SSAP
messages. Figure 25 depicts that.

44
Destination
Producer
Agent

Writes
XML Message A
SSAP message
Message A
STOMP
TCP
Reads
OpenWire XML Message A
TCP

XMPP SIBMQ
Consumer
TCP

AMQP
TCP

Figure 24 The Open MOM Protocol Interfaces

The message queues differ from the TCP connections in that they are always one-way.
This must be taken into consideration in design.

TextMessage
MapMessage
Destination BytesMessage
Producer
Agent
StreamMessage
Writes ObjectMessage
XML Message A

Message A

Reads
XML Message A

Consumer
SIBMQ

Figure 25 SSAP messages can be set to a queue as a TextMessage

The above-mentioned facts make it possible to reroute the agent messaging via a
message system. As the actual SSAP messages and the application interface are
untouched, the solution is transparent to the application. That, concisely, is the whole
solution.

The idea of equivalent interfaces is shown in Figure 26. NodeMQ.py here refers to an
extended Python library that is capable of connecting to Message Queue services.

45
The API shown to the programmer is kept equivalent to the original one. The
equivalence of the implementations means the new Python API is identical to the
original one and thus can be used to replace it in the applications.

However, it was not necessary to keep the SIB-TCP interface untouched but to create a
loose-coupled solution and to keep a clear published interface, and so a deeper
integration was not the best alternative. As the SIB-TCP interface is kept unmodified,
there is no need to change the Nokia SSAP/TCP interface that Smart-M3 SIB-TCP offers.
More detailed discussion is provided in the following chapter.

Figure 26 The equivalence of the implementations

The Python library is offered to give a familiar API to the Python developers. As the
interface is kept identical to the original Smart-M3 implementation, old applications do
not need to be migrated to the new library to adapt the benefits of the messaging
solution. However, nothing prevents the creation of applications that communicate
with the queues directly.

4.1 The architectural choice of SIBMQ

If SIBMQ were connected to SIBD directly by using D-BUS, the architecture thus
designed would probably be somewhat faster, fewer TCP connections would be used,
46
and there would be fewer separate processes. However, no great performance
optimisation can be achieved here.

If SIBMQ connects to SIB-TCP in the same way as the Python library does but uses only
local connections, the loss of performance can be assumed to be moderate and linear.
Thus, it does not cause distortion when the benefits of the whole message queue
solution are being estimated.

Nevertheless, without big changes, SIB-TCP can be modified to receive MQ messages


instead of TCP connections. Figure 27 shows the structure of SIB-TCP. The functions
that handle reading and writing TCP data are shown in yellow. The functions that
handle the TCP socket listeners are shown in green and red. There is no need to go into
the source code level here. The main point is that the accept, send, and receive
methods of TCP API functions can be replaced by message queue aware substitutes.
Only the part of the program that is shown is affected.

47
Figure 27 The structure of SIB-TCP

There are factors that favour using the SSAP/TCP connection of SIB-TCP. The protocol
used in the D-BUS is not openly documented, at least, not to my knowledge. Using the
documented interface of SIB-TCP and its XML-based language makes the solution more
loosely coupled. Furthermore, the added modularity improves its ability to withstand
changes. The solution is presented in Figure 28.

48
The proposed architecture
SIBMQ
Message SSAP/TCP Smart M3
SSAP Queues
Application D-bus
MQ SIB-tcp SIBD

MQ

SSAP

NodeMQ.py
Python
MQConnector
Application

discovery.py

Figure 28 SIBMQ using the SIB-TCP connection

4.2 The Resulting Design

The architecture was described at the outline level above. To implement that design,
the software and technique to be used have to be chosen. In the preliminary study, a
set of techniques was found to implement the message queue architecture. The
following steps were taken to refine that into a software architecture.

1. To visualise the alternatives, a concept map was drawn. That concept map is
shown in Figure 29.
2. The map was exported to a text file that contained the proposition triples of
the map.
3. A Java program was made to build up a search tree and solutions were printed.
4. A set of criteria was defined and a Java program was used to count the points
associated with the criteria rules.
5. A solution set was selected so that it includes all the major protocol variants.
6. A Java program was used to create a new proposition file of the selected set.

49
7. CmapTools was used to draw the propositions.
8. The design was refined manually.

I describe the steps in detail in the following sections.

50
Figure 29 The concept graph of the implementation choices.

51
The software components, which are shown in Figure 29, are open source and easily
available. The pieces of code certainly do not cover all the implementation alternatives
but they are good representatives of the techniques and they cover all the major open
standard protocols used in message queue systems.

The numerous solutions offer functionalities that overlap. Some of them are
unnecessarily complex. As I had limited time and resources available, it was essential
to select a subset that provides the most relevant information about the performance
aspects of protocols and the libraries used. This was accomplished using the following
method.

CmapTools was used to draw the concept graph. It is also capable of exporting all the
relations to a text file containing proposition triples in a row, for example:

In the next step, a Java program was made to build up a search tree that superimposed
the states of the prepositions. A depth-first strategy was used but, in contrast to the
search strategy presented by (Russell, Norvig et al. 1995), all the solutions that were
found were printed.

The starting point of the traversal was set to NodeMQ.py, the new Python library. The
paths describe the possible solution alternatives. There were 25 of them.

Because the best alternatives were not obvious, a list of criteria was named to make it
easier to extract the best candidates. The criteria were:

unnecessary programming and software installation should be avoided;


all the major open standard message queue application protocols have to be
represented, and
the solution should have as few steps as possible.

52
The same Java program was used to count the points associated with criteria rules. The
list was sorted according to the resulting values. The values are not absolute but the
order gives us a possible proposal resolution. The result set was selected to be so wide,
that all the major protocol variants were listed. The entire list of solutions is shown in
Table 1. The minimal subset of all the results is presented in green. The grey lines show
the equivalent solutions that had the same rank.

Table 1 the chosen set of tecnical solutions.

It was easy to modify the traversal Java program to list the propositions associated
with the individual solutions.

A new proposition file was created. Luckily, CmapTools is not only capable of writing
proposition files. It is also able to read the recreated and filtered propositions and
place the net of elements neatly on the screen.

The minimal resolution set is shown in Figure 30. Correspondingly, the whole solution
set is presented in Figure 31.

53
Figure 30 The minimal resolution set

54
Figure 31 The visualization of the whole resolution set

The final architectural design was created manually. The reason why neither of the
above-mentioned resolutions was used alone is that I wanted to keep the number of
programming languages limited and the redundant functionality of Apache Qpid was a
reason to prune it out of the design, as the surplus value offered by it is small.
Furthermore, the Stomppy library was added since, during the programming phase, it
proved to be one of the easiest one to use. The Node adaptation that used that library
was the first one to be created. The others took more time. That is why Stomppy was
worthwhile.

55
The manually joined and cut resolution is shown in Figure 32. In practice, to implement
the design, the following software components needed to be written:
NodeMQ version that uses the Stomppy library to connect to ActiveMQ
NodeMQ version that uses the pyactive library to connect to ActiveMQ
NodeMQ version that uses the XMPPY library to connect to ActiveMQ
NodeMQ version that uses the py-amqplib library to connect to RabbitMQ
SIBMQ version that uses the ActiveMQ-CPP library to connect to ActiveMQ
SIBMQ version that uses the rabbitmq-c library to connect to RabbitMQ

The functionality of the separate solutions can be kept apart in the detached program
modules to avoid loading unnecessary libraries into the memory. That principle helps
in estimating the performance factors of individual solution.

56
Figure 32 The final architecture

57
5 Tests and Methods

5.1 The General Picture

The test system was run on a Lenovo laptop computer. All the tests were run on Linux
Mint, which was run on Sun VirtualBox in the Microsoft Windows XP operating system.

Apache ActiveMQ 5.3 RabbitMQ 1.5.4


Python 2.6.2 Java OpenJDK 1.6.0_0 Erlang R12B-5 Smart-M3 0.9.2
Linux Mint 7 kernel 2.6.28.17-generic
Sun VirtualBox 3.1.4
Microsoft Windows XP 2002 Service Pack 3
Intel Core 2 CPU 2.0 Ghz 2.0 GB RAM
Lenovo T60p Type 8742
Figure 33 The layers of the test environment

The layers of the configuration are shown in Figure 33.

As decided in Chapter 4.2, the Apache ActiveMQ 5.3 and RabbitMQ 1.5.4 messaging
systems were installed and used.

The overall RabbitMQ and ActiveMQ test arrangements are depicted in Figure 34 and
Figure 35.

58
SIBMQ
For
ActiveMq Smart M3
ActiveMq
Topic
D-bus
Topic SIB-tcp SIBD

• NodeMQ for STOMP


• NodeMQ for OpenWire
• NodeMQ for XMPP
NodeMQ.py

MQConnector
Python
Application
discovery.py

Figure 34 The ActiveMQ test arrangements

SIBMQ
RabbitMq For
Smart M3
RabbitMq
Topic
D-bus
Topic SIB-tcp SIBD

• NodeMQ for AMQP


NodeMQ.py

MQConnector
Python
Application
discovery.py

Figure 35 The RabbitMq test arrangement

59
The test constructions were made in three phases. During the first phase, libraries were
used to make a test connection write and read to a message queue and a topic. The next
phase concentrated on constructing a routing of the messages to Smart-M3 and back.
During the last phase, the Python KP library of Smart-M3 was modified to use the tested
library constructions from the first phase to forward the application messages to the
messaging system.

That modification was made using the already existing extension system of the library.
The connection discovery service was adapted to recognise the new connection
messaging method next to the existing TCP connection.

The results of the construction are shown in Figure 36, Figure 37, Figure 38, and Figure
39. The source code of the Python KP modification is available in Appendices A to D,
starting at page 102.

Sample
Application SIBMQ
for ActiveMQ
Python KP
Smart-M3 0.9.2
Stomp.py 2.3 Apache ActiveMQ 5.3 ActiveMQ CPP
version 3.0.1
Python 2.6.2 Java OpenJDK 1.6.0_0

Figure 36 The Stomp arrangement

Sample
Application SIBMQ
for ActiveMQ
Python KP
Smart-M3 0.9.2
Pyactivemq 0.1.0 Apache ActiveMQ 5.3 ActiveMQ CPP
version 3.0.1
Python 2.6.2 Java OpenJDK 1.6.0_0

Figure 37 The OpenWire arrangement

60
Sample
Application SIBMQ
for ActiveMQ
Python KP
Smart-M3 0.9.2
Xmppy 0.5.0 rc1 Apache ActiveMQ 5.3 ActiveMQ CPP
version 3.0.1
Python 2.6.2 Java OpenJDK 1.6.0_0

Figure 38 The XMPP arrangement

Sample
Application SIBMQ
for RabbitMQ
Python KP
Smart-M3 0.9.2
py-amqplib 0.6 RabbitMQ 1.5.4 RabbitMQ
C client
Python 2.6.2 Erlang R12B-5 81dfceb1b769

Figure 39 The AMQP arrangement

The sections above introduce the general test arrangements. Later in this chapter, in
subchapter 5.2, the performance model is constructed and used to estimate the factors
related to the performance of the test system. Some raw results are shown in the
chapter.

In that chapter, I introduce a performance test model of the system and tests to
evaluate its factors. The objective of the model is to create a general view of how time is
consumed in the elements of the systems. Not only can the model suggest which of the
proposed architectures is preferred, but it also gives guidance for future work and a way
to evaluate new architectural solutions in the future.

Chapter 5.3 shows the estimated person-time for each test arrangement.

5.2 Estimating the Performance

The tests were built on the assumption that the pure public/subscribe model of
communication is the most suitable for a system that uses the subscription concept at
the application level. The single communication model solution is also easier to maintain

61
than the mixed queue and topic based solutions. Thus, the estimated solutions use the
topic channel mode of communication only.

However, mixed implementations were also constructed for the Stomp protocol. That
separate result is presented in 5.2.7 and can be used to validate the choice that has
been made.

5.2.1 The Performance Model of the System

The major subsystems in the proposed architecture are the Smart-M3 agent
application code, the Python connection library, the network, the MQ connection
library, the MQ provider software, SIB-MQ and Smart-M3. Here the total time spent on
the application is distinguished from the time spent on the application code. In a one-
thread system, you may think that the times used in the subsystems are exclusive and
that by adding these, a total time can be obtained. However, the system here is
inherently multithreaded and some parts of it are event-based. This means that
creating a model is not straightforward.

Benchmark runs are usually batch processes that can be repeated and that have
characteristics, which represent a typical program run. The control thread of the
program orchestrates the whole functionality of the application. This thread is also run
for the whole time that the application is run. If this represents the user’s view of the
application run, it is important to model the time consumption characteristics of that
thread. It offers a timeline view of a “cross-section” of the operation of the system.

The above-described model can be formulated by using a mathematical equation. The


coefficients of the equation are listed in the textbox below.

62
Total = Total time for the application
AC = Time spent on the application code including library calls which are
not MQ related.
NMQcode = Time spent on the MQ interface code of the Python library
NodeMQ = Time consumed by the Python Node module when sending or
receiving data to and from the MQ.
NMQlib = Time consumed by the Python MQ library
Nnet = Time spent on the network between the library and the MQ
provider service
MQ = Time spent on the MQ provider
MQSIBMQnet = Time spent on the network between the MQ provider and
SIBMQ

SIBMQ = Time spent on the SIBMQ


SIBMQcode = Time spent on the SIBMQ code
SIBMQlib = Time spent on the MQ library at SIBMQ
SIBMQSIBnet = Time spent on the network between SIBMQ and Smart-M3
SmartM3 = Time spent on the Smart-M3

As SIBMQSIBnet and MQSIBMQnet can be assumed to be local connections, the time


spent on them is small and can be ignored.

The model is thus:


SIBMQ = SIBMQlib + SIBMQcode
NodeMQ = NMQcode + NMQlib
Total = AC + NodeMQ + Nnet + MQ + SIBMQ + SmartM3

It may be impossible to estimate all the factors in a reasonable time. For that reason,
this can be seen mainly as a mental framework for the study.

63
5.2.2 The Benchmark

Karl Huppler (Huppler 2009) states that a good benchmark has to have the following
aspects, at least to some extent. They have to be relevant, repeatable, fair, verifiable,
and economical.

According to my information, there is no standard test to evaluate the performance of


Message-Oriented Middleware Protocols in a Smart Space access. However, I use a
strategy that is commonly used. By running an application that is a good representative
of a typical Smart-M3 application, we can roughly estimate the changes to the
performance of the system caused by introducing the messaging system, server
software, and libraries.

basic_test.py is a sample application delivered as a part of the delivery packet. It has its
advantages. It is publicly available, it covers all the SSAP message types, it is easy to
run repeatedly, and it has enough transactions. One can discuss if this represents a
typical Smart-M3 application. An average application would probably have all the same
elements but it would be run for longer and it could emphasise the subscriptions more.
However, it is beyond the scope of this study to find a perfect representative.

basic_test.py, as a benchmark, comprises many of the aspects Huppler expects to be


present in a good benchmark. By running basic_test.py the typical communication
pattern between a client and a Smart-M3 smart space can be timed. In that way, the
test is relevant. As it cleans its data from the semantic database at the end of the run,
it can be repeated in the same way time and again. All the message-oriented protocols
are handled the same way. In that way, it is also fair. Its verifiability is not easy to
prove. The benchmark is simple, and as such, it can be self-verifying. Finally, the test is
economical, since the source is publicly available free of charge at SourceForge.

Here I use basic_test.py as a starting point for the evaluation of the libraries. Later, in
the following chapters, I will go deeper into analysing how the run time is divided in
different implementations.

64
5.2.3 The Benchmark Tests

The Linux time command was used to measure the run time of the program. The
application logic was left untouched but the files basic_test.py, Node.py, and
discovery.py, which are part of the Smart-M3 software distribution, were modified so
that it would be possible to run the application in one batch run without any human
interaction. That was done to eliminate the random delay caused by a test person
using the user interface. In addition, the accuracy of the system command was
validated by comparing the values against the values a Java program executing the
same program. The Java program calculated the time used running the program by
comparing the system time at the start and at the end of the run in milliseconds. Both
of the measurement methods give comparable values.

The benchmark tests are the following:


Test arrangement 1: the basic_test.py application is run using the NodeMQ
module, which uses the Stomppy library to connect to ActiveMQ. All the
communication takes place using subscriptions to topics. The topics are used
as destinations, so the application and the space have their own topics and a
new topic is created for each subscription. SIBMQ relays messages from
ActiveMQ to Smart-M3. This program uses the ActiveMQ-CPP library to access
the message channels of ActiveMQ. The message queue middleware protocol
is Stomp. The source code of the implementation is presented in Appendix A:
The NodeMQ Stomp interface implementation.

Test arrangement 2: the basic_test.py application is run using the NodeMQ


module, which uses the pyactive library to connect to ActiveMQ. All the
communication takes place using subscriptions to topics. The topics are used
as destinations, so the application and the space have their own topics and a
new topic is created for each subscription. SIBMQ relays messages from
ActiveMQ to Smart-M3. This program uses the ActiveMQ-CPP library to access
the message channels of ActiveMQ. The message queue middleware protocol
is Apache ActiveMQ OpenWire. The source code of the implementation is
presented in Appendix B: The NodeMQ OpenWire interface implementation.

65
Test arrangement 3: the basic_test.py application is run using the NodeMQ
module, which uses the XMPPY library to connect to ActiveMQ. All the
communication takes place using subscriptions to topics. The topics are used
as destinations, so the application and the space have their own topics and a
new topic is created for each subscription. SIBMQ relays messages from
ActiveMQ to Smart-M3. This program uses the ActiveMQ-CPP library to access
the message channels of ActiveMQ. The message queue middleware protocol
is XMPP. The source code of the implementation is presented in Appendix C:
The NodeMQ XMPP interface implementation.

Test arrangement 4: the basic_test.py application is run using the NodeMQ


module, which uses the py-amqplib library to connect to RabbitMQ. All the
communication takes place using queues and bindings to the queues. The
bindings are used as destinations. For each subscription, a new binding is
created. The binding here offers the same functionality as the topics offer in
the Message-oriented Middleware context. SIBMQ relays messages from
RabbitMQ to Smart-M3. This program uses the rabbitmq-c library to access
the message channels of RabbitMQ. The message queue middleware protocol
is XMPP. The source code of the implementation is presented in Appendix D:
The NodeMQ AMQP interface implementation.

The TCP/IP reference arrangement: the basic_test.py application is run using


the original and unmodified Node module, which connects directly to SIB-TCP.
No message queue implementation is used.

The first benchmark test was run five times with each of the above-mentioned test
arrangements. To be able to compare the results of the message queue
implementations with the original TCP arrangement identical test runs were made for
the arrangement, too. The average value of the data was calculated to eliminate
random measurement errors in the data. The sets are very small here. Therefore, the
results are mainly indicative. Figure 40 shows the average run times of the benchmark
per the test arrangements.

66
The Benchmark test
4.0

runtime time in seconds


3.5
3.0
The average 2.5
2.0
1.5
1.0
0.5
0.0
The
Test 1: Test 2: Test 3: Test reference
STOMP OpenWire XMPP 4:AMQP arrangemen
t
The Average benchmark time 0.9 1.0 1.9 1.5 3.5
The standard deviation of the values 0.05 0.03 0.52 0.03 0.03

Figure 40 The average bench mark times

In the first benchmark test, I showed that the arrangements, which used a messaging
system, had better benchmark times in average. To prove that these values are
statistically significant, I had 35 sample runs for each test arrangement. The used
antithesis was that the measured similarity was occurred by chance. The question was
the following. What is the probability that the calculated average value of the sample
that has been measured using this test arrangement can occur at the deviation the
original TCP arrangement? The Welch's t-test answers that question.

The Welch's t-test was used to test the antithesis. I decided that, if the calculated p-
value is below a threshold chosen for statistical significance, then the antithesis is
rejected in favour of the alternative hypothesis, which is the opposite and means the
difference is statistically significant. I used the arbitrarily chosen threshold p-value of
0.05 here.

All the test arrangements were evaluated. The calculated p-values are shown in Table
2. All the values are less than the set threshold value by orders of magnitudes. It
means literally that all the messaging system implementations were significantly better
than the original TCP arrangement. The corresponding graph is shown in Figure 41.

Welch's t-test was used to statistically estimate the probability that the improvements
that are shown were caused by chance. This test is a variant of the Student’s t-test
introduced by William Sealy Gosset in 1908. Welch’s t-test is an appropriate solution to
67
the Behrens–Fisher problem (Best, Rayner 1987), which describes this case well. The
Behrens–Fisher problem tests a hypothesis of the difference between the means of
two normally distributed sets of measurements when the variances of the sets are not
assumed equal and the samples are independent. Here it can be assumed that the
samples are normally distributed. Furthermore, it is reasonable to assume that the
measurements are independent. The sample size per a test arrangement was 35. That
value was selected based on (Dallal 2008) which states that a sufficient sample size per
group is 30 when the two populations are roughly normally distributed.

The results of the extended benchmark test were comparable to those of the first test.
While Linux time command was used in the first test, the new Java program was used
here. The inherent garbage collection of Java may explain the slightly raised variance in
the measurements. The average values of the sets are alike. The first test was run
manually, while this one was measured automatically. This, and the more abundant
sample, may explain the lower variation in the third test set here.

This showed that the results are significantly better. The new implementations are
clearly faster which is curious since typically the performance of a system will not
increase when more components are added to it. The performance of the individual
software components needed to be analysed in more detail.

Table 2 The results of the Welch's t test


Test 1: STOMP Test 2: OpenWire Test 3: XMPP Test 4: AMQP TCP

Average runtime in milliseconds 807 855 1609 1655 3364


Variance 3188 21181 35063 13455 4490
Count of runs 35 35 35 35 35
Estimated p-value 2.77184E-90 3.45798E-56 9.06142E-41 1.9529E-57 1

68
The Extended Benchmark Test
4

3.5

Time in seconds
2.5

1.5

0.5

0
Test 2:
Test 1: STOMP Test 3: XMPP Test 4: AMQP TCP
OpenWire
Average runtime in seconds 0.807179555 0.855181584 1.609173638 1.654944363 3.363611231
Standard deviation 0.056463984 0.145537006 0.18725098 0.115997052 0.067004302

Figure 41 The extended benchmark test

5.2.4 The Network Data

A relaying program was written to measure the data sent on-line and estimate the
time spent on sending each message. The implementation was straightforward. The
bits received from the application programs were relayed unmodified to the
destination port of the MQ service and the same vice versa for the bytes received from
the message queue provider.

The TCP connection characteristics of each test arrangement were estimated. The
input and output data counts and the number of TCP connections used were counted.
Figure 42 shows the statistics.

69
The Network Data Count
45000
40000
35000

Number of bytes
30000
25000
20000
15000
10000
5000
0
The
Test 1: Test 2: Test 3: Test 4: reference
STOMP OpenWire XMPP AMQP arrangeme
nt
Total bytes sent 9325 15313 16910 10640 8426
Total bytes received 18670 20827 41707 18446 14942
TCP connection count 3 1 3 3 17

Figure 42 The network data count

The times the system takes to handle a message were estimated and a time stamp was
printed out when proxy finished sending a message and a second one was printed
when the reception of a message from the message queue provider started. The time
stamps gave the system time in milliseconds. The time used for the estimation was the
difference in the values.

The results were interesting. They are shown in Table 3. As expected, handling a single
message takes more time with the new implementations. This is a realistic scenario,
since the message queue implementations use the same communication methods of
SIB-TCP as the original implementation does. The only difference is the added software
in between.

However, it was equally interesting to find out that the total run time was still much
longer than in the TCP implementation. The only explanation is that the application
and the libraries it uses runs more slowly in the original implementations. The
numerous TCP connections it uses can explain that well since they can cause overhead.

One more thing was interesting. The timestamps exposed the extra time the OpenWire
and XMPP protocols needed to initialise the MQ connection.

70
Table 3 The time the system takes to handle the sample messages
The reference TCP Test 1: STOMP Test 2: OpenWire Test 3: XMPP
basic_test benchmark arrangement
Beginning 0 0 233 1257
JOIN 60 62 48 223
SUBSCRIBE 89 133 87 98
INSERT 49 249 114 87
UNSUBSCRIBE 28 34 61 159
Total time used in milliseconds 4596 1664 1862 4428

5.2.5 Evaluating the effect of the Network Bandwidth

The network traffic of the system depends on the kind of network that is used. The
system can be divided into two distinct parts. The message queue providers, SIBMQ
and Smart-M3, are services. It is natural to locate them in the same local network. On
the other hand, the agent software written in Python and the related Python modules
can be seen as a separate unit and thus it can be placed apart.

It is clear that the bandwidth affects the effectiveness of the system. To estimate how
much and in which ways a narrow bandwidth can affect the system a TCP connection
mediator program was written. The program simulates network traffic in three ways. It
receives and sends data in 100 bytes packets, which was an arbitrary choice. In
addition to that, the network latency was simulated by adding a chosen delay before
each sent or received packet. That delay refers to the time the data has spent on line.

Finally, semaphores were added to the sending and receiving routines so that only one
thread was able to send and another thread receive data at the same time. That
ensured that the measuring program took the supposed maximum capacity of the
virtual network into consideration.

Figure 43 shows the test arrangement. In that figure, the red box shows the measuring
program. The local network of the server side is shown with light blue shadowing. The
class diagram of the program that was constructed is shown in Figure 44

71
SIBMQ
Message Smart M3
Service
D-bus SIBD
Topic SIB-tcp
Topic

TCPMediator
Tester

NodeMQ.py

MQConnector
Python
Application
discovery.py

Figure 43 The network test arrangement

Figure 44 The class diagram of the TCP mediator tester

As Java garbage collection can have an effect on the measurements, the program was
written so as to take that into account. Before every test cycle, a new garbage
collection was triggered. The idea here is that triggering from the program code would
minimise the risk that it would take place while the measurement was taking place.

An automatic test run was performed for each test arrangement. At the beginning of
each test cycle, a new delay value in the range from 0 to 55 milliseconds was set
72
randomly. When 300 measurements had been made, they were sorted and ordered.
Microsoft Excel was used to create an XY Scatter Graph for each set and that was used
to calculate a linear regression trend line and its equation. The linear regression type
was used here since the phenomenon was believed to be linear.

The actual measurements are shown in Table 6 in Appendix E: The effect of the
Network Bandwidth.

The goodness of the regression type selection was tested by calculating the sum of the
squares of the vertical deviations (R²) values for the first test arrangement to estimate
how closely the estimate fits the sample. As R² values range from 0 to 1 and R²=1
represents the perfect fit, it was interesting to see that the Power, Logarithmic and
Exponential equations give notably poorer R² values than the linear one. On the other
hand, polynomial equations of an order of 4 or 6 offer the best fitting. However, these
can lead to overfitting, which means that the equation models well the sample values
but not the phenomenon itself and thus it is not as good at predicting the future
values. The calculated R² values are shown below. To summarize this, the linear type of
regression was estimated to be a good candidate to model sample.

R² values for the Test 1 samples

Linear equation fit: R² = 0.8696


Exponential equation fit: R² = 0.8264
Logarithmic equation fit: R² = 0.6809
Polynomial order 2: R² = 0.8698
Polynomial order 4:R² = 0.8882
Power equation fit: R² = 0.831

The sample values are plotted and the calculated trend lines of the test sets are shown
in the figures below.

73
Test 1: STOMP
18000
16000

runtime in milliseconds
14000
12000
10000
8000
6000 Linear (Test 1)
4000
2000 y = 263.12x + 3586
0 R² = 0.8696
0 10 20 30 40 50 60

delay in milliseconds

Figure 45 The regression line of the Stomp test arrangement

Test 2: OpenWire
25000
runtime in milliseconds

20000

15000

10000
Linear (Test 2)
5000
y = 357.39x + 4396.3
0 R² = 0.8436
0 10 20 30 40 50

delay in milliseconds

Figure 46 The regression line of the OpenWire test arrangement

74
Test 3: XMPP
40000
35000
runtime in miliseconds 30000
25000
20000
15000
Linear (Test 3)
10000
5000 y = 588.84x + 6955
0 R² = 0.8646
0 10 20 30 40 50

delay in milliseconds

Figure 47 The regression line of the XMPP test arrangement

Test 4: AMQP
25000
runtime in milliseconds

20000

15000

10000
Linear (Test 4)
5000
y = 300.15x + 5062.8
0 R² = 0.8356
0 10 20 30 40 50 60

delay in milliseconds

Figure 48 The regression line of the AMQP test arrangement

75
The reference Arrangement
20000
18000

runtime in milliseconds
16000
14000
12000
10000
8000
6000 Linear (TCP)
4000
2000 y = 233.34x + 6343.9
0 R² = 0.8193
0 10 20 30 40 50 60

delay in milliseconds

Figure 49 The regression line of the reference test arrangement

The x coefficients of the trend line equations can be assumed comparable to the
amount of data per unit of information, i.e. message sent on lines. The time needed to
send data on a slower telecommunication line is analogous to the comparable data
amount of the protocol and thus should be seen in the derivatives of the trend lines.

Figure 50 shows the comparison. Most of the values here fit very neatly to the
estimated trend line. This verifies that theory. This is not a surprise since the network-
simulating model is built on that assumption. However, the deviations are very
interesting. Test 2 has a significant deviation of the trend line. A straightforward
conclusion is that the OpenWire solution is prone to extra delays caused by factors
that are unknown as yet. Still, the deviation is not significant enough for the conclusion
to be self-evident.

76
500

450 XMPP

The x coefficient ofthe trend line equation


400

350

300 OpenWire

250
AMQP
200 Stomp
TCP
150

100

50

0
0 5000 10000 15000 20000 25000 30000 35000 40000 45000

The send data amount in bytes

Figure 50 The comparison of the amount of data sent per unit of information to the
derivatives of the estimated trend lines

All the trend lines of the test sets are shown in Figure 51 and Figure 52. The graphs show
that the bandwidth has a clear effect on the runtime values.

The delay values, where the trend line of a test set and the reference arrangement
crossed were calculated.

77
60000

50000

runtime in milliseconds 40000

Linear (The reference)


30000 Linear (Test 1)
Linear (Test 2)
Linear (Test 3)
20000
Linear (Test 4)

10000

0
0 10 20 30 40 50 60 70 80 90

delay in milliseconds

Figure 51 The trend lines of the test sets

10000

9000

8000

7000
runtime in milliseconds

6000
Linear (The reference)
5000 Linear (Test 1)

4000 Linear (Test 2)


Linear (Test 3)
3000
Linear (Test 4)
2000

1000

0
0 2 4 6 8 10 12 14 16 18 20

delay in milliseconds

Figure 52 The trend lines when the delay is set below 10 milliseconds

78
The results are:

The Test 1 trend line crosses the reference trend line at x = 93 milliseconds.

The Test 2 trend line crosses the reference trend line at x = 16 milliseconds.

Test 3 was interesting, since its trend line crossed the reference line at -2 milliseconds.
As that whole arrangement is faster when connections are direct, i.e. there is no
relaying software in between, the equations cannot estimate the effect of very small
delays well. In the other words, without the measurement unit the runtime values can
be assumed smaller on average. Thus, all the cross section points should be shifted to
the right to correct the systematic error.
The Test 4 trend line crosses the reference trend line at x = 19 milliseconds.

The corresponding estimated bandwidths are:


Test 1: 8 kbit/s
Test 2: 48 kbit/s
Test 4: 41 kbit/s

Test 3 was not listed above since negative values cannot be considered.

If the above estimates are correct enough, I claim that Test 1, the Stomp test
arrangement, is faster than the TCP implementation as long as the bandwidth of the
network used is higher than 8 kbit/s.

Correspondingly, Test 2, the OpenWire test arrangement, is faster than the reference
arrangement as long as the bandwidth is higher than 48 kbit/s and likewise, Test 4, the
AMQP, is faster than the reference when the bandwidth is higher than 41 kbit/s.

The named bandwidth values are low. That means the favourable condition are
present in almost any widely used networks, i.e. the new implementations are better,
regardless of the delays caused by mobile networks.

When the test sets were run one time after another, there were also errors. When
performance was being estimated, only those runs where all the messages of the
benchmark set were sent perfectly were included. That correctness was verified by
counting the end tags of the sent and received SSAP messages.
79
All of the tests were quite errorless when the delays were small. The differences became
apparent with greater delay times. The implementation of the Stomp protocol behaved
well until the delays were higher than 30 milliseconds. The OpenWire implementation
had a relatively steady error rate. The XMPP implementation was the most error prone
test arrangement. An explanation for this is that the Apache ActiceMq XMPP protocol
interface was not very steady.

Finally, it was notable, that AMQP and the reference test had perfectly errorless runs.
Table 4 shows the error rates.

Table 4 The error rates of the test runs

Test 1: STOMP Test 2: OpenWire Test 3: XMPP Test 4: AMQP TCP


Total 3.33 % 9.84 % 18.03 % 0.00 % 0.00 %
Delay under 10 ms 0.00 % 1.64 % 3.28 % 0.00 % 0.00 %
Delay 10 to 20 ms 0.00 % 3.28 % 1.64 % 0.00 % 0.00 %
Delay more than 20 ms 3.33 % 4.92 % 13.11 % 0.00 % 0.00 %

5.2.6 Profiling the Software Solutions

The test arrangement implementations of NodeMQ.py offer an API interface that is


equivalent to the original Node.py interface. That is, there is no need to change the
application in order for the message system implementation to be used.

In the module, the message handling is processed in a connector class that specifies the
methods to be used to connect, to send, to receive, and to close. In our message system
test arrangements that class is named MQConnector. In addition to the above methods,
it also specifies the method to reconnect the connection and to end the session.

As the MQConnector classes of each test arrangement have exactly the same interface,
the cumulative time used with each method could be estimated. That throws up
interesting facts about the performance factors of the protocol and the library that was
used. The Python cProfile module was used to generate the statistics.

80
Table 5 shows the cumulative runtime values of each method used in the tests. The
connect and close methods are not shown as the class has only dummy
implementations and thus theoretically no time is used there.

Relatively remarkable differences can be seen in the initialisation of the class. In


addition, reconnect, which is called when a new subscription connection is established,
has an interesting variance. The send and receive methods have rather even values,
although the AMQP implementation differs interestingly.

Table 5 The cumulative runtime with the MQ interface methods


MQConnector: __init__ MQConnector:reconnect MQConnector:send MQConnector:receive MQConnector:end
Test 1: STOMP 0.007 0.099 0.174 0.475 0.001
Test 2: OpenWire 0.020 0.016 0.156 0.503 0.004
Test 3: XMPP 0.273 0.910 0.187 0.510 0.000
Test 4: AMQP 0.068 0.123 0.079 0.987 0.001

In short, the OpenWire and AMQP implementations seem to be significantly better than
the corresponding Stomp and XMPP when numerous new subscriptions are created.
Nevertheless, AMQP is fast at sending messages but receiving took a reasonably long
time.

Anyhow, the above-mentioned figures are based on test runs of size one. There can be
deviation that is not shown here.

5.2.7 The Mixed Topic and Queue Based Solution

The solution here deviates in the following way from the aforementioned one. It uses
queues for direct communication between the communication parties and topics only
when a subscription is used in the application. It also has one essential difference.
SIBMQforActiveMQ uses a fixed number of threads instead of an unlimited count of
processes. The architectural topology is shown in Figure 53.

Figure 54 shows how much the thread count affects this alternative solution.

81
SIBMQ
Message SSAP/TCP Smart M3
SSAP Queues
Application D-bus
MQ SIB-tcp SIBD

MQ

SSAP

Node.py
Python
MQConnector
Application

discovery.py

Figure 53 The queue topic solution and fixed number of threads

20

18

16

14
Bench time in seconds

12

10

0
0 5 10 15 20 25

Thread count at the SIBMQ

Figure 54 The effect the number of threads have on the alternative Stomp Solution

Figure 55 shows a clear difference. The graph compares the benchmark runs of the
alternative mixed solution and the pure topic solution (Test 1). It shows that the pure
topic solution is plainly more efficient than the alternative implementation that is
presented.

82
The Comparison of the STOMP
Solutions
1.80
1.60
milliseconds

1.40
1.20
1.00
0.80
0.60
0.40
0.20
0.00
The STOMP Mixed
Test 1: STOMP
Solution
Average runtime in
1.57 0.807
milliseconds
Standard deviation time 1.52 0.056

Figure 55 The comparison of the Stomp solutions

The architectural difference here is too great for strong conclusions to be drawn. It is
enough to mention that the chosen pure topic-based solution is good but that
alternative solutions can also still be found.

5.3 The Maintainability of the Code

The maintainability of the implementations was estimated by calculating the


complexity metrics of the written source code. Reginald B. Charney’s pymetrics
program was used for this purpose.

The original Constructive Cost Model (COCOMO 81) was developed by Dr. Barry
Boehm. COCOMO is one of the open software cost estimation models that aim to
accurately estimate the time and cost a software project consumes. He published the
model in 1981. That model was based on the waterfall model of software
development. The software development process has evolved since that time.
COCOMO II is the successor to COCOMO 81. It is a redesigned version of the original
model made by USC-CSSE, ISR at UC Irvine, and the COCOMO II Project Affiliate
Organisations. The objective of that was to tune the model for the new life cycle
practices, e.g. it has a support for continuous model improvement. It places increased

83
emphasis on off-the-shelf software components and the reuse of the existing
software.(Center for Systems and Software Engineering)(Boehm, Clark et al. 1995)

The COCOMO II Source Lines of Code (SLOC) value of the Python interface
implementations of the message system was estimated by using B. Charney’s pymetrics.
A comparison of the values is shown in Figure 56. The value refers to the size estimation
of a project where one SLOC is one logical line of the source code. Only lines that are
delivered as a part of the software are counted, i.e. the test code is not included. In
addition, comments are omitted. Here the Stomp and AMQP implementations had the
smallest estimates.

1044
Calculated COCOMO II's SLOC Metric

1042
1040
1038
1036
values

1034
1032
1030
1028
1026
Test 1: Test 2: Test 3: Test 4:
STOMP OpenWire XMPP AMQP
pymetrics COCOMO II estimate
1032 1038 1043 1033
for NodeMQ.py

Figure 56 The Pymetrics Constructive Cost Model II (COCOMO II) cost estimation metrics of
the NodeMQ implementations

Thomas J. McCabe developed cyclomatic complexity early in 1976. Cyclomatic


complexity is a software metric in which the decision points of a program module are
counted. The complexity of the program is calculated by using a graph that is derived
from the control flow graph of the software. The value has been shown to correlate
with the error ratio of the software (McCabe, Butler 1989).

To estimate the differences in the total complexity of the interface implementations


that were constructed, the cyclomatic complexity values were summed up. The
original implementation was reduced to emphasise the work done to change the
implementation. The values are just for comparison. Figure 57 shows that evaluation.
The OpenWire solution is just a little bit less complex than the others are. However, to

84
my knowledge, the total sum of the values does not give an exact indication that the
software is complex.

To be more precise, the most complex method of the interface implementation was
the method that receives messages from the messaging system. All the
implementations had the same count of decision points here. Cyclomatic complexity of
five is a quite reasonable value (McCabe, Butler 1989). However, there were still
notable differences. The XMPP implementation has the most complex interface
initialisation routine. The cyclomatic complexity of that was 5, while the other
implementations had 3. The class has also had the most complex routine for spanning
a new connection to a subscription. Furthermore, in the XMPP and AMQP
implementations, I had to use an additional handler class to realise callback for the
messaging system.

To summarise, the complexities of the individual methods were reasonable. There is


no apparent reason to believe these are error-prone. However, the total complexity of
the implementations seems to be best in the OpenWire solution, although the Stomp
implementation was comparable.

88

86
McCabe Complexity Metric

84

82

80

78

76

74
Test 2:
Test 1: STOMP Test 3: XMPP Test 4: AMQP
OpenWire
value 80 79 87 83

Figure 57 The McCabe Cyclomatic complexity metrics of the NodeMQ implementations

85
David A. Wheeler’s SLOCCount has been used to calculate the person-month estimations
of the interfaces. SLOCCount uses the COCOMO 81 estimation model (Wheeler 2004),
which means it is not as accurate as pymetrics. The basic difference is that SLOCCount
calculates source code lines, while pymetrics calculates only the logical corresponding
lines. However, only SLOCCount offers an estimate of the amount of work on the
program of both of these. The estimate is offered in person-months.

All the interface implementations were estimated. Then, to give a precise estimate for
the additional work done on each of the new messaging system interface sources, the
estimate for the original implementation work of the TCP module was reduced. Figure
58 shows a comparison of the estimates.

0.695
0.69
0.685
person-months

0.68
0.675
0.67
0.665
0.66
0.655
0.65
0.645
Test 1: Test 2: Test 3: Test 4:
STOMP OpenWire XMPP AMQP
Node.py Node.py Node.py Node.py
SLOCCount person-
0.66 0.68 0.69 0.67
months estimate

Figure 58 The SLOCCount person-months estimates of the amount of work needed for the
Node.py interface changes.

Figure 59 shows the same values compared to the estimate of the amount of work
done to implement the original Node.py interface. That value does not include the
discovery.py implementation.

86
3
2.5

person-months
2
1.5
1
0.5
0
Test 2:
Test 1: Test 3: Test 4:
OpenWir
STOMP XMPP AMQP Node.py
e
Node.py Node.py Node.py
Node.py
SLOCCount person-
0.66 0.68 0.69 0.67 2.78
months estimate

Figure 59 The SLOCCount person-months estimates of the changes compared to the original
Node.py

To estimate the complete test arrangements, the person-months estimations of both


the SIBMQ mediator software that transmits the messages from the messaging system
to the RDF store and vice versa were calculated. The results are presented in Figure 60.
Finally, the total estimation of the test arrangements is shown in Figure 61. It was
interesting to see that, since the work estimate of the mediator software for RabbitMQ,
was small, the total estimate of the AMQP test arrangement was remarkably better than
that of the other ones.

To summarise the maintainability of the code, the individual methods used for the code
can be estimated to be easy to maintain and the probability of there being a fault in the
code is low. The Stomp and OpenWire interfaces were the easiest ones to implement.
However, the total work cost of the test arrangements gave a different result. The
AMQP solution was the most cost effective to implement. That does not necessarily
mean that this solution is superior. The RabbitMQ mediator was based on the ActiveMQ
mediator software. As the software structure was redesigned, the result was more
streamlined. That can possibly be seen in the estimations.

87
4.7

4.65

person-months
4.6

4.55

4.5

4.45

4.4
SIBMQ forActiveMQ SIBMQ for RabbitMQ
SLOCCount person-
4.65 4.51
months estimate

Figure 60 The SLOCCount person-months estimates of the SIBMQ implementations

5.4

5.35

5.3
person-months

5.25

5.2

5.15

5.1
Test 1: Test 2: Test 3: Test 4:
STOMP OpenWire XMPP AMQP
SLOCCount person-
5.31 5.33 5.34 5.18
months estimate

Figure 61 The total person-months estimate of the test arrangements

88
6 Results and Conclusion

A survey of the open message oriented middleware protocols and their implementation
alternatives was conducted and the alternative ways to connect the messaging systems
to Smart-M3 were studied. The performance comparisons were made.

I have proposed a solution to use message queue to access a smart space system
implemented as an RDF store and evaluated the performance of the solution using
different open message queuing protocols. It was possible to add a messaging system to
the system in a transparent way. Numerous implementations were constructed. The
performance advantage of all these implementations in relation to a direct TCP
connection was evident. All the messaging implementations had observably better
throughput times.

The open message-oriented middleware protocols showed visible differences in their


performance. Although the implementations of the libraries and messaging servers, as
well as the interface created in the study, have unavoidable effect on the performance,
protocol-specific differences could also be found. However, it is also worth noticing that
the implementation is a fixed part of the use of protocols.

Figure 62 shows a summary of the entire test. The tests where the results were better
than on average are shown in green. The colour red emphasises those results that were
especially bad. In general, it can be seen that the Stomp test arrangement outperformed
the others on average. This protocol used fewer bytes for the network communication
than the others did. The Stomppy library implementation was the fastest one in the
benchmark tests and profiling shows that it also outperforms the others in the
initialisation phase. It was also the fastest of the implementations when receiving
messages.

89
Test 1: Test 2: Test 3: Test 4:
STOMP OpenWire XMPP AMQP
Benchmark test: the average runtime 0.90 1.00 1.90 1.50
Benchmark test: the standard deviation value 0.05 0.03 0.52 0.03
Extended benchmark test: the average runtime value 0.81 0.86 1.61 1.66
Extended benchmark test: the standard deviation value 0.06 0.146 0.187 0.116
Network data count: Total bytes sent 9325 15313 16910 10640
Network data count: Total bytes received 18670 20827 41707 18446
Network data count: TCP connection count 3 1 3 3
Total error rate at the network test 3.33 % 9.84 % 18.30 % 0%
Error rate when delay was under 30ms 0% 1.64 % 3.28 % 0%
Profiling: initialisation of the connector class 0.01 0.02 0.27 0.068
Profiling: reconnect 0.10 0.02 0.91 0.123
Profiling: send 0.17 0.16 0.19 0.08
Profiling: receive 0.48 0.50 0.51 0.99
McCabe Complexity Metrics 382 382 389 385
SLOCCount person-months total estimate 5.31 5.34 5.34 5.18

Figure 62 The summary of the results

In Chapter 1.2 I stated that the number of open TCP sockets should be reduced. The
OpenWire protocol implementation fits nicely here. The OpenWire test arrangement,
where the pyactive library and ActiveMQ server were used, uses only one TCP
connection per client. The counted cumulative McCabe Cyclomatic complexity value was
the lowest for the OpenWire implementation.

The OpenWire solution is also quite comparable to the Stomp implementation.


However, although OpenWire is an open protocol specification, it is Apache ActiveMQ
specific (Apache Software Foundation). That limits its use.

The proposed implementation is shown in Figure 63. Stomp has many advantages. It
performed well in the tests on average. It was the fastest one in the benchmark test.
The protocol does not have any extra overhead during the initialisation phase (see Table
3). Although the implementation does not use the minimum number of TCP
connections, Stomp is more flexible than OpenWire as the implementation is not bound
to only one server type (Figure 29). The total person-months estimate is also better than
for OpenWire.

90
Application

Figure 63 The selected implementation

91
Python KP
discovery.py SIBMQ Smart-M3 0.9.2
for ActiveMQ
NodeMQ.py
Topic D-bus
SIB-
MQConnector SIBD
Topic tcp
Stomp.py 2.3 Apache ActiveMQ 5.3 ActiveMQ CPP
version 3.0.1
Python 2.6.2 Java OpenJDK 1.6.0_0
7 Discussion

As Claudia Handen pointed out in (Hanssen 2005), Java Garbage Collection can have a
remarkable effect on the performance of a Java program. In (Shiffman 1996) Stiffman
shows another characteristic of the Java runtime systems. Java uses the late binding
approach to resolve external references for a class. That means that the Java Runtime
engine does that the first time an object is accessed by the code. It causes delays in the
program and affects the performance.

ActiveMQ is written in Java (Snyder, Davies et al. 2009). This explains well why ActiveMQ
does not give steady throughput times. That phenomenon can be seen at the beginning
of a test series (Figure 64). However, it should not have a significant effect on a typical
application. ActiveMQ is a message-oriented middleware service and thus can be
assumed to be running in the background without oft-repeated restarts.

Dabek (2002) states that event-based programs display a superior performance in


general. He also points out that callbacks are characteristic of an event-based system.
The original implementation uses blocking reads but all the test arrangements have non-
blocking messaging interfaces as well as callbacks for receiving messages. As all the test
arrangements written for this thesis outperform the reference implementation, I have
to admit that his statement on performance is valid in this case.

92
8 BIBLIOGRAPHY

ANDRAHENNADI, S., SAMARARATHNA, E. and FERNANDO, S., 2008. Building High


Performance Distributed Applications using Middleware: A Case Based Example from
the Industry.
http://www.sliit.lk/Research/SRS08/ResearchSymposium2008/journal/Page7-12.pdf
edn. Sri Lanka Institute of Information Technology.

APACHE SOFTWARE FOUNDATION, 23 Feb 2009, 2009-last update, Apache ActiveMQ --


CMS API Overview [Homepage of Apache Software Foundation], [Online]. Available:
http://activemq.apache.org/cms/cms-api-overview.html [1/5/2010, 2010].

APACHE SOFTWARE FOUNDATION, , Apache ActiveMQ -- OpenWire Version 2


Specification [Homepage of Apache Software Foundation], [Online]. Available:
http://activemq.apache.org/openwire-version-2-specification.html [4/4/2010, 2010].

BARCIA, R., August 19, 2003, 2003-last update, JMSApplicationArchitectures.pdf


(application/pdf Object) [Homepage of TheServerSide.com], [Online]. Available:
http://www.theserverside.com/tt/articles/content/JMSArchitecture/JMSApplicationArc
hitectures.pdf [12/30/2009, 2009].

BARTHEL, J., 13/09/09, 2009-last update, Getting started with AMQP and RabbitMQ
[Homepage of InfoQ.com], [Online]. Available: http://www.infoq.com/articles/AMQP-
RabbitMQ [03/29, 2010].

BERNERS-LEE, T., HENDLER, J. and LASSILA, O., 2001. The semantic web. Scientific
American, 284(5), 34-43.

BERNSTEIN, P.A., 1996. Middleware: a model for distributed system services.


Commun.ACM, 39(2), 86-98.

BEST, D. and RAYNER, J., 1987. Welch's approximate solution for the Behrens-Fisher
problem. Technometrics, 29(2), 205.

BOEHM, B., CLARK, B., HOROWITZ, E., WESTLAND, C., MADACHY, R. and SELBY, R., 1995.
Cost models for future software life cycle processes: COCOMO 2.0. Annals of software
engineering, 1(1), 57-94.

93
BOLDYREV, S., OLIVER, I. and HONKOLA, J., 2009. A mechanism for managing and
distributing information and queries in a smart space environment, The 1st International
Workshop on Managing Data with Mobile Devices (MDMD 2009) 6-7 May, 2009-Milan,
Italy, 2009, .

CARTER, M., 10/08/08, 2008-last update, Comet Daily » Blog Archive » Scalable Real-
Time Web Architecture, Part 1: Stomp, Comet, and Message Queues [Homepage of
Comet Daily], [Online]. Available: http://cometdaily.com/2008/10/08/scalable-real-
time-web-architecture-part-1-stomp-comet-and-message-queues/ [3/30/2010, 2010].

CENTER FOR SYSTEMS AND SOFTWARE ENGINEERING, , CSSE Website - COCOMO II


[Homepage of University if Southern California], [Online]. Available:
http://csse.usc.edu/csse/research/COCOMOII/cocomo_main.html [4/21/2010, 2010].

CHAPPELL, D. and MONSON-HAEFEL, R., April 1, 2001, 2001-last update, JAVA


Developer's Journal: Guaranteed Messaging With JMS [Homepage of SYS-CON Media],
[Online]. Available: http://java.sys-con.com/node/36239 [6/1/2010, 2010].

COULOURIS, G.F., DOLLIMORE, J. and KINDBERG, T., 2005. Distributed systems: concepts
and design. fourth edition edn. Addison-Wesley.

DABEK, F., ZELDOVICH, N., KAASHOEK, F., MAZIERES, D. and MORRIS, R., 2002. Event-
driven programming for robust software, Proceedings of the 10th workshop on ACM
SIGOPS European workshop, 2002, ACM pp189.

DACONTA, M.C., OBRST, L.J. and SMITH, K.T., 2003. The semantic web. Wiley Pub.

DALLAL, G.E., 07/16/2008, 2008-last update, Student's t Test for Independent Samples.
Available: http://www.jerrydallal.com/LHSP/STUDENT.HTM [6/3/2010, 2010].

DAVIES, J., FENSEL, D. and VAN HARMELEN, F., 2002. Towards the semantic web:
ontology-driven knowledge management. John Wiley & Sons, Inc. New York, NY, USA.

DAVIS, W.S., 1999-last update, Structured program design [Homepage of CRC Press LLC],
[Online]. Available:
ftp://ftp.seu.edu.cn/Pub2/EBooks/Books_from_EngnetBase/pdf/7001/7001_PDF_C62.p
df [4/30, 2010].

94
DEITEL, H., M., DEITEL, P.,J. and SANTRY, S.,E., 2002. Messaging with JMS. In: M.
HORTON, ed, Advanced Java 2 Platform - How to Program. Upple Saddle River, New
Jersey: Prentice Hall, pp. 938-939.

DING, Y., FENSEL, D., KLEIN, M. and OMELAYENKO, B., 2002. The semantic web: yet
another hip? Data & Knowledge Engineering, 41(2-3), 205-227.

EUGSTER, P., 2007. Type-based publish/subscribe: Concepts and experiences. ACM


Transactions on Programming Languages and Systems (TOPLAS), 29(1), 6.

EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. and KERMARREC, A., 2003. The many faces
of publish/subscribe. ACM Comput.Surv., 35(2), 114-131.

FAISON, T., 2006. Event-based programming : taking events to the limit ; learn how to
use events to create better, simpler software systems in record time ; examples in both
C# and VB 2005. Berkeley, CA: Apress, pp. 3-4.

FAROOQ, U., PARSONS, E.W. and MAJUMDAR, S., 2004. Performance of


publish/subscribe middleware in mobile wireless networks, WOSP '04: Proceedings of
the 4th international workshop on Software and performance, 2004, ACM pp278-289.

FARRELL, W., 08 Jun 2004, 2004-last update, Introducing the Java Message Service
[Homepage of IBM Corporation], [Online]. Available:
https://www6.software.ibm.com/developerworks/education/j-jms/j-jms-updated-
pdf.pdf [1/5/2010, 2010].

FERG, S., 1/8/2006, 2006-last update, Event-Driven Programming: Introduction, Tutorial,


History [Homepage of sourceforge.net], [Online]. Available:
http://eventdrivenpgm.sourceforge.net/event_driven_programming.pdf [1/18/2010,
2010].

FLIEDER, K., 2005. Testing and Visualizing a Message Queuing Infrastructure. Proc.of
WMSCI, 5, 51-57.

FRÄMLING, K., OLIVER, I., HONKOLA, J. and NYMAN, J., 2009-last update, Smart Spaces
for Ubiquitously Smart Buildings. Available:
http://www.cs.hut.fi/u/framling/Publications/Ubicomm2009_BA.pdf [3/30, 2010].

95
FURCHE, T., LINSE, B., BRY, F., PLEXOUSAKIS, D. and GOTTLOB, G., 2006. Rdf querying:
Language constructs and evaluation methods compared. Reasoning Web, , 1-52.

GADDAH, A. and KUNZ, T., 2006. Performance of Pub/Sub Systems in Wired/Wireless


Networks, Vehicular Technology Conference, 2006. VTC-2006 Fall. 2006 IEEE 64th, 2006,
pp1-5.

GUI, G. and SCOTT, P., D., 2006. Coupling and cohesion measures for evaluation of
component reusability, Proceedings of the 2006 international workshop on Mining
software repositories, 2006, ACM pp21.

GUTIERREZ, C., HURTADO, C. and MENDELZON, A.O., 2004. Foundations of semantic


web databases, PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART
symposium on Principles of database systems, 2004, ACM pp95-106.

HAASE, K., 2002. Java Message Service API Tutorial. Palo Alto, California: Sun
Microsystems, Inc.

HANSLO, W. and MACGREGOR, K., 2004. The efficiency of XML as an intermediate data
representation for wireless middleware communication, SAICSIT '04: Proceedings of the
2004 annual research conference of the South African institute of computer scientists
and information technologists on IT research in developing countries, 2004, South
African Institute for Computer Scientists and Information Technologists pp279-283.

HANSSEN, C., 07/28/05, 2005-last update, Avoid Performance Bottlenecks from Java
Garbage Collection [Homepage of SAP Community Network], [Online]. Available:
http://www.sdn.sap.com/irj/sdn/performance-analysis?rid=/library/uuid/9771c65d-
0501-0010-2c90-ad2a2f2487c7 [3/31/2010, 2010].

HAPPE, J., FRIEDRICH, H., BECKER, S. and REUSSNER, R.H., 2008. A pattern-based
performance completion for Message-oriented Middleware, WOSP '08: Proceedings of
the 7th international workshop on Software and performance, 2008, ACM pp165-176.

HE, H., September 30, 2003, 2003-last update, What is service-oriented architecture
[Homepage of O'Reilly Media, Inc.], [Online]. Available:
http://www.xml.com/pub/a/ws/2003/09/30/soa.html [6/2, 2010].

96
HENJES, R., MENTH, M. and HIMMLER, V., 2007. Impact of complex filters on the
message throughput of the ActiveMQ JMS server. Managing Traffic Performance in
Converged Networks, , 192-203.

HENJES, R., MENTH, M. and ZEPFEL, C., 2006a. Throughput performance of java
messaging services using sun java system message queue, Proceedings 20th European
Conference on Modelling and Simulation, Bonn, Germany, 2006a, Citeseer pp684–691.

HENJES, R., MENTH, M. and ZEPFEL, C., 2006b. Throughput Performance of Java
Messaging Services Using WebsphereMQ, Distributed Computing Systems Workshops,
2006. ICDCS Workshops 2006. 26th IEEE International Conference on, 2006b, pp26-26.

HOHPE, G., 2005. Developing software in a service-oriented world. Whitepaper,


ThoughtWorks Inc, .

HUPPLER, K., 2009. The Art of Building a Good Benchmark. Performance Evaluation and
Benchmarking, , 18-30.

JANIK, M. and KOCHUT, K., 2005. Brahms: A workbench rdf store and high performance
memory system for semantic association discovery. Lecture Notes in Computer Science.
Berlin / Heidelberg: Springer, pp. 431-445.

KELL, S., 2009. The mythical matched modules: overcoming the tyranny of inflexible
software construction, Proceeding of the 24th ACM SIGPLAN conference companion on
Object oriented programming systems languages and applications, 2009, ACM pp881-
888.

KELL, S., 2008. A Survey of Practical Software Adaptation Techniques. Journal of


Universal Computer Science, 14(13), 2110-2157.

KOWALEWSKI, B., BUBAK, M. and BALIS´, B., 2008. An Event-Based Approach to


Reducing Coupling in Large-Scale Applications.

KUO, D. and PALMER, D., 2001. Automated Analysis of Java Message Service Providers.
Middleware 2001 : IFIP/ACM International Conference on Distributed Systems Platforms
Heidelberg, Germany, November 12-16, 2001.Proceedings, , 1.

LI, H. and JIANG, G., 2004. Semantic message oriented middleware for publish/subscribe
networks, proc of SPIE, 2004, pp124-133.
97
LUPIANA, D., O’DRISCOLL, C. and MTENZI, F., 2009. Defining Smart Space in the Context
of Ubiquitous Computing. Ubiquitous Computing and Communication Journal, 4(3), 516-
517-524.

MALANI, R., 2004. The Benefits of a Java Message Service Implementation of the C2
Framework. http://www.urop.uci.edu/journal/journal04/06/malani.pdf edn.

MALANI, P., 02/15/02, 2002-last update, Transaction and redelivery in JMS [Homepage
of Infoworld, Inc.], [Online]. Available: http://www.javaworld.com/javaworld/jw-02-
2002/jw-0315-jms.html?page=1 [1/15/2010, 2010].

MARSH, G., SAMPAT, A.P., POTLURI, S. and PANDA, D.K., , Scaling Advanced Message
Queuing Protocol (AMQP) Architecture with Broker Federation and InfiniBand
[Homepage of The Ohio State University], [Online]. Available: ftp://ftp.cse.ohio-
state.edu/pub/tech-report/2009/TR17.pdf [03/30, 2010].

MCBRIDE, B., 2004. The resource description framework (rdf) and its vocabulary
description language rdfs. Handbook on Ontologies, , 51–66.

MCCABE, T.J. and BUTLER, C.W., 1989. Design complexity measurement and testing.
Communications of the ACM, 32(12), 1425.

MENDES, M., BIZARRO, P. and MARQUES, P., 2009. A Performance Study of Event
Processing Systems. Performance Evaluation and Benchmarking, , 221-236.

MENTH, M., HENJES, R., ZEPFEL, C. and GEHRSITZ, S., 2006. Throughput performance of
popular JMS servers, SIGMETRICS '06/Performance '06: Proceedings of the joint
international conference on Measurement and modeling of computer systems, 2006,
ACM pp367-368.

MISRA, J., 1991. Loosely-coupled processes (preliminary version), PARLE'91 Parallel


Architectures and Languages Europe, 1991, Springer pp1-26.

MONSON-HAEFEL, R. and CHAPPELL, D.A., 2001. Java message service. O'Reilly Media.

MUHL, G., FIEGE, L. and PIETZUCH, P., 2006. Distributed Event-Based Systems. Berlin:
Springer-Verlag.

, G., 2006. Distributed Event-Based Systems. Springer-Verlag.


98
NIXON, P., DOBSON, S. and LACEY, G., 2000. Managing Smart Environments, Proceedings
of the Workshop on Software Engineering for Wearable and Pervasive Computing, 2000,
Citeseer.

O’SULLIVAN, D. and WADE, V., 2002. A smart space management framework. Computer
Science Technical Report, TCD-CS-2002-23, Trinity College Dublin, .

O'HARA, J., 2007. Toward a Commodity Enterprise Middleware. Queue, 5(4), 48-55.

OKOSHI, T., WAKAYAMA, S., SUGITA, Y., AOKI, S., IWAMOTO, T., NAKAZAWA, J.,
NAGATA, T., FURUSAKA, D., IWAI, M. and KUSUMOTO, A., 2001. Smart space laboratory
project: Toward the next generation computing environment, IEEE Third Workshop on
Networked Appliances (IWNA 2001), 2001, Citeseer.

OLIVER, I., HONKOLA, J. and ZIEGLER, J., 2008. Dynamic, Localised Space Based Semantic
Webs, IADIS Int’l WWW/Internet Conference, 2008, pp426–431.

PETRI, D., 8/9/2009, 2009-last update, OSI Model Concepts. Available:


http://www.petri.co.il/osi_concepts.htm [3/30/2010, 2010].

PIETZUCH, P., EYERS, D., KOUNEV, S. and SHAND, B., 2007. Towards a common API for
publish/subscribe, DEBS '07: Proceedings of the 2007 inaugural international conference
on Distributed event-based systems, 2007, ACM pp152-157.

PREHOFER, C., VAN GURP, J. and DI FLORA, C., 2007. Towards the web as a platform for
ubiquitous applications in smart spaces, Second Workshop on Requirements and
Solutions for Pervasive Software Infrastructures (RSPSI), at Ubicomp, 2007, .

PRUD'HOMMEAUX, E. and SEABORNE, A., 1/15/2008, 2008-last update, SPARQL Query


Language for RDF [Homepage of W3C], [Online]. Available: http://www.w3.org/TR/rdf-
sparql-query/ [4/28/2010, 2010].

RADESTOCK, M., , A technical look at RabbitMQ and Erlang. Available:


http://www.rabbitmq.com/resources/RabbitMQ_FITEclub_2009.pdf [3/29/2010, 2010].

RUSSELL, S.J., NORVIG, P., CANNY, J.F., MALIK, J. and EDWARDS, D.D., 1995. Solving
Problems by Searching. Artificial intelligence: a modern approach. Prentice hall
Englewood Cliffs, NJ, pp. 70-77.

99
SACHS, K., KOUNEV, S., APPEL, S. and BUCHMANN, A., 15 June 2009, 2009a-last update,
A Performance Test Harness For Publish/Subscribe Middleware [Homepage of ACM
special interest group for the computer systems performance evaluation community],
[Online]. Available:
http://conferences.sigmetrics.org/sigmetrics/2009/demo/Demo_Sachs_et_al.pdf [4/7,
2010].

SACHS, K., KOUNEV, S., APPEL, S. and BUCHMANN, A., 2009b. Benchmarking of
message-oriented middleware, DEBS '09: Proceedings of the Third ACM International
Conference on Distributed Event-Based Systems, 2009b, ACM pp1-2.

SAINT-ANDRE, P., 2009. XMPP: lessons learned from ten years of XML messaging.
Communications Magazine, IEEE, 47(4), 92-96.

SCHULDT, H., 2008. Distributed Information Systems, Message-oriented Middleware.


Universität Basel: Department Informatic.

SHIFFMAN, H., 1996-last update, Boosting Java Performance: Native Code & JIT
Compilers. Available: http://disordered.org/Java-JIT.html [3/31/2010, 2010].

SNYDER, B., DAVIES, R. and BOSANAC, D., 2009. Activemq in action. 1.0-Alpha edn.
http://www.manning.com/snyder/snyder_meapch1.pdf: Manning Publication.

SOININEN, J., LIUHA, P., LAPPETELÄINEN, A., HONKOLA, J., FRÄMLING, K. and RAISAMO,
R., 2010. Device interoperability: Emergence of the smart environment ecosystems.
Tieto- ja viestintäteollisuuden tutkimus TIVIT Oy.

VAN GURP, J., PREHOFER, C. and DI FLORA, C., 2008. Experiences with realizing smart
space web service applications, Consumer Communications and Networking Conference,
2008. CCNC 2008. 5th IEEE, 2008, pp1171–1175.

VINOSKI, S., 2006. Advanced Message Queuing Protocol. Internet Computing, IEEE,
10(6), 87-89.

VOSS, R., 2006-last update, Using JMS For Distributed Software Development
[Homepage of Javalobby, Inc.], [Online]. Available:
http://www.javalobby.org/articles/distributed-jms/ [12/30/2009, 2009].

100
WANG, X., DONG, J.S., CHIN, C., HETTIARACHCHI, R. and ZHANG, D., 2004. Semantic
Space: an infrastructure for smart spaces. PERVASIVE Computing, 2004(JULY–
SEPTEMBER), 32-39.

WHEELER, D., 8/1/2004, 2004-last update, SLOCCount User's Guide. Available:


http://www.dwheeler.com/sloccount/sloccount.html [4/21/2010, 2010].

WILKINSON, K., SAYERS, C., KUNO, H. and REYNOLDS, D., 2003. Efficient RDF storage and
retrieval in Jena2, Proceedings of SWDB, 2003, Citeseer pp7–8.

YANG, H.I., LIM, S., KING, J. and HELAL, S., 2006. Open issues in nomadic pervasive
computing. Proceedings of UbiSys, .

101
Appendix A: The NodeMQ Stomp interface implementation
import md5
import base64

import stomp
import time
import Queue
import signal
import os
import logging
import logging.config

try:
logging.config.fileConfig('stomp.log.conf')
except:
pass
log = logging.getLogger('stomp.py')
if not log:
log = utils.DevNullLogger()

class SSAPQUERYMsgHandler(ContentHandler):
def __init__(self):
self.queryTxt = ""
self.inQParameter = False
self.parameter_strings = []

def startElement(self, name, attrs):


if name == "parameter":
if attrs.get("name",None) == "query":
self.inQParameter = True
return
else:
if self.inQParameter:
self.parameter_strings.extend(["<", name])

for i in sorted(attrs.items()):
self.parameter_strings.extend([" ",str(i[0]), '="',str(i[1]), '"'])
self.parameter_strings.append(">")
return

def characters(self, ch):


if self.inQParameter:
self.parameter_strings.append(ch)

def endElement(self, name):


if name == "parameter":
if self.inQParameter:
self.queryTxt = "".join(self.parameter_strings)
self.inQParameter = False
return
else:
if self.inQParameter:
self.parameter_strings.extend(["</", name, ">"])
return

def signal_handler(num, stack):


pass

mq_incoming = Queue.Queue(0)
mq_connectors = list()

102
class MQConnector(Connector):
def __init__(self, arg_triple):
self.sib_address, self.sib_space_name, self.node_id = arg_triple
self.mq_incoming=mq_incoming
self.connectiontype="destination:"+self.node_id
if(len(mq_connectors)>0):
self.mq_conn=mq_connectors[0]
else:
self.mq_conn = stomp.Connection(host_and_ports = [ ('localhost', 61613) ],prefer_localhost =
True,try_loopback_connect = True)

self.mq_conn.set_listener('', self)
self.mq_conn.start()
self.mq_conn.connect()

self.mq_conn.subscribe(destination='/topic/'+self.node_id, ack='auto')
mq_connectors.append(self.mq_conn)

signal.signal(signal.SIGUSR1, signal_handler)

def on_error(self, headers, message):


print ('received an error %s' % message)

def on_message(self, headers, message):


self.mq_incoming.put(message)
os.kill(os.getpid(), signal.SIGUSR1)

def connect(self):
pass

def reconnect(self,query_msg):
self.qparser = make_parser()
self.handler = SSAPQUERYMsgHandler()
self.qparser.setContentHandler(self.handler)
self.qparser.parse(StringIO.StringIO(query_msg))
self.m = md5.new()
self.m.update(self.handler.queryTxt)
self.topic_name="Topic."+base64.b64encode(self.m.digest())

self.connectiontype="destination:"+self.topic_name
self.mq_conn = stomp.Connection(host_and_ports = [ ('localhost', 61613) ],prefer_localhost =
True,try_loopback_connect = True)
self.mq_incoming = Queue.Queue(0)
self.mq_conn.set_listener('', self)
self.mq_conn.start()
self.mq_conn.connect()
self.mq_conn.subscribe(destination='/topic/'+self.topic_name, ack='auto')

def send(self, msg):


self.mq_conn.send(msg, destination='/topic/SPACE.'+self.sib_space_name)
self.counter=0

def receive(self,tr_id = None):


mes = None
if(self.mq_incoming.qsize()>0):
mes=self.mq_incoming.get()

if mes == None:
signal.pause() # UNIX
mes=self.mq_incoming.get()
ret=self._parse_msg(mes)
self.counter=self.counter +1
if tr_id != None:
if int(ret["transaction_id"])!=CURRENT_TR_ID-1:
ret=self.receive(tr_id)
return ret

103
def close(self):
pass

def end(self):
self.mq_conn.stop()
exit()

104
Appendix B: The NodeMQ OpenWire interface
implementation
import md5
import base64

import time
import Queue
import signal
import os

import pyactivemq
from pyactivemq import MessageListener
from pyactivemq import ActiveMQConnectionFactory
from pyactivemq import AcknowledgeMode
from pyactivemq import DeliveryMode
from numpy.testing import assert_array_equal
import numpy as N

class SSAPQUERYMsgHandler(ContentHandler):
def __init__(self):
self.queryTxt = ""
self.inQParameter = False
self.parameter_strings = []

def startElement(self, name, attrs):


if name == "parameter":
if attrs.get("name",None) == "query":
self.inQParameter = True
return
else:
if self.inQParameter:
self.parameter_strings.extend(["<", name])

for i in sorted(attrs.items()):
self.parameter_strings.extend([" ", str(i[0]), '="', str(i[1]), '"'])
self.parameter_strings.append(">")
return

def characters(self, ch):


if self.inQParameter:
self.parameter_strings.append(ch)

def endElement(self, name):


if name == "parameter":
if self.inQParameter:
self.queryTxt = "".join(self.parameter_strings)
self.inQParameter = False
return
else:
if self.inQParameter:
self.parameter_strings.extend(["</", name, ">"])
return
def signal_handler(num, stack):
pass

mq_session = list()
mq_producer = list()
mq_consumer = list()
mq_conn = list()
mq_incoming = Queue.Queue(0)

class MessageListener(MessageListener):

105
def __init__(self, queue):
pyactivemq.MessageListener.__init__(self)
self.queue = queue

def onMessage(self, message):


self.queue.put(message.text)
os.kill(os.getpid(), signal.SIGUSR1)

class MQConnector(Connector):
def __init__(self, arg_triple):
self.sib_address, self.space_name, self.node_id = arg_triple
self.connIndex=0
self.mq_incoming=mq_incoming
if(len(mq_session)>0):
self.session=mq_session[self.connIndex]
self.producer=mq_producer[self.connIndex]
self.consumer=mq_consumer[self.connIndex]
self.conn=mq_conn[self.connIndex]
else:
self.f = ActiveMQConnectionFactory('tcp://127.0.0.1:61616?wireFormat=openwire')
self.conn = self.f.createConnection()
self.session = self.conn.createSession(AcknowledgeMode.AUTO_ACKNOWLEDGE)

self.sib_queue = self.session.createTopic('SPACE.'+self.space_name)
self.producer = self.session.createProducer(self.sib_queue)
self.producer.deliveryMode = DeliveryMode.NON_PERSISTENT

self.rec_queue = self.session.createTopic(self.node_id)
self.consumer = self.session.createConsumer(self.rec_queue)
self.listener = MessageListener(self.mq_incoming)
self.consumer.messageListener = self.listener

self.conn.start()
mq_session.append(self.session)
mq_producer.append(self.producer)
mq_consumer.append(self.consumer)
mq_conn.append(self.conn);
signal.signal(signal.SIGUSR1, signal_handler)

def connect(self):
pass

def reconnect(self,query_msg):
self.qparser = make_parser()
self.handler = SSAPQUERYMsgHandler()
self.qparser.setContentHandler(self.handler)
self.qparser.parse(StringIO.StringIO(query_msg))
self.m = md5.new()
self.m.update(self.handler.queryTxt)

self.topic_name="Topic."+base64.b64encode(self.m.digest())
self.r_rec_topic = self.session.createTopic(self.topic_name)
self.r_consumer = self.session.createConsumer(self.r_rec_topic)
self.mq_incoming = Queue.Queue(0)
self.listener = MessageListener(self.mq_incoming)
self.r_consumer.messageListener = self.listener
self.connIndex=1

def send(self, msg):


m = self.session.createTextMessage(msg.encode( "utf-8" ))
self.producer.send(m)

def receive(self,tr_id = None):


mes = None
if(self.mq_incoming.qsize()>0):
mes=self.mq_incoming.get()

if mes == None:
signal.pause() # UNIX

106
mes=self.mq_incoming.get()
ret=self._parse_msg(mes)
if tr_id != None:
if int(ret["transaction_id"])!=CURRENT_TR_ID-1:
ret=self.receive(tr_id)
return ret

def close(self):
pass

def end(self):
self.conn.close()
exit()

107
Appendix C: The NodeMQ XMPP interface implementation
import md5
import base64
import random

import sys
import xmpp

import time
import Queue
import signal
import os

class SSAPQUERYMsgHandler(ContentHandler):
def __init__(self):
self.queryTxt = ""
self.inQParameter = False
self.parameter_strings = []

def startElement(self, name, attrs):


if name == "parameter":
if attrs.get("name",None) == "query":
self.inQParameter = True
return
else:
if self.inQParameter:
self.parameter_strings.extend(["<", name])

for i in sorted(attrs.items()):
self.parameter_strings.extend([" ", str(i[0]), '="', str(i[1]), '"'])
self.parameter_strings.append(">")
return

def characters(self, ch):


if self.inQParameter:
self.parameter_strings.append(ch)

def endElement(self, name):


if name == "parameter":
if self.inQParameter:
self.queryTxt = "".join(self.parameter_strings)
self.inQParameter = False
return
else:
if self.inQParameter:
self.parameter_strings.extend(["</", name, ">"])
return

def signal_handler(num, stack):


pass

mq_incoming = Queue.Queue(0)
mq_connectors = list()
mq_jids = list()
mq_handlers = list()

class MQChannelHandler(threading.Thread):
def __init__(self, cl,mq_incoming,node_id):
threading.Thread.__init__(self)
self.isRunning=True
self.cl=cl
self.mq_incoming=mq_incoming
room = node_id+"@localhost/localhost"
self.cl.send(xmpp.Presence(to=room))
self.cl.RegisterHandler('message',self.xmpp_message)

108
def stop(self):
self.isRunning=False

def run(self):
while self.isRunning:
self.cl.Process(1)

def xmpp_message(self, dis, event):


self.mq_incoming.put(event.getBody())
os.kill(os.getpid(), signal.SIGUSR1)

class MQConnector(Connector):
def __init__(self, arg_triple):
self.sib_address, self.sib_space_name, self.node_id = arg_triple
self.mq_incoming=mq_incoming
if(len(mq_connectors)>0):
self.cl=mq_connectors[0]
self.jid=mq_jids[0]
self.jid1=mq_jids[1]

else:
self.jid1='j'+str(random.randint(1, 9999) ) +'@hut.fi'
pwd="secret"
self.jid=xmpp.protocol.JID(self.jid1)
self.cl = xmpp.Client(self.jid.getDomain(), debug=[])
if not self.cl.connect(server=('127.0.0.1',61222)):
print "No XMPP connection"

if not self.cl.auth(self.jid1,'jabberuserpassword','',0):
print 'could not authenticate!'

ch=MQChannelHandler(self.cl,self.mq_incoming,self.node_id)
ch.start();
mq_handlers.append(ch)
mq_connectors.append(self.cl)
mq_jids.append(self.jid)
mq_jids.append(self.jid1)
signal.signal(signal.SIGUSR1, signal_handler)

def connect(self):
pass

def reconnect(self,query_msg):
self.qparser = make_parser()
self.handler = SSAPQUERYMsgHandler()
self.qparser.setContentHandler(self.handler)
self.qparser.parse(StringIO.StringIO(query_msg))
self.m = md5.new()
self.m.update(self.handler.queryTxt)
self.topic_name="Topic."+base64.b64encode(self.m.digest())
self.jid1='j'+str(random.randint(1, 9999) ) +'@hut.fi'
self.jid=xmpp.protocol.JID(self.jid1)
self.cl = xmpp.Client(self.jid.getDomain(), debug=[])

if not self.cl.connect(server=('127.0.0.1',61222)):
print "No XMPP connection"

if not self.cl.auth(self.jid1,'jabberuserpassword','',0):
print 'could not authenticate!'

self.mq_incoming = Queue.Queue(0)

ch=MQChannelHandler(self.cl,self.mq_incoming,self.topic_name)
mq_handlers.append(ch)
ch.start();

def send(self, msg):


tojid="SPACE."+self.sib_space_name+'@localhost'
self.cl.send(xmpp.protocol.Message(tojid,msg))

109
def receive(self,tr_id = None):
mes = None
if(self.mq_incoming.qsize()>0):
mes=self.mq_incoming.get()

if mes == None:
signal.pause() # UNIX
mes=self.mq_incoming.get()
ret=self._parse_msg(mes)
if tr_id != None:
if int(ret["transaction_id"])!=CURRENT_TR_ID-1:
ret=self.receive(tr_id)
return ret

def close(self):
pass

def end(self):
for i in mq_handlers :
i.stop()
exit()

110
Appendix D: The NodeMQ AMQP interface implementation
import md5
import base64
import random

from amqplib import client_0_8 as amqp


import sys

import time
import Queue
import signal
import os

class SSAPQUERYMsgHandler(ContentHandler):
def __init__(self):
self.queryTxt = ""
self.inQParameter = False
self.parameter_strings = []

def startElement(self, name, attrs):


if name == "parameter":
if attrs.get("name",None) == "query":
self.inQParameter = True
return
else:
if self.inQParameter:
self.parameter_strings.extend(["<", name])
for i in sorted(attrs.items()):
self.parameter_strings.extend([" ", str(i[0]), '="', str(i[1]), '"'])
self.parameter_strings.append(">")
return

def characters(self, ch):


if self.inQParameter:
self.parameter_strings.append(ch)

def endElement(self, name):


if name == "parameter":
if self.inQParameter:
self.queryTxt = "".join(self.parameter_strings)
self.inQParameter = False
return
else:
if self.inQParameter:
self.parameter_strings.extend(["</", name, ">"])
return

def signal_handler(num, stack):


pass

mq_incoming = Queue.Queue(0)
mq_connectors = list()
mq_channels= list()
mq_handlers = list()

111
class MQChannelHandler(threading.Thread):
def __init__(self, chan,mq_incoming,node_id):
threading.Thread.__init__(self)
self.isRunning=True
self.chan=chan
self.mq_incoming=mq_incoming
self.chan.queue_declare(queue=node_id, durable=True, exclusive=False, auto_delete=False)
self.chan.queue_bind(queue=node_id, exchange="sibex", routing_key=node_id)

self.chan.basic_consume(queue=node_id,no_ack=True,callback=self.recv_callback,
consumer_tag="sibtag")

def stop(self):
print "stopping thread"
#self.chan.basic_cancel("sibtag") # Not working
self.isRunning=False

def run(self):
while self.isRunning:
self.chan.wait()

def recv_callback(self,msg):
self.mq_incoming.put(msg.body)
os.kill(os.getpid(), signal.SIGUSR1)

class MQConnector(Connector):
def __init__(self, arg_triple):
self.sib_address, self.sib_space_name, self.node_id = arg_triple
self.mq_incoming=mq_incoming
if(len(mq_connectors)>0):
self.cl=mq_connectors[0]
self.chan=mq_channels[0]

else:
self.cl = amqp.Connection(host="localhost:5672", userid="guest", password="guest",
virtual_host="/", insist=False)
self.chan = self.cl.channel()
self.chan.exchange_declare(exchange="sibex", type="direct", durable=True,
auto_delete=False,)

ch=MQChannelHandler(self.chan,self.mq_incoming,self.node_id)
ch.start();

mq_handlers.append(ch)
mq_connectors.append(self.cl)
mq_channels.append(self.chan)

signal.signal(signal.SIGUSR1, signal_handler)

def connect(self):
pass

def reconnect(self,query_msg):
self.qparser = make_parser()
self.handler = SSAPQUERYMsgHandler()
self.qparser.setContentHandler(self.handler)
self.qparser.parse(StringIO.StringIO(query_msg))
self.m = md5.new()
self.m.update(self.handler.queryTxt)
self.topic_name="Topic."+base64.b64encode(self.m.digest())

self.cl = amqp.Connection(host="localhost:5672", userid="guest", password="guest", virtual_host="/",


insist=False)
self.chan = self.cl.channel()
self.chan.exchange_declare(exchange="sibex", type="direct", durable=True, auto_delete=False,)

self.mq_incoming = Queue.Queue(0)

ch=MQChannelHandler(self.chan,self.mq_incoming,self.topic_name)

112
mq_handlers.append(ch)
ch.start();

def send(self, msg):


msg = amqp.Message(msg)
msg.properties["delivery_mode"] = 2
self.chan.basic_publish(msg,exchange="sibex",routing_key="SPACE."+self.sib_space_name)

def receive(self,tr_id = None):


mes = None
if(self.mq_incoming.qsize()>0):
mes=self.mq_incoming.get()

if mes == None:
signal.pause() # UNIX
mes=self.mq_incoming.get()
ret=self._parse_msg(mes)
if tr_id != None:
if int(ret["transaction_id"])!=CURRENT_TR_ID-1:
ret=self.receive(tr_id)
return ret

def close(self):
pass

def end(self):
for i in mq_handlers :
i.stop()
exit()

113
Test 1: STOMP Test 2: OpenWire Test 3: XMPP Test 4: AMQP
ms delay runtime in ms ms delay runtime in ms ms delay runtime in ms ms delay runtime in ms
1 3064 1 4023 5 14362 1 4470
1 2786 5 8833 5 12631 1 3738
5 6065 5 8305 6 12691 5 8908
5 6202 6 8341 8 15409 5 7756
6 6236 8 9257 10 8736 6 8877
8 7238 10 5255 11 14019 8 9613
10 4216 13 8991 13 16379 10 5066
Appendix E: The effect of the Network Bandwidth

11 7582 14 10633 15 16890 11 10188


13 8049 15 8415 15 15599 13 9128
14 7644 15 10145 16 17158 14 10161
15 7706 16 11007 16 17848 15 9376
15 7006 17 8245 17 12369 15 10019
16 8675 18 8495 18 13565 16 10610
16 8096 18 8213 18 13554 16 10790
17 6252 18 10821 18 18270 17 7577
18 6438 19 11802 19 19591 18 8594
18 6680 19 12258 19 19431 18 8350
18 8878 20 12396 20 18845 18 11265
19 9173 20 11647 20 19095 19 11111
19 9145 22 12507 22 20938 19 11298
20 9612 22 12366 22 19688 20 11424
20 8940 23 12599 23 18995 20 10282
22 10129 23 12601 23 19860 22 11315
22 9455 24 12115 24 19936 22 11658
23 9753 24 10367 24 16600 23 11936
23 9521 24 12694 24 19937 23 11751
24 9705 25 12412 25 20907 24 11791

114
24 7580 27 12864 27 19983 24 9756
Table 6 the effect of the Network Bandwidth

24 9221 27 11489 27 17755 24 11908


25 9564 27 12958 27 20659 25 12018
27 9975 28 11429 28 18588 27 11954
27 8502 31 14055 31 21345 27 10697
27 10269 32 15239 32 24597 27 12206
28 9173 32 15194 32 24939 28 11346
31 10829 32 15259 32 23641 31 11774
32 11363 34 19288 34 29703 32 14078
32 11830 34 19863 34 31073 32 14516
34 13268 34 19178 34 30749 34 15556
34 14263 35 14230 35 31438 34 18360
35 10786 35 19194 35 31563 34 18485
35 15628 35 19992 36 31075 35 13096
35 15876 36 19051 36 30568 35 18954
36 14755 36 20286 38 30920 35 18688
37 15017 37 19380 41 32468 36 17967
38 15488 38 19990 41 29942 36 18064
40 12398 40 15880 41 32892 37 17986
40 12332 40 16267 43 32953 38 17704
40 12416 40 15887 45 32794 40 15120
41 15580 41 18817 46 33158 40 15120
41 14847 41 18068 46 33595 40 14820
41 14877 41 20382 41 18425
43 15895 43 20712 41 17580
45 15956 45 21975 41 19260
46 14022 46 20162 43 19452
46 16924 46 20676 45 19360
46 16517 46 17000
47 14833 46 19595
49 15013 46 19594
47 17194
49 17830
Appendix F: Time Series of Stomp Measurements

The Bechmarktimes of the System


5
4.5
The benchmarktime in seconds

4
3.5
3
2.5 OpenWire/pyactivemq
2 Stomp/Stomppy
1.5
Original TCP
1
0.5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Test number (time serie)

Figure 64 the benchmark time series of a Stomp test arrangement. A message queue
implementation

115

You might also like