You are on page 1of 10

SStreaM: A Model for Representing Sensor Data and Sensor Queries

Levent Gürgen, Cyril Labbé, Claudia Roncancio, Vincent Olive


France Telecom R&D, Grenoble - France LSR-IMAG Laboratory, Grenoble - France
{levent.gurgen,vincent.olive}@rd.francetelecom.fr {cyril.labbe,claudia.roncancio}@imag.fr

Abstract place where the queries are evaluated, we can differenti-


ate two extreme ways of sensor querying: fully distrib-
Sensor querying has become one of the major challenges uted and centralized. Wireless Sensor Networks (WSN) [6]
in the data processing community since the explosion of adopt a purely distributed approach. Queries are evalu-
the new generation of sensors. Sensor networks and data ated on the sensors thanks to their increasing computing
stream processing solutions are two popular ways of query- and storage capacity. Hence, the sensor database system
ing sensor data. However, there are not sufficiently generic (SDBS) [11, 25, 19] notion has been introduced. In Data
models neither for representing sensor data nor for for- Stream Management Systems (DSMS) [3, 12, 7] continuous
mulating queries. In general, models are application de- queries over streaming sensor data are evaluated on a cen-
pendant. Their reuse in different contexts is limited. This tralized relatively powerful server. Compared to SDBSs,
paper aims to provide a general conceptual model, named DSMSs provide more complex queries (sliding window
SStreaM, to represent different kinds of sensor data as well joins, aggregations, etc.). Some recent works propose the
as various types of queries including continuous queries on integration of these two approaches, i.e. adopting a hybrid
data streams or on sliding windows. Several aspects of the approach [23, 17, 4].
model are illustrated by an example. This paper also briefly Disregarding the way queries are supported, the most
discusses related domains such as sensor databases, data popular approach for modeling sensor data is to use the well
stream management systems; and sequential, temporal and known relational model. However, unique characteristics of
real-time databases. sensor systems prevent a direct use of relational query eval-
uation techniques. Therefore, several propositions adapted
to sensors context at the level of data and query modeling,
1 Introduction query languages and optimization techniques are proposed
from both SDBS and DSMS solutions.
Sensor querying has become a very popular research This work mainly concerns data and query modeling is-
topic in many areas of computer science. This is because the sues for sensor data management. It reviews related work
contributions of these tiny yet intelligent devices to the new and concludes that a general model for representing sen-
”information everywhere” paradigm are now well recog- sor data and queries is still missing. Besides, we argue
nized. Therefore, various domains try to solve challenges that some related domains such as temporal, sequential and
appeared with this new way of computing. The networking real-time databases domain are not sufficiently explored in
domain does research on self-adaptive, energy-efficient, and order to extend sensor query representation. We propose
multi-hop routing protocols. Database community is con- SStreaM, a model for representing sensor data and sensor
cerned by modelling and querying sensor data. Embedded queries. Our contributions can be summarized as follows:
operating systems adapted for these tiny devices are being Sensor data model: We propose a sensor stream data
proposed by the systems community. And lastly, ubiquitous model which aims to give a general view over sensor data
computing explores different opportunities that new gener- in order to represent different sensor data types. Basically,
ation sensors can offer to the anywhere anytime information SStreaM’s data model is a sequential model which consid-
processing. ers sensor data as a (eventually infinite) sequence of tuples.
Concerning sensor data management issues, one inter- Each tuple represents static sensor properties, as well as the
esting (and widely used) approach to sensor information time-varying sensor data.
processing is to reuse classical database principles. With Sensor query model: Query execution over sensor
this approach, the sensors are considered as data sources stream data and over windows (finite sub-streams) is for-
generating data conforming to a schema. According to the malized. Query operators include a time dimension in order
to reflect the real-time nature of sensor querying. Operators and SDBSs, as well as the fundamental differences of our
on streams are modeled with a tuple-by-tuple execution ba- approach with respect to these aspects.
sis, while operators on windows are based on a set-oriented
execution basis. Representation of sensor data
General window definition: We give a general defini-
tion in order to define different types of windows such as The stream data model used by DSMSs is inspired by se-
temporal and position-based windows with various behav- quential model [29] which is another extension of the well-
iors such as fix, sliding, tumbling or landmark windows. known relational model. A stream is a (eventually infinite)
The rest of the paper is organized as following. Section sequence of tuples which represent sensor measures sam-
2 gives the positioning of our work with respect to related pled at different instants. Tuples are ordered by the time
domains such as SDBSs, DSMSs, sequential, temporal, and domain. Our approach includes, in addition to the time or-
real-time databases. Section 3 presents SStreaM, the data dering, a position ordering of tuples in order to include the
and query model we propose. Section 4 introduces an ex- positional semantic of tuples. Hence, we aim to enhance
ample scenario and illustrates the use of SStreaM. Section 5 query representation by including some sequential opera-
concludes and gives our research perspectives. tors.

2 Related Work In order to represent sensor data, each SDBS defines


a data schema. Queries are formulated according to that
Our work is directly related to two domains, namely sen- schema. However, proposed schemas are rigid and ap-
sor database systems (SDBSs) and data stream management plication dependant. There is not a common agreement
systems (DSMSs); and indirectly to three other domains, about how sensor data should be represented. SStreaM
namely sequential, temporal and real-time databases. This proposes a more general schema which distinguishes three
section gives a brief overview of these domains as well as types of attributes: sensor properties, sensor measurement,
the position of our proposal with respect to them. and timestamp. Hence, we aim to obtain more flexible sen-
sor data representation applicable to different types of sen-
2.1 SDBSs and DSMSs sors.

SDBSs [11, 25, 19] have been introduced with the arrival Query models
of the new generation of sensors. Thanks to their increasing
computing, storage and wireless communication capacities, In both DSMSs and SDBSs, the most popular way of
each sensor can be seen as an autonomous database con- formulating queries is to use an SQL-like language [8, 12,
taining data about its environment (temperature, pressure, 25, 11]. However, the underlying query models are not al-
geographic location, etc.). Sensors form a wireless sensor ways formalized. The most serious effort in formalization is
network [6], in which queries are distributed by a gateway made by the STREAM system [7], namely CQL [8]. Our
(or base station) in a multi-hop manner. Continuous queries approach differs from their model principally in three as-
are evaluated on the sensors (or in the network for some ag- pects. First, we base our model of stream query evaluation
gregation operators [24]) and the results are collected by the on a timely tuple-by-tuple execution basis (our arguments
gateway. Sensor data is not materialized until the evaluation for this choice is discussed later), while they have a rela-
of the query on the sensor. tion granularity for query execution. Second, our model,
DSMSs [3, 12, 7] are systems of continuous query as well as a timestamp ordering, includes a linear position
processing over data streams. They are not conceived only ordering by which we aim to be able to take advantage of
for sensor applications, but also for monitoring applications some positional operators seen in sequential databases (see
of financial, telecommunication, or network data. Contrar- the overview of sequential databases below). And third, we
ily to SDBSs, most of the DSMSs are centralized systems, give a more general definition for windows in order to de-
i.e. stream sources send their data to the DSMS, and con- fine various types of windows. CQL is limited to sliding
tinuous queries [10] are evaluated on a centralized server. windows. In addition, contrarily to their model, we also de-
Sensor data is materialized as a data stream. fine sliding distance and sliding rate for windows, as well
SDBSs and DSMSs are two strongly related domains. as a more flexible management of window edges. Another
In fact, we can talk about a certain sensor stream manage- DSMS, namely TelegraphCQ [12] also provides a general
ment system (SSMS) as a sub-domain of DSMS (see Fig- definition for windows by using a for-loop construct. How-
ure 1). Several recent works in the literature [23, 17, 4, 15] ever, they are restricted to temporal ordering, thus they only
fall into this domain. Below we give some common sensor define temporal windows. Besides, their sliding window
data and query representation aspects introduced by DSMSs definition does not include a sliding rate parameter.
Figure 1. Relations between DSMS and SDBS

Figure 2. Relations between SSMS and sequential, temporal,


2.2 Sequential, temporal and real-time and real-time databases
databases

Sequential models [29, 5] are proposed in order to rep-


resent ordered and grouped data which is needed for effi- nature of sensor querying. In addition, in RTDBSs, in order
cient query execution in some special kind of applications to keep the consistency in the system, the choice of a con-
such as financial applications, scientific data analysis, pat- currency control protocol plays an important role. Due to
tern recognition, etc. As mentioned earlier, sequential data real-time constraints, blocking protocols are avoided. Con-
and query model inspired the stream model used by most current access in sensor systems is not studied in existing
DSMSs. However, we argue that some sequential opera- solutions since the operations over sensor data are in gen-
tors can be exploited more extensively in order to enhance eral read-only. However we argue that, modifications made
stream query operators. For instance positional operators on sensor properties (see Table 1) may create the need of
(before, next, index, etc.) would increase the expressive- dealing with concurrency control issues for sensor systems.
ness of queries. SStreaM includes, as well as temporal or- We have a parallel research direction in investigating the
dering, linear position ordering in order to be able to include concurrency control aspects in sensor stream management
sequential operators, systems. However, this point is out of the scope of this pa-
Temporal databases [20, 13] aim to complete missing per.
temporal aspects of traditional databases. Several propos- Several works analyzing relations between the afore-
als rely on the relational model and add a temporal domain mentioned domains exist in the literature [26, 22, 14, 28]
in order to represent the temporal semantics of data. The (see Figure 2). We have given an overview of interrelations
existing models differ in several aspects such as dimension of those domains with the sensor querying domain. How-
(valid time vs. transaction time), structure (discrete, con- ever, further work is required to identify the reusable results
tinuous, etc.), and time domain representation (chronons, of those domains in sensor querying.
time intervals, set of intervals, etc.). Temporal expressions
(at, along, present, valid at, somewhen, etc.) and tempo- 3 SStreaM: sensor stream data and query
ral operators (overlap, after, during, precede, etc.) are pro- model
posed to extend already existing query languages, therefore,
to include temporal semantics in queries. DSMSs model This section presents SStreaM, a model of sensor data
the time domain with a 1-dimension representation (valid- and sensor queries. Section 3.1 gives our general sensor
time) and with a discrete structure where chronons repre- data schema definition and our stream model which is based
sent one point in time. However, temporal operators are on the sequential model adopted by most of DSMS solu-
not exploited in DSMSs. We argue that, since temporal as- tions. Section 3.2 presents the query model which is essen-
pects are naturally included in sensor data, temporal opera- tially based on three types of operators: stream operators,
tors and expressions would enhance queries on sensor data. window creation operators and windowed operators.
A real-time database system (RTDBS) [27, 21] is a
system where transactions have time constraints to update 3.1 Data model
the database, or to answer queries. Therefore, besides the
logical consistency (that traditional database systems deal
3.1.1 Schema definition
with), RTDBSs have to guarantee temporal consistency as
well. This is also true for sensor stream processing as sen- Sensor data is represented by tuples. As in conventional
sors deal with real-time events. SStreaM includes the tem- database systems, tuples conform to a data schema accord-
poral dimension of query operators to reflect the real-time ing to which queries are formulated. Mostly, queries pertain
Table 1. Examples of sensor data
properties measurement timestamp
Temperature Sensor : Building A, Room 102, temperature, Celsius, 30 10:23 12 June 2005
GPS Reader: Id 13232, GPS localization, 123, 343, 342 12:43 13 August 2005
RFID Reader: Id 3434, RFID localization, Room 101 1233424242 23:34 14 May 2004

to three different parts of sensor data: meta-information of Sequences of tuples form data streams. Next section
the sensor (identification, location, type, unit of measure, gives basic stream definitions and notations in order to for-
etc.), sensor’s measurement (temperature, pressure, GPS malize the stream data model.
coordinates, etc.), and timestamp representing the time at
which the measurement is made. Continuous query oper- 3.1.2 Sensor data stream
ators execute on the measurement of sensors (e.g. sensors
measuring less than 10). However, in order to localize the Definition 2 A stream S = {s0 , s1 , ........, sn , ........} is a
sensors whose data will be interrogated, a part of the query set of tuples si ordered by their tmstmp value. In addition,
is executed on the sensors’ meta-information (e.g. temper- tuples also have linear positional ordering, i.e. the tuple sn
ature sensors in room A measuring less than 10 Celsius). is the nth element of the stream S.
And lastly, time is also concerned by most of the queries on We note that the set S may contain distinct tuples which
sensor data (e.g. temperature sensors in room A measuring have the same value for the timestamp attribute, and an el-
in average less than 10 Celsius in a sliding window of 5 ement of T is not necessarily present among the timestamp
minutes). Hence, in our general sensor data schema defin- attributes of si . More formally:
ition, we differentiate three types of attributes. First type is
property attributes. They contain meta-information of sen- Property 1 Let τ : S → T be a function that gives the
sors. We note that not all property attributes are known by value of the timestamp attribute of a tuple si (i.e. τ (si ) =
the sensors. Some of them would be added to the tuples si .tmstmp). This function is neither injective nor surjec-
by intermediary units such as proxies, gateways, or servers. tive from S to T.
The second type is the continuously changing attribute over
time, and is represented by the measurement field. The se- Conceptually streams can be unbounded. However, only
mantic of the measurement and the properties can vary de- a bounded part of a stream is materialized for query process-
pending on application contexts. The semantic interpreta- ing. Therefore, we differentiate three parts in a stream. The
tion would be done at the application level. For instance, past, the present and the future (see Figure 3):
measurement can represent temperature reading for one ap- Definition 3 The present part of a stream is the currently
plication, RFID tag number for another. By this way, our materialized part of the stream at an instant t. We denote
objective is to allow the coexistence of different types of this part as S t ⊆ S, and as in definition 2, the tuple stn
sensors while giving a sufficiently general schema defini- is the nth element of S t . Thus, the first element of S t is st0 ,
tion. Finally, the last type is the timestamp attribute that |S t | is the cardinality of S t , the last element of S t is st|S t |−1 ,
represents the time at which a measurement is made by the
and ∀i sti .tmstmp ≤ t.
sensor. We assume that timestamps are attributed to tuples
according to a global time (by sensors or by some other in- The present part of the stream contains currently avail-
termediary units). We argue that this representation is suf- able data for query evaluation. Mostly, this part is mate-
ficiently generic that most sensor data types can be repre- rialized in form of a queue structure whose size is limited
sented (see Table 1 for some examples). by the memory capacity of the query processing unit. Be-
We can therefore give a formal tuple definition as follow- yond this limit, the data expires from the queue, therefore
ing: becomes past data.

Definition 1 A tuple s is a list of several property attributes Definition 4 The past of a stream S, at an instant t, is com-
ai , one measurement attribute m, and one timestamp at- posed of si ∈ S such that si .tmstmp < st0 .tmstmp.
tribute tmstmp, i.e. s =< a1 , a2 , ......., an , m, tmstmp >. The past can be stored in a persistent disc-based storage
Each attribute ai belongs to a particular domain Di , m system. Queries over histories of data can be evaluated on
attribute to the measurement domain M , and tmstmp to this part.
the time domain T . T is a totally ordered set containing
discrete points which represent different moments in time, Definition 5 The future of a stream S, at an instant t, is
tmstmp ∈ T = {t0 , t1 , t2 , ...}, T ⊆ N0 . composed of si ∈ S such that si .tmstmp > t.
Figure 3. The subset S t of the infite stream S represents data
currently available for processing (e.g. as a queue in memory).

Figure 4. Unary stream operator

This part represents the measures not yet made by the


sensors. We can therefore formulate the queries to be eval- δt is the termination time of execution of the operator (see
uated on the future values of sensor measures. Figure 4).

3.2 Query model According to definition 7, the following formula can be


given for U nOpt :
Queries are composed of several operators. Output of
one operator can form the input of another. Operators on UnOp t (Sint ) = UnOp t (sint0 ) = soutt+δt (1)
|Soutt |
streams work with a tuple by tuple basis. They take first tu-
ple(s) from the input stream(s), execute the operation, and (Remember: stn gives the nth term of S t )
write the answer to the output stream. Operators on win- From the preceeding formula we can find back the infi-
dows work with a set-oriented basis similar to relational nite stream Sout with the following:
operators. They take window(s) as input, they execute the
operation, and finally write the result to the output stream.

t +∆tn−1
[
Sout = U nOpt0 +∆tn−1 (sin00 ) (2)
3.2.1 Stream Operators n=1

Definition 6 A stream operator Op has a certain num- [
= soutt|Sout
0 +∆tn
(3)
ber of input streams and a unique output stream1 which t0 +∆tn |

n=1
contains the results of operations over input streams, i.e.
Sout = Op(Sin1 , Sin2 , .......Sinn ) where Sout denote where t0 represents the time of the first execution of the op-
the output stream and Sini an input stream erator, n ∈ N the nth execution of the operator, and ∆tn the
accumulated duration until the operator’s nth execution (i.e.
However, there are two types of operators which are ∆tn = Σni=1 δti ). Note that δti is the duration of operator’s
mostly used: unary operators (i.e. Sout = U nOp(Sin)) ith execution, |Soutt0 | = 0, and ∆t0 = 0 .
and binary operators (i.e. Sout = BinOp(Sin1 , Sin2 )). Since tuples are being added to the stream continuously
Although numerous unary operators could be defined, and eventually with a high rate, temporal dimension of the
concretely selection and projection operators are general query operators which reflects the real-time nature of sen-
enough to answer a large number of different kinds of sor queries gains more importance. Typically, in real-time
queries. Similarly, the join operator can be given as an ex- databases, δt is the constraint on the execution of the oper-
ample of a binary operator. However join operations are ator. These systems will require δt to be less than a certain
generally executed on windows, as a result of the block- threshold in order to keep temporal consistency in the sys-
ing nature of this operation. Therefore, in this section we tem. Although this subject is out of the scope of this paper,
will only define unary operators. In addition, we will par- we want to note that adding a temporal dimension to the
ticularly deal with the materialized part of streams which operators would facilitate to take into account the real-time
represents present values; whence the following definition: aspects of these systems. See the perspectives section for
more details.
Definition 7 U nOP t is a unary stream operator which
In the sensors context, periodic execution of operators is
represents the execution of the operator UnOP at time t over
very usual: periodic filters over the data periodically sent by
the input stream Sint . U nOpt takes the first element of
sensors, operators over periodically sliding windows, etc.
Sint and executes the operation defined by the operator.
0 In order to represent these cases, it would suffice to replace
The result forms the last element of Soutt , where t0 = t +
in the preceding formula, ∆tn with rate × n, where rate
1 Operators producing several streams are not considered. represents the execution periodicity of the operator.
3.2.2 Window Creation Operators
Windows are finite subsets of streams. From a general point
of view, a window is bounded by two parameters: start
and end. We differentiate two types of windows: temporal
windows and position based windows. In case of temporal
windows, window edges are time points (start, end ∈ T );
in case of position based windows, window edges are po-
sitions of the tuples in the stream (start, end ∈ N). Note
that, in both cases start ≤ end, and start = end implies
an empty window.
Window creation operators create windows from
streams. Formally;

Definition 8 Let W be a window creation operator over Figure 5. Position based sliding window with rate = 2, start adv
a data stream S, it returns a window R bounded by = end adv = 3. Window width is 4 units
start and end parameters. For position based windows:
W(start,end) (S) = R = {si ∈ S | start ≤ i ≤ end},
start ≥ 0. For temporal windows: W(start,end) (S) =

R = {si ∈ S | start ≤ si .tmstmp ≤ end}, start ≥ 0
[
s0 .tmstmp Wdesc (S) = Rn (4)
n=1
As mentioned earlier, mostly we will deal with the
Rn is the window created during the nth execution of the
present values of a stream. Thus, we define instantenous
operator W t over the stream S t (see Figure 6). Formally:
window creation operator W t which creates a window from t0 +(n−1)×rate
the stream S t . We will use temporal windows to illustrate Rn = W(start(n),end(n)) (S t0 +(n−1)×rate )
the rest of the section. The reasoning would be similar for
position based windows. where start(n) = start + (n − 1) × start adv and
end(n) = end + (n − 1) × end adv.
Definition 9 W t is an instantaneous temporal window With this general definition, it is possible to define dif-
creation operator which, at instant t, takes as input ferent types of windows: fixed windows (start adv =
a stream S t and returns a window R, i.e. R = end adv = 0), landmark (either start adv = 0, or
t
W(start,end) (S t ) = {sti ∈ S t | start ≤ sti .tmstmp ≤ end adv = 0), tumbling (start adv = end adv = end −
end}, start ≥ st0 .tmstmp and end ≤ st|S t |−1 .tmstmp. start), etc.
In general, window width is constant for sliding win-
In the sensor querying context, generally the windows dows (i.e. start adv = end adv = cnst). However, if
are not fixed, i.e. edges of windows vary continuously in we want to have windows of different sizes at each sliding
function of the time. In order to include this kind of win- period, we can define start adv(n) and end adv(n) which
dows, we give a window description definition below: can take different values at the nth execution of the window
creation operator. Similarly, a non-constant rate parameter
Definition 10 A window description desc is a 5-tuple con- implies an aperiodic window. Therefore, a variable parame-
taining the parameters: start, end, rate, start adv, and ter rate(n) defines a different rate for operator’s nth execu-
end adv. start (resp. end) is the initial value of the first tion (e.g. every time that a new tuple arrives). In addition,
(resp. second) edge of the window, rate represents the pe- in some cases, the window edges may surpass the present
riodicity of the window, finally start adv and end adv de- part of a stream (part where currently tuples are present).
termine the sliding distance (i.e. how much window edges For instance, this can happen when a sliding window ad-
will advance) in case of moving windows. (see Figure 5 for vances so fast that window’s end parameter falls into the
an example). future part of the stream (see Figure 7). One solution for
Therefore, we can generalize the window creation oper- this problem is to evaluate the windowed operator (see next
ator definition given above: section) over the window only including the present values
of the window; hence, the end parameter of the window
Definition 11 Let W 0 be a window creation operator, it becomes the timestamp of the last element of S t , i.e. if
takes as input a stream S and a description desc, it returns end > s|S t |−1 .tmstmp then end = s|S t |−1 .tmstmp. This
a set of windows created according to the behaviour de- solution could be used for periodic operators in order to give
scription given in desc. Formally: at least a result at the end of each period. Another solution
Figure 7. if end > s|S t |−1 .tmstmp then end =
s|S t |−1 .tmstmp

ing:


[
Sout = W U nOpt0 +(n−1)×rate (Rn ) (6)
n=1
[∞
t0 +n×rate
= sout|Sout t0 +(n−1)×rate | (7)
Figure 6. W creates windows from input stream. WUnOp is
n=1
executed on such windows and creates the output stream
where Rn has been introduced in the formula 4.
Note that for some windowed unary aggregation opera-
could be to wait until the window fills with the demanding tors (e.g. average, count, sum), and binary operators (e.g.
amount of tuples before executing the windowed operator. join), the timestamp value that the output tuple will take is
This solution could be adopted when there is no rate spec- not obvious. There are several possibilities to handle this
ified for the operator. Similarly, if the start < s0 .tmstmp, problem: i) to choose as the output’s timestamp, one of the
we can either take start = s0 .tmstmp or we can take the timestamp values of input tuples which contributed to the
tuples from historic, if this latter is available. output tuple (e.g. the minimum [16], the maximum [9]),
or the one indicated by the query [10]); ii) to assign a new
timestamp (e.g. operator’s execution time [10]); iii) or alter-
3.2.3 Windowed Operators natively to have a time interval [min ts, max ts] instead of
This section introduces windowed operators – operators ex- a unique timestamp [18]. In order to maintain the temporal
ecuting over windows. They are represented by the symbol order of the output tuples, we choose to take the maximum
W Op (Windowed Operator). As in the case of stream op- of the input timestamp values. It is also the value nearest to
erators, we take two types of windowed operators: unary the one that the operator would assign, if the second solu-
(W U nOp) and binary (W BinOp). As examples of win- tion was chosen.
dowed unary operators, we can give traditional aggregation
operators such as average, count, sum, min, and max. Simi- 4 Query example
larly, a windowed join is a binary operator (see Section 4 for
operator examples). In this section, we only define unary This section illustrates several aspects of SStreaM. The
windowed aggregation operators due to size restrictions. example is based on a hybrid multi-level architecture de-
However, other operators can be defined in an analogous fined to query distributed heterogeneous sensors [17]. The
way. architecture (see Figure 8) is composed of three main lev-
Similarly to definition 7, an aggregation W U nOp oper- els: control sites, gateways and sensors. Control sites are
ator takes as input a window R and returns the result tuple the entry points of the system. Users or applications send
to the output stream: their queries to the control site, and the control site decom-
poses the query in order to send the sub-queries to the gate-
W U nOpt (R) = soutt+δt
|Soutt | (5) ways concerned by the query. Gateways are distributed ac-
cording to an attribute (mostly the location attribute). They
t
where R = W(start,end) (Sint ) group different kinds of sensors, more precisely their prox-
ies. A proxy is the software controlling one or more sen-
As in the formula 2 and 3, we can find back the output sors. On the gateway, there is also one adapter per proxy
stream in case of a periodically sliding window as follow- which is the interface between the sensor specific proxy and
our sensor querying system. Adapters are charged to make
the translation between our query language and proxy’s sen-
sor specific control commands. Sensors are physically dis-
tributed in an environment and send their measures to their
proxies in a periodic or aperiodic manner. There are dif-
ferent kinds of sensors (temperature, pressure, localization,
etc.) with different capabilities such as some query oper-
ator processing and storage capacities which can be used
for query optimization purposes (e.g. if a sensor can exe-
cute a selection operator, then push the selection operator to
the sensor). Having this architecture in mind consider the
following scenario which will be used to illustrate a query
example:
Figure 8. Architecture and query example
In a factory, each product passes respectively by a cer-
tain number of sections during its lifecycle of production.
The product stays, during one minute, in each section where lected by the control site. We illustrate the part of the query
some operations are effectuated on it, and then passes to the which will be executed at one section.
next section. Each section has a gateway containing dif- Let S1 be the stream created by the RFID reader, and
ferent kinds of sensor proxies. For our example, we will S2 the stream created by temperature sensors of the section,
take two types of sensors: temperature sensors and RFID then Q can be represented in algebraic form as following:
readers (sensors). We assume that there are, several tem- 0
P rojOpL (SelOpP (W AV Gattr (Wdesc 3
(W JoinJC (
perature sensors placed at different locations of a section, 0 0
Wdesc1 (S1), Wdesc2 (S2)))
and one RFID reader per section detecting product tags (see
Figure 8). where L =< S1.measurement >, P = (S1.type =
According to the general operator definitions given in RF ID ∧ S2.type = T emperature ∧ average > 40),
section 3, we introduce following operators which will be attr = S2.measurement, and JC = (S1.location =
used for the query example: S2.location).
Stream operators: There is a join between the RFID readers’ data stream
SelOpP takes as input a tuple si and returns si if the (S1) and a sliding window over the temperature sensors’
tuple conforms to the predicate condition defined in P . data stream (S2). It is an equi-join over the location prop-
P rojOpL takes as input a tuple si and returns the tuple erty. However, according to our assumption that there is one
s0i which only contains the attributes of si listed in L. gateway per section, this condition will always be true. As
Windowed operators: a result, this join operation will only couple each product
W AV Gattr takes as input a window and returns the av- with the temperature readings made during its presence in
erage values of attr attributes of the tuples present in the the section. Knowing that each product stays in one sec-
window. tion during one minute, the width of the window will be 60
W JoinJC takes two windows as input, and returns the seconds 2 . The window is aperiodic; its sliding rate is de-
concatenation of the tuples holding the join condition spec- termined by the products’ arriving rate. Sliding distance for
ified in JC. both edges of the window is the difference between arrival
According to our objective to represent different kinds times of two consecutive products. Therefore, the join is
of sensor data, we define a global common schema calculated between the tuples of S1 and the windows cre-
sensor stream: ated on stream S2. Such window creation operator uses the
< SId, location, type, measurement, timestamp > following description:
This schema is actually a view over different distributed desc2 = < t0 , t0 + 60, rate(n), dist(n), dist(n) >
databases located at different levels of the architecture (con- where t0 is the timestamp of the first tuple in stream
trol sites, gateways, proxies) and over the stream data of S1; rate(n) = dist(n) = S1[n + 1].timestamp −
sensors. Note that the first three attributes form the prop- S[n].timestamp; and S1[n] gives the current tuple in
erty attributes progress.
We note that, although, in our model we didn’t define
Let’s consider the following query Q: Which products in
joins between a stream and a window, we can consider
the production chain had undergone an average tempera-
the former as a position based tumbling window whose
ture more than 40◦ C during its presence in a section?
start parameter is 0, end parameter is 1, rate is aperiodic,
This query will be executed on the gateway of each sec-
tion. The partial results from gateways will then be col- 2 The smallest time unit is a second
uses the OSGI platform [2], thus adopts a service-oriented
approach. Data is collected by sensor services, and aggre-
gated by distributed query services on the gateways. Global
sensor stream query services at control sites discover and
query sensor stream data by intermediary of query services
and sensor services.
Our ongoing research concerns the management of sen-
sor farms. We have found out that continuous queries will
be executed simultaneously with update transactions mod-
ifying sensor properties. This will require specific transac-
tion management. We believe that temporal dimension of
operators introduced in SStreaM would lead us to a finer
management of continuous queries.

References

[1] PISE Project, http://www.telecom.gouv.fr/rnrt/rnrt/projets/


PISE.htm.
[2] OSGI(Open Services Gateway Initiative),
http://www.osgi.org.
[3] D. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Con-
Figure 9. Query processing vey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Au-
rora: a new model and architecture for data stream manage-
ment. VLDB J., 12(2):120–139, 2003.
and start adv = end adv = 1. Therefore: desc1 =< [4] D. Abadi, W. Lindner, S. Madden, and J. Schuler. An inte-
0, 1, rate(n), 1, 1 >. gration framework for sensor networks and data stream man-
Finally, the average operation is executed over a tem- agement systems. In VLDB, pages 1361–1364, 2004.
poral tumble window over the stream created by the [5] R. Agrawal and R. Srikant. Mining sequential patterns. In
ICDE-11, pages 3–14, Taiwan, 1995.
windowed join operation. start, end, and rate para-
[6] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and
meters have the same values than ones in desc2 ; and E. Cayirci. Wireless sensor networks: a survey. Computer
start adv = end adv = 60 (i.e. desc3 =< t0 , t0 + Networks, 38(4):393–422, 2002.
60, rate(n), 60, 60 >). The average operation calculates [7] A. Arasu, B. Babcock, S. Babu, M. Datar, K. Ito, R. Mot-
the average temperature over a 60 seconds length window wani, I. Nishizawa, U. Srivastava, D. Thomas, R. Varma,
for each product and adds an average attribute to the result and J. Widom. Stream: The stanford stream data manager.
tuple. Figure 9 shows the part of the query executed on IEEE Data Eng. Bull., 26(1):19–26, 2003.
each gateway. Answers of gateways are then merged by the [8] A. Arasu, S. Babu, and J. Widom. The cql continuous query
control site. language: Semantic foundations and query execution. Tech-
nical Report 2003-67, Stanford University, 2003.
[9] A. M. Ayad and J. F. Naughton. Static optimization of con-
5 Conclusion and perspectives junctive queries with sliding windows over infinite streams.
In SIGMOD ’04, pages 419–430, NY, USA, 2004.
[10] B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom.
This paper proposed SStreaM, a model for representing Models and issues in data stream systems. In PODS ’02,
sensor stream data and queries. SStreaM provides a general pages 1–16, NY, USA, 2002.
sensor stream representation model. It defines three types of [11] P. Bonnet, J. Gehrke, and P. Seshadri. Towards sensor data-
operators: stream operators, window creation operators and base systems. Lecture Notes in Computer Science, 2001.
windowed operators. These operators include a temporal [12] S. Chandrasekaran, O. Cooper, A. Deshpande, M. Franklin,
dimension to reflect the real-time aspects of sensor stream J. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden,
querying. A general window definition and a flexible win- V. Raman, F. Reiss, and M. Shah. TelegraphCQ: Contin-
dow edge management are also provided. uous dataflow processing for an uncertain world. In CIDR,
2003.
A prototype implementing SStreaM has been developed [13] J. Chomicki. Temporal query languages: a survey. In
for the PISE project [1]. The aim of this project is to mon- ICTL’94, volume 827, pages 506–534.
itor electric power materials in real-time. Various sensors [14] W. Dreyer, A. K. Dittrich, and D. Schmidt. Research per-
give information about the current status of the materials spectives for time series management systems. SIGMOD
(intensity, voltage, quality of electricity, etc.). The project Record, 23(1):10–15, 1994.
[15] P. B. Gibbons, B. Karp, Y. Ke, S. Nath, and S. Seshan. Iris-
net: An architecture for aworld-wide sensorweb. IEEE Per-
vasive Computing, 2003.
[16] L. Golab and M. T. Ozsu. Update-pattern-aware model-
ing and processing of continuous queries. In SIGMOD ’05,
pages 658–669, NY, USA, 2005.
[17] L. Gurgen, C. Labbé, V. Olive, and C. Roncancio. A scal-
able architecture for heterogeneous sensor management. In
DEXA Workshops, pages 1108–1112, Denmark, 2005.
[18] M. Hammad, W. Aref, and M. Franklin. Efficient execution
of sliding-window queries over data streams, 2003.
[19] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann,
and F. Silva. Directed diffusion for wireless sensor network-
ing. IEEE/ACM Transactions on Networking, 2003.
[20] C. S. Jensen, R. T. Snodgrass, and M. D. Soo. The tsql2 data
model. In The TSQL2 Temporal Query Language, pages
153–238. 1995.
[21] B. Kao and H. Garcia-Molina. An overview of real-time
database systems. pages 463–486, 1995.
[22] K.Ramamritham. Time for real-time temporal databases? In
Proceedings of the International Workshop on an Infrastruc-
ture for Temporal Databases, 1993.
[23] S. Madden and M. J. Franklin. Fjording the stream: An
architecture for queries over streaming sensor data. In ICDE,
pages 555–566, 2002.
[24] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong.
Tag: A tiny aggregation service for ad-hoc sensor networks.
In OSDI, 2002.
[25] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong.
Tinydb: an acquisitional query processing system for sensor
networks. ACM Trans. Database Syst., 2005.
[26] G. Ozsoyoglu and R. T. Snodgrass. Temporal and real-time
databases: A survey. TKDE, 7(4):513–532, 1995.
[27] K. Ramamritham. Real-time databases. Distributed and
Parallel Databases, 1(2):199–226, 1993.
[28] D. Schmidt, A. K. Dittrich, W. Dreyer, and R. W. Marti.
Time series, a neglected issue in temporal database re-
search? In Int. Workshop on Temp. Databases, UK, 1995.
[29] P. Seshadri, M. Livny, and R. Ramakrishnan. The design
and implementation of a sequence database system. In Pro-
ceedings of VLDB’96, pages 99–110, San Francisco, USA.

You might also like