Professional Documents
Culture Documents
OpenSense
opensense.epfl.ch
OpenSense
OVERVIEW
CONCLUSION
OpenSense
A IR POLLUTION
A IR POLLUTION M ONITORING
Officials
pollution sources municipalities: creating incentives to reduce environmental footprint public health studies
Citizens
OpenSense
User concerns
Study participants are sensitive data privacy Participants would like personalized information about individual exposure and risks
OpenSense
M ONITORING TODAY
Data difficult to integrate into applications (e.g. for correlating with other features like peoples activities)
OpenSense
OPPORTUNITIES
Wireless communication: deploy larger numbers of stations Mobility: deploy mobile stations and increase coverage
OpenSense
R ESEARCH C HALLENGES
SENSING SYSTEM From many wireless, mobile, heterogeneous, unreliable raw measurements
NANO
TERA
Microscale: 5m^2
S ENS ING S Y S T EM
OpenSense
L AUSANNE D EPLOYMENT
Sensor modalities and communication
2 close to NABEL station, 1 to e-vehicle garage needed for sensor calibration and long-term experiments
OpenSense
EPFL|TRACE
WITH
X -S ENSE)
close to NABEL station long-term sensor testing and sensor calibration testing new sensors (combined CO/NO2 sensor)
O3, CO, Ultra Fine Particles, humidity, temperature > 1 year of measurements and 30 Mio data points Communication: GPRS/WLAN
Mar 2013
O3, Ultra Fine Particles, humidity, temperature Covers all Switzerland OpenSense
A IR POLLUTION S ENSORS
Gases We measure CO, CO2, NO2, O3
Lack of long-term spatially resolved data due to high sensor cost Lack of good low cost sensors on the market Need for frequent sensor calibration
Much smaller than PM10 or PM2.5 and are believed to have more severe health implications Lack of epidemiological studies on health effects of long-term exposure
High cost of UFP monitoring equipment Lack of spatially resolved exposure data Lack of reliable dispersion models
OpenSense
S ENSOR
STATIONS ON
PUBLIC TRANSPORTATION
Lausanne deployment
Zurich deployment
+
OpenSense
AirQualityEgg
Smartphone connected to ozone sensor and various application software for Android
[Hasenfratz et al., Mobile Sensing 2012] [Predic et al., PerCom 2013]
Low-cost devices for home deployment (calibration tests close to a NABEL station)
OpenSense
Output [V]
OpenSense
S AMPLING S YSTEM
Slow response of chemical sensors
active vs. passive sampling open vs. closed sampling system new flow pre-processing layer for Lausanne deployment enabling full flow control
exhaust chamber valve Amplitude [ppm]
intake
pump
OpenSense
LOCALIZATION A CCURACY
Accurate chemical sampling requires accurate positioning Low-cost, embedded sub-meter accuracy in urban settings requires sensor fusion & light map matching algorithms Large set of rich data (stop coordinates, heading, odometer, acceleration, vehicle context data, etc.)
doors open Next stop: Sallaz Next stop: Valmont
OpenSense
C ITY C OVERAGE
Public transport vehicles are not bounded to a specific line number but rather to their host depot We can choose how many stations we deploy in each depot but not on which lines
Depot Oerlikon
3 trams
Depot Irchel
Depot Kalkbreite
2 trams
5 trams
OpenSense
C ALIBRATION PROCEDURE
Gas sensor drift (aging) -> periodic recalibration needed Gas sensors are installed on mobile vehicles Few expensive reference stations within city limits Two recipes:
Calibration upon rendezvous of mobile vehicles and references Passing of calibration data from vehicle to vehicle: Multi-hop Calibration
OpenSense
D ATA Q UANTITY
AND
VISUALIZATION
CO
2.820.000
20s
18 months
OpenSense
OpenSense
F R O M D AT A T O I NF O R MATIO N
OpenSense
OpenSense
C HOICE
OF
M ODELS
Physics-based
OpenSense
3 km
3 km
Single measurements
Processing steps:
Raw data
Data validation
LUR model
Pollution map
Map validation
UFP D ISTRIBUTION
Winter (Jan Mar)
IN
Z URICH
x 10 3
x 10 3
2.5
2.5
1.5
1.5
0.5
0.5
x 10 3
x 10 3
2.5
2.5
1.5
1.5
0.5
0.5
OpenSense
UFP D ISTRIBUTION
IN
Z URICH
Li et al., poster
Spatial + Land Use Regression with Gaussian Processes Random 10-fold validation: RMSE = 2324
UFP Estimation (Mean)
UFP Estimation Confidence (95% Conf. Int)
OpenSense
GENERALIZED USE
OF
M ODELS
[Sathe et al. , Model-based Sensor Data Acquisition and Management, Springer to appear]
OpenSense
OpenSense
Problem:
OpenSense
original data stream approximation using user-selected models detecting anomalies user confirmation: anomaly is an actual error?
[Paparrizos et al. , ICDE, 2011]
OpenSense
Q UERY PROCESSING
Continuous Moving Queries Aggregate Queries COX emitted yesterday in Give a (in car) pollution update every 30 mins Lausanne center
Approach
Data aggregator produces a model cover from a set of models on an area Continuous sensor updates Continuous and ad-hoc queries
Challenge
Different sensor accuracies Unreliable, erroneous data Uncontrolled mobility
Results: 3 algorithms
[Cartier et al., SECON 2012]
OpenSense
DATA C OMPRESSION
s1 s2
5.9 6.1 6.2 5.2 5.7 6.2
s2
5.3 5.7 6.1
internet
t1 t2 t3
t1 t2 t3
m1
m2
Original data stream approximation using models bitmap compression of model parameters
OpenSense
T HE U S ER S
OpenSense
C ONTEXT E XTRACTION
Objective: Automatically annotating trajectories of different types of moving objects (cars, people) Stops
Hidden Markov Model (HMM) Stop behaviors
Moves
Map matching Transportation means
Trajectory
Land use coverage
OpenSense
Different concerns, perceptions, user groups, data quality requirements Can we satisfy them simultaneously? Incentives for participation Trusted data Protecting privacy Objective: multi-query optimization for maximizing social benefit Economic approach
Incentives
Users as producers
User
Utility-based framework
OpenSense
THE S ETTINGS
OpenSense
E N H AN C IN G S EN S IN G EF F IC I EN C Y
Exploit the spatial correlations Reduce data readings
Maintain coverage (small reconstruction error) Achieve fairness among all mobile users Adaptively choosing subsets of active mobile users
2 1 6 5 3 1 Reduce 4 4 2 3
6 5
OpenSense
C O NCL US IO NS
OpenSense
C ONCLUSION
Investigate all system layers: sensor user interfaces Utility-based framework as integrative approach System modeling as a key requirement
TEAM
Karl Aberer, EPFL-LSIR, PI
Thanasis Papaioannou, postdoc Zhixian Yan, postdoc Hoyoung Jeung, postdoc Rammohan Narendula, PhD Mehdi Riahi, PhD Alex Arion, PhD Saket Sathe, PhD Tian Guo, PhD Julien Eberle, PhD Sofiane Sarni, engineer Jason Jingshi Li, postdoc
Olga Saukh, postdoc Jan Beutel, postdoc David Hasenfratz, PhD Christoph Walser, engineer
Alexander Bahr, postdoc Ali Marjovi, postdoc Adrian Arfire, PhD William C. Evans, PhD Emanuel Droz, engineer
OpenSense
B ACKUP
SLIDES
OpenSense
DEATHS
FROM
URBAN A IR POLLUTION
OpenSense
A IR POLLUTION
AND
C ARDIOVASCULAR M ORTALITY
Health studies show that air pollution increases the risk of cardiovascular mortality (heart attacks) by 5% to 20% at least
OpenSense
WHAT
IS THE PROBLEM?
Node decides individually depending on its state, e.g. calibration Nodes communicate with WSN and coordinate Base station schedules nodes using mobility model: a third node arrives, dont measure! Air quality model: dont need measurement! Privacy model: node 1 should measure! Application model (e.g. health service): no measurement needed!
OpenSense
VALUE
OF
D ENSE M EASUREMENTS
Traditional approach
Recent results
Massive deployment of stations (150) at street-level (2008/2009 New York City Community Air Quality Survey) Pollutants of interest heavily concentrated along roads with high traffic densities
Few stations Low resolution interpolated estimates of pollutant concentrations across massive regions
OpenSense
GRANULARITY
OF
M ODELS
Mesoscale: 1km^2
Microscale: 5m^2
Statistical
OpenSense
C ALIBRATION A CCURACY
150
Raw data
100
50
Time
OpenSense
S EGMENTATION
HELPS
No segments
5 Segments
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2 0
0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
OpenSense
Basic strategy: O(k log n) Binary+: optimized approach in finding segment boundaries
Heuristic: segmentation using absolute errors Heuristic+: segmentation using relative errors
Near-optimal segmentation
S EGMENTATION R ESULTS
One day data as training and test sets
Heuristic is better than Binary, specially for testing Large number of segments (k > 5) does not help much
OpenSense
S AMPLING
Optimal sampling
NP-hard Uniform Random Points with highest entropy Remove information redundancy Recalculate entropy after the points already selected
Distribution-based sampling
OpenSense
S AMPLING R ESULTS
Uniform is better than random: duty cycle sensing! Entropy is as good as mutual-info, when sampling rate is large Entropy goes worse when sampling rate is small (bias for large errors)
OpenSense
Problem: the true distribution is not observable How to determine the quality of the estimation?
OpenSense
E XPERIMENTAL E VALUATION
OpenSense
Three methods
Standard geo-statistics: kriging (one model) Uniform gridding: linear regression for each grid cell Adaptive k-means: cluster points that are jointly well approximated by a linear regression model
S. Cartier, S. Sathe, D. Chakraborty and K. Aberer, ConDense: Managing Data in Community-driven Mobile Geosensor Networks, SECON 2012.
OpenSense
A DAPTIVE K-MEANS
Algorithm sketch
Create k clusters in 2-d space Build a model for each cluster In each cluster identify the point with the largest approximation error If that point is above error threshold it becomes a new cluster center
OpenSense
E VALUATION:
PROCESSING COST
OpenSense
E VALUATION:
ERROR
OpenSense
E XPERIMENTAL R ESULTS
error completenesss
Real data for electrosmog sensing from Nokia campaign Avg Static : static parameters that meet the threshold on the average Max Static : static parameters that always meet the threshold
OpenSense
USER PRIVACY
VS.
Participatory sensing
Users reveal location Semi-honest aggregation server infers user activity Obfuscation affects data quality
B.Agir, T.Papaioannou, R.Narendula, K.Aberer, J.P. Hubaux, An adaptive scheme for personalized privacy in participatory sensing, WiSec 2012.
OpenSense
TASK A SSIGNMENT
They specify a valuation function vq() and a limited budget Bq for each query q Trustworthiness, accuracy Battery consumption, privacy leakage maximize utility (social welfare) u(S), S subset of S
Utility definition: difference between the value of the query results and the cost for obtaining the results.
OpenSense
Q UERIES
IN
P ARTICIPATORY S ENSING
OpenSense
C OMPUTING
AN
A LLOCATION
Optimal
Formulate the problem as binary integer linear program Iteratively choose the sensor that maximizes the utility gain and remove obsolete sensors (exploits sub-modularity) 1/3-approximation algorithm for sub-modular functions Iteratively choose, for each query, the sensor with highest utility Valuation functions vq need not be sub-modular Experiments: greedy scheduling works better than heuristic scheduling when vqs are not sub-modular
Heuristic (LocalSearch)
Greedy (Baseline)
OpenSense
E VALUATION
OpenSense
INCENTIVE S CHEMES
FOR
S ENSORS
&# $"
"- ./ - 0123456"
% # $"
Reporting the true measurement is the best strategy for sensor operators Incentivize measurements at locations of less certainty Resistant to collusion (up to 70% collusion in simulation)
% "
!# $"
!" % !" &! " ' !" (!" $! " )!" *! " +! " , !" % !!"
1 23 /#'4#"#5 '
Fig. 9 levels
"- ./ - 0123456"
&"
% # $"
% "
!# $"
Extends existing incentive scheme to multivalue sensors. An effective reputation system for sensors.
%! "
&! "
' !"
(!"
$! "
)!"
*! "
+! "
, !"
% !!"
11
( #$1#+,% & #'23'4 ,5#$'! & #+,/'627 7 089 +& '2+',5#'( 0: 7 9 1'( $9 2$'
("
+, , -. -/ made 012, -/ 3 " to an average +-45, -264-7 89: / ;, " Fig. 8. Payment sensor for different +, , -. -/ 012, -/ 3 "<7 8, "4= ">, ?0"@ 0" +-45, -264-7 89: / ;, "<7 8, "4="A, ?0"B7 0" levels of collusion reporting the public prior
$# '" $# &" $# % "
and the proper scoring rules. Figure 6 show s the average payment a given sensor received from the Peer Truth Serum given different degrees of uncertainty of the pollutant level for the given sensor location. The uncertainty is presented in the form of the root-meansquared deviation betw een the ground truth and the most likely value from the public prior at the sensor 1. 2'3 #"4 % 56+'7#,8 ##+'( $4 6$'% +9': $60+9'; $0,<' location. This graph show s that in general, the Peer Truth Serum incentivizes reporting at locations of greater Fig. 6. Average payment per sensor given uncertainty uncertainty, w here the public prior differs more from the actual ground truth observed by the sensor. In contrast, the proper scoring are indifferent to the degree by of Table 1 show s therules distribution of payments received imprecision at the location of measurement. an average sensor throughout the simulation, adopting
$" !# '" !# &" !# % " !# $" !" $) " $*" ()" ( *" %) " % *"
$# $"
response
Service market
sampling for locations considering error, value Data aggregation server required samples priority
Data market
sensor locations Mobility model predictions measurements, location, status Mobile sensors
sensor status
predictions
local coordination
Context
Data Flow
Control Flow
OpenSense