Professional Documents
Culture Documents
http://research.microsoft.com/en-us/projects/urbancomputing/default.aspx
Big Challenges in Big Cities
Big Data in Cities AI Technology
Conv
Conv
Dense
Conv
Dense
Conv Dense
σ2 λk αk θk zr,n mr,n φk β
N
K xr R K
Service Providing
Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
The Environment
Air Pollution, ...
Win
Urban Data Analytics
Data Mining, Machine Learning, Visualization
Urban
Computing
Urban Data Management People Win Win Cities OS
Spatio-temporal index, streaming, trajectory, and graph data management,...
Urban Computing: Concepts, Methodologies, and Applications. Zheng, Y., et al. ACM TIST.
Improving Medical Emergency Services using Big Data
Dispatching Center
Hospital
Yilun Wang, Yu Zheng, et al. Travel Time Estimation of a Path using Sparse Trajectories.. KDD 2014
Location Selection for Ambulance Stations: A Data-Driven Approach, ACM SIGSPATIAL 2015
Web site: http://ambulance.chinacloudsites.cn/
Service Providing
Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
Air Pollution, ...
Win
Urban
Computing
Zheng, Y., et al. Urban Computing: concepts, methodologies, and applications. ACM transactions on Intelligent Systems and Technology.
Urban Big Data
• Data Structures
• Spatio-temporal (ST) Properties
US EPA, China MEP, IOT Foursquare, Geo-tweets, Dianping
Spatio-temporal Spatial Static Spatio-Temporal
Static Data Temporal Dynamic Data Dynamic Data
Point-Based
Road/Transportation
Road Traffic Data Trajectory Data
Networks
Bing, Google, Gaode, Gaode Maps, Traffic TAXI, DD, Uber,
Baidu Maps management Bureau China Mobile, China Telecom
Tutorial on Trajectory Data Mining
Uncertainty Traj. Pattern Mining Trajectory
Privacy Moving Freq. Seq. Graph
Classification
Preserving Together Patterns Mining
Patterns Trajectory
Reducing Periodic
Outlier/Anomaly Routing
Uncertainty Clustering Patterns
Detection
Matrix
Spatial Spatial Spatial Tensor
Trajectories Trajectories Trajectories
Graph
Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.
Why Urban Big Data Platform
Bridge the gap between urban big data and urban
computing applications
Urban Computing City-wide and Multi-modal,
Urban Big Data
Applications instantaneous Large-scale and
highly dynamic
Spatial Index
Temporal
...Index
Jie Bao, et al. Managing Massive Trajectories on the Cloud, ACM SIGSPATIAL 2016
Cloud Computing for ST Data
API
Interface
n3 c3
Spatial Index
c1 Tr2
...
Temporal
...
...Index
Tr1
Spatial Index Spatio-temporal Index Value Index
Network-
based
Data Source Types
Point-
based
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al.
( POIs) ( Stationary Readings) (Crowd Sourcing Readings)
Planning bike lanes based on Sharing-bike’s
Spatial and Temporal Static Spatial Static-Temporal Dynamic Spatial-Temporal Dynamic trajectories,
KDD 2017
Imperative for enabling smart cities!
Finding Top-k Most Influential Location Set
A submodular maximization
problem, NP-hard
Interactive Visual Data Analytics
“SmartAdP: Visual Analytics of Large-scale Taxi Trajectories for Selecting Billboard Locations”, VAST 2016
Planning Bike Lanes Based on
Sharing-Bikes’ Trajectories
KDD 2017
Start
Segments
Start
Segments
Start
Segments
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
Effective Bike Path Planning Based on Mobike’s Data
Shanghai
Hongqiao
Airport
NP-Hard Problem
s1 = 2 km
Road Space & Money User & Trip Coverage K-Connected Components
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
Service Providing
• Texts and images Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
spatial and spatio-temporal data; Air Pollution, ...
• A single data source
Data cross different domains
• Separate data mining algorithms
Urban Data Analytics
machine learning + data management Data Mining, Machine Learning, Visualization
• Visual and interactive data analytics
Urban Data Management
Spatio-temporal index, streaming, trajectory, and graph data management,...
Win
Urban
Computing
Zheng, Y., et al. Urban Computing: concepts, methodologies, and applications. ACM transactions on Intelligent Systems and Technology.
Applications
Visulaization
Physical location his tory Books
t browsed online Air quality data Traffic Data
fr Nt
Road Networks: Fr
Data Analytics
•
Features
Spatial Classifier
Machine learning for spatial and
Regions
POIs: Fp Spatial
ɵ
Time slots
Regions
A
Regions
X
Y
fg Na dv
Traffic: Ft Meteorologic: Fm
X = R×U Categories Y = T×RT Temporal Classifier fp w
Human mobility: Fh Temporal
Categories
α
Np Learner
Joint Task v Joint Task Learner
spatio-temporal data
Categories
API
Interface
Hybrid Index
Indexing
Spatial Index
...
Temporal
...
...Index
0.7
0.9
p1 p2
Data Is Unique
Ratio
Ratio
0.8 p3
0.6
0.5 0.7
Spatial Properties 0 40 80
Distance(km)
120 0 40
Distance(km)
80 120
• Distance
s1 u1
– Spatial closeness s2
– Triangle inequality: u3
|𝑑1 − 𝑑2 | ≤ 𝑑3 ≤ |𝑑1 + 𝑑2 | s3
u2
𝒖𝟏 : 𝑢2 > 𝑢4 𝒖𝟏 : 𝑢3 > 𝑢2
• Hierarchy c10 c10
l1
Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
Why Spatio-Temporal Data Is Unique
• Temporal properties 0.8 0.9
– Temporal closeness
0.7 0.8
Ratio
Ratio
– Period
– Trend 0.6 0.7
0.5 0.6
0 4 8 12 0
Time Interval(hour)
A) Air quality data
Speed (km/h)
Time Time
A) Hourly traffic speed on consecutive days B) Traffic speed at 9-10am on consecutive weekends
Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
DNN-Based Urban Flow Prediction
Predict In-flow and out-flow of crowds in each region at next
time interval throughout a city
New-flow Out-flow
r1
r2 r3
In-flow End-flow
• Important for:
▪ Traffic management
▪ Risk assessment
▪ Public safety
http://urbanflow.sigkdd.com.cn/
Junbo Zhang, Yu Zheng, et al. DNN-Based Prediction Model for Spatial-Temporal Data. ACM SIGSPAITAL 2016
Empowering Many Applications
on the Intelligent Cloud
Phone
Signals Anomaly Detection
Public Safety Crowd flow
Trajectories
Traffic flow
Card
Swiping
Data
Metro/bus flow
Trajectories
&
Request
Intelligent Scheduling Supply & demand
Food
Orders
Ratio
• Capturing temporal properties
Ratio
Low
0.8 c30 c31 c32 c33 c34 c35 c36
0.6
– Temporal closeness
Data of u1 , u2 , u3 , u4
0.7 0.8
Ratio
Ratio
0.6 0.7
0.5 0.6
0 4 8 12 0 4 8 12
Time Interval(hour) Time Interval(hour)
Time Time
A) Air quality data B)A)Humidity
Hourly traffic speed on consecutive days B) Traffic speed at 9-10am on consecutive weekends
Converting Trajectories into Video-like Data
Trajectories
Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
•
ST-ResNet Architecture: Capture temporal closeness, period, and
trend
•
A Collective Prediction •
Capture external factors
Capture spatial correlation of both near
and far distances
Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
09:00 10:00
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv
Conv
Dense
Dense
Aggregation Results
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv
Conv
Dense
Dense
Rank 1 Rank 5
Yanjie Fu, Yong Ge, Yu Zheng, et al. Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors. ICDM 2014
Ranking of Clusters of
Real Estates Real Estates
features
x i
Conv
Dense
Dense
http://urbanair.msra.cn/
Air Pollution: A Global Concern !
PM2.5, PM10, NO2, SO2, CO, O3 Air quality monitor station
S1
50kmx40km
S2
S6
S7
S6
S8
S12 S14
S13
S21 S19
S15 S22
S20
S9
S16 S10
S4 S11
S3
S16 S18
S5
S17
We do not really know the air quality of a location without
a monitoring station!
Inferring Real-Time and Fine-Grained air quality
throughout a city using Big Data
S1
S2
S6
S7
S6
S8
S12 S14
S13
S21 S19
S15 S22
S20
S9
S16 S10
S4 S11
S3
S16 S18
S5
Historical air quality data Real-time air quality reports S17
Zheng, Y., et al. U-Air: When Urban Air Quality Inference Meets Big Data. KDD 2013
http://urbanair.msra.cn
Zheng, Y., et al. U-Air: When Urban Air Quality Inference Meets Big Data. KDD 2013
Forecasting Air Quality Based on Big Data
http://urbanair.msra.cn
Yu Zheng, et al. Forecasting Fine-Grained Air Quality Based on Big Data. KDD 2015
Covering 300+ cities
Inferring Gas Consumption and Pollution
Emission of Vehicles throughout a City
KDD 2014
Questions
How many liters of gas have been consumed by the vehicles, in the
entire city, in the past one hour?
What is the volume of PM2.5 that has been generated accordingly?
Goals
• Estimate the gas consumption and vehicle emissions
– on arbitrary road segment
– at any time intervals
– using GPS trajectories of a sample of vehicles
2013/09/17 2013/09/21 2013/10/02
Tuesday Saturday National Holiday kg/km
Gas Consumption
3 –4 PM 240
150
kg/km
3 –4 PM
30
CO
12
Jingbo Shang, Yu Zheng, et al. Inferring Gas Consumption and Pollution Emission of Vehicles throughout a City. KDD 2014.
Gas Consumption & Pollution Estimation
Jingbo Shang, Yu Zheng, et al. Inferring Gas Consumption and Pollution Emission of Vehicles throughout a City. KDD 2014.
Transfer Learning
• Transfer learning
– Between Single Type of Datasets
– Between Multiple-types of datasets
Beijing NYC
Baoding Chicago
Target Domain
D1 D2 D3 D4 D5 D6 D7
Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
Transfer Knowledge Between Cities
• What cannot be transfer
– Models trained in a city
– Data of a city
Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
𝐷𝐵𝑆1 𝐷𝐵𝑆2 𝐷𝐵𝑆3 𝐷𝐵𝑆4
Framework
source city
𝐒1 𝐒2 𝐒3 𝐒4
𝐒1 𝐒2 𝐒3 𝐒4
𝐃1 𝐃2 𝐃3 𝐃4
𝑘
𝐓1 𝐓2 𝐓3 𝐓1 𝐓2 𝐓3
target city
Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
𝐷𝐵𝑇1 𝐷𝐵𝑇2 𝐷𝐵𝑇3
Framework
𝐷𝐵𝑆1 𝐷𝐵𝑆2 𝐷𝐵𝑆3 𝐷𝐵𝑆4
sparse coding model
source city
𝐒 1 𝐒 2 𝐒 3 𝐒 4
original feature extraction
1 2 3 4 𝑁𝑠
𝐒 𝐒 𝐒 𝐒
Conv
Dense
Dense
Applications
Air Quality: Inference, City-wide Traffic: Speed, Volume, Cross-Domain Spatio-Temporal
Prediction and Causality Energy and Pollution emission Correlation Pattern Mining
Data Analytics
Features
Regions Spatial Classifier POIs: Fp Spatial
•
ɵ
Time slots
Regions
A
Regions
X
Y
fg Na dv
Categories
Z
data Data Mining and Machine Learning for Spatial and Spatio-Temporal Data
Azure Cloud ML
Interface
1. Location-based Range Query
Computing Environment
3. Value-based Inverted Lookup
Hybrid Index
Indexing
Search for “Urban Computing”
Spatial Index
...
Temporal
...
...Index
HDInsight
Spatio-temporal Index
Virtual Machines
Value Index
Azure Queue
Thanks!
Storage
Storage
Types
Yu Zheng (郑宇) Azure SQL Azure Table Azure Blob Azure File
applications. ACM trans. on Intelligent Systems and Technology. ( POIs) ( Stationary Readings) (Crowd Sourcing Readings)
Spatial and Temporal Static Spatial Static-Temporal Dynamic Spatial-Temporal Dynamic