You are on page 1of 59

Urban Computing

–Enabling Intelligent Cities with AI and Big Data


Dr. Yu Zheng
Senior Research Manager, Microsoft Research
Chair Professor at Shanghai Jiao Tong University
Editor-in-Chief of ACM Trans. Intelligent Systems and Technology

http://research.microsoft.com/en-us/projects/urbancomputing/default.aspx
Big Challenges in Big Cities
Big Data in Cities AI Technology

Conv

Conv

Dense
Conv
Dense

Conv Dense

σ2 λk αk θk zr,n mr,n φk β
N
K xr R K
Service Providing
Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
The Environment
Air Pollution, ...

Win
Urban Data Analytics
Data Mining, Machine Learning, Visualization
Urban
Computing
Urban Data Management People Win Win Cities OS
Spatio-temporal index, streaming, trajectory, and graph data management,...

Human Air Meteorolo Social Energy Road POIs


Traffic
mobility Quality gy Media Networks

Tackle the Big challenges


Urban Sensing & Data Acquisition
Participatory Sensing, Crowd Sensing, Mobile Sensing in Big cities
using Big data!

郑宇. 城市计算概述,武汉大学学报. 2015年1月

Urban Computing: Concepts, Methodologies, and Applications. Zheng, Y., et al. ACM TIST.
Improving Medical Emergency Services using Big Data
Dispatching Center

Patients Ambulance stations

Save 30+% time!

Hospital

• Select locations for Ambulance Stations


• Dynamic ambulance allocation

Yilun Wang, Yu Zheng, et al. Travel Time Estimation of a Path using Sparse Trajectories.. KDD 2014
Location Selection for Ambulance Stations: A Data-Driven Approach, ACM SIGSPATIAL 2015
Web site: http://ambulance.chinacloudsites.cn/
Service Providing
Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
Air Pollution, ...

Urban Data Analytics


• Manage cross-domain spatio-temporal data Data Mining, Machine Learning, Visualization
• Data: ST properties
• Indexing and retrieval algorithms
• Enhanced Cloud computing platforms Urban Data Management
Spatio-temporal index, streaming, trajectory, and graph data management,...

Human Air Meteorolo Social Energy Road POIs


Traffic
mobility Quality gy Media Networks

Urban Sensing & Data Acquisition


The Environment Participatory Sensing, Crowd Sensing, Mobile Sensing

Win

Urban
Computing

People Win Win Cities OS

Zheng, Y., et al. Urban Computing: concepts, methodologies, and applications. ACM transactions on Intelligent Systems and Technology.
Urban Big Data
• Data Structures
• Spatio-temporal (ST) Properties
US EPA, China MEP, IOT Foursquare, Geo-tweets, Dianping
Spatio-temporal Spatial Static Spatio-Temporal
Static Data Temporal Dynamic Data Dynamic Data
Point-Based

POI Distributions Spatial-temporal


Weather/AQI Station Data Crowd Souring Data
Network-Based

Road/Transportation
Road Traffic Data Trajectory Data
Networks
Bing, Google, Gaode, Gaode Maps, Traffic TAXI, DD, Uber,
Baidu Maps management Bureau China Mobile, China Telecom
Tutorial on Trajectory Data Mining
Uncertainty Traj. Pattern Mining Trajectory
Privacy Moving Freq. Seq. Graph
Classification
Preserving Together Patterns Mining
Patterns Trajectory
Reducing Periodic
Outlier/Anomaly Routing
Uncertainty Clustering Patterns
Detection

Trajectory Indexing and Retrieval Matrix


Distance of Query Historical Managing Recent Analysis
Trajectory Trajectories Trajectories
TD

Trajectory Preprocessing Map-Matching Compression MF

Stay Point Detection Noise Filtering Segmentation CF

Matrix
Spatial Spatial Spatial Tensor
Trajectories Trajectories Trajectories
Graph

Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.
Why Urban Big Data Platform
Bridge the gap between urban big data and urban
computing applications
Urban Computing City-wide and Multi-modal,
Urban Big Data
Applications instantaneous Large-scale and
highly dynamic

Imperative for enabling smart cities!

Urban Big Data Platform


Managing Urban Big Data
• Difficulties
– Large-scale and highly dynamic  cloud computing
– Cloud computing platforms do not support ST Data well
• Unique ST data structures: trajectories (the most complex ST Data)
• Unique queries: ST-Range queries and KNN queries rather than key words
• Data across different domains: Hybrid indexing for managing multi-modality data

Trajectories Range Queries KNN Queries Hybrid indexing

Spatial Index

Temporal
...Index

Jie Bao, et al. Managing Massive Trajectories on the Cloud, ACM SIGSPATIAL 2016
Cloud Computing for ST Data
API
Interface

1. Location-based Range Query Tr3


n1
2. Spatio-temporal Range Query
Computing Environment

3. Value-based Inverted Lookup n2


Tr4 c4
Tr1 Tr2 c2
Hybrid Index Tr5
Indexing

n3 c3
Spatial Index
c1 Tr2
...

Temporal
...
...Index
Tr1
Spatial Index Spatio-temporal Index Value Index

HDInsight Virtual Machines Azure Queue


Y. Li, J. Bao, Y. Li, Z. Gong, Y. Zheng.
Mining the Most Influential k-Location Set From
Massive Trajectories.
IEEE Transactions on Big Data. 2017
Storage
Storage
Types

Azure SQL Azure Table Azure Blob Azure File

Network-
based
Data Source Types

(Road Networks) (Traffic Readings) (Trajectories)


Data

Point-
based
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al.
( POIs) ( Stationary Readings) (Crowd Sourcing Readings)
Planning bike lanes based on Sharing-bike’s
Spatial and Temporal Static Spatial Static-Temporal Dynamic Spatial-Temporal Dynamic trajectories,
KDD 2017
Imperative for enabling smart cities!
Finding Top-k Most Influential Location Set

A submodular maximization
problem, NP-hard
Interactive Visual Data Analytics

“SmartAdP: Visual Analytics of Large-scale Taxi Trajectories for Selecting Billboard Locations”, VAST 2016
Planning Bike Lanes Based on
Sharing-Bikes’ Trajectories
KDD 2017

Start
Segments

Start
Segments

Start
Segments
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
Effective Bike Path Planning Based on Mobike’s Data

• General Requirements on bike lanes


– Budget Constraints
– Maximize Utilization
– Construction Convenience

Shanghai
Hongqiao
Airport

NP-Hard Problem
s1 = 2 km

Road Space & Money User & Trip Coverage K-Connected Components
J. Bao, T. He, S. Ruan, Y. Li, and Y. Zheng et al. Planning bike lanes based on Sharing-bike’s trajectories, KDD 2017
Service Providing
• Texts and images  Improve urban planning, Ease Traffic Congestion, Save Energy, Reduce
spatial and spatio-temporal data; Air Pollution, ...
• A single data source 
Data cross different domains
• Separate data mining algorithms 
Urban Data Analytics
machine learning + data management Data Mining, Machine Learning, Visualization
• Visual and interactive data analytics
Urban Data Management
Spatio-temporal index, streaming, trajectory, and graph data management,...

Human Air Meteorolo Social Energy Road POIs


Traffic
mobility Quality gy Media Networks

Urban Sensing & Data Acquisition


The Environment Participatory Sensing, Crowd Sensing, Mobile Sensing

Win

Urban
Computing

People Win Win Cities OS

Zheng, Y., et al. Urban Computing: concepts, methodologies, and applications. ACM transactions on Intelligent Systems and Technology.
Applications

Air Quality: Inference, City-wide Traffic: Speed, Volume, Cross-Domain Spatio-Temporal


Prediction and Causality Energy and Pollution emission Correlation Pattern Mining

Visulaization
Physical location his tory Books
t browsed online Air quality data Traffic Data
fr Nt
Road Networks: Fr
Data Analytics


Features
Spatial Classifier
Machine learning for spatial and
Regions
POIs: Fp Spatial
ɵ
Time slots
Regions

A
Regions

X
Y
fg Na dv
Traffic: Ft Meteorologic: Fm
X = R×U Categories Y = T×RT Temporal Classifier fp w
Human mobility: Fh Temporal
Categories
α
Np Learner
Joint Task v Joint Task Learner

spatio-temporal data
Categories

Cross-Domain Data fusion Methods Travel Packages


{A, B, C}
Book categories {war,
romantic, sci-fic}
{Good, moderation,
Unhealthy}
{fast, normal,
congestion}
A) Book-travel interests co-estimation B) Air quality-traffic co-prediction

• Cross-domain data fusion


Data Mining and Machine Learning for Spatial and Spatio-Temporal Data
Azure Cloud ML

API
Interface

1. Location-based Range Query


2. Spatio-temporal Range Query
Computing Environment

3. Value-based Inverted Lookup

Hybrid Index
Indexing

Spatial Index

...

Temporal
...
...Index

Spatial Index Spatio-temporal Index Value Index

HDInsight Virtual Machines Azure Queue

Yu Zheng. Methodologies for Cross-


Storage
Storage
Types

Domain Data Fusion: An Overview. IEEE


Azure SQL Azure Table Azure Blob Azure File
Transactions on Big Data
Spatio-Temporal 0.8

0.7
0.9
p1 p2

Data Is Unique

Ratio
Ratio
0.8 p3
0.6

0.5 0.7
Spatial Properties 0 40 80
Distance(km)
120 0 40
Distance(km)
80 120

A) Air quality data B) Humidity

• Distance
s1 u1
– Spatial closeness s2

– Triangle inequality: u3
|𝑑1 − 𝑑2 | ≤ 𝑑3 ≤ |𝑑1 + 𝑑2 | s3
u2

𝒖𝟏 : 𝑢2 > 𝑢4 𝒖𝟏 : 𝑢3 > 𝑢2
• Hierarchy c10 c10
l1

– Different spatial granularities High


c20 c21 c22
l2
– City structures c20 c21
Low
c30 c31 c32 c33 c34 c35 c36 c22
c31 l3
c35 c32 c33
Data of u1 , u2 , u3 , u4
c30 c36 c34
cij : The jth cluster on the ith layer

Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
Why Spatio-Temporal Data Is Unique
• Temporal properties 0.8 0.9

– Temporal closeness
0.7 0.8

Ratio

Ratio
– Period
– Trend 0.6 0.7

0.5 0.6
0 4 8 12 0
Time Interval(hour)
A) Air quality data
Speed (km/h)

Time Time
A) Hourly traffic speed on consecutive days B) Traffic speed at 9-10am on consecutive weekends

Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
DNN-Based Urban Flow Prediction
Predict In-flow and out-flow of crowds in each region at next
time interval throughout a city

New-flow Out-flow

r1
r2 r3
In-flow End-flow

• Important for:
▪ Traffic management
▪ Risk assessment
▪ Public safety
http://urbanflow.sigkdd.com.cn/

Junbo Zhang, Yu Zheng, et al. DNN-Based Prediction Model for Spatial-Temporal Data. ACM SIGSPAITAL 2016
Empowering Many Applications
on the Intelligent Cloud
Phone
Signals Anomaly Detection
Public Safety Crowd flow
Trajectories

Traffic flow
Card
Swiping
Data

Metro/bus flow

Trajectories
&
Request
Intelligent Scheduling Supply & demand

Food
Orders

Orders & dispatch


Challenges
• Urban crowd flow depends on many factors
– Flows of previous time interval
– Flows of nearby regions and distant regions
– Weather, traffic control and events
• Capturing spatial properties 0.8
c10

– Spatial distance and hierarchy 0.7


0.9
c
High 20
c21 c22

Ratio
• Capturing temporal properties

Ratio
Low
0.8 c30 c31 c32 c33 c34 c35 c36
0.6

– Temporal closeness
Data of u1 , u2 , u3 , u4

– Period and trend


0.5 0.7
0 40 80 120 cij :0The jth cluster
40 on the ith layer
80
Distance(km) Distance(km)
A) Air quality data B) Humidity
0.8 0.9
Speed (km/h)

0.7 0.8
Ratio

Ratio

0.6 0.7

0.5 0.6
0 4 8 12 0 4 8 12
Time Interval(hour) Time Interval(hour)
Time Time
A) Air quality data B)A)Humidity
Hourly traffic speed on consecutive days B) Traffic speed at 9-10am on consecutive weekends
Converting Trajectories into Video-like Data

Trajectories

Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017

ST-ResNet Architecture: Capture temporal closeness, period, and
trend

A Collective Prediction •
Capture external factors
Capture spatial correlation of both near
and far distances

distant near recent

• Residual learning to help training

• Fusing factors differently in different regions

(𝜔𝑐,1, 𝜔𝑝,1, 𝜔𝑞,1 ) ⋯


⋮ ⋱ ⋮
⋯ (𝜔𝑐,𝑛, 𝜔𝑝,𝑛, 𝜔𝑞,𝑛 )

Junbo Zhang, Yu Zheng, et al. Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction, AAAI 2017
09:00 10:00
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv

• Feature concatenation + regularization Conv

Conv
Dense

Dense

• DNN-based Conv Dense

• Semantic meaning-based fusion


Road Networks: Fr
Spatial Classifier POIs: Fp Spatial
σ2 λk αk θk zr,n mr,n φk β
Traffic: Ft Meteorologic: Fm
Temporal Classifier N
Human mobility: Fh Temporal K xr R K

Multi-view learning (Co-training) Pro. dependency-based (Topic Models)


Physical location his tory Books browsed online Air quality data Traffic Data
g1 g2 g16 g1 g2 g16 r1 r2 rn r1 r2 rn f1 f2 fk
ti ti r1
ti+1 MG ti+1 r2
MG Mr Mr fr fp fg
tj tj rn
Joint Task Learner Joint Task Learner
Y X Packages
Travel BookZcategories {war, {Good, moderation, {fast, normal,
{A, B, C} romantic, sci-fic} Unhealthy} congestion}
A) Book-travel interests co-estimation B) Air quality-traffic co-prediction
Similarity-based (matrix factorization) Transfer Learning-based
Yu Zheng. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Transactions on Big Data
Stage-Based Data fusion
Data A Data B

Knowledge Model 2 Results


Model 1

Data A Data B Data K

Model 1 Model 2 Model k

Result 1 Result 2 Result k

Aggregation Results
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv

• Feature concatenation + regularization Conv

Conv
Dense

Dense

• DNN-based Conv Dense

• Semantic meaning-based fusion


Road Networks: Fr
Spatial Classifier POIs: Fp Spatial
σ2 λk αk θk zr,n mr,n φk β
Traffic: Ft Meteorologic: Fm
Temporal Classifier N
Human mobility: Fh Temporal K xr R K

Multi-view learning (Co-training) Pro. dependency-based (Topic Models)


Physical location his tory Books browsed online Air quality data Traffic Data
g1 g2 g16 g1 g2 g16 r1 r2 rn r1 r2 rn f1 f2 fk
ti ti r1
ti+1 MG ti+1 r2
MG Mr Mr fr fp fg
tj tj rn
Joint Task Learner Joint Task Learner
Y X Packages
Travel BookZcategories {war, {Good, moderation, {fast, normal,
{A, B, C} romantic, sci-fic} Unhealthy} congestion}
A) Book-travel interests co-estimation B) Air quality-traffic co-prediction
Similarity-based (matrix factorization) Transfer Learning-based
Yu Zheng. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Transactions on Big Data
Ranking and Clustering Real Estates using Big Data

• Values (learned from big data) Price House Increase Rank↓


H1 35% R1
– Increase more in a rising market ∆P
H5
H4
29%
13%
R1
R2
– Decrease less in a falling market …. …. ….
H2 9% R3
H3 2% R3
H6 -1.5% R4
H7 -6.1% R5
t0 t1 t2
A) The price of a real estate B) Rank of estates by ∆P

Rank 1 Rank 5

Yanjie Fu, Yong Ge, Yu Zheng, et al. Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors. ICDM 2014
Ranking of Clusters of
Real Estates Real Estates

Location, Location, Location

Geographical Utility Neighborhood Popularity Business Zone’s Prosperity

Public Commutes Taxi Traces

User Check-ins and Comments


Geospatial Infrastructure Human Mobility Business Areas
Public Commutes Taxi Traces

User Check-ins and Comments

features

x i

Pair-wised ranking constraint Sparsity Regularization


颇辣渝味火锅
上海市乐虹坊精致生活广场2楼 地铁10号线龙柏新村站2号出口
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv

• Feature concatenation + regularization Conv

Conv
Dense

Dense

• DNN-based Conv Dense

• Semantic meaning-based fusion


Road Networks: Fr
Spatial Classifier POIs: Fp Spatial
σ2 λk αk θk zr,n mr,n φk β
Traffic: Ft Meteorologic: Fm
Temporal Classifier N
Human mobility: Fh Temporal K xr R K

Multi-view learning (Co-training) Pro. dependency-based (Topic Models)


Physical location his tory Books browsed online Air quality data Traffic Data
g1 g2 g16 g1 g2 g16 r1 r2 rn r1 r2 rn f1 f2 fk
ti ti r1
ti+1 MG ti+1 r2
MG Mr Mr fr fp fg
tj tj rn
Joint Task Learner Joint Task Learner
Y X Packages
Travel BookZcategories {war, {Good, moderation, {fast, normal,
{A, B, C} romantic, sci-fic} Unhealthy} congestion}
A) Book-travel interests co-estimation B) Air quality-traffic co-prediction
Similarity-based (matrix factorization) Transfer Learning-based
Yu Zheng. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Transactions on Big Data
When Urban Air Meets Big Data
KDD 2013

http://urbanair.msra.cn/
Air Pollution: A Global Concern !
PM2.5, PM10, NO2, SO2, CO, O3 Air quality monitor station

S1
50kmx40km

S2
S6
S7
S6
S8
S12 S14
S13
S21 S19
S15 S22
S20
S9
S16 S10
S4 S11
S3

S16 S18
S5
S17
We do not really know the air quality of a location without
a monitoring station!
Inferring Real-Time and Fine-Grained air quality
throughout a city using Big Data

Meteorology Traffic Human Mobility POIs Road networks

S1

S2
S6
S7
S6
S8
S12 S14
S13
S21 S19
S15 S22
S20
S9
S16 S10
S4 S11
S3

S16 S18
S5
Historical air quality data Real-time air quality reports S17

Zheng, Y., et al. U-Air: When Urban Air Quality Inference Meets Big Data. KDD 2013
http://urbanair.msra.cn

Zheng, Y., et al. U-Air: When Urban Air Quality Inference Meets Big Data. KDD 2013
Forecasting Air Quality Based on Big Data
http://urbanair.msra.cn

Yu Zheng, et al. Forecasting Fine-Grained Air Quality Based on Big Data. KDD 2015
Covering 300+ cities
Inferring Gas Consumption and Pollution
Emission of Vehicles throughout a City

KDD 2014
Questions
How many liters of gas have been consumed by the vehicles, in the
entire city, in the past one hour?
What is the volume of PM2.5 that has been generated accordingly?
Goals
• Estimate the gas consumption and vehicle emissions
– on arbitrary road segment
– at any time intervals
– using GPS trajectories of a sample of vehicles
2013/09/17 2013/09/21 2013/10/02
Tuesday Saturday National Holiday kg/km

Gas Consumption
3 –4 PM 240

150

kg/km
3 –4 PM

30
CO

12

Jingbo Shang, Yu Zheng, et al. Inferring Gas Consumption and Pollution Emission of Vehicles throughout a City. KDD 2014.
Gas Consumption & Pollution Estimation

Jingbo Shang, Yu Zheng, et al. Inferring Gas Consumption and Pollution Emission of Vehicles throughout a City. KDD 2014.
Transfer Learning
• Transfer learning
– Between Single Type of Datasets
– Between Multiple-types of datasets

Yesterday Today Tomorrow


Reinforcement Transfer
Deep Learning:
Learning: Learning:
Features
Rewards Adaptation
Lots of Data
Lots of Data Few Data
Only the Rich
Only the Rich Everyone
Many cities are lack of data
Example 1: Predicting air quality Example 2: Diagnose urban noises
Source Domain

Beijing NYC
Baoding Chicago
Target Domain

D1 D2 D3 D4 D5 D6 D7

Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
Transfer Knowledge Between Cities
• What cannot be transfer
– Models trained in a city
– Data of a city

• What we can transfer 4


1 1 4
– Relation between different datasets 1 1 4 4
graph
construction
– Latent spaces of a type of data 3
3
3
3 6
6
6

Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
𝐷𝐵𝑆1 𝐷𝐵𝑆2 𝐷𝐵𝑆3 𝐷𝐵𝑆4
Framework
source city

𝐒1 𝐒2 𝐒3 𝐒4

original feature extraction 𝑁𝑠

𝐒1 𝐒2 𝐒3 𝐒4

graph clustering based


dictionary learning
𝐃1 𝐃2 𝐃3 𝐃4
transfer

𝐃1 𝐃2 𝐃3 𝐃4
𝑘

𝐓1 𝐓2 𝐓3 𝐓1 𝐓2 𝐓3
target city

original feature extraction


𝑁𝑡

Ying Wei, Yu Zheng, Qiang Yang. Transfer knowledge between Cities. KDD 2016.
𝐷𝐵𝑇1 𝐷𝐵𝑇2 𝐷𝐵𝑇3
Framework
𝐷𝐵𝑆1 𝐷𝐵𝑆2 𝐷𝐵𝑆3 𝐷𝐵𝑆4
sparse coding model
source city

𝐒෠ 1 𝐒෠ 2 𝐒෠ 3 𝐒෠ 4
original feature extraction
1 2 3 4 𝑁𝑠
𝐒 𝐒 𝐒 𝐒

graph clustering based 𝑘 𝑘 𝑘 𝑘


dictionary learning
transfer

𝐃1 𝐃2 𝐃3 𝐃4 Multimodal Transfer Adaboost

sparse coding model ෡𝑀𝑃


𝐓
0.90.9
max pooling
𝐓1 𝐓2 𝐓3 𝑁 𝑡 0.20.4
0.60.8
target city

original feature extraction ෡1


𝐓 ෡2
𝐓 ෡3
𝐓 𝑘
0.80.2 0.90.7 0.10.9
final
𝑁 𝑡 0.10.4 0.20.1
classifier
0.60.3 0.40.8 0.20.1
𝑘 𝑘 𝑘 results
𝐷𝐵𝑇1 𝐷𝐵𝑇2 𝐷𝐵𝑇3
Methodologies for Cross-Domain Data Fusion
• Stage-based data fusion
• Feature-level-based data fusion Conv

• Feature concatenation + regularization Conv

Conv
Dense

Dense

• DNN-based Conv Dense

• Semantic meaning-based fusion


Road Networks: Fr
Spatial Classifier POIs: Fp Spatial
σ2 λk αk θk zr,n mr,n φk β
Traffic: Ft Meteorologic: Fm
Temporal Classifier N
Human mobility: Fh Temporal K xr R K

Multi-view learning (Co-training) Pro. dependency-based (Topic Models)


Physical location his tory Books browsed online Air quality data Traffic Data
g1 g2 g16 g1 g2 g16 r1 r2 rn r1 r2 rn f1 f2 fk
ti ti r1
ti+1 MG ti+1 r2
MG Mr Mr fr fp fg
tj tj rn
Joint Task Learner Joint Task Learner
Y X Packages
Travel BookZcategories {war, {Good, moderation, {fast, normal,
{A, B, C} romantic, sci-fic} Unhealthy} congestion}
A) Book-travel interests co-estimation B) Air quality-traffic co-prediction
Similarity-based (matrix factorization) Transfer Learning-based
Yu Zheng. Methodologies for Cross-Domain Data Fusion: An Overview. IEEE Transactions on Big Data
Applications
Take Away Messages

Applications
Air Quality: Inference, City-wide Traffic: Speed, Volume, Cross-Domain Spatio-Temporal
Prediction and Causality Energy and Pollution emission Correlation Pattern Mining

• AI + Big Data + Cloud + Domain knowledge Visulaization


Road Networks: Fr
Physical location his tory
fr Books
t browsed online
Nt Air quality data Traffic Data

Data Analytics
Features
Regions Spatial Classifier POIs: Fp Spatial


ɵ

Time slots
Regions
A

Regions
X
Y
fg Na dv

Platform: Data management + Machine


Traffic: Ft Meteorologic: Fm
X = R×U Categories Y = T×RT Temporal Classifier fp w
Human mobility: Fh Temporal
Categories
α
Np Learner
Joint Task v Joint Task Learner

Categories
Z

Cross-Domain Data fusion Methods Travel Packages


{A, B, C}
Book categories {war,
romantic, sci-fic}
{Good, moderation,
Unhealthy}
{fast, normal,
congestion}

Learning + Visualization for spatio-temporal


A) Book-travel interests co-estimation B) Air quality-traffic co-prediction

data Data Mining and Machine Learning for Spatial and Spatio-Temporal Data
Azure Cloud ML

• AI for spatio-temporal data is very promising API

Interface
1. Location-based Range Query

but still very young 2. Spatio-temporal Range Query

Computing Environment
3. Value-based Inverted Lookup

Hybrid Index

Indexing
Search for “Urban Computing”
Spatial Index

...

Temporal
...
...Index

搜索“城市计算” Spatial Index

HDInsight
Spatio-temporal Index

Virtual Machines
Value Index

Azure Queue

Thanks!

Storage
Storage
Types
Yu Zheng (郑宇) Azure SQL Azure Table Azure Blob Azure File

msyuzheng@outlook.com Data Source Types


Network-
based

(Road Networks) (Traffic Readings) (Trajectories)


Data

Zheng, Y., et al. Urban Computing: concepts, methodologies, and Point-


based

applications. ACM trans. on Intelligent Systems and Technology. ( POIs) ( Stationary Readings) (Crowd Sourcing Readings)
Spatial and Temporal Static Spatial Static-Temporal Dynamic Spatial-Temporal Dynamic

You might also like