RDataMining Slides Time Series Analysis PDF

Time Series Analysis with R1
Yanchang Zhao
http://www.RDataMining.com
R and Data Mining Workshop @ AusDM 2014

27 November 2014
1
Presented at UJAT and Canberra R Users Group
1 / 39
Outline
Introduction
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Time Series Classification
Online Resources
2 / 39
2
Time Series Analysis with R
I time series data in R

I time series decomposition, forecasting, clustering and
classification
I autoregressive integrated moving average (ARIMA) model
I Dynamic Time Warping (DTW)
I Discrete Wavelet Transform (DWT)
I k-NN classification
2
Chapter 8: Time Series Analysis and Mining, in book
R and Data Mining: Examples and Case Studies.
http://www.rdatamining.com/docs/RDataMining.pdf
3 / 39
R
I a free software environment for statistical computing and

graphics
I runs on Windows, Linux and MacOS
I widely used in academia and research, as well as industrial
applications
I 5,800+ packages (as of 13 Sept 2014)
I CRAN Task View: Time Series Analysis
http://cran.r-project.org/web/views/TimeSeries.html
4 / 39
Time Series Data in R
I class ts
I represents data which has been sampled at equispaced points
in time
I frequency=7: a weekly series
I frequency=12: a monthly series
I frequency=4: a quarterly series
5 / 39
Time Series Data in R
a <- ts(1:20, frequency = 12, start = c(2011, 3))

print(a)
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 2011 1 2 3 4 5 6 7 8 9 10
## 2012 11 12 13 14 15 16 17 18 19 20
str(a)
## Time-Series [1:20] from 2011 to 2013: 1 2 3 4 5 6 7 8 9 10...
attributes(a)
## $tsp
## [1] 2011 2013 12
##
## $class
## [1] "ts"
6 / 39
Outline
Introduction
Online Resources
7 / 39
What is Time Series Decomposition
To decompose a time series into components:

I Trend component: long term trend
I Seasonal component: seasonal variation
I Cyclical component: repeated but non-periodic fluctuations
I Irregular component: the residuals
8 / 39
Data AirPassengers
Data AirPassengers: monthly totals of Box Jenkins international
airline passengers, 1949 to 1960. It has 144(=1212) values.
plot(AirPassengers)
600
500
AirPassengers
400
300
200
100
1950 1952 1954 1956 1958 1960
Time
9 / 39
Decomposition
apts <- ts(AirPassengers, frequency = 12)
f <- decompose(apts)
plot(f$figure, type = "b") # seasonal figures
60
40
20
f$figure
0
20
40
2 4 6 8 10 12
Index
10 / 39
Decomposition
plot(f)
Decomposition of additive time series

500
observed
300
450100
350
trend
250
0 20 40 60 150
seasonal
20 40 60 40
random
0
40
2 4 6 8 10 12
Time
11 / 39
Outline
Introduction
Online Resources
12 / 39
I To forecast future events based on known past data

I For example, to predict the price of a stock based on its past
performance
I Popular models
I Autoregressive moving average (ARMA)
I Autoregressive integrated moving average (ARIMA)
13 / 39
Forecasting
# build an ARIMA model

fit <- arima(AirPassengers, order = c(1, 0, 0), list(order = c(2,
1, 0), period = 12))
fore <- predict(fit, n.ahead = 24)
# error bounds at 95% confidence level
U <- fore$pred + 2 * fore$se
L <- fore$pred - 2 * fore$se
ts.plot(AirPassengers, fore$pred, U, L,
col = c(1, 2, 4, 4), lty = c(1, 1, 2, 2))
legend("topleft", col = c(1, 2, 4), lty = c(1, 1, 2),
c("Actual", "Forecast", "Error Bounds (95% Confidence)"))
14 / 39
Forecasting
Actual
700 Forecast
Error Bounds (95% Confidence)
600
500
400
300
200
100
1950 1952 1954 1956 1958 1960 1962
Time
15 / 39
Outline
Introduction
Online Resources
16 / 39
I To partition time series data into groups based on similarity or

distance, so that time series in the same cluster are similar
I Measure of distance/dissimilarity
I Euclidean distance
I Manhattan distance
I Maximum norm
I Hamming distance
I The angle between two vectors (inner product)
I Dynamic Time Warping (DTW) distance
I ...
17 / 39
Dynamic Time Warping (DTW)
DTW finds optimal alignment between two time series.
library(dtw)
idx <- seq(0, 2 * pi, len = 100)
a <- sin(idx) + runif(100)/10
b <- cos(idx)
align <- dtw(a, b, step = asymmetricP1, keep = T)
dtwPlotTwoWay(align)
1.0
0.5
Query value
0.0
0.5
1.0
0 20 40 60 80 100
Index
18 / 39
Synthetic Control Chart Time Series
I The dataset contains 600 examples of control charts

synthetically generated by the process in Alcock and
Manolopoulos (1999).
I Each control chart is a time series with 60 values.
I Six classes:
I 1-100 Normal
I 101-200 Cyclic
I 201-300 Increasing trend
I 301-400 Decreasing trend
I 401-500 Upward shift
I 501-600 Downward shift
I http://kdd.ics.uci.edu/databases/synthetic_control/synthetic_
control.html
19 / 39
Synthetic Control Chart Time Series
# read data into R sep='': the separator is white space, i.e., one
# or more spaces, tabs, newlines or carriage returns
sc <- read.table("./data/synthetic_control.data", header = F, sep = "")
# show one sample from each class
idx <- c(1, 101, 201, 301, 401, 501)
sample1 <- t(sc[idx, ])
plot.ts(sample1, main = "")
20 / 39
201 101 1
25 30 35 40 45 15 20 25 30 35 40 45 24 26 28 30 32 34 36
0
Six Classes
10
20
30
Time
40
50
60
501 401 301
10 15 20 25 30 35 25 30 35 40 45 0 10 20 30
0
10
20
30
Time
40
50
60
21 / 39
Hierarchical Clustering with Euclidean distance
# sample n cases from every class

n <- 10
s <- sample(1:100, n)
idx <- c(s, 100 + s, 200 + s, 300 + s, 400 + s, 500 + s)
sample2 <- sc[idx, ]
observedLabels <- rep(1:6, each = n)
# hierarchical clustering with Euclidean distance
hc <- hclust(dist(sample2), method = "ave")
plot(hc, labels = observedLabels, main = "")
22 / 39
140
120
100
Height
80
60
2
2
5
6
6
40
2
2
4
2
2
6
2
2
5
6
6
6
4
5
5
2
2
5
5
6
6
5
5
3
1
4
4
4
3
4
4
3
3
1
5
5
3
3
20
3
3
4
4
1
6
6
1
3
3
1
1
1
1
1
1
dist(sample2)
hclust (*, "average")
23 / 39
# cut tree to get 8 clusters

memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 3 1 1 3 2 0 0
## 3 0 0 0 0 0 0 10 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
24 / 39
Hierarchical Clustering with DTW Distance
myDist <- dist(sample2, method = "DTW")

hc <- hclust(myDist, method = "average")
plot(hc, labels = observedLabels, main = "")
# cut tree to get 8 clusters
memb <- cutree(hc, k = 8)
table(observedLabels, memb)
## memb
## observedLabels 1 2 3 4 5 6 7 8
## 1 10 0 0 0 0 0 0 0
## 2 0 4 3 2 1 0 0 0
## 3 0 0 0 0 0 6 4 0
## 4 0 0 0 0 0 0 0 10
## 5 0 0 0 0 0 0 10 0
## 6 0 0 0 0 0 0 0 10
25 / 39
Height
0 200 400 600 800 1000
3
3
3
3
3
3
5
5
3
3
3
3
5
5
5
5
5
5
5
5
6
6
6
4
6
4
4
4
4
4
4
4
myDist
4
4
6
6
6
hclust (*, "average")

6
6
6
1
1
1
1
1
1
1
Hierarchical Clustering with DTW Distance
1
1
1
2
2
2
2
2
2
2
2
2
2
26 / 39
Outline
Introduction
Online Resources
27 / 39

I To build a classification model based on labelled time series
I and then use the model to predict the lable of unlabelled time
series
Feature Extraction
I Singular Value Decomposition (SVD)
I Discrete Fourier Transform (DFT)
I Discrete Wavelet Transform (DWT)
I Piecewise Aggregate Approximation (PAA)
I Perpetually Important Points (PIP)
I Piecewise Linear Representation
I Symbolic Representation
28 / 39
Decision Tree (ctree)
ctree from package party
classId <- rep(as.character(1:6), each = 100)

newSc <- data.frame(cbind(classId, sc))
library(party)
ct <- ctree(classId ~ ., data = newSc,
controls = ctree_control(minsplit = 20,
minbucket = 5, maxdepth = 5))
29 / 39
Decision Tree
pClassId <- predict(ct)

table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 100 0 0 0 0 0
## 2 1 97 2 0 0 0
## 3 0 0 99 0 1 0
## 4 0 0 0 100 0 0
## 5 4 0 8 0 88 0
## 6 0 3 0 90 0 7
# accuracy
(sum(classId == pClassId))/nrow(sc)
## [1] 0.8183
30 / 39
DWT (Discrete Wavelet Transform)
I Wavelet transform provides a multi-resolution representation
using wavelets.
I Haar Wavelet Transform the simplest DWT
http://dmr.ath.cx/gfx/haar/
I DFT (Discrete Fourier Transform): another popular feature

extraction technique
31 / 39
DWT (Discrete Wavelet Transform)
# extract DWT (with Haar filter) coefficients

library(wavelets)
wtData <- NULL
for (i in 1:nrow(sc)) {
a <- t(sc[i, ])
wt <- dwt(a, filter = "haar", boundary = "periodic")
wtData <- rbind(wtData, unlist(c(wt@W, wt@V[[wt@level]])))
}
wtData <- as.data.frame(wtData)
wtSc <- data.frame(cbind(classId, wtData))
32 / 39
Decision Tree with DWT
ct <- ctree(classId ~ ., data = wtSc,

controls = ctree_control(minsplit=20, minbucket=5,
maxdepth=5))
pClassId <- predict(ct)
table(classId, pClassId)
## pClassId
## classId 1 2 3 4 5 6
## 1 98 2 0 0 0 0
## 2 1 99 0 0 0 0
## 3 0 0 81 0 19 0
## 4 0 0 0 74 0 26
## 5 0 0 16 0 84 0
## 6 0 0 0 3 0 97
(sum(classId==pClassId)) / nrow(wtSc)
## [1] 0.8883
33 / 39
plot(ct, ip_args = list(pval = F), ep_args = list(digits = 0))
1
V57
117 > 117

2 9
W43 V57
4 > 4 140 > 140

3 6 11
W5 W31 V57
178 > 178

12 17
W22 W31
8 > 8 6 > 6 6 > 6 15 > 15

14 19
W31 W43
13 > 13 3 >3
Node 4 (n = 68) Node 5 (n = 6) Node 7 (n = 9) Node 8 (n = 86) Node 10 (n = 31) Node 13 (n = 80) Node 15 (n = 9) Node 16 (n = 99) Node 18 (n = 12) Node 20 (n = 103) Node 21 (n = 97)
1 1 1 1 1 1 1 1 1 1 1
0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6 0.6
0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4 0.4
0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
0 0 0 0 0 0 0 0 0 0 0
123456 123456 123456 123456 123456 123456 123456 123456 123456 123456 123456
34 / 39
k-NN Classification
I find the k nearest neighbours of a new instance

I label it by majority voting
I needs an efficient indexing structure for large datasets
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
35 / 39
k-NN Classification
I find the k nearest neighbours of a new instance

I label it by majority voting
I needs an efficient indexing structure for large datasets
k <- 20
newTS <- sc[501, ] + runif(100) * 15
distances <- dist(newTS, sc, method = "DTW")
s <- sort(as.vector(distances), index.return = TRUE)
# class IDs of k nearest neighbours
table(classId[s$ix[1:k]])
##
## 4 6
## 3 17
Results of majority voting: class 6
35 / 39
The TSclust Package
I TSclust: a package for time seriesclustering 3
I measures of dissimilarity between time series to perform time

series clustering.
I metrics based on raw data, on generating models and on the
forecast behavior
I time series clustering algorithms and cluster evaluation metrics
3
http://cran.r-project.org/web/packages/TSclust/
36 / 39
Outline
Introduction
Online Resources
37 / 39
Online Resources
I Chapter 8: Time Series Analysis and Mining, in book

R and Data Mining: Examples and Case Studies
http://www.rdatamining.com/docs/RDataMining.pdf
I R Reference Card for Data Mining
http://www.rdatamining.com/docs/R-refcard-data-mining.pdf
I Free online courses and documents
http://www.rdatamining.com/resources/
I RDataMining Group on LinkedIn (8,000+ members)
http://group.rdatamining.com
I RDataMining on Twitter (1,800+ followers)
@RDataMining
38 / 39
The End
Thanks!
Email: yanchang(at)rdatamining.com
39 / 39

RDataMining Slides Time Series Analysis PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RDataMining Slides Time Series Analysis PDF

Uploaded by

Copyright:

Available Formats

Time Series Analysis with R1

R and Data Mining Workshop @ AusDM 2014

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

I time series data in R

I a free software environment for statistical computing and

a <- ts(1:20, frequency = 12, start = c(2011, 3))

## Time-Series [1:20] from 2011 to 2013: 1 2 3 4 5 6 7 8 9 10...

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

To decompose a time series into components:

1950 1952 1954 1956 1958 1960

Decomposition of additive time series

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

I To forecast future events based on known past data

# build an ARIMA model

1950 1952 1954 1956 1958 1960 1962

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

I To partition time series data into groups based on similarity or

I The dataset contains 600 examples of control charts

# sample n cases from every class

# cut tree to get 8 clusters

myDist <- dist(sample2, method = "DTW")

0 200 400 600 800 1000

hclust (*, "average")

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

Time Series Classification

ctree from package party

classId <- rep(as.character(1:6), each = 100)

pClassId <- predict(ct)

I DFT (Discrete Fourier Transform): another popular feature

# extract DWT (with Haar filter) coefficients

ct <- ctree(classId ~ ., data = wtSc,

117 > 117

4 > 4 140 > 140

178 > 178

8 > 8 6 > 6 6 > 6 15 > 15

I find the k nearest neighbours of a new instance

I find the k nearest neighbours of a new instance

Results of majority voting: class 6

I TSclust: a package for time seriesclustering 3

I measures of dissimilarity between time series to perform time

Time Series Decomposition

Time Series Forecasting

Time Series Clustering

Time Series Classification

I Chapter 8: Time Series Analysis and Mining, in book

You might also like