You are on page 1of 46

Environmental Data Analysis with MatLab Lecture 18:

Cross-correlation

SYLLABUS
Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Using MatLab Looking At Data Probability and Measurement Error Multivariate Distributions Linear Models The Principle of Least Squares Prior Information Solving Generalized Least Squares Problems Fourier Series Complex Fourier Series Lessons Learned from the Fourier Transform Power Spectral Density Filter Theory Applications of Filters Factor Analysis Orthogonal functions Covariance and Autocorrelation Cross-correlation Smoothing, Correlation and Spectra Coherence; Tapering and Spectral Analysis Interpolation Hypothesis testing Hypothesis Testing continued; F-Tests Confidence Limits of Spectra, Bootstraps

purpose of the lecture

generalize the idea of autocorrelation to multiple time series

Review of last lecture


autocorrelation correlations between samples within a time series

high degree of short-term correlation

what ever the river was doing yesterday, its probably doing today, too because water takes time to drain away

Neuse River Hydrograph


x 10
discharge, cfs
4

A) time series, d(t)

2 1

d(t), cfs

0
PSD, (cfs)2 per cycle/day

0 x 10
9

500

1000

1500

time t, days

2000 2500 time, days

3000

3500

4000

8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.04 0.045 0.05

low degree of intermediate-term correlation

what ever the river was doing last month, today it could be doing something completely different because storms are so unpredictable

Neuse River Hydrograph


x 10
discharge, cfs
4

A) time series, d(t)

2 1

d(t), cfs

0
PSD, (cfs)2 per cycle/day

0 x 10
9

500

1000

1500

time t, days

2000 2500 time, days

3000

3500

4000

8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.04 0.045 0.05

moderate degree of long-term correlation

what ever the river was doing this time last year, its probably doing today, too because seasons repeat

Neuse River Hydrograph


x 10
discharge, cfs
4

A) time series, d(t)

2 1

d(t), cfs

0
PSD, (cfs)2 per cycle/day

0 x 10
9

500

1000

1500

time t, days

2000 2500 time, days

3000

3500

4000

8 6 4 2 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 frequency, cycles per day 0.04 0.045 0.05

1 day
2.5
discharge lagged by 1 days

3 days
2.5
discharge lagged by 3 days

30 days
2.5
discharge lagged by 30 days

x 10

x 10

x 10

2 1.5 1 0.5 0

2 1.5 1 0.5 0

2 1.5 1 0.5 0

0.5

1 1.5 discharge

2.5 x 10
4

0.5

1 1.5 discharge

2.5 x 10
4

0.5

1 1.5 discharge

2.5 x 10
4

Autocorrelation Function

autocorrelation

x 10 5 0 -30 x 10 5 0 -5

-20
6

-10

0 lag, days

10

20

30

autocorrelation

-3000

-2000

-1000

0 lag, days

1 3

1000

2000

3000

30

formula for covariance

formula for autocorrelation

autocorrelation at lag (k-1)t

autocorrelation similar to convolution

autocorrelation similar to convolution

note difference in sign

autocorrelation in MatLab

Important Relation #1 autocorrelation is the convolution of a time series with its time-reversed self

Important Relationship #2 Fourier Transform of an autocorrelation is proportional to the Power Spectral Density of time series

End of Review

Part 1 correlations between time-series

scenario

discharge correlated with rain but discharge is delayed behind rain because rain takes time to drain from the land

dischagre, m3/s

rain, mm/day

time, days

time, days

rain, mm/day

dischagre, m3/s

time, days rain ahead of discharge

time, days

rain, mm/day

dischagre, m3/s

time, days shape not exactly the same, either

time, days

treat two time series u and v probabilistically


p.d.f.

p(ui, vi+k-1)
with elements lagged by time

(k-1)t
and compute its covariance

this defines the cross-correlation

just a generalization of the auto-correlation

different times in different time series

different times in the same time series

like autocorrelation, similar to convolution

As with auto-correlation two important properties


#1: relationship to convolution

#2: relationship to Fourier Transform

As with auto-correlation two important properties


#1: relationship to convolution

#2: relationship to Fourier Transform

cross-spectral density

cross-correlation in MatLab

Part 2

aligning time-series a simple application of cross-correlation

central idea
two time series are best aligned at the lag at which they are most correlated, which is the lag at which their cross-correlation is maximum

two similar time-series, with a time shift


(this is simple test or synthetic dataset)

u(t)
0

v(t)
-1 10 1 0 20 30 40 50 60 70 80 90 100

cross-correlate
cross-correlation

5 0 -5 -20 -10 0 time 10 20

find maximum
maximum
cross-correlation

5 0 -5 -20 -10 0 time 10 20

time lag

In MatLab

In MatLab
compute crosscorrelation

In MatLab
compute crosscorrelation
find maximum

In MatLab
compute crosscorrelation
find maximum

compute time lag

align time series with measured lag 0


-1 10 1 20 30 40 50 60 70 80 90 100

u(t)
0

v(t+tlag)
-1 10 20 30 40 50 60 70 80 90 100

solar insolation and ground level ozone


(this is a real dataset from West Point NY)
A)

solar, W/m2

500 0
B)

8 time, days

10

12

14

ozone, ppb

100 50 0 2 4 6 8 time, days 10 12 14

W/m2

500

solar insolation and ground level ozone


solar, W/m2

500 0
B)

8 time, days

10

12

14

ozone, ppb

100 50 0 2 4 6 8 time, days 10 12 14

W/m2

note time lag


500

4
cross-correlation

x 10

6 C)

maximum

3 2 1 0 -10 -5 0 5 time, hours 10

time lag 3 hours

solar radiation, W/m2

A)

500

0.5
B)

1.5

2.5 3 time, days 3.00 hour lag

3.5

4.5

ozone, ppb

100 50 0

original delagged

0.5

1.5

2.5 time, days

3.5

4.5

You might also like