You are on page 1of 5

A Novel Time Series Forecasting Approach with

Multi-Level Data Decomposing and Modeling∗


Xuemei Han, Congfu Xu, Huifeng Shen and Yunhe Pan
College of Computer Science,
Zhejiang University,
Hangzhou, 310027, P.R.China
zjuwendy@yahoo.com.cn, xucongfu@cs.zju.edu.cn
useraddcn@yahoo.com.cn, panyh@zju.edu.cn

Abstract— Time series produced in complex systems are rong mixing condition, which reflects the short-term
always controlled by multi-level laws, including macroscopic and process in time series.
microscopic laws. These multi-level laws bring on the combination Many methods are proposed to cope with time series fore-
of long-memory effects and short-term irregular fluctuations in
the same series. Traditional analysis and forecasting methods do casting, such as Box-Jenkins [5][6], Neural Networks [7][8],
not distinguish these multi-level influences and always make a Genetic Algorithms [9][10], Kalman filter method [11], etc.
single model for prediction, which has to introduce a lot of These methods generally construct a single model with compli-
parameters to describe the characteristics of complex systems cated parameters on the raw data when describing the complex
and results in the loss of efficiency or accuracy. This paper system, while ignore preprocessing before modeling. However,
goes deep into the structure of series data, decomposes time
series into several simpler ones with different smoothness, and since both long term trends and short term fluctuations coexist
then samples them with multi-scale sizes. After that, each time in the same sequence, it is a dilemma to balance the accuracy
series is modeled and predicated respectively, and their results and the efficiency: discarding mass historic data which may be
are integrated finally. The experimental results on the stock useful for analysis and forecasting will decrease the accuracy,
forecasting show that the method is effective and satisfying, even while giving the same weight to the historic data will increase
for the time series with large fluctuations.
the processing time inevitably.
Index Terms— time series forecasting, complex system, data This paper proposes a new forecasting approach with
decomposing, multi-scale sampling multi-level decomposing and multi-scale sampling. In our
method, the time series are decomposed into several simpler
I. I NTRODUCTION ones which are called as separated-series, then we sample
every new separated-series with diverse scales. The new
Time series is one of the most frequently encountered separated-series can be described with different models or the
forms of data. The forecasting of time series is an important same model with different parameters. Then the forecasting of
topic in data mining since it plays great role in the economic every new separated-series is conducted. The final forecasting
decision making, prevention of natural calamity, and so on. results of the original series can be obtained by integrating
This paper pays more attention to the time series in com- those of separated-series.
plex systems, such as financial data, meteorologic data, etc. The rest of this paper is organized as follows. In Section
Since these data are controlled by multi-level laws, including II, we give several definitions and formulations of time series
macroscopic and microscopic laws, they are more stochastic preprocessing. Section III explains the modeling and forecast-
than other types of data, such as engineering data and physical ing processes. Experimental results of the approach are shown
data. Statisticians, economists and mathematicians have paid in Section IV. The last section offers our conclusion.
much attention to the analysis of time series structure of
complex systems. Most of economists consider that time series II. T IME S ERIES P REPROCESSING
are the components of trends, cycles, seasonal variations and The data produced in complex systems are often influenced
irregular fluctuations [1]. In 1951, Hurst [2] investigated the by multi-level factors, and the influence periods of these
long-term storage capacity of reservoirs and found the long- factors are diverse. Some factors may result in a long-term
memory trait of the hydrology series firstly. After 1980s, evolution, for example, the inherent value mechanism of
researchers found this trait was very common in time series of stock, which is decisive in the long-term trends of stock
different areas [3]. price. While some factors’ durations are much shorter, such
On the other hand, Rosenblatt [4] presents the concept, st- as people’s psychological factors in the stock market, which
——————————————–
∗ This paper is supported by National Natural Science Foundation of China (No. 60402010), Advanced Research Project of China Defense Ministry (No.
413150804), and partially supported by the Aerospace Research Foundation (No. 2003-HT-ZJDX-13).
Congfu Xu is the correspondent author.
may always cause short-term fluctuations. This requests c) Evaluation of X(j), 1 < j < m: Elements of
multi-scale models which can deal with not only long-term time series X(j) can be gotten by subtracting elements of
trends forecasting but also short-term fluctuations prediction. X(1), X(2), . . . , X(j−1) from X with the same orders. Using
In this paper, original time series are decomposed into smoothness-factor lj , the p-th element in X(j) can be calculated
several ones with different cycles. The traits of the new time by
series are more visible and the new time series are easier 1 P p
to be modeled. After decomposing, each new time series is  xt , p ≤ lj + 1
p q=1 q


sampled with different cycle which can bring the gain of x(j)tp =
1 P p j−1
efficiency without decreasing the accuracy. This section gives P
 ( xtq − x(k)tq ), p > lj + 1


the formulation of the approach. lj q=p−lj k=1

A. Definitions d) Evaluation of X(m): Elements of time series X(m)


can be gotten by subtracting elements of X(1), X(2), . . . ,
We start by giving some definitions related to time series
X(m − 1) from X with the same orders.
which may be convenient for describing our approach.
Algorithm 1 shows the process of evaluating x(j)tp .
Definition 1 (standard sampling cycle): Standard sampl-
ing cycle, which is denoted as δ, is the original interval of
recordings or statistics. This paper assumes that δ is invariable Algorithm 1: Evaluate-Elements-of-X(j)
for a given system, and the other sampling cycles are all Data: X, L
integral multiple of δ. Result: x(j)tp , 1 ≤ j ≤ m
Definition 2 (operator ⊕ and separated-series): T is an begin
ordered set {t1 , t2 , . . . , tn , . . . } with ti+1 − ti = δ, i ∈ x(1)tp ←− xtp
N, and 1 ≤ i ≤ n. A time series X = {xti , t ∈ T } can be for j = 1 to m − 1 do
decomposed to several time series and this operation is denoted if p ≤ lj + 1 then
as ⊕ x(j)tp ←− (x(j)tp−1 ∗ (p − 1) + xtp )/p
else
X = X(1) ⊕ X(2) ⊕ . . . ⊕ X(m), (1)
x(j)tp ←− (x(j)tp−1 ∗lj −x(j)tp−lj −1 +xtp )/lj
and the i-th position of the sequence X(j) satisfies xtp ←− xtp − x(j)tp
x(m)tp ←− xtp
m
X end
x(j)ti = xti , 1 ≤ j ≤ m.
j=1

X(j) is named as separated-series which is equinumerous to Therefore, the original time series is decomposed into
X, and the sampling time of the same order as X is the same. several ones with different traits. The number of X(j) can
be determined by experiments, in most cases, 3 to 5 may
B. Time Series Decomposing be appropriate. The separated-series will turn from smooth to
coarse with the increment of j.
In this subsection we present a method to decompose
time series into several separated-series satisfying equation C. Multi-Scale Sampling
(1). This method, which we name as multi-smoothness-factor
decomposing, is an inductive process as below: A typical decomposing result is shown in fig. 1. The orig-
a) Set smoothness-factor array: Given an array inal time series is decomposed into three separated-series by
two smoothness factors. We can find that the separated series
L = {l1 , l2 , . . . , lm−1 } turn from smooth to coarse and the periods of fluctuations
become shorter with the increment of j. For the smoother
where li ∈ N, 1 ≤ i ≤ m − 1, L is a monotonically separated-series, since they reflect the extending of long-term
decreasing sequence, for instance, a binary-base exponent trends, their long-term history must be considered in modeling.
array like {26 , 24 , 22 , 21 }. L can be determined by experiments However, each point in the smoother separated-series carries
or experiences. less information than that in the coarse ones. It is a waste
b) Evaluation of X(1): By using smoothness-factor l1 , of time if the smoother separated-series are sampled with
the p-th element in X(1) can be calculated by standard sampling cycle δ. Multi-scale sampling has been
 p
1 P discussed in recent years [12]. The essence is that the new


p xtq , p ≤ l1 + 1 data are sampled with high frequency and the older data with
q=1
x(1)tp = 1 P p lower frequency in the same time series. This seems simple,

 xt , p > l1 + 1 but the variable measurements always make an extra trouble
l1 q=p−l1 q

for the later processes.
15
X(1)
X(2) B. Construct Forecasting Models
X(3)

Neural Networks, Genetic Algorithms, etc., can be em-


10
ployed for the separated-series modeling. Different models
or one model with different parameters can be adopted for
different separated-series. Here we exemplify the modeling and
forecasting process by using Box-Jenkins method.
5
A stationary time series {yk } of mean zero can be taken
for responses of linear time-invariant random system with the
input of white noise. Then yk satisfies the difference equation
0
p
X q
X
yk = φi yk−i + εk − θj εk−j . (2)
i=1 j=1
−5
50 100 150 200 250 300 350 400 450 500 Pp
which is denoted as ARMA(p, q), where i=1 φi yk−i is
Fig. 1. A typical figure of separated-series. the
Pq weighted sum of the most recent p responses, and
θ ε
j=1 j k−j is the weighted sum of the recent q white noise.
The estimation of φ1 , φ2 , . . . φp , θ1 , θ2 , . . . θq and σε2
This paper proposes a new multi-scale sampling method. can be obtained by the least squares method or maximum
We sample the same separated-series with the constant fre- likelihood estimating method, whereafter, yk can be predicated
quency, while the sampling frequencies are variable when by equation 2. Details of constructing ARMA model can be
handling the different separated-series. Let the ordered set found in [3].
S={s1 , s2 , . . . , sm } be the sampling cycle array, where sj (1 ≤ Assume that there are H sampling points in Ẋ(j), then
j ≤ m) is the sampling cycle of X(j) and the elements of S the processes of modeling are:
are monotonically decreasing. To be convenient for calculation, 1) Select the most recent h sampling points, h ≤ H. If
we choose sj−1 /sj ∈ N. The new time series sampling from Ẋ(j) is not a stationary sequence, it must be transformed
X(j) is denoted as Ẋ(j). to be a stationary one by first or multistage difference
method .
III. M ODELING AND FORECASTING 2) Construct an ARMA model for the sampling points with
appropriate parameters. The discussion of determining
In the following subsections, we show the process of
the order of ARMA model can be seen in [13]. The
modeling and forecasting. The process of new data entering
reasonability of the model should be tested through the
and other details related to modeling and forecasting are also
Box Pierce test [3].
discussed.
3) Predict the value of next moment by using equation 2.
4) Add new data and compare the difference of new data
A. New Data Entering with the prediction of the step 3. The residuals should
The time series analysis and forecasting is always an online be calculated.
process. New data will enter continuously in the process of The model should be verified periodically. For the time-
forecasting, and they should be processed at once. Algorithm 2 variant complex system, the parameters may not be appropriate
shows the update procedure of sampled separated-series when for prediction any longer with the passage of time. When the
new data entering. residuals can not satisfy presumable assumptions, the most
recent h sampling points are selected and the model is updated
Algorithm 2: New-Data-In through step 1 and step 2.

Data: Sampled separated-series{Ẋ(j)}, xtk , L, S, C. Result Integration


{CountS[j]} is set of sampling counters. Now the prediction of each separated-series is gotten. The
Result: {Ẋ ′ (j)} with new data, 1 ≤ j ≤ m sampling cycle of each Ẋ(j) is different, the prediction cycle
begin turns shorter as the sampling frequency goes higher. They are
Evaluate-Elements-of -X(j) very useful, for example, from the prediction of the longer
for j = 1 to m do sampling cycle, the potential trend of the time series can be
CountS[j] ←− CountS[j] + 1
seen. But for the short-term accurate prediction, the results of
if CountS[j] mod S[j] = 0 then
all the separated-series must be integrated to obtain the final
Ẋ(j) ←− Ẋ(j) + {x(j)tk }
forecasting result.
end As shown in fig. 2, the final forecasting results should be
integrated from those of all the frequency sampling sequences.
Assuming to predict the result of the moment τ , the prediction
16

15 Observed data

Forecasting data
14

13

12

Fig. 2. Relations between multi-scale frequency series 11

10

of τ of each separated-series should be obtained according to 8

equation 1. Since the sampling cycles are different, not all 7

the predictions of τ can be obtained directly, but the nearest 6

observing and forecasting are available. Since these cases most 5


50 100 150 200 250 300 350 400 450 500

happen on the smoother separated-series, the absent predic-


tions can be calculated by polynomial interpolation simply. Fig. 4. The fitting and forecasting curve of Minsheng Bank stock
Therefore, we get the final result.
0.07

IV. E XPERIMENTS AND R ESULTS


0.06
Stock data are chosen to validate our approach in this
paper. Some forecasting results are demonstrated in Figures 0.05
3, 4 and 5. We choose 530 days close prices of Minsheng
Bank for modeling and forecasting. The smoothness array 0.04
L = {60, 15} , which decompose the original time series into
three separated-series. As fig. 3 shows, the top left subplot 0.03
is the fitting curve of the original time series, and the other
subplots are the fitting curves of separated-series. 0.02
The sampling cycle array S is chosen as {10, 5, 1} accord-
ing to the smoothness of each separated-series. The final fitting 0.01
result is shown in fig. 4 (Close price as the vertical axis and
time as the horizontal axis). 0
50 100 150 200 250 300 350 400 450 500

15 15
Fig. 5. Fitting curve of errors

V. C ONCLUSION
10 10
In this paper, we propose a new approach for time series
analysis and forecasting. Through multi-level decomposing,
the complex time series turns to be a series of simpler
5 5
100 200 300 400 500 100 200 300 400 500 ones, then the separated-series are sampled according to their
smoothness and modeled respectively. Treating the long-term
5 5 trends and short-term fluctuations with different weight bring
on gains of accuracy and efficiency.
R EFERENCES
0 0
[1] B. L. Bowerman and R. T.O’Connell, Forecasting and Time Series: An
Applied Approach, 3rd ed. Florence, KY: Tomson Learning, 1993.
[2] H. Hurst, “Long-term storage capcity of reservoirs,” Trans. of the
−5 −5 American Society of Civil Engineers, vol. 116, pp. 770–779, 1951.
100 200 300 400 500 100 200 300 400 500
[3] Z. Shi-ying and F. Zhi, Cointegration Theory and SV Models, 1st ed.
Tsinghua University, 2004.
Fig. 3. The original time series and the separated-series [4] M. Rosenblatt, “A central limit theorem and a strong mixing condition,”
in In Proc. of the National Academy of Sciences, vol. 42, 1956, pp.
43–47.
The curve of relative errors is shown in fig. 5. The mean [5] S. I. Wu and R.-P. Lu, “Combining artificial neural networks and
of relative errors is 0.0105, and 92.83% relative errors are statistics for stock market forecasting,” in In Proc. of ACM Conference
smaller than 3%. Compared with the approaches which are on Computer Science, May 1993, pp. 257–264.
[6] S.-J. Huang and K.-R. Shih, “Short-term load forecasting via ARMA
used for modeling with fixed parameters such as in [14][15], model identification including non-gaussian process considerations,”
our approach gets better forecasting results. IEEE Trans. on Power Systems, vol. 18, no. 2, pp. 673–679, May 2003.
[7] R. S. T. Lee, “iJADE stock advisor: an intelligent agent based stock [12] E. Palpanas, M. Vlachos, and E. J. K. et al, “Online amnesic approxi-
prediction system using hybrid RBF recurrent network,” IEEE Trans. on mation of streaming time series,” in In IEEE Conf. on ICDE, 2004, pp.
Systems, Man, and Cybernetics, vol. 34, no. 3, pp. 421–428, May 2004. 338–349.
[8] D. V. P. Emad, W. Saad, I. Donald, and C. Wunsch, “Comparative study [13] B. Steve and O. Cyril, “A comparison of box-jenkins and objective
of stock trend prediction using time delay, recurrent and probabilistic methods for determining the order of a nonseasonal arma model,”
neural networks,” IEEE Trans. on Neural Networks, vol. 9, no. 6, pp. Journal of Forecasting, vol. 13, pp. 419–434, 1994.
1456–1470, 1998. [14] B. R. Chang, “A study of non-periodic short-term random walk forecast-
[9] H. Iba and T. Sasaki, “Using genetic programming to predict financial ing based on RBFNN, ARMA, or SVR-GM(1,1—/sql tau/)approach,” in
data,” in In IEEE Proc. of Congress on Evolutionary Computation, JUL. In IEEE Conf. on Systems, Man, and Cybernetics, 2003.
1999, pp. 244–251. [15] Y.-F. Wang, “On-demand forecasting of stock prices using a real-time
[10] S. H. Ling, F. H. F. Leung, H. K. Lam, and P. K. S. Tam, “Short-term predictor,” IEEE Trans. on Knowlege and Data Engineering, vol. 15,
electric load forecasting based on a neural fuzzy network,” IEEE Trans. no. 4, pp. 1033–1037, 2003.
on Industrial Electronics, vol. 50, no. 6, pp. 1305–1316, 2003.
[11] D. McGonigal and D. Ionescu, “An outline for a kalman filter and recur-
sive parameter estimation approach applied to stock market forecasting,”
in In IEEE Conf. on Electrical and Computer Engineering, vol. 2, SEP.
1995, pp. 1148–1151.

You might also like