Professional Documents
Culture Documents
Abstract— Time series produced in complex systems are rong mixing condition, which reflects the short-term
always controlled by multi-level laws, including macroscopic and process in time series.
microscopic laws. These multi-level laws bring on the combination Many methods are proposed to cope with time series fore-
of long-memory effects and short-term irregular fluctuations in
the same series. Traditional analysis and forecasting methods do casting, such as Box-Jenkins [5][6], Neural Networks [7][8],
not distinguish these multi-level influences and always make a Genetic Algorithms [9][10], Kalman filter method [11], etc.
single model for prediction, which has to introduce a lot of These methods generally construct a single model with compli-
parameters to describe the characteristics of complex systems cated parameters on the raw data when describing the complex
and results in the loss of efficiency or accuracy. This paper system, while ignore preprocessing before modeling. However,
goes deep into the structure of series data, decomposes time
series into several simpler ones with different smoothness, and since both long term trends and short term fluctuations coexist
then samples them with multi-scale sizes. After that, each time in the same sequence, it is a dilemma to balance the accuracy
series is modeled and predicated respectively, and their results and the efficiency: discarding mass historic data which may be
are integrated finally. The experimental results on the stock useful for analysis and forecasting will decrease the accuracy,
forecasting show that the method is effective and satisfying, even while giving the same weight to the historic data will increase
for the time series with large fluctuations.
the processing time inevitably.
Index Terms— time series forecasting, complex system, data This paper proposes a new forecasting approach with
decomposing, multi-scale sampling multi-level decomposing and multi-scale sampling. In our
method, the time series are decomposed into several simpler
I. I NTRODUCTION ones which are called as separated-series, then we sample
every new separated-series with diverse scales. The new
Time series is one of the most frequently encountered separated-series can be described with different models or the
forms of data. The forecasting of time series is an important same model with different parameters. Then the forecasting of
topic in data mining since it plays great role in the economic every new separated-series is conducted. The final forecasting
decision making, prevention of natural calamity, and so on. results of the original series can be obtained by integrating
This paper pays more attention to the time series in com- those of separated-series.
plex systems, such as financial data, meteorologic data, etc. The rest of this paper is organized as follows. In Section
Since these data are controlled by multi-level laws, including II, we give several definitions and formulations of time series
macroscopic and microscopic laws, they are more stochastic preprocessing. Section III explains the modeling and forecast-
than other types of data, such as engineering data and physical ing processes. Experimental results of the approach are shown
data. Statisticians, economists and mathematicians have paid in Section IV. The last section offers our conclusion.
much attention to the analysis of time series structure of
complex systems. Most of economists consider that time series II. T IME S ERIES P REPROCESSING
are the components of trends, cycles, seasonal variations and The data produced in complex systems are often influenced
irregular fluctuations [1]. In 1951, Hurst [2] investigated the by multi-level factors, and the influence periods of these
long-term storage capacity of reservoirs and found the long- factors are diverse. Some factors may result in a long-term
memory trait of the hydrology series firstly. After 1980s, evolution, for example, the inherent value mechanism of
researchers found this trait was very common in time series of stock, which is decisive in the long-term trends of stock
different areas [3]. price. While some factors’ durations are much shorter, such
On the other hand, Rosenblatt [4] presents the concept, st- as people’s psychological factors in the stock market, which
——————————————–
∗ This paper is supported by National Natural Science Foundation of China (No. 60402010), Advanced Research Project of China Defense Ministry (No.
413150804), and partially supported by the Aerospace Research Foundation (No. 2003-HT-ZJDX-13).
Congfu Xu is the correspondent author.
may always cause short-term fluctuations. This requests c) Evaluation of X(j), 1 < j < m: Elements of
multi-scale models which can deal with not only long-term time series X(j) can be gotten by subtracting elements of
trends forecasting but also short-term fluctuations prediction. X(1), X(2), . . . , X(j−1) from X with the same orders. Using
In this paper, original time series are decomposed into smoothness-factor lj , the p-th element in X(j) can be calculated
several ones with different cycles. The traits of the new time by
series are more visible and the new time series are easier 1 P p
to be modeled. After decomposing, each new time series is xt , p ≤ lj + 1
p q=1 q
sampled with different cycle which can bring the gain of x(j)tp =
1 P p j−1
efficiency without decreasing the accuracy. This section gives P
( xtq − x(k)tq ), p > lj + 1
the formulation of the approach. lj q=p−lj k=1
X(j) is named as separated-series which is equinumerous to Therefore, the original time series is decomposed into
X, and the sampling time of the same order as X is the same. several ones with different traits. The number of X(j) can
be determined by experiments, in most cases, 3 to 5 may
B. Time Series Decomposing be appropriate. The separated-series will turn from smooth to
coarse with the increment of j.
In this subsection we present a method to decompose
time series into several separated-series satisfying equation C. Multi-Scale Sampling
(1). This method, which we name as multi-smoothness-factor
decomposing, is an inductive process as below: A typical decomposing result is shown in fig. 1. The orig-
a) Set smoothness-factor array: Given an array inal time series is decomposed into three separated-series by
two smoothness factors. We can find that the separated series
L = {l1 , l2 , . . . , lm−1 } turn from smooth to coarse and the periods of fluctuations
become shorter with the increment of j. For the smoother
where li ∈ N, 1 ≤ i ≤ m − 1, L is a monotonically separated-series, since they reflect the extending of long-term
decreasing sequence, for instance, a binary-base exponent trends, their long-term history must be considered in modeling.
array like {26 , 24 , 22 , 21 }. L can be determined by experiments However, each point in the smoother separated-series carries
or experiences. less information than that in the coarse ones. It is a waste
b) Evaluation of X(1): By using smoothness-factor l1 , of time if the smoother separated-series are sampled with
the p-th element in X(1) can be calculated by standard sampling cycle δ. Multi-scale sampling has been
p
1 P discussed in recent years [12]. The essence is that the new
p xtq , p ≤ l1 + 1 data are sampled with high frequency and the older data with
q=1
x(1)tp = 1 P p lower frequency in the same time series. This seems simple,
xt , p > l1 + 1 but the variable measurements always make an extra trouble
l1 q=p−l1 q
for the later processes.
15
X(1)
X(2) B. Construct Forecasting Models
X(3)
15 Observed data
Forecasting data
14
13
12
10
15 15
Fig. 5. Fitting curve of errors
V. C ONCLUSION
10 10
In this paper, we propose a new approach for time series
analysis and forecasting. Through multi-level decomposing,
the complex time series turns to be a series of simpler
5 5
100 200 300 400 500 100 200 300 400 500 ones, then the separated-series are sampled according to their
smoothness and modeled respectively. Treating the long-term
5 5 trends and short-term fluctuations with different weight bring
on gains of accuracy and efficiency.
R EFERENCES
0 0
[1] B. L. Bowerman and R. T.O’Connell, Forecasting and Time Series: An
Applied Approach, 3rd ed. Florence, KY: Tomson Learning, 1993.
[2] H. Hurst, “Long-term storage capcity of reservoirs,” Trans. of the
−5 −5 American Society of Civil Engineers, vol. 116, pp. 770–779, 1951.
100 200 300 400 500 100 200 300 400 500
[3] Z. Shi-ying and F. Zhi, Cointegration Theory and SV Models, 1st ed.
Tsinghua University, 2004.
Fig. 3. The original time series and the separated-series [4] M. Rosenblatt, “A central limit theorem and a strong mixing condition,”
in In Proc. of the National Academy of Sciences, vol. 42, 1956, pp.
43–47.
The curve of relative errors is shown in fig. 5. The mean [5] S. I. Wu and R.-P. Lu, “Combining artificial neural networks and
of relative errors is 0.0105, and 92.83% relative errors are statistics for stock market forecasting,” in In Proc. of ACM Conference
smaller than 3%. Compared with the approaches which are on Computer Science, May 1993, pp. 257–264.
[6] S.-J. Huang and K.-R. Shih, “Short-term load forecasting via ARMA
used for modeling with fixed parameters such as in [14][15], model identification including non-gaussian process considerations,”
our approach gets better forecasting results. IEEE Trans. on Power Systems, vol. 18, no. 2, pp. 673–679, May 2003.
[7] R. S. T. Lee, “iJADE stock advisor: an intelligent agent based stock [12] E. Palpanas, M. Vlachos, and E. J. K. et al, “Online amnesic approxi-
prediction system using hybrid RBF recurrent network,” IEEE Trans. on mation of streaming time series,” in In IEEE Conf. on ICDE, 2004, pp.
Systems, Man, and Cybernetics, vol. 34, no. 3, pp. 421–428, May 2004. 338–349.
[8] D. V. P. Emad, W. Saad, I. Donald, and C. Wunsch, “Comparative study [13] B. Steve and O. Cyril, “A comparison of box-jenkins and objective
of stock trend prediction using time delay, recurrent and probabilistic methods for determining the order of a nonseasonal arma model,”
neural networks,” IEEE Trans. on Neural Networks, vol. 9, no. 6, pp. Journal of Forecasting, vol. 13, pp. 419–434, 1994.
1456–1470, 1998. [14] B. R. Chang, “A study of non-periodic short-term random walk forecast-
[9] H. Iba and T. Sasaki, “Using genetic programming to predict financial ing based on RBFNN, ARMA, or SVR-GM(1,1—/sql tau/)approach,” in
data,” in In IEEE Proc. of Congress on Evolutionary Computation, JUL. In IEEE Conf. on Systems, Man, and Cybernetics, 2003.
1999, pp. 244–251. [15] Y.-F. Wang, “On-demand forecasting of stock prices using a real-time
[10] S. H. Ling, F. H. F. Leung, H. K. Lam, and P. K. S. Tam, “Short-term predictor,” IEEE Trans. on Knowlege and Data Engineering, vol. 15,
electric load forecasting based on a neural fuzzy network,” IEEE Trans. no. 4, pp. 1033–1037, 2003.
on Industrial Electronics, vol. 50, no. 6, pp. 1305–1316, 2003.
[11] D. McGonigal and D. Ionescu, “An outline for a kalman filter and recur-
sive parameter estimation approach applied to stock market forecasting,”
in In IEEE Conf. on Electrical and Computer Engineering, vol. 2, SEP.
1995, pp. 1148–1151.