You are on page 1of 13

PHYSA: 11049

+ Model

pp. 113

(col. g: NIL)

ARTICLE IN PRESS

Physica A xx (xxxx) xxxxxx www.elsevier.com/locate/physa

Tai-Liang Chen a, , Ching-Hsue Cheng a , Hia-Jong Teoh a,b


Yunlin 640, Taiwan, ROC

a Department of Information Management, National Yunlin University of Science and Technology, 123, Section 3, University Road, Touliu, b Department of Accounting Information, Ling Tung University, 1, Ling Tung Road, Nantun, Taichung 408, Taiwan, ROC

Received 30 May 2007; received in revised form 5 September 2007

PR OO

High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets

1 2 3

Abstract

RR

Stock investors usually make their short-term investment decisions according to recent stock information such as late market news, yesterdays technical analysis reports, and the price uctuations in these two days. To reect these short-term factors which impact stock price, this paper proposes a comprehensive fuzzy time-series, which factors linear relationships between recent periods of stock prices and fuzzy logical relationships (nonlinear relationships) mined from time-series into forecasting processes. In empirical analysis, the TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) and HSI (Heng Seng Index) are employed as experimental datasets, and four recent fuzzy time-series models, Chens (1996), Yus (2005), Chengs (2006) and Chens (2007), are used as comparison models. Besides, to compare with conventional statistic method, the method of least squares is utilized to estimate the auto-regressive models of the testing periods within the databases. From analysis results, the performance comparisons indicate that the multi-period adaptation model, proposed in this paper, can effectively improve the forecasting performance of conventional fuzzy time-series models which only factor fuzzy logical relationships in forecasting processes. From the empirical study, the traditional statistic method and the proposed model both reveal that stock price patterns in the Taiwan stock and Hong Kong stock markets are short-term. c 2007 Published by Elsevier B.V.

TE

4 5 6 7 8 9 10 11 12 13 14 15 16

Keywords: High-order fuzzy time-series; Multi-period adaptation model; Stock index forecasting
17

1. Introduction

CO

EC

18

UN

Time-series models have utilized the fuzzy theory to solve various domain forecasting problems, such as university enrollment forecasting [14], stock price forecasting [514] and temperature forecasting [15]. In the area of stock price forecasting, Huarng (2001) provided heuristic models [5] from stock price time-series to improve forecasting performance by integrating problem-specic heuristic knowledge with Chens (1996) model, which was proposed to forecasting university enrollment. In the following research, an N th-order heuristic fuzzy time-series model was proposed by Huarng (2003) to forecasting the TAIEX [7]. Additionally, the researcher has found that the length of
Corresponding author. Tel.: +886 920975168.

19 20 21 22 23 24

E-mail address: g9320817@yuntech.edu.tw (T.-L. Chen). 0378-4371/$ - see front matter c 2007 Published by Elsevier B.V. doi:10.1016/j.physa.2007.10.004 Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

PHYSA: 11049

ARTICLE IN PRESS
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx

40

2. Fuzzy time-series

42 43 44 45 46

47 48 49

Denition 1. Y (t ) (t = . . . 0, 1, 2, . . .) is a subset of a real number. Let Y (t ) be the universe of discourse dened by the fuzzy set f i (t ). If F (t ) consists of Fi (t ) (i = 1, 2, . . .), F (t ) is dened as a fuzzy time-series on Y (t ) (t = . . . , 0, 1, 2, . . .).
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

41

Fuzzy theory, the modern concept of uncertainty, was introduced by Zadeh (1975) to deal with linguistic terms [19 21]. The membership in a fuzzy set is not a matter of afrmation or denial, but rather a matter of degree. Nowadays fuzzy theory is vigorously studied in expert system, approximate reasoning, controls, pattern recognition, database, and information retrieval systems, etc. Time-series models had failed to consider the application of this theory until fuzzy time-series was dened by Song and Chissom [22,23]. The denitions and processes of the fuzzy time-series presented by Song and Chissom (1993) are described as follows.

CO

intervals for the universe of discourse, will affect forecasting results and proposed two linguistic interval partitioning approaches, distribution-based and average-based length, to approach this issue [6]. Another researcher, Yu (2005), proposed a weighted method to forecasting the TAIEX to tackle two issues, recurrence and weighting, in fuzzy time-series forecasting [10]. Yu argued that recurrent fuzzy relationships, which were simply ignored in Chens (1996) studies, should be considered in forecasting and recommended that different weights should be assigned to various fuzzy relationships. In the following research, a trend-weighted model [12] was proposed by Cheng (2006) to echo Yus research. From the literature above, mining fuzzy logical relationships (FLR) from time-series is considered as one of the important factors inuencing the forecasting accuracy of fuzzy time-series models. Therefore, in recent research, some advanced algorithms such as genetic algorithms [4] and neural networks [9] are applied to improve this process. Chen (2006) proposed a new method, by using genetic algorithms, to deal with the university enrollment forecasting problems based on high-order fuzzy time-series, where the length of each interval in the universe of discourse is tuned [4]. Besides, Yu (2006) applied a backpropagation neural network to handle nonlinear forecasting problems in stock price forecasting [9].Two models, a basic model and a hybrid model, using a neural network approach were proposed to forecast the TAIEX. Although these fuzzy time-series models using advanced algorithms have made great improvements on forecasting performance, only nonlinear relationships such as fuzzy logical relationships are most concerned in forecasting and the process of mining fuzzy logical relationships is not easily understandable just like a black box. Besides, stock market investors usually make their short-term decisions based on the latest stock information such as yesterdays market news and price uctuations in these two days. Therefore, in stock price forecasting, we argue that the linear relationships between recent periods of stock prices should be factored in forecasting models besides fuzzy logical relationships [1214]. In this paper, we propose a high-order fuzzy time-series model based on a multi-period adaptation model [16], which is derived from the adaptive expectation model [17], to promote forecasting accuracy. In empirical analysis, we employ three stock databases, TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index), and HSI (Heng Seng Index), from 1991 to 1999, as experimental datasets. And four recent fuzzy time-series models, Chens (1996), Yus (2005), Chengs (2006), and Chens (2007), are employed as comparison models. The comparisons show that the proposed model outperforms these conventional models, Chens (1996), and Yus (2005) which only mine fuzzy logical relationships from time-series. Additionally, to verify the proposed model, we use a statistic method, the method of least squares [18], to estimate auto-regressive models for the nine year of stock index within each stock database. The estimated results are employed to check the time lags against the order recommended by the proposed model. This verication shows that they are highly consistent. Based on the empirical analysis, two major conclusions are given: (1) the multi-period adaptation model can effectively improve the forecasting performance of conventional fuzzy time-series; and (2) the price change patterns in the two stock markets, TAIEX and HSI, are short-term. The remaining of this paper is organized as follows: Section 2 introduces fuzzy time-series model; Section 3 introduces the proposed model and algorithm; Section 4 is empirical analysis and model comparisons; and, in the last section, ndings and concluding remarks are given.

RR

EC

TE

PR OO

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx 3
1 2 3

Denition 2. If there exists a fuzzy logical relationship R (t 1, t ), such that F (t ) = F (t 1) R (t 1, t ), where represents an operation, then F (t ) is said to be caused by F (t 1). The logical relationship between F (t ) and F (t 1) is F (t 1) F (t ).

Denition 3. Let F (t 1) = Ai and F (t ) = A j . The relationship between two consecutive observations, F (t ) and F (t 1), referred to as a fuzzy logical relationship (FLR), can be denoted by Ai A j , where Ai is called the Left-Hand Side (LHS) and A j the Right-Hand Side (RHS) of the FLR. Denition 4. All fuzzy logical relationships in the training dataset can be further grouped together into different fuzzy logical relationship groups according to the same Left-Hand Sides of the fuzzy logical relationship. For example, there are two fuzzy logical relationships with the same Left-Hand Side ( Ai ): Ai A j 1 and Ai A j 2 . These two fuzzy logical relationships can be grouped into a fuzzy logical relationship group. Denition 5. Suppose F (t ) is caused by F (t 1) only, and F (t ) = F (t 1) R (t 1, t ). For any t , if R (t 1, t ) is independent of t , then F (t ) is named a time-invariant fuzzy time-series, otherwise a time-variant fuzzy time-series. Denition 6. Assume that F (t ) is a fuzzy time-series and F (t ) is caused by F (t 1), F (t 2), . . . , and F (t n ), then the fuzzy logical relationship can be represented as follows: F (t 1), F (t 2), . . . , F (t n ) F (t ). This expression is called the n th-order fuzzy time-series forecasting model, where n 2 [2,4]. Song and Chissom employed six main procedures in time-invariant fuzzy time-series and time-variant fuzzy timeseries models as follows: (1) dene and partition the universe of discourse; (2) dene fuzzy sets for the observations; (3) partition the intervals; (4) fuzzify the observations; (5) establish the fuzzy relationship and forecast; and (6) defuzzify the forecasting results.

4 5 6

PR OO

7 8 9 10

11 12

13 14 15 16 17 18 19

3. Proposed model and algorithm

TE

20

CO

In fuzzy time-series models, mining fuzzy logical relationships (FLR) from time-series is the most critical process to inuence forecasting accuracy [9]. In recent scientic research, articial intelligence algorithms such as genetic algorithms and neural networks play an important role. To improve forecasting accuracy, these algorithms are also applied in fuzzy time-series models [4,9]. Although articial intelligence algorithms perform well in forecasting, they only handle nonlinear relationships, which mean fuzzy logical relationships, in time-series. Besides, as mentioned in the introduction section, recent stock information such as late market news and yesterdays price uctuations are usually referenced for short-term decisions. And reasonable investors will modify their predictions with recent forecasting errors. Hence, a high-order fuzzy time-series model should be provided to meet the price patterns in history data such as Hurangs model (2003) [7]. Based on these facts, we argue that a thoughtful fuzzy time-series model should consider two price patterns together in forecasting processes: (1) nonlinear high-order fuzzy logical relationships in historical data; and (2) linear relationships between recent periods of stock prices. Therefore, this paper proposes a high-order fuzzy time-series model (see Fig. 1) to implement this notion.

21 22 23 24 25 26 27 28 29 30 31 32 33

3.1. Linear forecasting model

RR

EC

34

In time-series research, the adaptive expectation model [17] is a reasonable forecast model to represent the prediction approach of stock investors for the future stock price, where the forecast is generated by the last one period of stock price and the correction for last one period of forecasting error. In stock markets, practical experiences tell that investors usually make their decisions based on recent periods of stock prices. Based on this assumption, we can utilize recent periods of forecasting errors to modify the forecast for the future stock price. Therefore, we extend the adaptive expectation model to derive a multi-period adaptation model (dened in Eq. (1), where forecast (t + 1) is the prediction for the future price; P (t ) is the present price; i is the previous i th period of forecasting error; and h i is the adaptation parameter for i ) [24].

35 36 37 38 39 40 41 42

UN

forecast(t + 1) = P (t ) +
i =1

h i i .

(1)

43

Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

PHYSA: 11049

ARTICLE IN PRESS
4 T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx

CO

3.2. Proposed algorithm

4 5 6 7 8

In this section, we provide stepwise computations and a numerical example to introduce the proposed algorithm as follows. Step 1: Reasonably dene the universe of discourse, U . U = [ Dmin D1 , Dmax + D2 ], where D1 and D2 are two proper positive numbers [1]. For example, the minimum and maximum of the TAIEX, in a training period, are 6430 and 7750, respectively. Then the universe of discourse
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

With the multi-period adaptation model model, we propose a comprehensive fuzzy time-series model which is capable of mining nonlinear and linear relationships among stock price time-series.

RR

Fig. 1. Research process of the proposed model.

EC

TE

PR OO

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx Table 1 Seven linguistic intervals for TAIEX Linguistic interval I1 I2 I3 I4 I5 I6 I7 Interval range [6400, 6600] [6600, 6800] [6800, 7000] [7000, 7200] [7200, 7400] [7400, 7600] [7600, 7800] 5

Q1

Table 2 Assign a related linguistic value Time t t t t t t t t =1 =2 =3 =4 =5 =6 =7 =8 Stock price 6436 6537 6662 6550 6666 6890 6810 7020

PR OO D

L 1 = a11 /u 1 + a12 /u 2 + + a1m /u m L 2 = a21 /u 1 + a22 /u 2 + + a2m /u m . . . (2)


15

Secondly, nd out the degree of each stock price belonging to each L i (i = 1, . . . , k ). If the maximum membership of the stock price is under L k , then the fuzzied stock price is labeled as L k . Lastly, convert each stock price in training dataset to corresponding linguistic values, L k . In the next step, the fuzzy logical relationships are constructed based on the fuzzied stock price. For example, seven fuzzy linguistic vales from Step 2 can be dened as follows: L 1 = (very low price), L 2 = (low price), L 3 = (little low price), L 4 = (normal price), L 5 = (little high price), L 6 = (high price) and L 7 = (very high price) [1]. And Table 2 demonstrates how to classify eight periods of stock prices into corresponding linguistic vales.
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

L k = ak 1 /u 1 + ak 2 /u 2 + + akm /u m .
16 17 18 19 20 21 22 23

CO

can be dened as U = [6400, 7800], where D1 is 30 and D2 is 50. As a result, the dened universe of discourse, U = [6400, 7800], can cover every occurred stock index price in the training period. Step 2: Partition U into several equal length of linguistic intervals. The number of interval is related to the count of linguistic values which are considered for the universe of discourse. In Chens research [2002, 2006], different linguistic values, from 7 to 14, are employed to test his models and different forecasting performances were produced. Chens models showed that more linguistic values result in better performance for the enrollment dataset (the enrollments of the University of Alabama) which is only used as training dataset not testing. In this paper, seven different linguistic values, from 7 to 13, are taken to evaluate the proposed model. Take seven linguistic values as example, the linguistic intervals for these linguistic values, which partition the universe of discourse, U = [6400, 7800], are listed in Table 1. Step 3: Establish a related fuzzy set for each observation in the training dataset. Firstly, dene the fuzzy set, L 1 , L 2 , . . . , L k , based on the linguistic intervals by Eq. (2).The value of ai j indicates the grade of membership of u j in fuzzy set L i , where u j is a triangle fuzzy number with a linguistic interval, Ik , and ai j [0, 1] (1 i k and 1 j m ).

TE

F EC RR

Linguistic value L1 L1 L2 L1 L2 L3 L3 L4

1 2 3 4 5 6 7 8 9 10 11 12 13

Q2

14

PHYSA: 11049

ARTICLE IN PRESS
6 Table 3 First-order fuzzy logical relationship Time t t t t t t t t =1 =2 =3 =4 =5 =6 =7 =8 Linguistic value L1 L1 L2 L1 L2 L3 L3 L4 First-order FLR N.A. Time t t t t t t t =2 =3 =4 =5 =6 =7 =8 Linguistic value L1 L2 L1 L2 L3 L3 L4 T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx

Table 4 Second-order fuzzy logical relationship Time t t t t t t t t =1 =2 =3 =4 =5 =6 =7 =8 Linguistic value L1 L1 L2 L1 L2 L3 L3 L4 Time t t t t t t t =2 =3 =4 =5 =6 =7 =8 Linguistic value L1 L2 L1 L2 L3 L3 L4

Second-order FLR N.A. N.A.

PR OO
Time t t t t t t =3 =4 =5 =6 =7 =8 L5 0 0 0 0 0 0 0 L6 0 0 0 0 0 0 0

P (t 1) L1 L2 L3 L4 L5 L6 L7

P (t ) L1 1 1 0 0 0 0 0

L2 2 0 0 0 0 0 0

EC

Table 5 A uctuation-type matrix for FLR groups (rst order)

L3

TE

D
L4 0 0 1 0 0 0 0

Linguistic value L2 L1 L2 L3 L3 L4

L7 0 0 0 0 0 0 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Step 4: Establish i th-order fuzzy logic relationship (FLR) for time-series. First-order fuzzy logical relationship is composed of two consecutive linguistic values; second-order FLR is composed of three consecutive linguistic values; and i th-order fuzzy logical relationship is composed of i + 1 consecutive linguistic values. For example, Tables 3 and 4 demonstrate how to establish rst-and second-order fuzzy logical relationships based on Table 2. Step 5: Establish FLR groups and assign a frequency weight for each FLR. The FLR with the same LHS (Left-Hand Side) linguistic value can be grouped into one FLR group. All FLR groups will construct a uctuation-type matrix. Tables 5 and 6 show the uctuation-type matrix produced by the FLR from Tables 3 and 4. Each row of the matrix represents one FLR group and each cell represents the occurrence frequency of each FLR. Each FLR within the same FLR group should be assigned a weight. For example, in Table 4, the FLR group of L 1 is L 1 L 1 , L 2 . The FLR of L 1 L 1 occurs once and the weight is assigned 1. The FLR of L 1 L 2 occurs twice and the weight is assigned 2. In this paper, a frequency-weighted method, in which each FLR weight is determined by its occurrence frequency, is employed. The sum of the weight of each FLR will be normalized to obtain a frequency-weighted matrix, W ( L i , t ),
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

CO

RR

0 1 1 0 0 0 0

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx Table 6 A uctuation-type matrix for FLR groups (second order) P (t 2), P (t 1) L1, L1 L1, L2 L1, L3 L7, L5 L7, L6 L7, L7 P (t ) L1 1 1 0 0 0 0 L2 2 0 0 0 0 0 L3 0 1 1 0 0 0 L4 0 0 1 0 0 0 L5 0 0 0 0 0 0 L6 0 0 0 0 0 0 L7 0 0 0 0 0 0 7

which is dened in Eq. (3).

PR OO

W W2 Wk 1 W ( L i , t ) = W1 , W2 , . . . , Wk = , ,..., . k k k Wi Wi Wi
i =1 i =1 i =1

(3)

TE

For example, the frequency-weighted matrix for the FLR group of L 1 at t is specied as follows: W1 = 1/3, W2 = 2/3, W3 = 0, W4 = 0, W5 = 0, W6 = 0 and W7 = 0. Step 6: Compute initial linguistic forecasts and defuzzify the forecasts to a numeric forecast. Therefore there are two sub-procedures in this step as follows.

3 4 5 6 7 8 9 10 11 12 13 14 15

D (t ) = L d f W T ( L i , t ).

RR

Step 6-2: Defuzzify the linguistic forecasts to generate a numeric forecasted output [1]. This procedure is called defuzzication, dened in Eq. (4), where L d f is the defuzzied matrix, which is composed of each midpoint of linguistic interval and W T ( L i , t ) is the transformed matrix of the frequency-weighted matrix from Step 5. (4)

EC

Step 6-1: Produce linguistic forecasts based on the rules of FLR groups. Take Table 5 as example, if a linguistic stock price is L 1 , which t for the rule in the rst row of Table 5, then the linguistic forecasts for the future stock price will be L 1 and L 2 , and the frequency-weighted matrix for these two linguistic forecasts is [1/3, 2/3, 0, 0, 0, 0, 0]. If the linguistic stock price is not found in the rules of FLR groups, the numeric stock price of L 1 is employed as the forecast for the future stock price.

16

UN

For example, if the present stock price is L 1 , which t for the rule in the rst row of Table 5, then the transformed matrix of the frequency-weighted matrix is [1/3, 2/3, 0, 0, 0, 0, 0]T . From Table 1, the midpoint of I1 is 6500, I2 is 6700, I3 is 6900, I4 is 7100, I5 is 7300, I6 is 7500, and I7 is 7700. And, therefore, the defuzzied matrix, L d f , is [6500, 6700, 6900, 7100, 7300, 7500, 7700]. From the defuzzication equation (4), [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/3, 2/3, 0, 0, 0, 0, 0]T is computed to get the defuzzied result, 6633. Table 7 shows some examples of defuzzication using the data from Tables 1 and 5. Step 7: Use the multi-period adaptation model to produce forecasts. The multi-period adaptation model is dened in Eq. (5), where Forecast(t + 1) is the conclusive forecast for the future stock price; P (t ) is the present stock price on time t ; i is the i th forecasting error (where = D (t ) P (t )) and; h i is a linear parameter for the i th period of forecasting error, i .

17 18 19 20 21 22 23 24 25 26 27

Forecast(t + 1) = P (t ) +
i =1

CO
k

h i i .

(5)

28

The linear parameters, h i , range from 1 to 1 but 0 with the stepped value, 0.001, to adapt the forecasts to reach the best forecasting performance in training datasets. And the determined parameters from the training datasets are taken
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

29 30

PHYSA: 11049

ARTICLE IN PRESS
8 Table 7 Examples of defuzzication (rst order) Time t t t t t t t t =1 =2 =3 =4 =5 =6 =7 =8 Stock price 6436 6537 6662 6550 6666 6890 6810 7020 Linguistic value L1 L1 L2 L1 L2 L3 L3 L4 L d f W T (Li , t ) [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/3, 2/3, 0, 0, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/3, 2/3, 0, 0, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/2, 0, 1/2, 0, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/3, 2/3, 0, 0, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [1/2, 0, 1/2, 0, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [0, 0, 1/2, 1/2, 0, 0, 0]T [6500, 6700, 6900, 7100, 7300, 7500, 7700] [0, 0, 1/2, 1/2, 0, 0, 0]T No rules D (t ) 6633 6633 6700 6633 6700 7000 7000 T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx

Table 8 Examples of forecasting processes for adaptive forecasts (rst order) Time P (t ) Linguistic D (t ) value = D (t ) P (t ) Forecast (t + 1) 1 adaptation period t t t t t t t t =1 =2 =3 =4 =5 =6 =7 =8 6436 6537 6662 6550 6666 6890 6810 7020 L1 L1 L2 L1 L2 L3 L3 L4 6633 6633 6700 6633 6700 7000 7000 N.A. 197 96 38 83 34 110 190 N.A. N.A. 6436 + h 1 197 6537 + h 1 96 6662 + h 1 38 6550 + h 1 83 6666 + h 1 34 6890 + h 1 110 6436 + h 1 190

2 adaptation periods

4. Empirical analysis

5 6 7 8 9 10 11 12 13 14

UN

CO

(actual(t ) forecast(t ))2 n . (6)

15

RMSE =

t =1

16

4.1. Performance evaluation To evaluate the proposed model, various forecasting performances using different orders, from 1st order to 4th order, and adaptation periods, form 1 to 4 periods, are produced. Tables 9 and 10 list the forecasting performances
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

17 18

RR

This section provides performance evaluation, comparison and model. The empirical databases to verify the proposed model are two stock indexes as follows: TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) and HSI (Heng Seng Index). A nine year period of each stock index, from 1991 to 1999, is selected as experimental datasets to evaluate the forecasting performance of the proposed model. Each year of the selected stock index is split into two subsets, a training dataset and a testing dataset. The previous ten months of one year, from January to October, is used for training, and the last two months, November to December, is for testing. Each stock index database contains over 2500 objects with one attribute, closing price of stock index. To measure forecasting performance of the proposed model, we employ the RMSE (Root Mean Squared Error) as a performance indicator (dened in Eq. (6), where actual (t ) is the actual trading stock price on time t ; forecast(t ) is the forecasting value for actual (t ); and n is the number of times for forecasts).

EC

to adapt the forecasts for testing datasets. Table 8 demonstrates how to produce multi-period adaptations of forecasts using the data from Tables 2, 5 and 7.

TE

N.A. N.A. 6537 + h 1 96 + h 2 197 6662 + h 1 38 + h 2 96 6550 + h 1 83 + h 2 38 6666 + h 1 34 + h 2 83 6890 + h 1 110 + h 2 34 6436 + h 1 190 + h 2 110

PR OO

3 adaptation periods

N.A. N.A. N.A. 6662 + h 1 38 + h 2 96 + h 3 197 6550 + h 1 83 + h 2 38 + h 3 96 6666 + h 1 34 + h 2 83 + h 3 38 6890 + h 1 110 + h 2 34 + h 3 93 6436 + h 1 190 + h 2 110 + h 3 34

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx Table 9 Forecasting performance for the TAIEX (testing datasets) Proposed model Order Adaptation period 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
a Minimum RMSE.

Year 1991 99.08 43.44 44.24 38.18a 38.66 100.69 44.56 38.9 41.49 42.88 130.97 56.92 58.03 61.97 68.65 130.58 66.12 66.96 63.28 64.88

1992 81.13 43.81 43.32 41.78 41.2a 86.41 45.72 43.38 43.49 44.41 89.3 45.07 44.05 44.22 44.86 103.16 52.38 52.59 52.97 53.87

1993 126.55 105.58a 108.41 109.59 110.9 125.31 108.12 110.44 111.56 112.74 136.47 113.01 114.19 115.33 117.49 132.09 113.61 114.67 115.97 117.8

1994 110.69 75.71a 75.79 76.42 78.15 112.39 79.34 79.79 80.82 85.78 119.65 87.69 88.64 88.48 91.19 128.09 97.14 96.79 97.95 100.27

1995 105.57 54.99a 55.55 58.28 59.17 102.96 55.84 56.41 58.34 58.88

1996 76.56 51.05 50.93a 51.63 52.67 70.16 51.1 51.66 51.78 54.55

1997 158.89 134.41 137.32 135 132.48

1998 154.08 115.14 114.58 115.63 116.54

1999 133 103.86 102.02a 102.13 103.33 141.91 104.06 104.25 105.81 105.92 149.34 112.16 114.59 115.49 117.01 190.77 136.27 135.26 137.63 139.14

103.64 59.1 59.68 60.93 62.37 117.53 62.98 63.04 63.78 67.53

Proposed model Order Adaptation period 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4

Year 1991 53.04 39.75 39.01 39.57 40.01

1992

EC

Table 10 Forecasting performance for the HSI (testing datasets)

1993

TE
1994

D
1995 127.52 76.31 76.21 76.97 74.58a 129.62 79.82 81.83 80.03 80.88 128.21 82.68 79.92 80.42 79.74 146.20 93.36 92.57 89.68 89.10

PR OO
160 136.31 133.25 131.15 128.89a 184.56 135.66 135.54 131.66 130.62 206.64 143.24 138.35 139.65 141.18 70.19 53.36 53.09 54.01 56.05 71.42 54.31 54.9 56.46 57.94 1996 147.30 144.32 147.06 139.60a 141.62 148.20 146.82 139.65 141.52 143.45 149.87 139.69 141.46 143.44 145.49 139.69 141.46 143.44 145.49 147.56 1997 534.82 267.97a 274.11 277.26 278.70 465.22 278.67 282.32 280.94 282.12 431.08 292.96 291.75 292.21 294.19 399.76 299.83 299.84 301.48 284.68

F
149.66 113.65a 114.97 115.88 116.51 153.22 115.8 116.58 117.64 119.08 167.99 121.31 122.63 124.16 122.94 1998 296.65 204.43 205.83 203.70a 205.67 310.00 216.57 215.40 217.39 211.39 333.08 247.29 249.23 243.55 243.09 328.65 259.35 249.57 252.39 248.83

1999 295.59 238.73 244.70 239.75 242.90 296.11 245.88 238.50a 243.66 247.60 295.50 240.75 243.91 248.82 254.46 292.56 248.25 251.35 254.95 254.21

CO
48.28 39.18 40.62 40.96 41.56 48.49 40.61 40.73 41.68 42.13

52.21 38.63a 39.45 40.66 40.30

RR
190.71 133.78 135.24 136.67 143.96 216.62 144.28 145.88 147.72 154.14 178.25 142.30 144.19 146.10 147.70

182.56 131.49 131.32a 132.38 138.72

266.12 208.43 206.05 209.41 211.52

202.16 146.47a 152.94 154.51 162.94

219.74 205.07a 206.75 207.38 208.36 217.71 206.88 207.22 208.38 210.91 220.05 207.54 208.83 211.28 211.06

205.85 158.82 160.51 162.59 170.60 227.71 174.34 176.74 179.25 189.12 219.22 184.34 187.01 189.72 195.07

a Minimum RMSE.

for the empirical databases (Table 9 for the TAIEX and Table 10 for the HSI) by using 9 linguistic values to partition the universe of discourse dened in training datasets. It shows that, in the TAIEX, the forecasting models using the
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

1 2

PHYSA: 11049

ARTICLE IN PRESS
10 Table 11 The statistics of best performance for the TAIEX Year Forecasting performance Order Adaptation periods 1991 38.18 1 3 1992 41.2 1 4 1993 105.58 1 1 1994 75.71 1 1 1995 54.99 1 1 1996 50.93 1 2 1997 128.89 2 4 1998 113.65 2 1 1999 102.02 1 2 T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx

Table 12 The statistics of best performance for the HSI Year Forecasting performance Order Adaptation periods 1991 38.63 2 1 1992 131.32 1 2 1993 205.07 2 1 1994 146.47 1 1

1995

PR OO
1996 1997 139.60 1 3 267.97 1 1 1995 1996 1997 1998 79 70 55a 55a 55a 54 54 50a 50a 51 148 133 136 134 129a 167 151 115 115 114a

F
1998 1999 149 142 118 104 102a

1999 238.50 2 2

74.58 1 4

203.70 1 3

Table 13 Performance comparison table (TAIEX) Models Chens model [1] Yus model [10] Chens model [13] Chengs model [12,14] Proposed model (optimal order & adaptation periods) 1991 80 61 47 43 38a 1992 60 67 44 44 41a 1993 110 105a 107 105a 106 1994 112 135 78 76a 76a

Sum of RMSE 959 918 750 726 712a

a The best performance among 5 approaches (Minimum RMSE).


1 2 3 4 5 6 7 8 9

10

4.2. Performance comparison

11 12 13 14 15 16 17 18

19

4.3. Model verication

20 21 22 23

To compare the proposed model with conventional time-series models, a statistic method, the method of least squares [18], is taken to estimate the stock price patterns of the three stock markets. By using the software (E-Views) to estimate the time-series models for different periods of stock index (from 1991 to 1999), nine sets of statistics of hypotheses testing for each stock market are generated.
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

UN

To examine the improvement in performance, four fuzzy time-series models, Chens (1996), Yus (2005), Chengs (2006) and Chens (2007) models, are employed as comparison models. Because the optimal linguistic values for the proposed model are 9, we reduplicate the algorithms from the literature [1,10,1214] using 9 linguistic values to produce performance comparison tables for the empirical databases shown in Tables 13 and 14. The comparison table for the TAIEX shows that the proposed model (using 9 linguistic values) performs best in 4 testing datasets (1991, 1992, 1997 and 1999) and bears the smallest value of the sum of RMSE. However, in the HSI, the proposed model is not the only one best model among ve models, although the proposed model bears the smallest value of the sum of RMSE.

CO

RR

adaptation model (from 1 to 4) perform better than the one without using (0). This result is also discovered in the HSI database. The performance tables for the stock databases are summarized in Tables 11 and 12 (Table 11 for the TAIEX, and Table 12 for the HSI), which list the different orders and adaptation period under the best performance for the nine experimental datasets. Take Table 11 as explanation example, among 4 forecasting models with different order, 1st-order model takes the rst place of forecasting performance (performs best in 7 testing periods, 19911996 and 1999). While among 4 forecasting models using different adaptation periods, one adaptation period model take the rst place (performs best in 5 testing periods, 1993, 1994, 1995, 1998, and 1999). From these summarization tables, Tables 11 and 12, it is found that higher-order models are not sequential to perform better lower-order models.

EC

TE

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx Table 14 Performance comparison table (HSI) Models Chens model [1] Yus model [10] Chens model [13] Chengs model [12,14] Proposed model (optimal order & adaptation periods) 1991 105 40 38a 38a 39 1992 197 183 130a 130a 131 1993 336 203 206 208 205a 1994 313 198 147 140a 146 1995 214 105 75a 76 75a 1996 169 141 147 144 140a 1997 512 375 274 274 268a 1998 282 258 205 202a 204 1999 306 273 241 234a 239 Sum of RMSE 2434 1776 1465 1446a 1446a 11

a The best performance among 5 approaches (Minimum RMSE).

Fig. 2. Statistics of hypotheses testing for the TAIEX of 1991.

UN

In each year of stock index, ve linear regression variables, from price (t 1) to price (t 5), are selected to be estimated and tested. The p -value given just below the t -statistic, denoted probability (t -statistic) [18], is the marginal signicance level of the t -test. If the p -value is less than the signicance level which is testing, given at 0.01 here, the null hypothesis that all slope coefcients are equal to zero is rejected. Take Fig. 2 as example, in the TAIEX of 1991, the time-series variable signicantly related to present variable, price (t ), is previous the variable, price (t 1), because only the p -value (0.0000) for price (t 1), is less than the signicance level (0.01) among ve variables (from price (t 1) to price (t 5)). To summarize nine sets of statistics of hypotheses testing for the stock databases, Tables 15 and 16 list the summarizations of estimated variables and statistics under 1% signicance level, which include the estimated timeseries variables, coefcients, t -statistic, and p -value, for the nine year period of each stock database (Table 15 for the TAIEX, and Table 16 for the HSI). From the statistics of hypotheses testing in Tables 15 and 16, nine estimated time lags, from 1991 to 1999, for each stock database are derived, and used to compare with the orders of the proposed model under the best performance. Tables 17 and 18 lists two types of estimated time lags produced by using least squares method and the proposed model (Table 17 for the TAIEX, and Table 18 for the HSI). From these tables, it is apparent that there is high consistency between the statistic model and the proposed model for the two stock markets (the percentage of consistency is 13/18 = 72%). Based on the evidence, we can argue that the proposed model derive most the same forecasting patterns with conventional statistic method in stock markets.

EC

TE

PR OO

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

5. Findings and conclusions This paper provides a high-order fuzzy time-series model and employ the multi-period adaptation model to enhance forecasting accuracy. In empirical analysis, there are three ndings as follows.
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

CO

RR

19

20 21

PHYSA: 11049

ARTICLE IN PRESS
12 T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx Table 15 Summarizations of statistics under signicance level for TAIEX Year 1991 1992 1993 1994 1995 1996 1997 1998 1999 Variable price (t price (t price (t price (t price (t price (t price (t price (t price (t price (t 1) 1) 1) 1) 1) 1) 1) 1) 4) 1) Coefcient 1.007620 1.013346 1.146745 1.029945 0.963102 1.002557 0.997510 1.072445 0.245564 1.107256 t -statistic 17.03325 16.88899 18.80499 17.74115 16.14425 16.76250 16.78753 17.43625 2.756455 17.88475 Prob. 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0062 0.0000

Table 16 Summarizations of statistics under signicance level for HSI Year 1991 1992 1993 1994 1995 1996 1997 1998 1999 Variable price (t price (t price (t price (t price (t price (t price (t price (t price (t price (t price (t price (t price (t 1) 1) 4) 1) 1) 1) 1) 1) 3) 4) 1) 2) 1) Coefcient 0.970594 1.080045 0.302411 1.120111 0.997210 1.078850 0.955913 1.028787 0.320416 0.391072 1.116729 0.254831 1.065948

PR OO
t -statistic 15.13667 17.05132 3.267867 17.51516 15.51544 16.73271 14.93345 16.04084 3.692714 4.393940 17.44999 2.680829 16.49485 1995 t 1 1 Yes 1996 t 1 1 Yes 1997 t 1 2 No 1998 t 1/ t 4 2 No 1995 t 1 1 Yes 1996 t 1 1 Yes 1997 t 1/ t 3/ t 4 1 Yes 1998 t 1/ t 2 1 Yes

Prob. 0.0000 0.0000 0.0012 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0000 0.0000 0.0078 0.0000

Table 17 The estimated time lags for the TAIEX Methods Least squares method ( p = 0.01) Proposed model (order) Consistency Year 1991

RR
1992

EC
1993 t 1 1 Yes

TE
1994 t 1 1 Yes

1999 t 1 1 Yes

t 1 1 Yes

t 1 1 Yes

Table 18 The estimated time lags for the HSI Methods

CO
Year 1991

1992 t 1/ t 4 1 Yes

1993 t 1 2 No

1994 t 1 1 Yes

1999 t 1 2 No

Least squares method ( p = 0.01) Proposed model (order) Consistency

UN

t 1 2 No

1 2 3 4 5

(1) From Tables 9 and 10, it is discovered that the forecasting models using lower order (1st or 2nd order) and fewer adaptation periods (1 and 2 adaptation periods) perform better than higher-order models. Besides, most of the estimated time lags from the method of least squares are less than 2 (see Tables 17 and 18). These ndings both show that the stock price patterns in the stock markets are short-term. This implies that investment decisions of stock investors are inuenced by recent 1 or 2 periods of price uctuations.
Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

PHYSA: 11049

ARTICLE IN PRESS
T.-L. Chen et al. / Physica A xx (xxxx) xxxxxx 13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

References
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]

PR OO

(2) The best performance is generated from different order forecasting models (from 1 to 2) and the models using different adaptation periods (from 1 to 4) (see Tables 15 and 16). This nding can be explaining by assuming that different years of the stock data are generated from non-equivalent and auto-regressive time-series models (see Tables 17 and 18). (3) The experiment datum, from Tables 9 and 10, shows that the adaptation model can effectively reduce the RMSE within the proposed model in the all testing periods, and from performance comparison tables, Tables 13 and 14, it is obvious that the proposed model using different adaptation periods outperforms the three listing conventional fuzzy time-series models, Chens (1996) and Yus (2005), which only mine fuzzy logical relationship from time-series. In stock markets, investors usually make their decisions based on recent stock prices in these two days not long time ago. However, most of the conventional models only extract fuzzy logical relationship from a long-period of historical data to generate forecasting rules. Therefore, the listing models, Chens (1996) and Yus (2005) models, cannot adapt their forecasts to meet recent price uctuations to reduce forecasting error. On these ndings, two conclusions are remarked: (1) the multi-period adaptation model, proposed in this paper, can effectively improve the forecasting performance of conventional fuzzy time-series models; and (2) the price patterns in the TAIEX and HSI are short-term. In the future research, two suggestions are provided to improve this paper: (1) rene the evaluating approaches with various ratios of training to testing to split experimental datasets; and (2) develop a computer system to provide forecasts for stock markets and evaluate the prot.

19

UN

S.M. Chen, Forecasting enrollments based on fuzzy time-series, Fuzzy Sets and Systems 81 (1996) 311319. S.M. Chen, Forecasting enrollments based on high-order fuzzy time series, Cybernetics and Systems 33 (2002) 116. S.M. Chen, C.C. Hsu, A new method to forecast enrollments using fuzzy time series, Applied Science and Engineering 2 (2004) 234244. S.M. Chen, N.Y. Chung, Forecasting enrollments using high-order fuzzy time series and genetic algorithms, International Journal of Intelligent Systems 21 (2006) 485501. K. Huarng, Heuristic models of fuzzy time series for forecasting, Fuzzy Sets and System 123 (2001) 369386. K. Huarng, Effective lengths of intervals to improve forecasting in fuzzy time series, Fuzzy Sets and Systems 123 (2001) 155162. K. Huarng, H.K. Yu, An N -th order heuristic fuzzy time series model for TAIEX forecasting, International Journal of Fuzzy Systems 5 (4) (2003) 247253. K. Huarng, H.K. Yu, A type-2 fuzzy time-series model for stock index forecasting, Physica A 353 (2005) 445462. K. Huarng, T.H.K. Yu, The application of neural networks to forecast fuzzy time series, Physica A 336 (2006) 481491. H.K. Yu, Weighted fuzzy time-series models for TAIEX forecasting, Physica A 349 (2005) 609624. H.K. Yu, A rened fuzzy time-series model for forecasting, Physica A 346 (2005) 657681. C.H. Cheng, T.L. Chen, C.H. Chiang, Trend-weighted fuzzy time-series model for TAIEX forecasting, Lecture Notes in Computer Science 4234 (2006) 469477. T.-L. Chen, C.-H. Cheng, J.-T. Hia, Fuzzy time-series based on bonacci sequence for stock price forecasting, Physica A 380 (July) (2007) 377390. C.H. Cheng, T.L. Chen, H.J. Teoh, C.H. Chiang, Fuzzy time-series based on adaptive expectation model for TAIEX forecasting, Expert Systems with Applications (2006) (in press). S.M. Chen, Temperature prediction using fuzzy time-series, IEEE Transactions on Cybernetics 30 (2000) 263275. C.H. Cheng, T.L. Chen, H.J. Teoh, Multiple-period Modied Fuzzy Time-series for Forecasting TAIEX, 2007. J. Kmenta, Elements of Econometrics, MacMillan, 1986. Douglas C. Montgomery, George C. Runger, Norma Faris Hubele, Engineering Statistics, 3rd ed., Wiley, New York, 2004. L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning I, Information Science 8 (1975) 199249. L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning II, Information Science 8 (1975) 301357. L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning III, Information Science 9 (1976) 4380. Q. Song, B.S. Chissom, Forecasting enrollments with fuzzy time-series Part I, Fuzzy Sets and Systems 54 (1993) 110. Q. Song, B.S. Chissom, Forecasting enrollments with fuzzy time-series Part II, Fuzzy Sets and Systems 62 (1994) 18. C.-H. Cheng, T.-L. Chen, H.-J. Teoh, Multiple-period Modied Fuzzy Time-series for Forecasting TAIEX, 2007.

20 21

TE

Q3

22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

RR

EC

Q4

38 39 40 41 42 43 44 45 46 47 48

Please cite this article in press as: T.-L. Chen, et al., High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets, Physica A (2007), doi:10.1016/j.physa.2007.10.004

CO

You might also like