You are on page 1of 11

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO.

2, MAY 1997 223

A Fuzzy Clustering-Based Rapid Prototyping


for Fuzzy Rule-Based Modeling
M. Delgado, Antonio F. Gómez-Skarmeta, and F. Martín

Abstract— This paper presents different approaches to the learning method of fuzzy rules based on the gradient descent
problem of fuzzy rules extraction by using fuzzy clustering as method, the works of Berenji et al. [4], Sin and de Figueredo
the main tool. Within these approaches we describe six methods [39], and others [3], [41], [52] that use a fuzzy clustering to
that represent different alternatives in the fuzzy modeling process
and how they can be integrated with a genetic algorithms. These obtain the fuzzy rules and functional form of the fuzzy sets
approaches attempt to obtain a first approximation to the fuzzy implied in those rules, and others authors that use alternative
rules without any assumption about the structure of the data. methods such as [14], [18], [23], [24] [33]. As a common
Because the main objective is to obtain an approximation, the drawback, most of the methods are complex, computationally
methods we propose must be as simple as possible, but also, they
demanding, and difficult to understand by the users when
must have a great approximative capacity and in that way we
work directly with fuzzy sets induced in the variables’ input they are nonexperts in fuzzy sets theory. We have called this
space. The methods will be applied to four examples and the first kind of fuzzy models an approximative approach because
errors obtained will be specified in the different cases. they try to extract from the sample data the fuzzy sets that
Index Terms— Fuzzy clustering, fuzzy modeling, fuzzy rules, characterize the fuzzy rules without any intention that the
unsupervised learning. fuzzy sets have a linguistic interpretation, so the main objective
in this case is to obtain a less descriptive fuzzy model, but a
more precise one.
I. INTRODUCTION
The second direction in formulating fuzzy rules is in the

I N this paper, we discuss several methods that use a fuzzy


clustering process within a rapid-prototyping approach and
attempt to generate a first approximation to a fuzzy model.
mode of what Yager [49] called template-based methods. In
this approach, the system expert provides template linguistic
values which are used to partition the input–output space.
Any system is to be described through the existing relations These template values are then used to generate potential rules
between its input variables and its output variables. To identify for the fuzzy system model. Input–output data are then used
such relations, a functional input–output description may be to generate weights, possibilities, or probabilities associated
available, but in the case of many complex processes, this is with the importance of the potential rules. In this approach,
not feasible and we need look for alternative methods. The the fuzzy subsets are a priori given and the emphasis is on the
use of fuzzy models and, more particularly, those described learning of the weights of the rules instead of the learning of
through fuzzy rules has been shown to be successful. the parameters of membership functions, as in the first method.
The problem of generation of fuzzy IF-THEN rules is one A common feature of these template-based methods is the
of the more important problems in the development of fuzzy assumption of given potential IF-THEN rules and, therefore, a
systems models. We can observe two major trends in this predefined rectangular partitioning of the input–output space.
endeavor. This first one is essentially a parameter identification The use of matrix product to express the induced degree of
approach; it is characterized by a back-propagation like learn- uncertainty for generating linguistic rules had been proposed
ing of the antecedent and consequent membership function by Peng and Wang [37] and Wang et al. [45]. In this same
parameters. This approach, typified by Wang and Mendel [44], approach, we can consider the works on linguistic fuzzy
Ichihashi [20], and Araki et al. [1], dominates the works models using fuzzy relational models of Pedrycz [36] or some
on learning in the fuzzy environment. In this approach, the works that starting with a linguistic model try to adjust the
description of the antecedent and consequent fuzzy subsets are parameters of the fuzzy sets using differents techniques like
reduced to a functional form whose parameters are estimated. neural nets or genetic algorithms [21]. A common drawback
In this same kind of work, we can also consider the works of of these methods is the combinatorial explosion they exhibit
Takagi and Sugeno [42] who investigate the identification of when the number of antecedents are high and, as a result, less
a system where the consequent of the fuzzy rule is a linear eficient methods are obtained. In the case of the second kind of
input–output (I/O) relation, Nomura et al. [34] who describe a models, we have called them a descriptive approach because
using a collection of predefined fuzzy sets in the domain of
Manuscript received October 10, 1995; revised October 21, 1996. This work
was supported in part by CICYT Project TIC95-1019. the variables they try to determine which combinations better
M. Delgado is with the Department Ciencias de la Computación e Inteligen- characterize the system, so its main objective is to obtain a
cia Artificial, Universidad de Granada, Granada, 18071 Spain. qualitative model of the system.
A. F. Gómez-Skarmeta and F. Martín are with the Department Informatica
y Sistemas, Universidad de Murcia, Murcia. Spain. In a normal situation when we are trying to model a system,
Publisher Item Identifier S 1063-6706(97)02839-7. the most important information about it is a collection of
1063–6706/97$10.00  1997 IEEE
224 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 2, MAY 1997

data obtained from the observation of its behavior. Also, input space and fuzzy sets of the output space. Numerical
normally, in the case of complex system, this may be the only examples are provided in Section IV to illustrate the perfor-
information we can have. The problem with this information mance of the presented methods. Finally, in Section V some
is that it does not have any structure, so either we use all conclusions and indications about future trends are presented.
of it directly, what could be a complex problem, or we try
to reduce its complexity. In this context, clustering in general
and fuzzy clustering, in particular, is one of the most promising II. FUZZY MODELING AND FUZZY CLUSTERING
techniques, basically because when we have a lot of data and In the following, we will consider that the system to be
we have no more information about it, clustering can be used modeled is a multi-input single-output (MISO) one, i.e., to be
to detect the possible groupings that exist, groups that have described by a set of rules with the form
similarity in their behaviors, and groups that can be used
to establish some hypothesis about the structure present in If is and and is then is (1)
the data; however, it is also interesting to make groups of
related data to reduce the complexity of the model. Therefore, where are input variables, is the output, and
in this way we are applying the incompatibility principle of and are fuzzy variables.
Zadeh [53], [54], reducing the complexity but maintaining the We will assume that the basic knowledge for modeling such
most information we can from the original data. Once we a (fuzzy) system is given in terms of a (usually) large set
have detected the groupings present on data, we can try to of sample I/O pairs, say
associate them to one or more fuzzy rules to obtain a fuzzy and obviously the selected rules are
model of the studied system. These characteristics of fuzzy required to be able to approximate the function
clustering techniques have led us to its use in the area of that theoretically underlies the system behavior the most
fuzzy modeling, either within the descriptive or within the consistently with the given sample of I/O pairs. We ought
approximative approach. to assume that the collection of data samples represent the
In this work, we present the results we have obtained in the system behavior in the product space where
study of different techniques of fuzzy rule generation using the with being the domains
information we can obtain from the fuzzy clusters of the data. of discourse of the inputs and is the domain of the output.
These approaches attempt to obtain a first approximation to It is clear that if we are trying to approximate the function
the fuzzy rules without any assumption about the structure of there must be a relation between the data in
the data. We are going to study the problem of fuzzy rule and and, in this way, the clustering of the data in
generation from a methodology point of view, considering seems to be an adequate technique to underline
the alternatives we have to obtain a rapid prototyping of this relation. The use of the clusters found in is
a fuzzy model using fuzzy clustering as the main tool and necessary because they represent the tendencies detected in
considering different situations. In this context, the term rapid the association between the input and output data. Moreover,
prototyping means being able to obtain (in a rapid way) an they are an efficient and flexible procedure to aggregate the
acceptable collection of fuzzy rules that can be considered behaviors present on the data, in this way, characterizing
as a first approximation to the fuzzy model and that can be their relations. The use of the information given by the fuzzy
introduced in a subsequent process or procedure to prune and clusters in is not new, and several authors have
tune them. This methodology is the central component of a proposed similar approaches (i.e, Yoshinari et al. [52], Yager
rapid-prototyping tool we are developing for the integration and Filev [48], Sin and de Figueiredo [39], Babuska and
of different techniques of fuzzy rules generation, and that we Verbruggen [2], Chiu [8], and others). Our contribution is the
have called integrating generators of rules (IGOR). use of the information the clustering gives us in a direct way to
These methods will provide the users with a useful tool in generate a rapid-prototyping approach to the fuzzy modeling.
the decision-making process on how to undertake the fuzzy Obviously, an alternative could be to generate fuzzy clusters
modeling, offering different alternative methods to generate separately in the input and output space and then to try putting
fuzzy rules going from a pure approximative approach to them in relation. We will see (in the next section) an approach
a pseudodescriptive one, the difference between them being of this kind and also an intermediate approach. However, as
where the emphasis is put, either in obtaining more precise our studies have revealed [12] and, as we will see in the
fuzzy rules or more “linguistic” ones in the sense that they examples in Section IV, this method is less efficient to obtain
can be interpreted by an expert. All the methods will be an approximative fuzzy model, although it has advantages in
characterized by their simplicity, by what makes them suitable case what we want is to obtain a more descriptive model.
for use in a rapid-prototyping approach to the fuzzy modeling, In this way, in our approach (a fuzzy clustering) is carried
and by the use of fuzzy sets defined directly in the product out on , which will group together those I/O pairs that
space of the input variables. are close with respect to some similarity measure. As we
The remainder of this paper is organized as follows. In do not assume any more information about the structure of
Section II, we present the fuzzy model with which we are the data (for example, about its linearity), we use a classical
going to work. In Section III, we define different methods that fuzzy clustering algorithm to generate fuzzy clusters in the
are going to use the information that gives us the clusters used product space with centers or centroids denoted by
to characterize the better combination of fuzzy relations of the for .
DELGADO et al.: FUZZY CLUSTERING-BASED RAPID PROTOTYPING FOR FUZZY RULE-BASED MODELING 225

From the information these cluster and their centroids To infer using this collection of fuzzy rules, we have
provide, the key idea is to generate a first collection of rough decided, because of its simplicity although without significance
fuzzy rules that can be used as a first approximation of decrease of the performance (remember we are trying to
the collection of rules to later obtain an inference machine obtain a first approximation to the fuzzy model), to use the
(approximative approach) or find a linguistic description of Mizumoto’s simplified approximate reasoning method [31],
the involved fuzzy sets (descriptive approach). where the fuzzy set in the consequent of the fuzzy rule is
To illustrate our comments, let us suppose that the fuzzy substituted by a “Singleton value” (i.e., a real number) that
-means (FCM) algorithm [5] is used in the initial clustering could be obtained from some suitable defuzzification method
process. Then the membership function of the fuzzy relation over , in particular, the gravity center of
associated with the th cluster is such that
(4)

Under the idea of using Mizumoto’s simplified reasoning


method, (3) is to be rewritten as

(2) If is then is (5)


where, obviously, is the Singleton consequent to each fuzzy
where and if then rule.
and otherwise. For any input and using the aforementioned simplified
These fuzzy relations will generate the fuzzy rules for reasoning method the inferred value will be
our proposed fuzzy system model and then it is obvious that
the number of needed rules plays a key role and is to be
established for every problem. In several papers, the number
of rules is obtained from cluster validation by using some of (6)
the validity measures built for the purpose in the literature.
Another alternative is the use of a progressive cluster iden-
tification algorithm like the one proposed by Krishnapuram
Now the problem is how to characterize the fuzzy sets
[27]. We have work in this area using a data preprocessing
and or the associated Singleton value , starting from
by means of a hierarchical clustering [11] that allows us to
the fuzzy clusters in . This topic will be the
select the “best” original crisp partitions of the data and that,
central objective of the next section.
additionally, can serve as initialization of the fuzzy clustering
algorithm, in this way improving the algorithm performance.
III. AN APPROACH TO RAPID
Let us remark that the use of a partitional fuzzy clustering
PROTOTYPING IN FUZZY MODELING
algorithm, in particular the FCM, give us several advantages
that leads us to use it: In this section, we will present the different methods
1) a fuzzy partition of the space is obtained, which is an grouped in virtue of the technique used to obtain them
important characteristic in fuzzy inference; and also in increasing computational complexity order. The
2) once we obtain the fuzzy clusters, the functional expres- description of different methods will be used to indicate
sion of their associated fuzzy sets depends only on their different alternatives in the process of generating the fuzzy
centroids and the distance of the data to them, so we can rules. Moreover, the increasing complexity in the methods
express these fuzzy sets in any domain, for example, will be due, in one case, because the intention of obtaining
directly in ; a pseudodescriptive fuzzy model and in other case because
3) the amount of information necessary in the fuzzy model considering more complex fuzzy model like the one proposed
is reduced and no a priori assumption about the “no by Takagi et al. [42].
interactivity” between the different variables in the input Several authors (Sugeno et al. [41], Yager et al. [50],
space is needed. Babuska et al. [2], etc.) have faced the use of the fuzzy cluster
in fuzzy modeling by projecting the clusters in the domains
Therefore, using the fuzzy clustering we have disclosed
of the variables. We have adopted a different point of view.
local behaviors of the data around the obtained cen-
In general, in all our methods we will directly characterize
troids—behaviors that are captured by these fuzzy relations.
fuzzy sets in , trying in this way to capture most part of
We assume that these fuzzy clusters found in the product
the information about the behavior or tendencies detected on
space are the best description of the tendencies
the data to obtain a better approximation. Additionally, we
present in the data, thus, this global structure ought to be kept
consider that the centroids of the clusters (which describe a
and then the fuzzy rules (1) will be considered as
local behavior) and their relation with the data (the fuzzy sets
If is then is (3) they generate) give much information and, thus, we assume
they must play a key role in our process.
where and and being In the following, we will present the different methods,
and respectively, fuzzy sets in and to be obtained which will be denoted , where stand for ESTi-
as representation of the th fuzzy cluster of . mation procedure and indicates an ordinal number.
226 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 2, MAY 1997

A. Inducing Fuzzy Sets in as a Global Input Space inference method


The first and most simple approximation to the problem
is to use directly the centroids and the metric found by the
clustering on the product space in the input and (11)
output space and . To do this, we consider the fuzzy sets
generated in the domains and by the components of the
centroids in and , and that will be noted as and
, respectively. We must note that these fuzzy sets are not where is the covariance matrix associated to the cluster
projections of the clusters, because the data do not preserve .
their membership value, so we must talk about a fuzzy sets The use here of a exponential membership function is mo-
induced in and by the fuzzy clusters. In this way, using tivated because, in this case, we can use another kind of fuzzy
the expression (1) defined in we can now consider clustering algorithm such as the possibilistic C-means (PCM)
fuzzy sets in and said and [26] instead the FCM to disclose the local behavior of the data.
with membership functions These authors use exponential membership functions because a
bell-like shape is more appropiate for linguistic interpretations
and it is rotationally symmetric in the multidimensional space;
consequently, it can be easily projected in each domain of the
variables. This is an interesting property if we try to transform
the generated fuzzy rules in the direction of a more descriptive
(7) fuzzy model (see Section III-C).

B. Considering the Relation between


where if (the same if and, Clusters in and
thus, we have fuzzy rules All previous methods are rather simple because they trans-
late on and the information disclosed by the fuzzy
If is then is (8) clustering in . Let us observe that the same output
is asigned in both cases.
According to the properties of the centroids, we can alter- A little more complicated method can be considered if we
natively generate a set of rules taking as Singleton consequent establish some relation between the groupings detected in
the component in of the centroids in the form and the tendencies that can be detected in .
The idea is that in the reasoning process [as we will need to
If is then is (9) work with data in (input data space)], we need a way to
relate the behavior in with the behavior detected in the
I/O data used in the modeling.
from which the inference for any new input is obtained
The first way we can do it without great complexity is
according to (10), given rise to the first of our proposed
to redefine the fuzzy rules so the consequent value is not
methods
obtained through the projection in the output space of the
fuzzy clusters in , though considering the groupings
detected in and the associated groupings, that can
be established in . To obtain these groupings in and,
(10) hence, the corresponding fuzzy sets, we have to work either
with the induced fuzzy sets shown in (7) or, alternatively, make
a fuzzy clustering in using the input data set. In this second
case, we have considered the same number of clusters than in
Though in the previous method we have used in the defini- and we have used the results of the fuzzy clustering
tion of the fuzzy sets in , the same distance measure used in as initialization of the FCM algorithm, letting the
in the fuzzy clustering algorithm, it is necessary to remark algorithm execute for just a few iterations. The idea behind
that using any other kind of distance measure to define the this process is to obtain a one-to-one relationship between the
membership functions is quite possible because the key results clusters found in and those found in . The results
of the clustering process are the position of the centroids and in either case are quite similar.
how the data are distributed around them. Once we have this Thus, these techniques lead us obtain the consequent value
information, we can use it in several alternative ways. for the fuzzy rules using the whole information provide by
For example, we can define fuzzy sets with exponential the clusters found in and the local information
membership functions in the domain using the components obtained from the clusters in . Using this procedure we
in this space of the centroids found in and the have a collection of fuzzy rules
fuzzy covariances calculated over the components in of
If is then is (12)
the sample data [15], [22]. In this way, we obtain a different
DELGADO et al.: FUZZY CLUSTERING-BASED RAPID PROTOTYPING FOR FUZZY RULE-BASED MODELING 227

where the new consequent is obtained using being a -norm and the Singleton consequent associated
to the fuzzy sets .
Although we have assigned a weight to each possible rule,
the reduction in the number of rules associated to any possible
(13) fuzzy set in the input space could be interesting. This can serve
not only to decrease the size of the weights matrices, reducing
in this way the complexity of the system, but also because in
this way the inference process is improved. Once we generate
where and are the membership degrees of
the matrix of weights, we can assign just one output fuzzy set
the data in the fuzzy sets corresponding to the th cluster on
to each input fuzzy set , for example, taking into account
and its induced one on , respectively.
The value provided by could be seen as the projection in
(18)
the domain of the centroids of resultants by combining the
fuzzy clusters of with the fuzzy sets in induced
by the clusters. C. Considering Separately
The main advantage of all the previous techniques is that
B. An Alternative Classical Approach they generate a good (local) fit of the given data allowing
Although up until now we have used the information the construction of fuzzy models, which potentially have a
obtained from the fuzzy clustering in , a more classical very good approximative ability. However, their drawback is
approach may be used. If and the lack of any linguistic interpretation for the obtained rules
are fuzzy sets supposed to be obtained from because all the input variables have been considered global.
independent fuzzy clustering processes in the input and the Here, we will present some methods that provide more de-
output space, respectively, then one can wanted to assign a scriptive fuzzy models, but keeping an always needed approxi-
weight to each rule [each pair ] without considering mative efficiency. To do that we will consider
the information obtained by a fuzzy clustering in the product separately.
space of the input and output variables. This approach (in the same line of the works of Babuska et
Thus, we are going to consider that collections of and al. [2] and Sugeno et al. [41]), try to generate fuzzy sets in each
fuzzy sets on and , respectively, are obtained from of the domains of discourse of the variables [of course, from
fuzzy clustering of the input and output space from which the the fuzzy clusters found in ]. The main difference
system model is composed by a collection of fuzzy with the methods of the aforementioned authors is that we
rules of the form do not make any previous assumption about the structure of
the data. On the other hand, we are trying to obtain a first
If is then is
(14) approximation of the fuzzy model—not to adjust it (i.e., to
i quickly obtain an approximator the simplest way). To compare
this approach with the one presented in the previous section,
with and being fuzzy sets in and
let us consider that once we obtain the projections of the fuzzy
, respectively, and a weight or certainty value assigned
clusters in each domain we make the extensional hull of the
to the rule. If rules with Singleton consequent are considered,
fuzzy sets obtained and then propose to approximate them by
(14) is to be rewritten as
trapezoidal fuzzy sets. In this way, part of the approximative
If is then is power of the methods that work directly in is lost; also, it
(15) does not warrant obtaining a fuzzy partition of the data which
i
may cause a sparse rule base, but in constrast, we obtain a more
Different alternatives to assigning the weights of the fuzzy descriptive fuzzy model. Within this formulation we look for
rules arise depending on the operators used. A classical a collection of fuzzy rules:
one, using the – composition, is obtained from the
collection of data by means of If is and and is then is (19)

(16) from which an output value can be inferred for


each input to be
Once the weights are available, we can infer using the
simplified reasoning methods and the weights using the
expression
(20)

(17)
where is a -norm and the component in of the centroid
of the th fuzzy clusters of . Depending on the -norm
228 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 2, MAY 1997

used we can have different methods. If we consider recursive least-squares algorithm (or a stationary Kalman
filter), the grade of membership of the data to the antecedent
of the fuzzy rules is considered using directly the grade
of membership of the data to the fuzzy clusters found in
(21) . Moreover, as in this kind of model, we are searching
for local linear models present in the data; a modification we
have adopted is not only to use an FCM algorithm, but as an
initialization to a Gustafson–Kessel fuzzy clustering algorithm
Our interest in obtaining fuzzy rules with fuzzy sets in each
[17]. This algorithm is based in detecting hyperplane fuzzy
of the domain of discourse is not only because its descriptive
clustering and, in this way, it is adequate for detecting the
power, but also because, as we will see in Section IV-C, we
fuzzy partitions that better fulfill the assumption of fuzzy linear
can apply a genetic algorithm to these kind of rules to tune
models [2].
them.
In general, although it is not always true, this method tends
to obtain fuzzy models with a higher accuracy. In this sense
D. Fuzzy Clusters and Complex Models a natural questions arise, and it is why not to use always this
When we have to deal with complex system, the methods we method. The answer is simple: the computational complexity
have presented above with Singleton values in the consequence of this method makes it inadequate in a rapid-prototyping
of the fuzzy rules could have no sufficient approximative approach when the increase in performance of the fuzzy model
power. In these cases, we can adopt a rapid-prototyping obtained does not compensate for the complexity increase
approach although using a more elaborate fuzzy model. The in generating the model. Frequently, this is the case with
Sugeno–Takagi–Kang (TSK) approach [40], [42] can be an systems which have a simple or medium complexity, where
adequate model because this approach tries to decompose the it is more interesting to obtain a rapid model with the more
complicated input space into subspace and then approximate simple methods that will generate eficient fuzzy model, and
the system in each subspace by a simple linear regression then if the model is acceptable, it could be later tuned and
model. Several alternative approaches have been proposed to pruned using optimization tecniques like genetics algorithms,
reduce the complexity of the building process associated to gradient descent methods, or any other well-known fitting
these approach. Many of these approaches have proposed the techniques.
use of clustering techniques in this process (see, for example,
[47] and [49]).
In our case, we are going to use directly the information
IV. NUMERICAL EXAMPLES
of the fuzzy clusters in to obtain, by means of the
recursive least-squared method, the coefficients of the linear In this section we present the behavior of the core of
regression model that will be associated to each fuzzy rule. our framework IGOR—that is to say, the different methods
We can use the same hypotheses as in the previous methods proposed below and, consistently, the methodology cycle of
with respect to the antecedent fuzzy sets and to make the our proposal. We will consider four different examples. The
assumptions about a linear relation between antecedents in first two are not excessive complex systems, but are frequently
and consequent in . In this case, we can use a used to illustrated the methods presented in the literature. In
reasoning method and obtain fuzzy rules our case, we will use them to show the good behavior of
the simplest methods we have proposed. The examples will
If is then is be the well-known problem of the inverted pendulum and
the nonlinear system proposed in [32] and [41]. Then we
where will consider two more complex systems: the Box–Jenkins
gas furnace and the Zimmermann and Zysno model based
(22) on empirical data [55]. Because of the complexity of these
systems, it is necessary to leave the simple methods and
with and the inference process is carried out by consider, instead, the method we have noted as to
means of obtain relatively good results. In this way, we can compare
our rapid-prototyping approach with those presented in the
literature.
The comparison of the methods proposed in the paper with
(23) others (in [28], [32], [40], [41], and [47]) does not seek to
conclude that our methods are better than the other ones, but
that our methods bring solutions that are within an acceptable
range, in this way, obtaining a validation of them. In the
where the is any of the membership functions to fuzzy forthcoming discussion, it is shown how our methodolody,
sets in we have described up until now. faced with different types of problems, will help the decision-
One of the differences we adopt in this approach, is that maker plan which method and underlying approach will let
to obtain the coefficient of the linear consequent using the him reach a better solution.
DELGADO et al.: FUZZY CLUSTERING-BASED RAPID PROTOTYPING FOR FUZZY RULE-BASED MODELING 229

Fig. 1. Inverted pendulum.

TABLE I
INVERTED PENDULUM RULES
Fig. 2. Nonlinear system.

weight, 5 m in length, applying force to the pendulum gravity


center during a constant time of 10 ms was simulated obtaining
in this way a set of 348 triples , such that

The performance index considered to evaluate the obtained


fuzzy models will be the square-mean error
From the 348 data above, a subset of 280 was used by as
training set to re-identify the fuzzy model for the inverted
pendulum obtaining, in this way, seven rules again. The
(24) remaining 68 data was used to check the model, obtaining
a performance error [19].
where is the number of data, is the actual output, and
We do not assume any known partition to start the fuzzy
is the model output.
clustering on and with several clustering validity
criteria as those described in [5], [7], [10], and [11]; we
A. Inverted Pendulum determine that the most suitable number of elements in the
The inverted pendulum is a very usual example used to fuzzy partition of this product space (and thus the number of
check fuzzy modeling techniques, specially applied to fuzzy fuzzy rules) is seven or nine.
control (see Fig. 1). After identifying the system, we use the same 68 data as
The stated variables are the angle , the angular speed , and Herrera et al. [19] to check the obtained models. The methods
the control variable force . For every the objective have been applied to the case of seven and nine rules. The
is to determine the force that must be applied to the gravity results with nine rules are significant superior (in the order of
center of the pendulum during a constant time to place and/or 15%), although to compare our results with those obtained in
keep the pendulum in the vertical position. the literature in Table II, we only show the errors obtained with
On the assumption of (radian), the behavior of seven rules. In the case of method , using the validity
the pendulum may be described by a nonlinear differential measures, we have considered seven fuzzy clusters in and
equation that can be solved to integrate control policies. How- five fuzzy clusters in .
ever, since Yamakawa’s works [51], the superior efficiency of
fuzzy control is known. B. Nonlinear Static System
Our reference model will be Yamakawa’s [51] described by
Let us consider the following nonlinear static system pre-
a set of seven fuzzy rules, with the linguistic term set
sented in [32] and [41], with two inputs and , and a single
output
where and stand for negative and positive, respectivley, (25)
and where the suffixes and mean large, medium, and
small, respectively. The fuzzy linguistics rules are shown in We show a three-dimensional I/O graph of this system in
Table I. Fig. 2. From this systemequation, 50 I/O data are obtained.
To evaluate the performance of our methods, we have used In [41], using the following fuzzy clustering validity criterion
the training data set and the test data set proposed by Herrera over that output data
et al. [19]. These authors consider Yamakawa’s fuzzy model
with trapezoidal membership functions for the semantics of (26)
linguistic labels [29]. From this, a pendulum with 5 kg in
230 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 2, MAY 1997

TABLE II TABLE III


EXAMPLES/ERROR ERROR EVOLUTION WITH THE GENETIC ALGORITHM

they found that six was the optimal number of fuzzy clusters.
with the results of others methods proposed in literature.
In our case, using this same validity criterion over the fuzzy
However, as a reference, we can indicate that while in the
clustering of the product space of the input and output do-
case of the inverted pendulum the performance with method
mains, we found that five was the optimal number of fuzzy
is a great improvement, in the case of the non-
clusters, so we use these number to compare the results.
linear system we have not reached such a great improvement
Before any parameter identification, these authors obtain that
if we also take into account the increase in the computational
, using the method they proposed, which is in some
complexity due to the method.
ways similar with our approaches.
The errors obtained with the different methods proposed in
C. Tuning the Rules with a Genetic Algorithm
this paper are shown in Table II. In the case of method ,
using a similar approach as indicated in the example above, To show that it seems adequate to try to first generate a
we have considered five fuzzy clusters in and five in . fuzzy model in a rapid and easy way, in a subsequent step,
It is clearly shown within these results that although simple, this first model can be tuned; we present here a possible
the results obtained are good enough as a first approximation approach to optimize the generated rules. There are several
to the fuzzy model. Also, as can be seen in the results possible techniques that could be used. For example, we
presented in Table II, there is a significant difference between can use the fuzzy learning vector quantization (FLVQ) or
the performance of method and and the others. fuzzy Kohonen clustering networks (FKCN) [6] to adjust the
This seems natural (from our point of view) as method membership functions parameters (as is proposed in [43]) or
and have less approximative power than they need we can use some more classical tuning techniques such as
to work—in one case, with the relation between the clusters genetic algorithms (GA’s).
found in and using some form of weight to validate In [19], a tuning method using GA’s is presented. In that
this relation and, in the other case, using fuzzy sets in each paper, the GA tuning method fits the membership functions
of the input variables instead of fuzzy sets directly in . of the fuzzy rules based on the expert knowledge with the
However, as we have indicated previously, these two methods inference system and the defuzzification strategy considered.
have an advantage over the others as it is their great capacity GA’s are search algorithms that use operations found in
to generate descriptive fuzzy models. This is possible because natural genetics to guide the trek through a search space. GA’s
in the case of , we generate fuzzy clusters in are theoretically and empirically proven to provide robust
independently of any other consideration; therefore, we can search in complex spaces, giving a valid approach to problems
then generate fuzzy sets in each of the domain of discourse requiring efficient search [16].
considering the projection of the fuzzy relation defined by In Table III, we can see the performance error results with
the fuzzy cluster in each of the input variables. Hence, as method when this GA is applied to the fuzzy model
happens with methods , we can later use some linguistic of the inverted pendulum. The performance expression is the
approximation technique to obtain a qualitative fuzzy model same as in (24).
of the system [41]. Our results with the inverted pendulum show that the
Moreover, as can be seen in Table II, the methods model obtained by the proposed methods is a good first
and have a better behavior than the other methods, approximation, which also may be used as a good initialization
especially , which seems to be due to the use of the for a tuning process. As can be seen, the resultant fuzzy model
exponential membership function and the fuzzy covariance. with nine rules has a performance similar to the one obtained
However, during the simulations we have performed, we have using the method , with the additional advantage that
observed that this good behavior is highly dependent of the in case of method , we have a collection of fuzzy rules
fuzzy partition obtained with the fuzzy clustering and, in this with fuzzy sets in each of the input and output space, so it
way, is relatively sensitive. In the case of method , its can be easily used to generate a collection of linguistic fuzzy
great virtue is its extreme simplicity with a relatively good rules using a linguistic approximation technique.
performance.
Finally, we want to point out that in these two examples, D. Box–Jenkins’ Gas Furnace
we have not shown the results considering the method Box–Jenkins’ gas furnace is a famous example of system
because, as the two examples are rather simple systems, the identification. The well-known Box–Jenkins’ data set consists
results obtained with the more simple methods ( and of 296 I/O observations, where the input is the rate
) are sufficient, as can be deduced from the comparison of gas flow into a furnace and the output is the
DELGADO et al.: FUZZY CLUSTERING-BASED RAPID PROTOTYPING FOR FUZZY RULE-BASED MODELING 231

TABLE IV , trying in this way to capture most of the information about


ERRORS IN LITERATURE the behavior or tendencies detected on the data to obtain a
better approximation.
The use of a fuzzy clustering algorithm such as FCM (in
the more simple methods) and the combination of FCM and
Gustafson–Kessel algorithm (in the more complex methods)
have let us describe the relation between input and output
varibles and, in this way, obtain a collection of fuzzy rules
that can approximate systems with great acccuracy by means
of collection of simple methods.
The most remarkable advantage of these methods is that
they have demostrated their utility to be used in a rapid-
TABLE V
ERRORS IN LITERATURE prototyping methodology whose objective is to obtain a first
model that could be good enough so it can serve as a first
approximation of fuzzy model for systems; this could be
subsequently be purged and tuned to obtain the final fuzzy
model. Within this methodology, we have also shown that
these methods can be integrated with other techniques such
as GA’s.
Although the results we have obtained with the different
concentration in the outlet gases. Since the process is dynamic, methods in the examples are quite exciting, we can point out
we can consider differents values in the time of the variables. that in case of simple systems, the results obtained with the
In Table IV, we indicate different fuzzy models proposed in more simple methods ( and ) are sufficient, as
the literature, showing the number of variables chosen in each can be deduced from the comparison with the results of others
case, the number of rules, and the performance of the models. methods proposed in literature. However, when we have to
To compare our results, we make a fuzzy clustering using the deal with complex systems modeling method, shows
296 data points, considering from six to four clusters, as in the better performance, but with a computational complexity
Sugeno et al. [41], Yoshinari et al. [52], and Wang et al. [47], increase. In any case, we have shown that the methods can be
obtaining better results in the case of four clusters. used alone to obtain a first approximation to the fuzzy model.
We should remark that except for the Sugeno and Tanaka Finally, we must remark that our approach seeks to pro-
model [40], we obtain a better performance with a number of vide the decision maker with a range of easily management
rules remarkably smaller in comparison with the classical ap- methods among which he can use the one better adjusted to
proaches. Sugeno and Tanaka [40] provide less error, although his requirements. In this way, depending on his interest on
with more variables and more computational complexity. either a qualitative (descriptive) fuzzy model or a quantitative
(approximative) one, or depending on the kind of emphasis
E. Zimmermann and Zysno Empirical Data he want to put on the accurate capacities of this first model,
Finally, in this section we applied our method to the the user can use different methods to test alternative possible
empirical data of Zimmermann and Zysno [55] that other fuzzy models.
authors (i.e., Dyckhoff and Pedrycz [13], Krishnapuram and The work we present here is part of a more general study we
Lee [25], or Langari and Wang [28]) have also used to test are doing about the integration of different techniques for fuzzy
their algorithms. rules generation in an environment we have called IGOR.
In Table V, we can see the performance [using (24)] ob- IGOR pretends to be a general framework for fuzzy modeling
tained with the methods suggested in the literature and with where different techniques of supervised and unsupervised
our method , considering four rules (as Langari et al. learning, together with different tuning methods, could be
[28] have done). integrated.
Again, even in this complex case, the improvement in the We are also working on other techniques based on the
model performance is clear. notion of matching between the fuzzy clusters obtained in
the differents domains , , and and the use of
V. CONCLUSIONS AND FUTURE TRENDS some measures like fuzzy frequency introduced by Delgado
This paper presents different alternatives to a rapid- and Gonzalez [10], and how the results we presented in [9]
prototyping approach to the fuzzy modeling by means of (about the use of hierarchical clustering) can be used to make a
relatively simple methods, based on the use of fuzzy clustering, tuning of the fuzzy model obtained with the methods described
that have shown good behavior in different contexts. in this paper.
Starting from the results appearing up until now in the
literature of fuzzy clustering, we have reformulated the way ACKNOWLEDGMENT
fuzzy clusters are used, defining different methods whose use The authors would like to thank the anonymous reviewers
will be determined by the users’ perspective in the fuzzy for their valuable comments on the manuscript and Professor
modeling. These methods directly characterize fuzzy sets in J. Bezdek for his helpful comments and suggestions.
232 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 5, NO. 2, MAY 1997

REFERENCES [30] E. Mamdani, “Advances in the linguistic synthesis of fuzzy controllers,”


Int. J. Man-Machine Std., vol. 8, pp. 669–678, 1976.
[1] S. Araki, H. Nomura, I. Hayashi and N. Wakami, “A self-generating [31] M. Mizumoto, “Method of fuzzy inference suitable for fuzzy control,”
method of fuzzy inference rules,” Fuzzy Eng. Toward Human Friendly J. Soc. Instrum. Contr. Engineers, vol. 58, pp. 959–963, 1989.
Syst., vol. IV, pp. 1047–1058, 1991. [32] H. Nakanishi, I. B. Turksen, and M. Sugeno, “A review and comparison
[2] R. Babuska and H. B. Verbruggen, “Applied fuzzy modeling,” in Proc. of six reasoning methods,” Fuzzy Sets Syst., vol. 57, pp. 257–294, 1992.
IFAC Congress Artificial Intell. Real Time Contr., Valencia, Spain, 1994, [33] J. Nie and D. A. Linkens, “Learning control using fuzzified self-
pp. 61–66. organizing radial basis function network,” IEEE Trans. Fuzzy Syst., vol.
[3] , “A new identification method for linguistic fuzzy models,” in 1, pp. 280–287, Nov. 1993.
Proc. FUZZ-IEEE/IFES’95, Yokohama, Japan, Mar. 1995, pp. 897–904. [34] H. Nomura, I. Hayashi, and N. Wakami, “A self-turning method of fuzzy
[4] H. R. Berenji and P. S. Khedkar, “Clustering in product space for fuzzy control by descent method,” Int. Fuzzy Syst. Assoc., Brussels, Belgium,
inference,” in Proc. IEEE Int. Conf. Fuzzy Syst., San Francisco, CA, 1991, pp. 155–158.
1993, pp. 1402–1407. [35] N. R. Pal, J. C. Bezdek, and E. C-K Tsao, “Generalized clustering
[5] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algo- networks and kohonens self-organizing scheme,” IEEE Trans. Neural
rithms. New York: Plenum, 1981. Networks, vol. 4, pp. 549–557, July 1993.
[6] J. C. Bezdek, E. C. Tsao, and N. R. Pal, “Fuzzy Kohonen clustering [36] W. Pedrycz, “An identificaction algorithm in fuzzy relational systems,”
networks, 1st Proc. IEEE Conf. Fuzzy Syst., San Diego, CA, 1992, pp. Fuzzy Sets Syst., vol. 13, pp. 153–167, 1984.
1035–1043. [37] X. Peng and P. Wang, “On generating linguistic rules for fuzzy models,”
[7] J. C. Bezdek and S. K. Pal, Eds., Fuzzy Models for Pattern Recognition Lectures Notes Comput. Sci. Uncertainty Intell. Syst., vol. 313, pp.
New York: IEEE Press, 1992. 185–192, 1988.
[8] S. L. Chiu, “A cluster estimation method with extension to fuzzy model [38] P. K. Simpson and G. Jahns, “Fuzzy min-max neural networks for
identification,” in Proc. IEEE Int. Conf. Fuzzy Syst., Orlando, FL, 1994, function approximation,” in Proc. IEEE Int. Conf. Fuzzy Syst., San
pp. 1240–1245. Francisco, CA, 1993, pp. 1967–1972.
[9] M. Delgado and A. González, “A frequency model in a fuzzy environ- [39] S. K. Sin and R. J. P. deFigueiredo, “Fuzzy system desing through
ment,” Int. J. Approx. Reasoning, vol. 11, pp. 159–174, 1994. fuzzy clustering and optimal predefuzzification,” in Proc. IEEE Int.
[10] M. Delgado, M. A. Vila and A. F. Gómez-Skarmeta, “An unsuper- Conf. Fuzzy Syst., San Francisco, CA, 1993, pp. 190–195.
vised learning procedure to obtain a classification from a hierarchical [40] M. Sugeno and K. Tanaka, “Successive identification of a fuzzy model
clustering,” in 2nd Proc. Eur. Congress Fuzzy Intell, Technol, Software and its applications to prediction of a complex system,” Fuzzy Sets
Computing, Aachen, Germany, Sept. 1994, pp. 528–533. Syst., vol. 42, pp. 315–334, 1991.
[11] M. Delgado, A. F. Gómez-Skarmeta and M. A. Vila, “Hierarchical [41] M. Sugeno and T. Yasukawa, “A Fuzzy-logic-based approach to qual-
clustering to validate fuzzy clustering,” in Proc. FUZZ-IEEE/IFES’95, itative modeling,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 7–31, Feb.
Yokohama, Japan, 1995, pp. 1807–1812. 1993.
[12] M. Delgado, A. F. Gómez-Skarmeta and F. Martín, “Using fuzzy clusters [42] T. Takagi and M Sugeno, “Fuzzy identification of systems and its
to model fuzzy systems in a descriptive approach,” in Proc. Inform. applications to modeling and control,” IEEE Trans. Syst., Man, Cybern.,
Processing Management Uncertainty Knowledge-Based Syst. IPMU’96, vol. SMC-15, pp. 116–132, 1985.
Granada, Spain, July 1996, pp. 563–568. [43] P. Vourimaa, “Fuzzy self-organizing map,” Fuzzy Sets Syst., vol. 66, pp.
[13] H. Dyckhoff and W. Pedrycz, “Generalized means as model of com- 223–231, 1994.
pensative connectives,” Fuzzy Sets Syst., vol. 14, pp. 143–154, 1984. [44] L. Wang and J. M. Mendel, “Back-propagation fuzzy system as nonlin-
[14] D. Dubois and H. Prade, “Gradual inference rules in approximate ear dynamic system identifiers,” in Proc. IEEE Int. Conf. Fuzzy Syst.,
reasoning,” Inform. Sci., vol. 61, pp. 103–122, 1992. San Diego, CA, 1992, pp. 1409–1416.
[15] I. Gath and A. Geva, “Unsupervised optimal fuzzy clustering”, IEEE [45] , “Generating fuzzy rules by learning from examples,” IEEE
Trans. Pattern Anal. Machine Intell., vol. 11, pp. 773–781, July 1989. Trans. Syst., Man, Cybern., vol. 22, pp. 1414–1427, 1992.
[16] D. E. Goldberg, Genetics Algorithms in Search, Optimization, and [46] L. Wang, “Training of fuzzy logic systems using nearest neighborhood
Machine Learning. Reading, MA: Addison–Wesley, 1989. clustering,” in Proc. IEEE Int. Conf. Fuzzy Syst., San Francisco, CA,
[17] D. E. Gustafson and W. C. Kessel, “Fuzzy clustering with a fuzzy 1993, pp. 13–17.
covariance matrix,” Proc. IEEE CDC, San Diego, CA, Jan. 1979, pp. [47] L. Wang and R. Langari, “Complex systems modeling via fuzzy logic,”
761–766. IEEE Trans. Syst., Man, Cybern., vol. 26, pp. 100–106, Feb. 1996
[18] F. Guély and P. Siarry, “A centered formulation of Takagi–Sugeno rules [48] R. T. Yager and D. P. Filev, “Generation of fuzzy rules by mountain
for improved learning efficiency,” Fuzzy Sets Syst., vol. 62, pp. 277–285, clustering,” Tech. Rep. MII-1318, 1993.
1994. [49] , “Template based fuzzy systems modeling,” Tech. Rep. MII-
[19] F. Herrera, M. Lozano, and J. L. Verdegay, “Tuning fuzzy logic 1310, 1993.
controllers by genetic algorithms,” Int. J. Approx. Reasoning, vol. 12, [50] , “Unified structure and parameter identification of fuzzy models,”
pp. 299–315, 1995. IEEE Trans. Syst., Man, Cybern,, vol. 23, pp. 1198–1205, July 1993.
[20] H. Ichihashi, “Iterative fuzzy modeling and a hierarchical network,” in [51] T. Yamakawa, “Stabilization of an inverted pendulum by a high-speed
Int. Fuzzy Syst. Assoc., Brussels, Belgium, 1991, pp. 49–51. fuzzy logic controller hardware system,” Fuzzy Sets Syst., vol. 32, pp.
[21] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting fuzzy 161–180, 1989.
[52] Y. Yoshinari, W. Pedrycz, and K. Hirota, “Construction of fuzzy models
if-then rules for classification problems using genetic algorithms,” IEEE
through clustering techniques,” Fuzzy Sets Syst., vol. 54, pp. 157–165,
Trans Fuzzy Syst., vol. 3, pp. 260–270, Aug. 1995
[22] N. B. Krayiannis, “MECA: Maximun entropy clustering algorithm,” in 1993.
[53] L. A. Zadeh, “Fuzzy sets,” Inform. Contr., vol. 8, pp. 338–353, 1965.
Proc. IEEE Int. Conf. Fuzzy Syst., Orlando, FL, 1994, pp. 630–1648.
[54] , “Outline of a new approach to the analysis of complex systems,”
[23] F. Klawonn and R. Kruse, “Equality relations as a basis for fuzzy
IEEE Trans. Syst., Man, Cybern,, vol. SMC-1, pp. 28–44, 1973.
control,” Fuzzy Sets Syst., vol. 54, pp. 147–156, 1993.
[55] H.-J. Zimmermann and P. Zysno, “Latent connective in human decision
[24] L. T. Koczy and K. Hirota, “Ordering, distance and closeness of fuzzy
making,” Fuzzy Sets Syst., vol. 4, pp. 37–51, 1980.
sets,” Fuzzy Sets Syst., vol. 59, pp. 281–293, 1993.
[25] R. Krishnapuram and J. Lee, “Fuzzy connective-based hierarchical
aggregation networks for decision making,” Fuzzy Sets Syst., vol. 46,
pp. 11–27, 1992.
[26] R. Krishnapuram and J. M. Keller, “A possibilistic approach to cluster-
ing,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 98–110, May 1993.
[27] R. Krishnapuram, “Generation of membership functions via possibilistic
clustering,” in Proc. IEEE World Congress Computat. Intell., Orlando,
FL, 1994, vol. 2, pp. 902–908.
[28] R. Langari and L. Wang, “Fuzzy models, modular networks, and hybrid
learning,” Fuzzy Sets Syst., vol. 79, pp. 141–150, 1996
[29] C. Liaw and J. Wang, “Design and implementation of a fuzzy controller
for a high performance induction motor drive,” IEEE Trans. Syst., Man, M. Delgado, photograph and biography not available at the time of publi-
Cybern., vol. 21, pp. 921–929, 1991 cation.
DELGADO et al.: FUZZY CLUSTERING-BASED RAPID PROTOTYPING FOR FUZZY RULE-BASED MODELING 233

Antonio F. Gómez-Skarmeta was born in Santi- F. Martín, photograph and biography not available at the time of publication.
ago, Chile. He received the B.S. (honors) degree
from the University of Murcia, Spain, the M.S.
degree from the University of Granada, Spain, and
the Ph.D. degree from the University of Murcia,
all in computer science, in 1988, 1991, and 1995,
respectively.
From 1986 to 1992, he was an Assistant Professor
in the Department of Computer Science, University
of Murcia. Since 1993 he has been a Professor at
the same department. His research interests include
fuzzy control, fuzzy and linguistic modeling, intelligent cooperative systems,
and image analysis and processing.
Dr. Gómez-Skarmeta is an active member of the Spanish Fuzzy Logic and
Technologies Association (FLAT).

You might also like