You are on page 1of 6

Shuhui Li

Department of Electrical Engineering and


Computer Science,
Comparative Analysis of
Texas A&M University-Kingsville,
Kingsville TX 78363
Regression and Artificial Neural
e-mail shuhui.li@tamuk.edu

Donald C. Wunsch
Network Models for Wind Turbine
Department of Electrical and Computer
Engineering, Power Curve Estimation
University of Missouri-Rolla,
Rolla MO 65409 This paper examines and compares regression and artificial neural network models used
for the estimation of wind turbine power curves. First, characteristics of wind turbine
power generation are investigated. Then, models for turbine power curve estimation using
Edgar OHair both regression and neural network methods are presented and compared. The parameter
estimates for the regression model and training of the neural network are completed with
Michael G. Giesselmann the wind farm data, and the performances of the two models are studied. The regression
model is shown to be function dependent, and the neural network model obtains its power
Department of Electrical Engineering, curve estimation through learning. The neural network model is found to possess better
Texas Tech University, performance than the regression model for turbine power curve estimation under compli-
Lubbock TX 79409 cated influence factors. DOI: 10.1115/1.1413216

1 Introduction hub height and the velocity profile are known, and that the veloc-
During the last decade, models have emerged 15 to estimate ity profile does not change. The velocity profile is defined as the
and predict the power produced by wind farms. To maximize the difference in velocity as a function of height from the bottom to
use of wind generated electricity when connected to the electric the top of the turbine blade and is another factor which can influ-
grid, it is necessary to be able to estimate and predict the power ence turbine power production 8,9. The hub height velocity and
production of a wind turbine. However, this power production can the velocity profile are measured by the met. towers. However,
be influenced by many factors and usually fluctuates rapidly, im- due to the limited number of the met. towers, the variable terrain,
posing considerable difficulties on the management of combined the turbines distributed over a wide range on the wind farm, and
electric power systems. Several different techniques have been wind dynamics, the actual wind velocity and profile for each tur-
presented to estimate and predict the highly variable energy pro- bine are usually quite different from those obtained from the met.
duction. Two typical models are regression models 1,2 and arti-
towers. These are some of the main reasons that the measured
ficial neural networks 35. In this paper, we compare regression
and artificial neural network models for wind turbine power curve turbine power production versus meteorological tower wind speed
estimation using data from the Central and South West wind farm does not fall on the manufacturers power curve as shown by Fig.
near Fort Davis, Texas. 2. In the figure, the line represents the manufacturers warranted
estimated output power curve for the wind turbine with 500 kW
2 The Wind Farm and the Wind Power Generation rated power, which has been adjusted especially for the Fort Davis
wind farm to account for the difference in attitude with respect to
The Fort Davis wind farm consists of 12 turbines and two me-
sea level. The dots represent the measured wind turbine power
teorological towers met. tower Fig. 1. Data received from the
wind farm can be divided into two categories. The first contains output for a met. tower 10-minute average wind velocity. Which
data from the two met. towers such as wind velocities and direc- met. tower velocity to use is selected based on which direction the
tions measured at three elevations 10, 30, and 40m. The second wind comes from, i.e., if the wind comes from the east, the mea-
contains information about turbine power generation, such as av- sured wind speed from the east tower is used; if the wind comes
erage power outputs, voltages, and currents. The two met. towers, from the west, the measured wind from the west tower is chosen.
indicated with , are the sites for measurement of wind speed and In Fig. 2, the large difference of turbine power production at the
direction. Each dotted circle is the location of a wind turbine. same wind speed, as well as high power productions at low wind
Turbine power production depends on the energy contained in speeds and low power productions at high wind speeds, implies
the wind. The basic measuring unit of the energy contained in the that the wind at the turbine can be quite different from the wind at
wind is wind power density 6, or power per unit of area normal
the met. towers.
to the wind azimuth, calculated as Eq. 1, where PW is wind
power density W/m2, is air density kg/m3, and V is horizon- The air density in Eq. 1 also influences the energy contained
tal component of the mean free-stream wind velocity m/s. in the wind and therefore turbine power production. However,
has less influence on turbine power production than the wind
Pw0.5 V3 (1) speed because the dynamic range of is usually small and wind
However, both wind velocity and air density are generally not power is proportional to the cube of wind speed. In addition to the
constant. The hub height of the turbine is 40m above the ground. above factors, wind power production is also affected by other
The industry standard is to relate the power to the hub height wind factors such as seasons of a year, time of day 1, and wind fluc-
velocity 7. Such a relationship implies that the velocity at the tuation within a certain time period. In the following comparison
between neural networks and regression models, the only factors
Contributed by the Solar Energy Division of THE AMERICAN SOCIETY OF ME- considered are the 40m wind speeds and wind directions from the
CHANICAL ENGINEERS for publication in the ASME JOURNAL OF SOLAR ENERGY
ENGINEERING. Manuscript received by the ASME Solar Energy Division, January, two met. towers. Introducing other factors would make the speci-
2001; final revision, July, 2001. Associate Editor: D. Berg. fication of a function for a regression model quite difficult.

Journal of Solar Energy Engineering Copyright 2001 by ASME NOVEMBER 2001, Vol. 123 327

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Fig. 1 Central and South West renewable project small wind farm, Fort Davis, Texas

If function f (xi , ) is linear, the regression model can be ex-


pressed as Eq. 4. This can be written as a matrix Eq. 5, where
Y is a n-dimensional vector and X is a n p 1 matrix. In case
of the estimated regression coefficient , the predicted values are
then calculated by multiplying each row in the X matrix by the
column, that is YX . The least squares estimate of is the
solution to Eq. 6, and only one function solving step is needed to
get the solution. When f in Eq. 2 is a polynomial, a linear rep-
resentation of Eq. 4 can still be obtained, but the number of the
columns of X will be larger than the number of the predictor
variables x ji ( j1,2, . . . ,k).
When f is a nonlinear function, linearization Taylor extension
of f with respect to parameters 10,11 is required for Eq. 7,
where 0 ( 00 , 01 , . . . , 0 p ) are initial values for parameter ,
so that techniques for Eqs. 36 can be used. These initial val-
ues may be intelligent guesses or preliminary estimates based on
whatever information is available; they will be iteratively
Fig. 2 Wind power generation vs. wind speed turbine No. 6, improved.
March 1996
n

S y f x ,
i1
i i
2
(3)
3 Regression Model for Wind Turbine Power Estimation
p
3.1 Prediction by Regression Model. Regression models
quantitatively describe the variability among the observations by
y i x
j0
ji j i x 0i 1 (4)
partitioning an observation into two parts 10. The first part of
this decomposition is the predicted portion having the character- YX (5)
istic that can be ascribed to all the observations considered as a
group in a parametric framework. The remaining portion, called XTX XTY (6)
the residual, is the difference between the observed and the pre-


dicted values and must be ascribed to unknown sources. This can p
f xi ,
be expressed as f xi , f xi , 0 j0 j (7)
j0 j 0
y i f xi , i i1,2, . . . ,n (2)
where n is the number of the observations, y i is ith observation, 3.2 Regression Model for Wind Turbine Power Estimation.
xi (x 1i ,x 2i , . . . ,x ki ) is the predictor variable vector related to As stated in Section 2, there are many factors which can affect
observation y i , ( 0 , 1 , . . . , p ) is the parameter vector, and wind turbine power generation. However, we will only include
i is the error associated with ith observation. The function f is 40m wind velocity and direction measurements from the two met.
estimated by fitting a polynomial or other type of function. Fitting towers in our modeling and comparative analysis.
refers to calculating values of the parameters from a set of data. The following rules have been used for the regression model:
Usually, the estimate , a least squares estimate of , tries to a A polynomial f i () is chosen as the main representation of the
minimize the error sum of squares shown by Eq. 3. regression model; b Wind speeds are taken as the most important

328 Vol. 123, NOVEMBER 2001 Transactions of the ASME

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Fig. 3 Wind rose diagram on Fort Davis wind farm Roxal, 40m
level, 5194 to 43095. It shows probability distribution of wind Fig. 4 Structure of the multilayered perceptron network with
directions in a circle single hidden layer Node 0 is bias.

predictor variables in the polynomial; c The wind direction in-


fluence on turbine power production is introduced as a product the characteristics of the MLP network together with the ability to
with wind speed; and d Topographic influence on turbine power learn from experience through training that the MLP derives its
production is obtained by fitting different regression models for computing power.
different turbines. A MPL network can be used for a function approximation prob-
The overall regression model for each turbine is shown by Eq. lem in which the inputs to the network are equivalent to the pre-
8, where i is related to a specific turbine, and V1, V2, d1, and d2 dictor variables in the regression model of Eq. 2 and the output
are 40m wind speed and wind direction measurements from the of the network is equivalent to the predicted value. For a given
two met. towers. Function g() in Eq. 8 is a transform function problem, there is a cost function T 12, which is similar to the
for wind directions Eq. 9 to reflect high wind directions. The error sum of squares of Eq. 3 for the regression model, as the
M1 and M2 in Eq. 9 are means corresponding to two high wind measure of training set learning performance. The objective of the
directions computed for a year for wind coming from the north- learning process is to adjust the weights of the network so as to
west and east, respectively Fig. 3. The product of wind speed minimize T . A highly popular training algorithm known as the
with a transformed wind direction Eq. 9 in the regression backpropagation algorithm 13 is generally used to adjust the
model implies that high wind mainly comes from certain direc- network weights until a stop criterion is reached.
tions and also high power generation requires high wind speed.
P i f i V 1 ,V 2 1i V 1 g d 1 2i V 2 g d 2 i1,2, . . . ,12
(8) 4.2 Neural Network Model for Wind Turbine Power
Estimation. We configure a neural network for each turbine.
3 xM 2 3 xM 2
g x 0.6e 5/310 1 0.65e 10 2 /10 (9) The input patterns to the network are 40m wind velocities and
directions obtained from the two met, towers. The network output
4 Neural Network Model for Wind Turbine Power is the estimated power generated corresponding to each individual
input pattern. Three additional steps are introduced to improve
Estimation
learning. First, the wind velocity usually from 0 mph to 50 mph
4.1 Multilayer Perceptron MLP Network. A multilayer is normalized into a range from 0 to 4 through a transformation
perceptron network Fig. 4 has three distinctive characteristics function. This transform function can also reflect the effect of the
12. First, the network consists of a set of source nodes that power output limitation applied on a wind turbine when the wind
constitute the input layer, one or more layers of hidden neurons, is high 5. Second, Eq. 8 is used to preprocess the wind direc-
and the output layer. Second, the model of each neuron in the tion 0 to 360 into a more limited range and to increase the
MLP network includes a differentiable nonlinearity at the output influence of high wind directions on turbine power production.
end. A commonly used form of nonlinearity that satisfies this re- This preprocessing for the input patterns enables the neural net-
quirement is a sigmoidal nonlinearity defined by the logistic func- work to learn faster i.e., less learning iterations and produce a
tion of Eq. 10, where j is the net internal activity level of smaller difference between the predicted and the measured out-
neuron j, and y j is the output of the neuron. Third, The network puts for each turbine 5. Third, for comparison, we also consid-
exhibits a high degree of connectivity determined by the weights ered the case that the total inputs to a node are not only a linear
of the network. A change in the connectivity of the network re- combination of the components x j , but also include high-order
quires a change in the population of network weights. interactions represented by their products x j x k , triplets x j x k x l ,
quadruplets, etc., which forms a feedforward high-order neural
1 network 14,15. To facilitate the comparison of the high-order
y j (10) neural network with the regression model of Eq. 8, the inputs
1exp j
presented to the high-order network used in this paper contain all
MLPs have been applied successfully to solve many difficult the items represented by Eq. 8. For example, for the comparison
and diverse problems by training them in a supervised manner. of the high-order neural network with nth order regression model,
Experiential knowledge for a MLP network is acquired by the the network inputs include 1st, 2nd, . . . , nth order wind speeds,
network through the training process and stored in the network and two products of wind speeds with preprocessed wind
weights after it is trained 12,13. It is through the combination of directions.

Journal of Solar Energy Engineering NOVEMBER 2001, Vol. 123 329

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Fig. 5 Using wind speed of met. 18 to estimate that of met. 19 Fig. 6 Using wind speed of met. 18 to estimate that of met. 19
March 1996 April 1996 shows the scatter plot of the measured wind
speeds, the prediction by a linear regression model, and the
prediction by neural network model

5 Comparative Analysis of Regression Model to network model gets better performance for both low and high
wind speeds even for test data from April 1996 as shown by
Neural Network Model
Fig. 6.
The parameter estimates of for the regression and the weight
adjustments for the neural network training try to minimize the
6 Comparisons of Wind Turbine Power Estimation
error sum of squares. However, the regression model requires the
form of the function f in Eq. 2 to be predefined explicitly, mak- using Regression and Neural Network Models
ing the model function dependent but easy to evaluate. On the For wind turbine power estimation, data for 40m wind speeds
other hand, the neural network achieves the ability of function and wind directions from both of the met. towers are used. For
approximation through learning from data, making it data each turbine, 1500 data sets of 10-minute averages, which can
dependent. represent typical wind turbine power productions, are selected
There are also differences in obtaining parameter estimates of from March 1996 and are then used to do both the parameter
for the regression model and weight adjustments for the neural estimates for the regression model and the training of the neural
network. For the regression model, a matrix representation of Eq. network model. The selection of the training data is mainly based
5 is usually obtained, and then one function solving step is on the considerations of: 1 trying to make training data distribute
needed to get the least squares estimate section 3.1. For the equally in different wind speed sections, and 2 trying to make
neural network, the network weights are usually adjusted many the training data cover, equally, the wind coming from different
times perhaps thousands of times until a stop criterion is directions.
reached. The regression model for the power estimation is based on Eq.
Some evident relationships can also be seen between regression 8 which contains a polynomial with two wind speed variables
models and neural networks when the sigmoid nonlinear activa- and two products of wind speeds with transformed wind direc-
tion functions of neurons are reduced to linear functions. A gen- tions. Figure 7 shows the comparison among the 2nd, 3rd, and 4th
eral first-order neural network will be equivalent to a multi- order polynomial regression models for turbine power estimation
variable linear regression model, and a high-order neural network for test data for April 1996. It can be seen that the 3rd order
will be equivalent to a multi-variable polynomial regression regression model provides the best prediction results, and that
model. Thus, the nonlinear activation function is an important either the lower or the higher order will degenerate the results.
factor in a neural network for complicated function approximation This indicates that the low order polynomial is not suitable for this
problems. application; and the high order polynomial can introduce noise at
Figures 5 and 6 compare some of the properties between regres- high wind speed even though the parameters for high order items
sion and neural network models. This example uses measured are very small. The products of wind speeds with transformed
wind speeds from the west met. tower to predict the wind speeds wind directions have been found to improve the prediction accu-
of the east met. tower using regression and neural network mod- racy. By introducing the products, the estimation of turbine power
els. A data set of 1500 patterns in March 1996 is used for both the production can be adjusted for wind coming from different
parameter estimates of the regression model and the training of directions.
the neural network model. For the regression model, the relation The neural network is built and trained for each turbine. Each
obtained between input and output predicted value is a straight network has three layers. The nodes chosen in the three layers are
line when a linear function is predefined Fig. 5, reflecting the 4, 8, and 1 excluding bias. The input patterns are preprocessed
function dependent characteristics of the model. However, the measured wind velocities and directions, and the desired network
neural network achieves the input-output relationship through outputs correspond to the normalized measured turbine power 5.
learning. From the regression point of view, this relationship or The performance of the trained network is compared with that
equivalent function is more complicated Fig. 5. of the 3rd order regression model for the same test data for April
But the neural network model is more data dependent. In Fig. 5, 1996 in Fig. 8. Table 1 gives the absolute average power differ-
the parameter estimates and the training of the network weights ence in kW of turbine No. 6 between the measured and estimated
are based on the first 1500 patterns of data from March 1996. This power of various models for four months of test data. Both the
selection of patterns has much higher data density in low wind figure and the table show that the neural network has better per-
than in high wind, making the neural network bias its weights formance. The comparison does not show much difference be-
more toward the low wind data 16. However, when training data tween the high-order and general first-order neural networks.
are selected equally from different wind speed sections, the neural Even though all the items in Eq. 8 are taken as inputs to the

330 Vol. 123, NOVEMBER 2001 Transactions of the ASME

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


Fig. 7 Wind turbine power prediction by polynomial regression models for test data turbine No. 6, April 1996

Fig. 8 Comparison of wind turbine power prediction by 3rd order regression and neural network models for
test data turbine No. 6, April 1996

high-order neural network, the RMS errors used to reflect the


learning process of the first-order and high-order neural networks
are very close Fig. 9. Actually, for a general first-order neural
network, the effect of the equivalent high-order interactions is
obtained through the weighted sum of inputs and the nonlinear
activation function of the neuron model. The regression model,
on the other hand, must explicitly include those high-order
interactions.

Table 1 Comparison of Power Difference in kW Between Mea-


sured and Estimated Power for Regression and 1st Order Neu-
ral Network Models

Prediction Model Mar-96 Apr-96 May-96 June-96


Regression 2nd order 35.8 37.5 32.0 41.1 Fig. 9 Comparison of RMS errors for training a high-order and
Regression 3rd order 27.8 32.0 26.8 28.4 the first-order neural networks. For the high-order network, the
Regression 4th order 32.6 39.5 29.3 29.6 network inputs includes 1st, 2nd, and 3rd orders of wind
1st order Neural Network 25.9 29.5 23.7 26.4 speeds, and two products of wind speeds with transformed
wind directions, i.e., the same items as those in Eq. 8.

Journal of Solar Energy Engineering NOVEMBER 2001, Vol. 123 331

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use


In actual practice, the power output comparisons for one month References
for a neural network model would then be used to predict power 1 Joensen, A., Madsen, H., and Nielsen, T. S., 1997, Non-Parametric Statistical
output for that same month and the next month. In addition, in Methods for Wind Power Prediction, presented at EWEC97, Dublin, Den-
complex terrain like that of the Fort Davis wind farm, it can be mark.
2 Landberg, L., 1997, A Mathematical Look at a Physical Power Prediction
expected that each turbine will require its own model. We have Model, presented at EWEC97, Dublin, Denmark.
not established the number of months of data that should be used 3 Kariniotakis, G. N., Stavrakakis, G. S., and Nogaret, E. F., 1996, Wind Power
in the training/development of the model, but the national weather Forecasting using Advanced Neural Networks Models, IEEE Trans. on En-
service usually uses three-month periods of data in developing ergy Conversion, 11, No. 4, pp. 762767.
4 Bossanyi, E. A., 1985, Stochastic Wind Prediction for Wind Turbine System
their models for weather forecasting. However, training of the Control, Proc. of 7th British Wind Energy Association Conf. Oxford, U.K.,
neural network with more data may be needed for better network pp. 219226.
performance for different seasons of the year. 5 Li, S., OHair, E., and Giesselmann, M., 1997, Using neural networks to
To handle more advanced problems, a wind power neural net- predict wind power generation, Proc. of Int. Solar Energy Conf. Washington
D.C., pp. 415 420.
work with extended Kalman filter EKF 17 training on a SIMD 6 Walker, J. F., and Jenkins, N., 1997, Wind Turbine Technology, John Wiley &
parallel machine 18,19 is under investigation. The extended Kal- Sons.
man filter training has been shown to be advantageous both in 7 American Wind Energy Association, 1988, Standard Performance Testing of
training speed and network performance in many applications Wind Energy Conversion Systems, AWEA Standard, AWEA 1.1.
8 Frost, W., and Aspliden, C., 1995, Characteristics of the Wind, in Wind
17,20. Turbine Technology, David A. Spera ed., ASME Press, pp. 371 445.
9 Krause, P. C., and Man, D. T., 1981, Dynamic Behavior of a Class of Wind
Turbine Generators during Electrical Disturbances, IEEE Trans. Power Ap-
par. Syst., PAS-100, No. 5, pp. 2204 2210.
10 Allen, D. M., and Cady, F. B., 1982, Analyzing Experimental Data by Regres-
sion, Lifetime Learning Publications.
11 Draper, N. R., and Smith, H., 1981, Applied Regression Analysis, John Wiley
7 Conclusions & Sons.
12 Haykin, S. S., 1994, Neural Networks: A Comprehensive Foundation, Mac-
We have compared regression and neural network models to millan.
estimate wind turbine power curves. Both methods attempt to 13 Hush, D. R., and Horne, B., 1993, Progress in Supervised Neural Networks,
minimize the error sum of squares between observations and pre- IEEE Signal Process. Mag., pp. 138.
14 Kosmatopoulos, E. B., Polycarpou, M. M., Christodoulo, M. A., and Ioannou,
dicted values. Regression requires an explicit function to be P. A., 1995, High-Order Neural Network Structures for Identification of Dy-
defined before the least squares parameter estimates, while a namic Systems, IEEE Trans. Neural Netw., 6, No. 2, pp. 422 431.
neural network depends more on training data and the learning 15 Thimm, G., and Fiesler, E., 1997, High-Order and Multilayer Perception
algorithm. Initialization, IEEE Trans. Neural Netw., 8, No. 2, pp. 349359.
16 Zhu, H., and Rohwer, R., 1997, Measurements of Generalisation Based on
Input variables to the models have been restricted to the wind Information Geometry, in Mathematics of Neural Networks: Models, Algo-
speed at 40m and the wind direction, as measured by a met tower. rithms and Applications, S. W. Ellacott, J. C. Mason, and I. J. Anderson eds.,
Comparison of model prediction with measurements not used in pp. 394 398.
the model development show that the neural network model per- 17 Singhal, S., and Wu, L., 1989, Training Multilayer Perceptrons with the
Extended Kalman Filter Algorithm, in Advances in Neural Information Pro-
forms better than the regression models. cessing Systems, Morgan Kaufmann, San Mateo, CA, pp. 133140.
Wind power generation can also be affected by other factors 18 Kumar, V., Grama, A., Gupta, A., and Karypis, G., 1999, Introduction to Par-
such as air density, vertical wind profile, season, and time of day. allel Computing, Benjamin/Cummings, pp. 151196.
Under the complicated influence of numerous factors such as 19 Saratchandran, P., Sundararajan, N., and Foo, S., 1996, Parallel Implementa-
tions of Backpropagation Neural Networks on Transputers, World Scientific.
these, selection of an appropriate function for a regression model 20 Puskorius, G. V., and Feldkamp, L. A., 1991, Decoupled Extended Kalman
is extremely difficult, and this can give neural networks an added Filter Training of Feedforward Layered Networks, in Proc. of Int. Joint Conf.
advantage. on Neural Networks, Seattle WA, pp. 771777.

332 Vol. 123, NOVEMBER 2001 Transactions of the ASME

Downloaded From: http://solarenergyengineering.asmedigitalcollection.asme.org/ on 01/29/2016 Terms of Use: http://www.asme.org/about-asme/terms-of-use

You might also like