Professional Documents
Culture Documents
Abstract - The NPower Forecasting Challenge 2015 invited several sister models to enhance the accuracy of point load
students and professionals worldwide to predict daily energy forecast.
usage of a group of customers. The BigDEAL team from the Big
Data Energy Analytics Laboratory landed a top 3 place in the In the nPower Forecasting Challenge 2015, we first
final leaderboard. This paper presents a refined methodology generated individual forecasts using different techniques, such
based on the implementation during the competition. We first as Multiple Linear Regression (MLR), Autoregressive
build the individual forecasts using several forecast techniques, Integrated Moving Average (ARIMA), Artificial Neural
such as Multiple Linear Regression (MLR), Autoregressive Network (ANN), and Random Forecast (RF). We then
Integrated Moving Average (ARIMA), Artificial Neural Network averaged a subset of these individual forecasts to achieve a
(ANN) and Random Forests (RF). We then select a subset of the more robust and accurate point forecast than each individual
individual forecasts based on their performance on a validation forecast.
period, a.k.a. post-sample. Finally we obtain the final forecast by
averaging the selected individual forecasts. The forecast The rest of the paper is organized as follow: Section II
combination on average yields a better result than the forecast introduces the data used in the competition. Section III presents
from a single technique. the modeling techniques and the forecast combining strategy.
Section IV discusses the results. The paper is then concluded in
Keywords—electric load forecasting; forecasting combination; Section V.
forecast competition; NPower Forecasting Challenge 2015
II. DATA DESCRIPTION
I. INTRODUCTION The competition included three rounds. In each round, the
Energy industry is one of the many industries that integrate contestants were asked to submit a six-month ahead ex post
forecasting to the day-to-day business operations. Accurate forecast of daily energy consumption, which is similar to the
forecasts help the energy companies to purchase the proper sliding simulation method discussed in [17]. At the beginning
amount of electricity with the least cost on a daily basis. The of the first round, two sets of data were provided. One dataset
NPower Forecasting Challenge 2015 was organized by included the historical data used for modeling, such as the daily
NPower with the primary purpose of recruiting summer energy consumption, and the weather and the calendar
interns, though the competition was open to students and information as summarized in Table I. The other dataset
professionals worldwide. The competition topic was six-month included six months of actual weather and calendar
ahead ex post daily energy forecasting, which falls into the information in the forecasting period. At the beginning of each
category of retail energy forecasting [1]. In this paper, we of the next two rounds, the organizer provided the actual daily
present a refined methodology based on the implementation energy consumption over the forecasting period of the previous
during the competition. round, as well as the actual weather and calendar information
of the next six months.
During the past many decades, extensive literature has been
devoted to load forecasting [2]. A recent review of short term For each round of the competition, we dissect the data into
load forecasting is in [3]. Common load forecasting techniques three pieces as summarized in Table II: 1) the training data for
include regression analysis [4], time series models [5], artificial parameter estimation; 2) the validation data for model
neural networks [6], and fuzzy regression [7]. While the load selection; and 3) the test data (i.e. the forecast data), which is
forecasters have not yet found a technique that can dominate the forecast period designated by the competition organizer
all the others, forecast combination has been regarded by the NPower. Fig. 1 shows the actual daily energy consumption
forecasting community as a practical method to enhance the data from the historical period of the first round to the forecast
forecast accuracy [8][9]. The literature in load forecast period of the third round. Fig. 2 shows the scatter plots
combination is limited. Some researches tackled the problem between the daily energy consumption and the weather
by combining weather stations [10] [11], while others have variables, such as temperature, wind speed, precipitation, and
focused on combining forecasts from different modeling solar radiation.
techniques [12]–[15]. A recent load forecast combination study
is presented in [16], where the author combined forecasts from
Fig. 1. Daily energy consumption (Apr. 2011 – Sep. 2014)
Fig. 2. Scatter plots of daily energy consumption (kWh) and weather variables by day type
III. MODELING PROCESS TABLE IV. DAY CODE MODIFICATION (BACKWARD SELECTION MLR)
TABLE VII. VALIDATION MAPE (%) OF INDIVIDUAL MODELS AND COMBINED RESULTS.
Round MLR1 MLR2 ANN ARIMA RF Combined Combined Option
1 2.34 2.45 2.25 3.38 4.76 2.18 MLR1+ANN+MLR2
2 2.39 2.02 3.18 7.6 3.75 1.83 MLR1+MLR2
3 2.45 2.13 2.63 3.12 3.25 2.13 MLR2
The author has requested enhancement of the downloaded file. All in-text references underlined in blue are linked to publications on ResearchGate.