Professional Documents
Culture Documents
Energy
journal homepage: www.elsevier.com/locate/energy
a r t i c l e i n f o a b s t r a c t
Article history: Energy intensity is a commonly used key performance indicator (KPI) for the energy performance of
Received 28 October 2018 production processes and often serves as an Energy Performance Indicator (EnPI). The energy perfor-
Received in revised form mance of a process depends on a variety of factors like capacity utilization, ambient temperature and
13 May 2019
operational performance. Understanding the influence of these factors on the relevant KPI or EnPI helps
Accepted 24 May 2019
Available online 18 June 2019
to distinguish between influenceable and non-influenceable contributions and to identify the
improvement potential. By describing the best historically observed performance as a function of the
non-influenceable factors, valuable information on the efficiency of the current operation of a plant and
Keywords:
Energy performance indicators
the improvement potential is provided to plant managers and operators. In this contribution, a method is
Energy baseline proposed to identify a surrogate performance model for the attainable energy performance considering
Energy management systems the relevant factors. The modeling method is based solely on the evaluation of historical process data and
Surrogate models employs a novel combination of known surrogate modeling techniques using clustering, model fitting
Process monitoring and model simplification by backward elimination. The method is applied to real process data of a large
industrial production plant and the use of the model for process performance monitoring and reporting
in accordance with energy management system requirements is illustrated and discussed.
© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
1. Introduction goals that are defined on the societal level, e.g. the reduction of the
carbon footprint.
1.1. Evaluation of the energy performance of industrial processes In Germany, the problem of the competitive disadvantage for
energy intensive industries caused by high energy prices due to the
The industry in Europe faces enormous challenges regarding energy transition was addressed by a reduction of levies under
productivity and competitiveness. In the chemical industry, sig- certain prerequisites [2]. As one of these prerequisites, energy
nificant investments in new production plants are made outside of intensive companies need to operate a certified energy manage-
Europe in countries with lower prices of feedstock and energy [1]. ment system according to ISO 50001:2018 [3] or EMAS [4]. Certified
This trend is expected to continue and these new plants are often sites or companies commit themselves to continuously enhancing
highly efficient, while the plant inventory is relatively old in the environmental and energy performance. Morrow and Rondi-
Europe. In addition, companies have to respond to the political and nelli [5] point out that the introduction of environmental man-
societal pressure asking for a more sustainable production. Thus, agement systems is motivated by the desire to improve the
the European process industry has to increase its energy efficiency performance beyond regulatory compliance. However, the attri-
to compete with other regions in the world and to comply with bution of improvements to the adoption and certification of man-
agement systems is difficult. Poksinska et al. [6] point out that
management systems can be used as a toolbox for environmental
performance enhancements but that the certification alone is
* Corresponding author. insufficient.
E-mail addresses: benedikt.beisheim@ineos.com (B. Beisheim), keivan.rahimi-
adli@ineos.com (K. Rahimi-Adli), stefan.kraemer@bayer.com (S. Kr€ amer),
ISO 50001:2018 demands the use of energy performance in-
sebastian.engell@tu-dortmund.de (S. Engell). dicators (EnPI) which have to be compared with an energy baseline.
https://doi.org/10.1016/j.energy.2019.05.176
0360-5442/© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
B. Beisheim et al. / Energy 183 (2019) 776e787 777
The indicator and the baseline can be chosen by the users of the 1.2. Overview of currently available evaluation techniques
management systems. As a result, a variety of indicators and
baselines are used. A major challenge is that the performance of The evaluation of process performance and its comparison with
production processes is rarely constant as ambient conditions, modeled, historical or literature data is common practice in the
feedstock quality, throughput and the maintenance status of the process industry.
processes vary over time. The influence of these variations on the Energy efficiency benchmarks for different countries and in-
process performance must be understood for a meaningful analysis dustrial sectors are available (e.g. La€ssig et al. [7] for Germany,
of the effect of measures to improve the energy efficiency, both for Phylipsen et al. [8] for the Netherlands or Makridou et al. [9] on a
long term analysis (changes of the performance due to production European scale). There are also several studies available that
levels) and for short term analysis of the improvement potential compare the performance of different production plants of the
under certain conditions as e.g. the outside temperature. same kind using a variety of indicators. The meta-study by Saygin
Using the ambient temperature as an illustrative example it can et al. [10] revealed the energy saving potential for several processes
be seen that a single influencing factor can have different and and countries based on the Best Practice Technology (BPT), which is
contradictory effects on process performance. On the one hand, low defined as the top 10% percentile of the available processes with
temperatures have a negative influence on the energy demand of a respect to energy efficiency. This evaluation is helpful on high
process since low temperatures increase the heat loss through managerial levels or for policy makers to compare the plant port-
pipes and very low temperatures even require heat tracing. On the folio with plants of other companies or in other countries to be able
other hand, low ambient temperatures decrease the energy de- to estimate possible energy saving potentials [11]. However, it is not
mand in cooling water production and absorption processes adequate to use these literature benchmarks to evaluate the current
perform better at lower temperatures. operational performance of an existing plant as the best-in-class
The knowledge of these factors and their influence on the pro- efficiency might not be attainable with the given equipment.
cess help to explain performance deviations of the processes and The process industry has already developed solutions for the
consequently to identify if fluctuations are induced by non- evaluation of the improvement potential and also standardised
influenceable external disturbances or are due to sub-optimal them in some cases. Bayer developed a holistic methodology to
operation. The minimization of the duration of periods of sub- identify the most energy efficient operation of a production plant
optimal operation improves the efficiency, the economical perfor- [12]. In this method, two different categories of inefficiencies for
mance and the environmental footprint of the processes. chemical processes are identified: dynamic losses and static losses.
Dynamic losses are related to the operation of the given equipment,
whereas static losses are related to the choice of suboptimal
equipment, a lack of heat integration or the use of outdated
778 B. Beisheim et al. / Energy 183 (2019) 776e787
technology. This methodology introduced the term Best Demon- performance of a process. The best possible operation is not
strated Practice (BDP) for the e according to the present knowledge considered during the performance evaluation. Thus, these con-
e best possible mode of operation of a plant. It is based on rigorous cepts can only illustrate the differences in the performance during
models or on linear regression of process data after outlier removal. different time intervals. They are not able to assist plant personnel
The definition of the best demonstrated practice aims at indicating in improving the current operation as the average performance is
the optimal mode of operation at a specific set of non-influenceable not suitable for this purpose. Although different concepts for the
circumstances like ambient conditions or feedstock quality. The assessment of process performance exist in the literature (e.g.
method found its way into the NAMUR Recommendations NE 140 [15,22]), they have not yet been used for energy management
and NE 162 [13,14]. systems. Currently, there exists a gap between the methods pro-
The use of rigorous models for process performance monitoring posed in the literature and the application in industry [28,29].
is the best approach as it provides the most reliable information. In order to close this gap, this contribution proposes a method to
Model-based simulation of chemical processes is a well developed identify a model of the best demonstrated practice (the best
field in academia and industry [15,16]. Many approaches for the observed process performance depending on the values of the
efficient modeling of processes are documented (e.g. Dowling and influencing factors). The ultimate goal is to use this model to
Biegler [17]) and commercial software is available and used analyse the current process performance and to inform plant op-
extensively in industry for process optimization (e.g. AspenPlus or erators and plant managers where there is most likely room for
gPROMS and many inhouse simulators). A detailed rigorous model improvement. Because of this intended use of the model, the best
is suitable for plant optimization and improving process automa- performance is described rather than the average performance,
tion but the development effort is typically not justified if the distinguishing the method from other applications of regression
model is only used for process monitoring. In general, the models in energy management.
complexity of a model has to match the given task [18]. If suffi- The method is based on a statistical evaluation of historical data
ciently precise rigorous models cannot be developed under the and a nonlinear interpolation of the BDP as a function of the non-
constraints of time and budget, surrogate models are an alternative influenceable factors. Only limited process knowledge is neces-
that gained popularity in recent years. sary to obtain this “baseline” which can be used to monitor the
A surrogate model is a parameterized mathematical structure process behavior and to identify the potential for operational im-
(also called black-box model) that is fitted to the observations. Such provements. The resulting model represents the most important
models only represent the data that has been used to parameterize influencing variables so that a normalization of the process per-
them, so extrapolation usually is not possible and their accuracy formance with respect to these factors is possible.
depends on the density, accuracy and reproducibility of the avail- The method provides a basis for the evaluation of complex
able data. Their development also requires considerable efforts in production plants and sites using decomposition and aggregation
data pre-treatment and in the choice and parameterization of the methods as proposed in Beisheim et al. [30] and Beisheim et al. [31].
model structure. As an additional advantage, surrogate models can Prior to the investigation of the BDP, a suitable indicator has to
typically be evaluated quickly, which speeds up process optimiza- be chosen. In the industrial context several frameworks for the
tion [19]. choice of indicators are available (e.g. Giljum et al. [32], Huysman
For process monitoring and reporting, rather simple models are et al. [33], Kujanpa€€a et al. [34] or Beisheim et al. [31]). Two main
commonly used. ISO 50006:2017 [20] proposes the use of linear types are applied in these frameworks: efficiency and intensity
regression to account for different factors affecting the energy indicators. In general, the efficiency is the ratio of useful output to
performance of a process like the daily temperature figure or the total input. The intensity is the reciprocal value of the efficiency.
process utilization. Linear regression and principal component Throughout this paper, the Energy Perfomance Indicator EnPI is
analysis are effective and established methods to identify models to used which is defined as:
monitor the performance of batch and continuous processes
[21,22]. Ej
Linear regression models are relatively easy to fit to data, but EnPIj ¼ ; (1)
mP
they are not able to capture the nonlinearities that often are present
in chemical processes. Kriging models [23] and artificial neural where Ej denotes the energy demand and mP the amount of
networks [24,25] are two of the most popular nonlinear surrogate product in specification during a specified time interval. j denotes
models. Such surrogate models have been successfully employed in the type of the energy source. In chemical processes different en-
various fields. E.g. Audet et al. [26] use Kriging models in order to ergy and utility sources are utilized, e.g. steam, electricity, pres-
optimize the wing planform design of airplanes. Neural networks surized air or cooling water. BDP models for each of these sources
and Kriging models however may exhibit some “roughness” of the can be identified separately or cumulated indicators can be used
response surfaces which can make it difficult to use them for (e.g. by using an Energy Currency [31]). The operational improve-
optimization using derivative-based methods [27]. ment potential (OIP) is defined as the difference between the EnPI
and the BDP at the given non-influenceable conditions:
1.3. Scope of this contribution
OIP ¼ EnPI BDP: (2)
The requirements for the certification of energy management
systems are continuously refined and the factor-based normaliza- The proposed concept is independent of the chosen source of
tion of energy related indicators is increasingly in the focus of energy. Generally, indicators should be used that can be derived
certified companies and certification bodies. However, to the best from measurements of physical flows.
of the authors’ knowledge, concepts beyond multivariate linear The generation of a BDP model is a multi-step procedure which
regression are not documented in the literature nor are they part of is visualized in Fig. 1. Since the approach is data driven, data
the current standards for energy management systems. In many acquisition is the first step of the method. The raw data is pre-
cases, the use of linear regression models is insufficient since the treated by mean centering and unit variance (UV) scaling to have
performance curves of the process data clearly indicate nonlinear the same range of variation of all variables. The data is then clus-
behavior. Moreover, the available concepts focus on the average tered to identify different operational regimes and to reduce the
B. Beisheim et al. / Energy 183 (2019) 776e787 779
number of data points that are used in the model identification The sampling time has to suit the purpose. For real-time
step. The model is constructed using surrogate modeling tech- monitoring, the compression interval of the data for model iden-
niques. The different steps of the modeling procedure are explained tification has to be a reasonable fraction (12 15) of the time constant
in detail in Section 2. The application to real plant data from INEOS of the process. Much smaller sampling times do not provide addi-
in Cologne is described in Section 3. tional information as the rate of change of the process is limited by
The novelty of this contribution is the combination of state-of- the time constant whereas longer intervals prevent the early
the-art methods for data clustering and model identification identification of suboptimal operation and increase the reaction
which were not yet used in the context of the computation of en- time of plant personnel.
ergy baselines. The use of the best demonstrated practice enhances The data has to be representative for typical operational sce-
the useablity of the energy baseline for both real-time performance narios. Abnormal regimes must be excluded from the data to avoid
monitoring and reporting. The use of a clustering algorithm re- the identification of a non-representative BDP model. The data
sponds to the issue of differently populated operational regimes in must be provided with consistent time stamps, for example, if lab
the model identification. The model identification combined with a data is used, the time when the sample was taken is important, not
statistical backward elimination of influence factors with minor the time at which it was sent to the lab or when it was analyzed.
significance generates a simple, non-linear model which is acces- The result is a data matrix U where each row represents one
sible for people with limited mathematical knowledge. Thereby, time step tj and each column corresponds to a state variable. In this
the gap between concepts in the scientific literature and large-scale matrix ci denotes a continuous state, di a discrete state, nt the
application in industry can be closed. number of time steps, nc the number of continuous variables and nd
the number of discrete variables:
2. Modeling procedure 0 1
u
B t 1 C
2.1. Data acquisition U¼B
@ « A¼
C uc1 / ucnc ud1 / udn
d
(3)
utnt
The first step is the acquisition of measurement data. The use of
reconciled data is advised to avoid data inconsistency due to
0 1
measurement uncertainties and gross errors. Data reconciliation is
ut1 ;c1 / ut1 ;cnc ut1 ;d1 / ut1 ;dnc
based on physical laws, in particular on conservation laws for mass B C
¼B
@ « 1 « « 1 « C A: (4)
and energy. If such relationships are not available, the method can utnt ;c1 / utnt ;cnc utn ;d1 / utn ;dnc
also be applied to the raw data, possibly after removal of outliers t t
obtained by the raw data with the variance obtained from filtered
data and performing an F-test [35,36]. If the data was collected at
stationary points for most of the time, KPI values in transient sit-
uations will also be filtered out by the exclusion of the outermost
percentiles of the data. When the actual operation is compared to
the BDP, care must be taken to only compare data from periods
where the process is stationary.
The resulting input to the next step in the BDP modeling pro-
cedure is a pretreated data matrix UPT as defined below:
UPT ¼ ut ulb;ci utj ;ci uub;ci cci 2V c ; j2½1; …; nt
o (5)
∧utj ;dk ¼ ud;dk cdk 2V d ; j2½1; …; nt
1 !
b PT;ci ¼
u uPT;ci m uPT;ci , 1
s uPT;ci (6)
!
cci 2V c and dim 1 ¼ dim uPT;i ;
3. Repeat step 2 until k initial cluster centers have been chosen. 2.4. Model selection and parametrization using an adapted ALAMO
approach
After finding the initial cluster centers, the kmeans algorithm
works as follows: The choice and parametrization of a surrogate model in the
context of determining the best demonstrated practice is per-
1. For each j2f1; …; kg, C j is the set of points in X where the formed using an adapted ALAMO approach. Automated Learning of
distance to the jth cluster center is the smallest among all Algebraic Models for Optimization (ALAMO) is a software package
centers. developed by Cozad et al. [27] for the efficient development of
2. For each j2f1; …; kg, the updated cluster centers cj are surrogate models that are suitable for simulation-based optimiza-
computed by calculating the mass of all points in the corre- tion. It generates simple and accurate models from simulated or
sponding cluster C j : experimental data that are convenient for derivative-based opti-
mization software. In order to overcome the shortcomings of linear
1 X regression models, ALAMO selects a combination of simple basis
cj ¼ x; (8)
Cj functions that fit the responses with an acceptable accuracy. The set
x2C j
of basis functions is defined by the user based on process knowl-
edge, expected physical or chemical relationships etc. When the
where C j is the cardinality of C j .
number of basis functions is increased, the regression has a larger
number of degrees of freedom and starts to capture the noise or
3. Repeat steps 1 and 2 until the cluster centers remain constant. other secondary features of the training dataset which results in so-
called “overfitting”. As an outcome of overfitting, the model has a
The algorithm is available in many standard engineering soft- low bias but a high variance. This means that a small change in the
ware packages (e.g. MATLAB). For large data sets with a high input variables can result in an unrealistic change in the responses
number of clusters other algorithms are available (e.g. kmeansjj [44]. In order to prevent this behavior, ALAMO utilizes a model
[42]). fitness measure that handles the trade-off between the goodness of
The sensitivity of the kmeansþþ algorithm to the magnitude of the fit and the complexity of the model.
the input data due the use of the Euclidean distance of the data In the following, the mathematical background of model fitting
points from the cluster centers explains the necessity of mean in ALAMO is briefly discussed and an adaptation of this method for
centering and UV scaling. the application discussed here is presented.
The resulting clusters represent typical operational domains of a The general idea behind the model identification step in ALAMO
plant. As the idea of the identification of the BDP model is to find is the determination of the best combination of predefined basis
the most efficient operation regimes, using the cluster center for functions and regression factors to represent the process data. The
model identification is not suitable. Instead, a representative point general formulation for a BDP model is given as:
from the cluster is chosen based on percentile filtering.
For this purpose, the following computation is used: X
n X
m
BDP ¼ bij fi xj ; (13)
1 X i¼1 j¼1
rj ¼ x (9)
R j x2R
j
where xj denotes the model input variable j and fi denotes the basis
function i. bij denotes the regression factor for basis function i and
with
model input j.
As presented by Cozad et al. [27], ALAMO solves a nested opti-
x2Rj c Pj;n EnPIðxÞ Pj;m ∧x2C j (10)
mization problem of the following general form:
X OLR X OLR d X N X
bl ¼ bj and b ¼
u
bj ; (15) z b X 2 ¼! 0 (20)
dbj i¼1 i j2S j ij
j2B j2B
where bOLR
j is the vector of coefficients calculated by the Ordinary XN X
Least Square Regression (OLR).
f Xij zi bj Xij ¼ 0 j2S : (21)
i¼1 j2S
In the original ALAMO formulation, the “Corrected Akaike In-
formation Criterion” (AICC ) is used as the fitness measure [46,47], Eq. (21) is incorporated into the optimization problem as a set of
which is an extended version of the “Akaike Information Criterion” big-M constraints (Eq. (22)). These constraints make sure that the
[48] tailored for small sample sizes. first order optimality condition is met for the coefficients of the
Here, as the resulting model is a curve to fit the calculated active basis functions:
percentile-centers of the clusters, the different weights of the
clusters are considered using a Weighted Least Square (WLS) Uj 1 yj Hj Uj 1 yj j2B (22)
formulation in the objective function rather than the ordinary least
square error, similar to the approach proposed by Banks and Joyner
[49]: X
N X
Hj ¼ Xij ðzi bk Xik Þ j2B ; (23)
X X i¼1 k2S
1 N
AICWLSC ¼ N log ui zi bj Xij 2
N i¼1 j2B
where Uj is calculated as the maximum value that H can have
(16)
within the bounds (Eq. (15)) [27].
2SðS þ 1Þ The inner optimization problem then results as:
þ2S þ
NS1
XN
min ei (24a)
jC j b;y i¼1
with ui ¼ P i ; (17)
j C j
X
where zi is the value of the data point and Xij contains the value of
s:t: ei zi bj Xij i ¼ 1; …; N (24b)
j2B
the jth
basis function of the input variable xi . N is the number of the
data points and S is the number of selected functions. ui is the X
weighting factor of observation (cluster) i and is defined as the ratio ei bj Xij zi i ¼ 1; …; N (24c)
j2B
of the number of the points in cluster i to the sum of all data points
in all clusters. A cluster with a low number of data points indicates X
an operating range that was not observed often. Consequently, yj ¼ S (24d)
these operating ranges might not have been explored as much as j2B
others.
The extension of the approach is suitable for all types of surro-
gate model development using clustered data, not only for the use
Uj 1 yj
in BDP modeling. X
N X
The numerical solution was also adapted. While the original Xij zi bj Xij (24e)
i¼1 j2B
formulation iterates the inner problem for increasing values of S as
long as the value of AICWLSC increases for the first time, here it is Uj 1 yj j2B
solved for S ¼ Smax. The maximum value of Smax is
bl yj bj bu yj j2B (24f)
Smax ¼ jB j: (18)
The user can decide on choosing a smaller Smax in order to yj ¼ f0; 1g j2B : (24g)
reduce the computational effort and the solution time. It has to be
ensured that an increase of the outer objective function is observ- The application of ALAMO may require a change of the range of
able when choosing a lower Smax than proposed in Eq. (18). the input variables prior to model identification. If the set of the
In order to transform the resulting MIQP problem into a MILP basis functions includes logarithmic terms or functions with ex-
problem to solve it with free available solvers (e.g. CBC [50]), the ponents which are negative or e.g. equal to 1/2, it must be ensured
quadratic formulation of the objective function of the inner prob- that the input variables are not close to zero or negative.
lem is replaced by the sum of absolute values as proposed in Cozad In data preprocessing the values were mean centered and UV
et al. [27]: scaled. This results in a distribution of the input variables around
B. Beisheim et al. / Energy 183 (2019) 776e787 783
zero with a variance equal to one. In order to prevent the afore- improvement factor is a modified version of the criteria presented
mentioned problem, all of the percentile center variables are shif- by Hocking [52] for backward elimination.
ted to the interval ½1; 2 after the clustering step. The procedure is repeated until the number of relevant influ-
ence factors has been determined.
While during model identification the distance between the
2.5. Postprocessing by backward elimination percentiles and the cluster centers was not considered it is now
used for backward elimination. For processes with a small distance
In the previous subsection, a method to identify a general sur- between the BDP model and the cluster centers a more complex
rogate model with multiple influencing factors (xi ) was provided. model is necessary whereas for large distances a simpler model is
The method computes the model by fitting a curve to the averages sufficient.
of the data points between the chosen percentiles of each cluster. The procedure is numerically straight forward and removes the
This model fitness measure (AICWLSC ) selects the optimum number necessity of a multicriterial optimization, which would arise if both
of the basis functions by solely using the distance of the model from criteria were used in one objective function.
the percentile centers as a measure of accuracy.
In plant operation, the main goal of the development of the BDP 2.6. Summary of the developed modeling procedure
is to provide the plant operators with a measure that indicates the
distance between the current resource consumption and the best The proposed procedure can now be summarized in the
demonstrated practice that was observed under similar conditions following steps:
in the past, the operational improvement potential (OIP). Under the
assumption that the cluster-center represents the average oper- 1. Acquire representative data of the process.
ating point of each cluster, the distance between this point and the 2. Preprocess the data by mean centering and unit variance
fitted BDP is an indication of the average operational improvement scaling according to Eq. (6).
potential. 3. Cluster the data using kmeansþþ and calculate the cluster
For this purpose, it is sufficient to represent the dominating centers.
influences on the BDP so that the models can possibly be simplified. 4. Calculate the BDP point for each cluster using Eq. (9)-(12)
This also reduces the risk of artifacts appearing in the model. There where the percentile range is a tuning factor. The represen-
are two possibilities to add this to the method: Either the model tative points for each cluster (EnPIR j ) enter the optimization
fitness measure can be adapted by additional terms in the objective problem as zi in step 7. The model is fitted to these data
function or it can be checked ex post whether the calculated model points.
can be simplified because some factors have little influence on the 5. Scale the input data x to a range excluding zero, for example
OIP. Here, the second option as a pragmatic extension of the from 1 to 2.
method is chosen and an approach based on backward elimination 6. Define the maximum number of basis functions SU . A defi-
[51] to eliminate the influencing factors with a minor influence on nition of SU < Smax is advised to reduce the computation time
the OIP is applied. for problems with many input variables.
Firstly, a model is fitted using all of the possible influencing 7. Start with S ¼ 1 and solve the inner optimization problem in
factors and the average OIP is calculated as: Eq. (24) for increasing values of S until S ¼ SU . The result of
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the optimization at each step is a set of model parameters b.
u N
uX X 2 8. The minimum value of AICWLSC determines the optimal so-
OIPavg ¼t ui EnPIC j bj Xij (25) lution of the outer optimization problem. The corresponding
i¼1 j2S set of parameters b represents the best model.
9. For jV c j > 1 define the threshold of the backward elimina-
1 X tion, g, and run the backward elimination procedure.
EnPIC i ¼ EnPIPT ðxÞ; (26)
jC i j x2C 10. The model after the backward elimination step is the final
i
BDP model.
where EnPIC j is the EnPI of the cluster of preprocessed data points,
The algorithm uses 3 tuning factors, the percentiles used for the
X is the matrix of the transformed inputs by the basis functions
identification of the BDP model, the number of clusters and g. The
using the operating conditions of the cluster-center and ui is the
user has to define a sensible set of basis functions for the curve
weight of each cluster. Afterwards, at each step the influencing
fitting step.
factors are eliminated one at a time, the model is fitted again and
the average error of the new model is calculated as:
3. Application to real process data of an industrial plant
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u N
uX X 2
εk ¼ t ui zi bjk Xijk ; (27) 3.1. Use of the best demonstrated practice (BDP) in the process
i¼1 j2B industry
performance.
On the other hand, the concept is suitable for reporting pur-
poses. ISO 50006:2017 [20] proposes the analysis of factors influ-
encing the process performance. While this analysis is not an
explicit requirement of ISO 50001:2018, a continuous improvement
of energy performance is. In this context, it is suggested to make
comparisons between different time intervals more transparent by
normalizing the energy efficiency with respect to the key influ-
encing factors.
Furthermore, ISO 50003:2016 [53], which defines the re-
quirements for certification bodies, demands the verification of a
continuous improvement of the energy performance including
energy efficiency, energy use, and consumption. While the total
energy consumption is related to the productivity of a company and
the energy use is often defined by the processes, the efficiency is
Fig. 4. A simplified representation of the PO plant.
subject to continuous optimization in process industry. The pro-
posed concept can be used in two ways to verify an improvement in
efficiency: to indicate a decrease of the average deviation from the
BDP from one time interval to another, which correlates with a
more efficient operation, and secondly to demonstrate a decrease of
the BDP by performing another evaluation after a period of time,
which indicates a technical improvement of the process either by
changing operating conditions or by the use of different equipment.
Finally, the proposed method can be applied to redefine the
energy baseline of a process after retrofitting (c.f. ISO 50001:2018).
Fig. 3 illustrates both of these methods to verify improvements in
energy efficiency. The BDP model in this example is a function of
the plant load. On the left, the operational improvement potential
from period one to two was halved resulting in a smaller difference
between sampling points and BDP model curve. The BDP model is
still valid in this case. On the right, a process modification led to a
Fig. 5. Preprocessed data points before clustering and scaling.
BDP adjustment. The distance between the measurements and
their corresponding BDP model remains constant. The overall EnPI
level decreased. In both cases the specific energy consumption 2. Dehydrochlorination of PCH to PO using an alkaline solution,
decreased. 3. Purification of the product.1
Fig. 9. Comparison of the models before and after backward elimination. Variables
Fig. 7. of the fitted models. shifted and scaled to dimensionless units.
786 B. Beisheim et al. / Energy 183 (2019) 776e787
The solid line shows the model after backward elimination and the 4. Conclusion and outlook
crosses show the data points without the temperature projection.
The left and right bound in the figure indicate the minimum and In this contribution, a method is proposed to identify a best
maximum value of the percentile centers used in the model iden- demonstrated practice model of a process based on historical
tification. Within these bounds the model is applicable for com- process data.
parison with process data whereas outside the limits the results Statistical methods and state-of-the-art surrogate modeling
have to be handled with care due to the lacking extrapolation ca- techniques are utilized to obtain a model that is composed of
pabilities of the BDP model. simple basis functions and can be used by plant personnel without
The data that is used for backward elimination was originally mathematical knowledge. The models can be used for real-time
clustered taking both of the influencing factors into consideration. process monitoring to improve the operational performance and
As the temperature has a negligible effect on the model quality, as for reporting purposes. The application of the concept to verify
the final step, a model was fitted using only the plant load as the energy performance improvements in the context of an ISO
influencing factor. The motivation for doing this is to base the 50001:2018 certification was discussed.
clustering step also only on the plant load. The proposed method is designed to handle real data from in-
The parameters of the algorithm were left unchanged compared dustrial processes. Robustness is achieved by clustering and by
to the previous case. The algorithm identified a model using 2 basis using percentile averages to identify the best demonstrated prac-
functions as: tice. Due to the use of as few as possible basis functions, the models
are less prone to show local variations that are a result of the
y ¼ 0:7484m_ 2p þ 3:2778m_ 1
p : (34) interpolation procedure and not a property of the process.
The identification of the best demonstrated practice is the first
and the corresponding curve is shown in Fig. 10. The red line and step towards a more resource efficient production. Since the
the crosses present the model and the percentile centers after method only identifies the operational improvement potential the
backward elimination, whereas the blue dashed lined and circles success depends on the skills and the experience of the plant
present the model fitted and the percentile centers using only the personnel. The development of data driven rule extraction to
plant load as input. This model is different from the model fitted support plant operators to derive efficient process intervention is
after backward elimination despite using the same input data and the next step for processes where expensive modeling and
the same settings as there is a difference in the resulting clusters. advanced process control are not viable.
Nevertheless, it can be seen that the models are very similar. The The concept was applied to process data of a large production
final model is depicted against the process data in Fig. 11. plant of INEOS in Cologne, which demonstrates the practical
applicability of the method to real world imperfect data from a
process for which no rigorous model is available. The resulting
models are non-linear but as simple as possible to present clear
trends to the plant personnel. The method does not incorporate any
process knowledge which facilitates the general application to
different processes. The backward elimination step indicates that
the ambient temperature has only a minor influence on the process
performance and can consequently be removed from the BDP
model which further simplifies the model structure without a
significant loss of information. By following the workflow proposed
in this contribution, the gap between state-of-the-art methods in
the literature and the application in the industry can be closed,
providing the basis for better process monitoring and energy effi-
cient operation and a substantiated baseline for energy manage-
ment systems.
As an area for further research, the disaggregation of a process
into smaller sub-processes and the description of each sub-process
Fig. 10. Comparison of the model after backward elimination and the model based by a specific BDP model should be considered. The process can
only on the plant load. afterwards hierarchically be aggregated to obtain consistent BDP
models and performance figures on the top level. A concept which
considers BDP and performance aggregation was proposed by
Beisheim et al. [30]. It will be used at the INEOS site in Cologne with
over 20 processing plants to monitor and report the energy and
environmental performance. It is a future corner stone for reporting
in the context of the energy management system at INEOS in
Cologne.
Acknowledgments