You are on page 1of 13

Neural Comput & Applic

DOI 10.1007/s00521-016-2721-x

ORIGINAL ARTICLE

Predicting the oil production using the novel multivariate


nonlinear model based on Arps decline model and kernel method
Xin Ma1,2 • Zhibin Liu2

Received: 27 January 2016 / Accepted: 17 November 2016


Ó The Natural Computing Applications Forum 2016

Abstract Prediction of petroleum production plays a key 1 Introduction


role in the petroleum engineering, but an accurate predic-
tion is difficult to achieve due to the complex underground Prediction of the future oil production is important in the
conditions. In this paper, we employ the kernel method to petroleum industry, which can help the petroleum engi-
extend the Arps decline model into a nonlinear multivariate neers to make the project decisions, adjust the current
prediction model, which is called the nonlinear extension schedule, and analyse the effect of the operations, etc.
of Arps decline model (NEA). The basic structure of the However, an accurate prediction is difficult to be achieved
NEA is developed from the Arps exponential decline due to the complex underground conditions. It is not easy
equation, and the kernel method is employed to build a to build an accurate deterministic mathematical model as
nonlinear combination of the input series. Thus, the NEA is the petroleum production depends on many parameters,
efficient to deal with the nonlinear relationship between the with which the petroleum production often shares a non-
input series and the petroleum production with a one-step linear relationship. Even such an ideal mathematical model
linear recursion, which combines the merits of commonly exists, the parameter data are not always all available. As a
used decline curve methods and intelligent methods. The result, the accurate forecasting of the petroleum production
case studies are carried out with the production data from has long been an issue in the petroleum engineering.
two real-world oil field in China and India to assess the The petroleum engineers have used the decline curve
efficiency of the NEA model, and the results show that the methods to analyse the reservoir performance and predict
NEA is eligible to describe the nonlinear relationship the petroleum production for many decades. Most resear-
between the influence factors and the oil production, and it ches are based on the Arps equations, which were devel-
is applicable to make accurate forecasts for the oil pro- oped in the mid-1900s [1]. The applicability of the Arps
duction in the real applications. decline curves has been proved by numerous empirical
studies [2–4], and recent study [5] indicated that the Arps
Keywords Petroleum production  Petroleum decline curves are no more than empirical equations but
engineering  Kernel method  Arps decline  Multivariate also have theoretical basis. Nevertheless, the decline curves
regression essentially consider the time series of petroleum production
only, which leads to the inevitable shortcomings such as
underestimating or overestimating the reservoir
performance.
In recent years, the intelligent methods have been
& Xin Ma applied [6], such as the artificial neural networks (ANN)
cauchy7203@gmail.com
[7–10], support vector machines (SVM) [11], adaptive
1
School of Science, Southwest University of Science and neural fuzzy interface (ANFIS) [12]. Some of these
Technology, Mianyang, China methods were used to build the time series prediction
2
School of Science, Southwest Petroleum University, models [13–15], and others were used to build the multi-
Chengdu, China variate regression models [16–18]. Although the intelligent

123
Neural Comput & Applic

methods are better at handling the nonlinearities situations, of the LEA model are shown in Sect. 2; the details of the
they do not perform well at times while attempting to solve modelling procedures are described in Sect. 3; numerical
highly nonlinearity time series data [19]. The ones used for experiments and analysis are shown in Sect. 4, and con-
multivariate regression are often used to build static mod- clusions are drawn in Sect. 5.
els, which are only input–output boxes or black boxes for
the influence factors and the petroleum production. But not
a single petroleum production system is static as it is well 2 The Arps decline model and its linear extension
known that even if any influence factor does not change,
the petroleum production will still decline along with time. 2.1 The Arps decline model
Thus, these models are also limited in applications.
Above all, the decline curve methods and the intelligent The classical decline curve equation was presented by Arps
methods have their own advantages and disadvantages, and [1] as follows:
this research is aiming at building a new prediction model 1 dqðtÞ
for the petroleum production by combining the advantages ¼ aqðtÞd ð1Þ
qðtÞ dt
of the decline curve methods and the intelligent methods.
The main advantage of the decline curve methods is that it where a and d are empirical constants to be determined
reflects the overall declining trend of the petroleum pro- based on production data. When d ¼ 0, Eq. (1) degenerates
duction and that of the intelligent methods is their ability to to an exponential decline model, and when d ¼ 1, Eq. (1)
deal with the nonlinearities. To combine these advantages, yields a harmonic decline model. When 0\d\1, Eq. (1)
we need to use the so-called kernel trick. derives a hyperbolic decline model. These three types are
The kernel trick used in this paper is similar to the least not only applicable to analyse the oil and gas wells per-
squares support vector machines (LS-SVM), which was formance, but also available to predict the oil and gas
introduced by Suykens [20]. The LS-SVM employs the reservoir performance.
ridge regression form [21] considering the equality con-
straints instead of the inequalities of the standard SVM of 2.2 The linear extension of Arps decline model
Vapnik [22, 23]. In recent researches, many classical linear
methods have been transformed into nonlinear ones based Obviously, the production is influenced by the other factors,
on both the LS-SVM formulation [24–26] and the standard such as the pressure difference for the oil and gas wells, the
SVM formulation [27–29], and the LS-SVM formulation water injections and the number of active production wells
has been presented to be easier to use as it only involves for the oil or gas fields. If these factors are considered, one
solving a set of linear equations while the standard SVM can add a linear combination of them into Eq. (1) with d ¼ 0
formulation involves solving a quadric programming (the exponential decline equation), which is
problem using the gradient-based training methods, which dqðtÞ
often suffer from many local minima and high computa- ¼ aqðtÞ þ b1 u1 ðtÞ þ b2 u2 ðtÞ þ    þ bN uN ðtÞ þ c
dt
tional cost [30]. ð2Þ
The main idea of this paper starts with extending the
Arps equation for exponential decline to a multivariate where the ui ðtÞ represents the variable corresponding to the
ordinary equation by adding the linear combination of the influence factor, i.e. the pressure difference, number of
input parameters, and its discrete form becomes a linear active oil or gas wells and amount of water injections, bi
recursive multivariate model, which is called the linear are the system parameters to be determined, and c is a bias
extension of the Arps model (LEA). The kernel trick is term. The corresponding difference equation of Eq. (2) can
then employed to transform this linear model to a nonlinear be written as
prediction model, which is called the nonlinear extension qðkÞ ¼ aqðk  1Þ þ b1 u1 ðkÞ þ b2 u2 ðkÞ þ    þ bN uN ðkÞ þ c:
of the Arps model (NEA). This model is different from the
past kernel regression models, as the one-step linear ð3Þ
recursion is known, while the past models are totally black- The Eq. (3) is called the linear extension of Arps decline
box models, such as the LS-SVM regression [31–33], model, which is abbreviated as LEA model. Given a
Recursive LS-SVM [34–36]. Being similar to the LS-SVM, sample set as fu1 ðkÞ; u2 ðkÞ; . . .; uN ðkÞ; qðkÞgðk ¼ 1; 2;
the solution of the NEA also follows directly from solving . . .; nÞ, the parameters of LEA in (3) can be obtained using
a set of linear equations, which makes the new model to be the least squares method as follows:
easy-to-use.  1
The rest of this paper is organized as follows: a brief ½a; b1 ; b2 ; . . .; bN ; cT ¼ BT B BT Y; ð4Þ
overview of the Arps decline model and the representation

123
Neural Comput & Applic

where
The nonlinear function g is difficult to determine, and in
0 1
qð2Þ this paper we use a method by which we can find an
B qð3Þ C equivalent expression of g.
B C
Y ¼B C; One defines the nonlinear mapping
@ ... A
qðnÞ u: RN ! F ; ð10Þ

and where F is a higher dimensional feature space. Within the


0 1 nonlinear mapping u, we can simply translate the nonlinear
qð1Þ u1 ð2Þ . . . uN ð2Þ 1
function gðuðkÞÞ in the original space RN to a linear
B qð2Þ u1 ð3Þ . . . uN ð3Þ 1C
B C function in the feature space F as
B¼B
B .. .. .. ..
C
.. C: ð5Þ
@ . . . . .A gðuðkÞÞ ¼ xT uðuðkÞÞ: ð11Þ
qðn  1Þ u1 ðnÞ . . . uN ðnÞ 1
For example, consider the vector uðkÞ ¼ ½u1 ðkÞ; u2 ðkÞ; u3 ðkÞ
Define the discrete function f(k) as in the original space R3 , and the nonlinear function
f ðkÞ ¼ b1 u1 ðkÞ þ b2 u2 ðkÞ þ    þ bN uN ðkÞ þ c: ð6Þ gðuðkÞÞ ¼ u1 ðkÞ2 þ u1 ðkÞu2 ðkÞ þ 2u2 ðkÞu3 ðkÞ0:5 þ u3 ðkÞ1:5 :
The solution of LEA model can be obtained using the
We can define the nonlinear mapping u: R3 ! R4 as
recursive method as
h i
X
k uðuðkÞÞ ¼ u1 ðkÞ2 ; u1 ðkÞu2 ðkÞ; u2 ðkÞu3 ðkÞ0:5 ; u3 ðkÞ1:5 ;
q^ðkÞ ¼ ak1 qð1Þ þ aks f ðsÞ; ð7Þ
s¼2 then the nonlinear function can be rewritten as
where q(1) is the given initial point. gðuðkÞÞ ¼ xT uðuðkÞÞ;
In summary, once given a sample set
fu1 ðkÞ; u2 ðkÞ; . . .; uN ðkÞ; qðkÞgðk ¼ 1; 2; . . .; nÞ, one can where x ¼ ½1; 1; 2; 1.
compute the parameters a; b1 ; b2 ; . . .; bN ; c using the least Substituting the gðuðkÞÞ in (11) into (9), we have
squares estimation (4) and then use the solution (7) to qðkÞ ¼ aqðk  1Þ þ xT uðuðkÞÞ þ l: ð12Þ
predict the production series.
The difference Eq. (12) is called the nonlinear extension of
Arps decline model, which is abbreviated as NEA model.
3 The nonlinear extension of Arps decline model When the mapping u is an equivalent mapping, i.e.
uðuðkÞÞ ¼ uðkÞ and x ¼ ½1; 1; . . .; 1TN , the NEA
It can be seen in (2) that the LEA model can only describe model (12) yields to the LEA model (3).
the linear relationship between the influence factors and the
production series, which is limited in the applications. In 3.2 The parameters estimation for the NEA model
this section, we are going to build a nonlinear model based
on the Arps decline model and the kernel method. In this subsection, we will discuss the parameters estima-
tion of the NEA model. With a given sample
3.1 The representation of the nonlinear model fuðkÞ; qðkÞgðk ¼ 1; 2; . . .; nÞ, we can define the error bound
for each sample as
In order to describe the nonlinear relationship between the
ek ¼ qðkÞ  aqðk  1Þ  xT uðuðkÞÞ  l; ð13Þ
influence factors and the production series, we can extend
the Arps decline model (d ¼ 0) by adding a nonlinear and then the parameters a; x; l can be obtained by solving
function of the influence factors into Eq. (1), and then we the following optimization problem:
have Xn
min e2k ð14Þ
dqðtÞ a;x;l
¼ aqðtÞ þ gðuðtÞÞ þ l; ð8Þ k¼2
dt
when the nonlinear mapping u is known. However, it is
where uðtÞ ¼ ½u1 ðtÞ; u2 ðtÞ; . . .; uN ðtÞ, and gðuðtÞÞ is a computationally infeasible to determine the nonlinear map-
general nonlinear function of uðtÞ. The corresponding ping u in most cases [37]. Thus, we consider the regularized
discrete equation of (8) can be written as optimization problem, which is a combination of the LS-
qðkÞ ¼ aqðk  1Þ þ gðuðkÞÞ þ l: ð9Þ SVM formulation and the Arps decline model, as follows:

123
Neural Comput & Applic

1 1 cX n and In1 an n  1 dimensional identity matrix with all the


minJ ða; x; eÞ ¼ a2 þ jjxjj2 þ e2 diagonal elements to be 1 and others to be zero. The Xij can
a;x;e 2 2 2 k¼2 k ð15Þ
be expressed by employing a kernel function Kð; Þ which
T
s.t. qðkÞ ¼ aqðk  1Þ þ x uðuðkÞÞ þ l þ ek satisfies the Mercer’s condition, that is
where c is called the regularization term, which controls Xij ¼ uðuðiÞÞ  uðuðjÞÞ ¼ KðuðiÞ; uðjÞÞ: ð19Þ
the smoothness of the model. Actually, the regularized
Thus, the parameters ki and l can be obtained by solving
problem (15) is a semi-parametric formulation, which has
the linear Eq. (4) with (19). And then the parameter a can
been used in the partially linear LS-SVM [38]. The
be computed using the first equation in the KKT
Lagrangian of problem (15) is
conditions (17).
X
n
Lða; x; l; e; kÞ ¼ J ðb; x; eÞ  kk fbqðk  1Þ
k¼2 ð16Þ 3.3 The solution of the NEA model
T
þ x uðuðkÞÞ þ l þ ek  qðkÞg;
Substituting the second equation in the KKT conditions
where kk is the Lagrangian multiplier. The KKT conditions P
x ¼ nk¼2 kk uðuðkÞÞ into the NEA model, we have
for optimality of Lagrangian (16) are the following equations
8 X
n

> oL P qðkÞ ¼ aqðk  1Þ þ kj KðuðjÞ; uðkÞÞ þ l: ð20Þ


>
>
> ¼0 ) a ¼ nk¼2 kk qðk  1Þ j¼2
>
> oa
>
> oL P
>
>
>
> ¼0 ) x ¼ nk¼2 kk uðuðkÞÞ We define the discrete function as
>
> ox
>
< oL X
n
P
¼0 ) nk¼2 kk ¼ 0 UðkÞ ¼ kj KðuðjÞ; uðkÞÞ þ l; ð21Þ
>
> ol j¼2
>
>
>
> oL
>
> ¼0 ) ek ¼ kk c1 and the solution of the NEA model can be obtained using
>
> oek
>
> the recursive method, which can be written as
>
> oL
>
: ¼ 0 ) aqðk  1Þ þ xT uðuðkÞÞ þ l þ ek ¼ qðkÞ
okk X
k
q^ðkÞ ¼ ak1 qð1Þ þ aks UðsÞ: ð22Þ
ð17Þ
s¼2
By eliminating the x and ek , the l; a and Lagrangian And then the predicted series can be computed using the
multipliers kk can be obtained by solving the following solution (22) with the parameters a; ki ; lði ¼ 2; . . .; nÞ and
linear equations a proper kernel function.
⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1Tn−1 μ 0 3.4 Algorithm summary
⎝ ⎠⎝ ⎠ = ⎝ ⎠,
1 With a given sample
1n−1 Ω + Q + γ In−1 λ q2|n
fuðkÞ; qðkÞgðk ¼ 1; 2; . . .; n; n þ 1; . . .; n þ pÞ, the first
n groups are used to build the NEA model, and the fol-
ð18Þ lowing p groups are used for testing; the computational
steps of NEA model can be described as follows:
where
Step 1 select an appropriate kernel function Kð; Þ and
1n1 ¼ ½1; 1; . . .; 1Tn1 ;
the regularized parameter c in (15);
k ¼ ½k1 ; k2 ; . . .; kn1 T ; Step 2 compute the parameters l and k2 ; k3 ; . . .; kn by
solving the set of linear Eq. (18) with the first n groups
q2jn ¼ ½qð2Þ; qð3Þ; . . .; qðnÞT ; in the given sample and then compute the parameter
q1jn1 ¼ ½qð1Þ; qð2Þ; . . .; qðn  1ÞT ; a using the first equation in the KKT conditions (17);
Step 3 compute the values of the predicted series q^ðkÞ by
Qij ¼ qðiÞqðjÞ; k from 1 to n þ p using the solution (22) with the given
Xij ¼ uðuðiÞÞ  uðuðjÞÞ; values of uðkÞ ðk ¼ 1; 2; . . .; n þ pÞ.

123
Neural Comput & Applic

Table 1 Oil production and


Time Production Injection Time Production Injection Time Production Injection
water injection data collected
from Huabei oil field in China 199,403 10.2985 16.6924 200,007 8.525 54.1382 200,611 6.248 53.9145
199,404 9.51 13.163 200,008 8.37 48.6407 200,612 5.8297 56.6833
199,405 9.796 13.4286 200,009 7.89 48.912 200,701 6.8293 55.8355
199,406 9.4685 13.4069 200,010 7.812 49.3935 200,702 6.6944 47.7564
199,407 9.672 12.1478 200,011 7.62 50.342 200,703 7.7262 53.992
199,408 9.61 13.8557 200,012 7.718 45.5614 200,704 7.0544 54.5779
199,409 9.24 14.9509 200,101 8.3235 46.1312 200,705 7.2689 52.3116
199,410 10.3183 15.0522 200,102 6.86 40.8461 200,706 7.02 52.0587
199,411 8.9748 14.8292 200,103 8.308 46.2019 200,707 6.51 64.1887
199,412 9.114 14.9622 200,104 8.1 44.4009 200,708 6.3705 67.6799
199,501 9.3 12.4241 200,105 8.525 44.8998 200,709 5.73 66.2878
199,502 8.4 11.5622 200,106 8.25 42.7893 200,710 5.828 71.2031
199,503 9.3 12.3137 200,107 8.215 47.9857 200,711 5.58 70.695
199,504 9 11.4536 200,108 8.1226 47.9394 200,712 5.7099 76.0552
199,505 9.3 13.2431 200,109 7.7781 43.0131 200,801 6.696 75.627
199,506 9.46 12.8888 200,110 7.9546 45.0127 200,802 6.248 72.3389
199,507 9.145 12.7708 200,111 7.42 43.5656 200,803 6.7116 72.4939
199,508 9.021 12.2916 200,112 7.5383 44.7165 200,804 6.6001 71.9326
199,509 8.75 9.8818 200,201 7.905 41.2469 200,805 7.5082 73.2608
199,510 8.71 9.8102 200,202 7.14 36.3642 200,806 7.765 74.1608
199,511 8.37 9.2507 200,203 8.432 40.1168 200,807 7.285 70.5047
199,512 8.504 9.1579 200,204 7.71 37.6542 200,808 6.9595 67.9562
199,601 9.8197 10.7291 200,205 7.967 35.3813 200,809 6.45 71.906
199,602 9.8273 13.0146 200,206 7.32 34.1979 200,810 6.572 73.6473
199,603 9.9298 15.5385 200,207 7.502 33.5076 200,811 6.6 71.0956
199,604 9.288 15.6091 200,208 7.409 33.061 200,812 4.2653 72.913
199,605 9.3 16.5426 200,209 7.2006 32.132 200,901 7.367 76.496
199,606 9.06 16.7631 200,210 7.865 33.2881 200,902 6.544 68.7459
199,607 8.835 17.5995 200,211 6.69 34.7477 200,903 6.9408 75.4489
199,608 8.3886 17.1045 200,212 6.8794 36.2652 200,904 6.786 71.939
199,609 8.4 16.4459 200,301 7.44 35.6915 200,905 6.9812 75.3072
199,610 8.525 16.865 200,302 6.86 31.5326 200,906 6.756 73.7559
199,611 8.25 16.5455 200,303 7.595 33.8591 200,907 6.7332 75.7665
199,612 8.419 17.6785 200,304 7.2 31.2304 200,908 6.6712 74.3523
199,701 9.455 17.8668 200,305 7.13 32.0144 200,909 6.2956 73.7282
199,702 8.54 16.7203 200,306 6.9 30.5278 200,910 6.4325 75.4118
199,703 9.455 18.0317 200,307 7.13 31.9121 200,911 6.153 73.0154
199,704 9 16.9752 200,308 7.13 31.9186 200,912 6.3895 75.0095
199,705 9.599 17.6045 200,309 6.84 30.0179 201,001 7.192 76.5787
199,706 9.436 16.7647 200,310 7.006 31.9468 201,002 6.524 70.8311
199,707 9.5398 16.8532 200,311 6.78 34.2474 201,003 7.2385 78.3418
199,708 9.0286 17.026 200,312 7.0896 34.6135 201,004 6.99 76.5056
199,709 8.932 15.3665 200,401 6.882 34.0347 201,005 7.254 82.485
199,710 8.993 15.3529 200,402 6.4467 31.8531 201,006 6.72 79.7809
199,711 8.6784 15.4757 200,403 6.882 34.1887 201,007 6.944 80.2322
199,712 9.0111 15.7833 200,404 6.6 32.8415 201,008 7.0525 80.6947
199,801 9.63 14.3599 200,405 6.82 32.997 201,009 6.69 78.7301
199,802 8.5904 13.6218 200,406 6.6 31.9271 201,010 6.9099 80.4074
199,803 9.7363 15.1075 200,407 6.82 28.6707 201,011 6.819 79.9855
199,804 9.3845 14.5246 200,408 6.665 24.6763 201,012 7.1672 81.5187

123
Neural Comput & Applic

Table 1 continued
Time Production Injection Time Production Injection Time Production Injection

199,805 9.9472 15.384 200,409 6.45 23.2625 201,101 7.254 80.8041


199,806 9.5771 14.933 200,410 6.665 24.3332 201,102 6.664 72.2455
199,807 9.1172 13.636 200,411 6.45 28.7657 201,103 7.3935 79.3641
199,808 9.1225 14.5434 200,412 6.7221 32.3583 201,104 7.125 78.2658
199,809 8.88 15.0848 200,501 6.82 34.8419 201,105 7.347 83.0929
199,810 8.7092 16.1132 200,502 6.16 30.942 201,106 7.2165 79.849
199,811 8.4282 15.6865 200,503 6.82 30.3225 201,107 7.254 84.2972
199,812 9.9076 15.9457 200,504 6.48 28.0654 201,108 7.2385 84.4448
199,901 9.145 18.4104 200,505 6.5969 26.3153 201,109 6.99 81.8795
199,902 8.498 17.1324 200,506 6.492 25.2935 201,110 7.192 85.2569
199,903 9.362 18.9923 200,507 6.51 25.0274 201,111 6.9 83.6502
199,904 9 20.2753 200,508 6.3395 28.4806 201,112 7.4273 87.1272
199,905 9.455 23.9226 200,509 6.0016 27.3753 201,201 7.3005 87.4907
199,906 9.3 24.0016 200,510 6.107 33.1871 201,202 6.902 81.3053
199,907 8.99 20.8363 200,511 5.79 35.3405 201,203 7.409 87.3629
199,908 8.99 29.4138 200,512 5.885 32.3882 201,204 7.179 83.3497
199,909 8.79 36.4669 200,601 7.28 35.032 201,205 7.4245 86.3196
199,910 8.835 40.7173 200,602 5.9416 33.8688 201,206 7.275 83.5424
199,911 8.7 49.2413 200,603 6.81 45.128 201,207 7.316 85.5708
199,912 8.935 52.5054 200,604 6.182 53.3972 201,208 7.0863 85.2351
200,001 8.835 40.0141 200,605 6.293 53.9176 201,209 7.02 81.6368
200,002 8.265 45.5969 200,606 6.1186 52.5978 201,210 7.2705 84.2399
200,003 8.835 58.7685 200,607 6.138 54.1677 201,211 7.1688 82.3999
200,004 8.55 58.6405 200,608 6.107 58.1144 201,212 7.4486 87.4297
200,005 8.68 64.5271 200,609 5.913 56.8863 201,301 7.4402 87.6003
200,006 8.4 54.7649 200,610 6.1411 31.967

4 Case studies Table 2 RMSPE(%) by NEA


Model Training Testing
and LEA in case study 1
In this section, we present two case studies of predicting NEA 3.9983 4.2216
the oil production of two oil fields using the real-world LEA 13.1676 12.41
production data. The numerical experiments are all carried
out on the MATLAB (V.7.10) platform, and the programs The kernel function we use in this study is the Gaussian
used in this section are available online.1 kernel, which is often used in other kernel methods [20].
The root-mean-square percentage error (RMSPE) is The Gaussian kernel is defined as
used to evaluate the modelling accuracy in this study, ( )
which are defined as jjuðiÞ  uðjÞjj2
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi KðuðiÞ; uðjÞÞ ¼ exp ; ð24Þ
u n  2r2
u1 X q^ðkÞ  qðkÞ2
RMSPE ¼ t  100 ð%Þ ð23Þ
where jjjj is the 2-norm of the vectors and r is the kernel
n k¼1 qðkÞ
parameter.
where q^ðkÞ are the values of oil production produced by the
forecasting models, and the q(k) are the historical oil pro- 4.1 Numerical results by LEA and NEA
duction values.
4.1.1 Case study 1: predicting the oil production of Block-
1 of Huabei oil field in China

1
In this case, we are interested in forecasting the oil pro-
The source code are available at http://cn.mathworks.com/matlab
duction using the historical data of the water injection and
central/fileexchange/58918-the-source-code-of-the-kernel-regular
ized-extension-of-the-arps-decline-model-knea-. the oil production. The water injection is regarded as the

123
Neural Comput & Applic

Fig. 1 Predicted values of oil 11


production of Block-1 in Huabei Oil production
NEA
oil field by LEA and NEA 10 Data for building the prediction models LEA

Oil production (10 m )


9

3
4
8

4
0 50 100 150 200 250
Time

influence factor. According to the reservoir engineering, it 4.1.2 Case study 2: predicting the oil production of an oil
is difficult to determine the relationship between the water field in Cambay Basin in India
injection and the oil production due to the complex
underground conditions [39]. In this study, we will build In this case, we will build the LEA model and the NEA
the prediction models based on the LEA model and the model using the raw data from [40]2 as shown in Table 3,
NEA model, respectively. which contains five input series and one output series. The
The raw data are collected from the Block-1 in Huabei raw data shown in Table 3 are the smoothed data of the
oil field, which is located in north China. This block was monthly production of five oil wells and the cumulative
developed in January 1992, and its oil production started production of an oil field in Cambay Basin in India, where
to decline since 1994. In the same period, the oil field the five input series are corresponding to the monthly
company started to develop the water injection wells in production of five oil wells, and the output is corresponding
order to enhance the oil production. The data set contains to the cumulative production of this oil field. And it was
227 pairs of the historical data, in which the first 200 reported by Chakra et al. [40] the relationship between the
pairs are used to build the prediction models, and the last input and output presented to be highly nonlinear.
27 ones are used for testing the effectiveness of these The raw data contain 51 groups, the first 34 groups are
two models (Table 1). used to build the prediction models, and the rest 17 groups
In this case, we set the parameters r ¼ 0:1 and c ¼ 10 are used for validation. In this case, we set the parameters
for the NEA model. The evaluation criteria are shown in r ¼ 0:5 and c ¼ 100 for the NEA model. The evaluation
Table 2, and the predicted series by LEA and NEA along criteria are shown in Table 4, and the predicted series by
with the production data are plotted in Fig. 1. LEA and NEA along with the production data are plotted in
It can be seen in Table 2 that the all the training error Figs. 2 and 3, respectively.
and the testing error of NEA are much smaller than those The training error and testing error of LEA are very
of LEA, which indicates that the NEA is much more large as shown in Table 4, which indicates that the LEA
accurate than the LEA in predicting the oil production. model can not describe the nonlinear relationship between
The LEA model can only roughly describe the overall the input and output series; thus, it is not available to
tendency of the oil production as shown in Fig. 1, and predict the cumulative production in this case. But the
the predicted values by LEA are quite far away to the oil errors of NEA are small, which indicates that the NEA is
production points. The predicted values by NEA are very efficient to predict the cumulative production of this oil
close to the oil production points, and the NEA can also field. It can be clearly seen in Figs. 2 and 3 that the pre-
describe the tendency of the oil production accurately as diction results of LEA are much larger than the real values
shown in Fig. 1. The LEA model can not perform well of cumulative production, and the results of NEA are very
because the relationship between the oil production and close to the real values. In summary, the NEA model still
water injection is not linear, and on the other hand the outperforms the LEA in this case.
NEA model, which can describe the nonlinear relation-
ship between the inputs and outputs, can perform well in
this study. 2
The raw data are listed in the Table 4 in [40].

123
Neural Comput & Applic

Table 3 Smoothed data of monthly production of the oil field in Table 3 continued
Cambay Basin in India [40]
Months Input-1 Input-2 Input-3 Input-4 Input-5 Target
Months Input-1 Input-2 Input-3 Input-4 Input-5 Target
49 0.047 0.028 0.567 0.046 0.029 0.786
1 0.111 0.072 0.309 0.09 0.237 0.775 50 0.048 0.048 0.577 0.05 0.035 0.8
2 0.107 0.068 0.303 0.084 0.233 0.758 51 0.048 0.068 0.579 0.05 0.041 0.827
3 0.106 0.065 0.299 0.081 0.224 0.752
4 0.099 0.068 0.3 0.075 0.216 0.752
5 0.099 0.064 0.304 0.074 0.211 0.745 Table 4 RMSPE (%) by NEA
Model Training Testing
6 0.101 0.061 0.309 0.073 0.207 0.736 and LEA in case study 2
7 0.091 0.059 0.319 0.074 0.202 0.722 NEA 3.9983 4.2216
8 0.081 0.057 0.326 0.078 0.195 0.708 LEA 192.5 8878.3
9 0.074 0.048 0.324 0.082 0.195 0.685
10 0.062 0.044 0.326 0.085 0.191 0.676
4.2 Performances of NEA with different parameters
11 0.05 0.04 0.327 0.081 0.187 0.67
r and c
12 0.051 0.036 0.332 0.065 0.192 0.637
13 0.057 0.032 0.322 0.069 0.189 0.624
In this subsection, we will present two experiments to show
14 0.061 0.028 0.307 0.072 0.168 0.628
the impact of the kernel parameter r and the regularized
15 0.069 0.027 0.295 0.079 0.154 0.621
parameter c.
16 0.074 0.026 0.291 0.089 0.147 0.601
17 0.078 0.025 0.281 0.1 0.137 0.622
4.2.1 Results by NEA with different kernel parameters r
18 0.074 0.023 0.279 0.091 0.133 0.625
19 0.073 0.023 0.297 0.082 0.147 0.616 This experiment is carried out to test the impact of the
20 0.068 0.022 0.311 0.065 0.158 0.641 kernel parameter r for the Gaussian kernel (24) to the
21 0.063 0.022 0.312 0.061 0.158 0.671 performance of the NEA model. The raw data in the
22 0.058 0.023 0.35 0.06 0.15 0.675 Table 1 are used in this experiment. The parameter r is set
23 0.055 0.025 0.389 0.059 0.143 0.663 to be 0.01, 0.1, 1, 10, 100, and c is set to be c ¼ 10,
24 0.049 0.025 0.41 0.062 0.129 0.659 respectively. The RMSPEs for training and testing are
25 0.043 0.024 0.421 0.065 0.111 0.612 listed in Table 5 and also plotted in Fig. 4. It can be seen
26 0.041 0.024 0.437 0.057 0.1 0.554 that the training error is growing with larger r, while the
27 0.04 0.024 0.402 0.051 0.094 0.571 testing error is decreasing. The predicted values by NEA
28 0.038 0.027 0.357 0.045 0.086 0.598 with r being 0.01, 1, 100 are plotted in Figs. 5, 6 and 7,
29 0.045 0.031 0.363 0.048 0.084 0.618 respectively. It is shown that the predicted series are closer
30 0.052 0.037 0.376 0.051 0.082 0.621 to the oil production with smaller r, and the predicted
31 0.055 0.043 0.386 0.058 0.076 0.675 series are flatter with larger r. In this experiment, the NEA
32 0.055 0.042 0.393 0.058 0.073 0.714 model with best testing accuracy is the one with r ¼ 0:1.
33 0.054 0.04 0.444 0.062 0.075 0.738 This implies that the optimal kernel parameter is often
34 0.048 0.04 0.493 0.057 0.076 0.77 obtained in the mid-size values.
35 0.042 0.038 0.53 0.052 0.078 0.824
36 0.036 0.031 0.574 0.046 0.083 0.747
37 0.035 0.029 0.624 0.048 0.088 0.722 4.2.2 Results by NEA with different regularized
38 0.033 0.025 0.566 0.041 0.083 0.704 parameters c
39 0.033 0.021 0.548 0.038 0.083 0.685
40 0.034 0.017 0.534 0.036 0.083 0.67 This experiment is carried out to test the impact of the
41 0.03 0.017 0.528 0.033 0.076 0.756 regularized parameters c to the performance of NEA
42 0.03 0.02 0.519 0.033 0.067 0.746 model. The raw data in the Table 1 are also used. The
43 0.034 0.024 0.59 0.04 0.067 0.725 parameter c is set to be 0.01, 0.1, 1, 10, 100, and r is set to
44 0.033 0.027 0.586 0.04 0.06 0.712 be r ¼ 0:1, respectively. The RMSPEs for training and
45 0.034 0.029 0.579 0.04 0.043 0.713 testing are listed in Table 6 and also plotted in Fig. 8. It
46 0.041 0.03 0.569 0.041 0.031 0.711 can be seen that the training error is decreasing with larger
47 0.042 0.026 0.57 0.042 0.033 0.716
c, but the testing error is not monotonic. The details of the
48 0.043 0.022 0.57 0.043 0.033 0.757
results can be seen more clearly in the Figs. 9, 10 and 11,

123
Neural Comput & Applic

Fig. 2 Predicted values of 200

Predicted values of total oil production


cumulative oil production of the Raw data
oil field in Cambay Basin in LEA
India by LEA 150

Data for builing the models

100

50

0
0 10 20 30 40 50 60
Month

Fig. 3 Predicted values of 0.9


Predicted values of total oil production

cumulative oil production of the


oil field in Cambay Basin in 0.85
India by NEA
0.8
Data for builing the models
0.75

0.7

0.65

0.6 Raw data


NEA
0.55
0 10 20 30 40 50 60
Month

Table 5 RMSPE by NEA with which describe the plots of the predicted values. The
r Training Testing smaller c is corresponding to the smoother predicted series,
different kernel parameters r
0.01 1.5990 4.9007 and larger c leads to more accurate training results. The
0.1 3.9983 4.2216 results are coincidence with the optimization problem (15).
1 7.1460 4.9339 The optimal performance of NEA is obtained with c ¼ 10,
10 8.8670 4.5510 which is also not too large or too small, which implies that
100 11.9723 2.9737 the optimal c would also be obtained in the mid-size
values.

Fig. 4 RMSPE by NEA with 12


different regularized parameters
c
10

Training
8
RMPSE (%)

6
Testing

0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
lg(σ)

123
Neural Comput & Applic

Fig. 5 Prediction results by 11


NEA with r ¼ 0:01 Oil production
σ=0.01
10 Data for building the prediction models

Oil production (104 m3)


9

4
0 50 100 150 200 250
Time

Fig. 6 Prediction results by 11


NEA with r ¼ 1 Oil production
σ=1
10 Data for building the prediction models
Oil production (104 m3)

4
0 50 100 150 200 250
Time

Fig. 7 Prediction results by 11


NEA with r ¼ 100 Oil production
σ=100
10 Data for building the prediction models
Oil production (104 m )

9
3

4
0 50 100 150 200 250
Time

123
Neural Comput & Applic

Table 6 RMSPE by NEA with which can describe the nonlinear relationship between
c Training Testing
different regularized parameters the system inputs and output. It has been shown in the
c 0.01 16.3535 7.6375 case studies that the NEA outperforms the linear
0.1 13.8975 6.4731 extension of Arps decline model (LEA), which indi-
1 7.8722 4.7022 cated that the NEA was eligible to be applied in the
10 3.9982 4.2216 nonlinear forecasting problems in the petroleum
100 2.1993 4.9803 engineering.
The impact of the kernel parameter r and the regular-
ized parameter c has also been tested in this study. The
5 Conclusions results show that the smaller r makes the NEA model to be
more nonlinear, and the larger r makes the NEA model to
In this paper, we proposed a novel prediction model for approach to a linear model. And the NEA model produced
the petroleum production based on the Arps decline smoother points with larger c, and it was represented to be
model and the kernel method, which is called the NEA more accurate in training with larger c. These results all
model. The introduction of the kernel method can indicated that the optimal r and c would be obtained in the
extend the Arps decline model to a nonlinear model mid-size values.

18

16

14
Training
12
RMPSE (%)

10
Testing

2
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
lg(γ)

Fig. 8 RMSPE by NEA with different regularized parameters c

Fig. 9 Prediction results by 11


NEA with c ¼ 0:01 Oil production
γ=0.01
10 Data for building the prediction models
Oil production (10 m )

9
3
4

4
0 50 100 150 200 250
Time

123
Neural Comput & Applic

Fig. 10 Prediction results by 11


NEA with c ¼ 1 Oil production
γ =1
10 Data for building the prediction models

Oil production (104 m3)


9

4
0 50 100 150 200 250
Time

Fig. 11 Prediction results by 11


NEA with c ¼ 100 Oil production
γ =100
10 Data for building the prediction models
Oil production (104 m3)

4
0 50 100 150 200 250
Time

The limitations of this study should also be noticed that Compliance with ethical standards
we have not proposed an available algorithm to compute
Conflict of interest All authors declare that they have no conflict of
the optimal parameters r and c. In fact, selection of these interest.
two parameters is an open problem in the machine learning
researches, and there is no available algorithm to select the
optimal parameters up to now. But it will be interesting and References
might be possible to find an optimal interval which con-
tains the optimal parameters r and c, and once this interval 1. Asps JJ (1945) Analysis of decline curves. Trans Am Inst Min
is found, it will be more computationally efficient to select Metall Pet Eng 160:228–247
2. Li K, Horne RN et al (2003) A decline curve analysis model
the optimal parameters for the NEA model. based on fluid flow mechanisms. In: SPE western regional/AAPG
pacific section joint meeting. Society of Petroleum Engineers
Acknowledgements The authors thank the editors and anonymous 3. Velazquez Camacho R, Fuentes-Cruz G, Vasquez-Cruz MA et al
referees for their useful comments and suggestions, which have (2008) Decline-curve analysis of fractured reservoirs with fractal
helped to improve this paper. This work was supported by the Doc- geometry. SPE Reserv Eval Eng 11(03):606–619
toral Research Foundation of Southwest University of Science and
Technology (No. 16zx7140).

123
Neural Comput & Applic

4. Sureshjani MH, Gerami S (2011) A new model for modern 20. Suykens JAK, Vandewalle J (1999) Least squares support vector
production-decline analysis of gas/condensate reservoirs. J Can machine classifiers. Neural Process Lett 9(3):293–300
Pet Technol 50(7–8):10–23 21. Golub GH, Van Loan CF (2012) Matrix computations, vol 3. JHU
5. Ling K, Jun H et al (2012) Theoretical bases of Arps empirical Press, Baltimore
decline curves. In: Abu Dhabi international petroleum conference 22. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn
and exhibition. Society of Petroleum Engineers 20(3):273–297
6. Mohaghegh SD et al (2005) Recent developments in application 23. Harris D, Burges CJC, Kaufman L, Smola A, Vapnik V (1997)
of artificial intelligence in petroleum engineering. J Pet Technol Support vector regression machines. Adv Neural Inf Process Syst
57(04):86–91 9:155–161
7. Sun Q, Liebmann D, Nejad A, Bansal Y, Ayala LF, Balogun O, 24. Suykens JAK, Van Gestel T, Vandewalle J, De Moor B (2003) A
Suleen F, Ertekin T, Karpyn Z et al (2013) Forecasting well support vector machine formulation to PCA analysis and its
performance in a discontinuous tight oil reservoir using artificial kernel version. IEEE Trans Neural Netw 14(2):447–450
neural networks. In: SPE unconventional resources conference— 25. Alzate C, Suykens JAK (2008) A regularized kernel CCA con-
USA. Society of Petroleum Engineers trast function for ICA. Neural Netw 21(2):170–181
8. Aizenberg I, Sheremetov L, Villa-Vargas L, Martinez-Muñoz J 26. Xu J-H, Zhang X-G, Li Y-D (2004) Regularized kernel forms of
(2016) Multilayer neural network with multi-valued neurons in minimum squared error methods. Acta Autom Sin 30(1):27–36
time series forecasting of oil production. Neurocomputing 27. Schölkopf B, Smola A, Müller K-R (1997) Kernel principal
175:980–989 component analysis. In: Artificial neural networks ICANN’97.
9. Theodosiou M (2011) Disaggregation & aggregation of time Springer, pp 583–588
series components: a hybrid forecasting approach using gener- 28. Scholkopft B, Mullert K-R (1999) Fisher discriminant analysis
alized regression neural networks and the theta method. Neuro- with kernels. Neural Netw Sig Process IX:1
computing 74(6):896–905 29. Tipping M (2003) Relevance vector machine. US patent
10. Sheremetov L, Cosultchi A, Martı́nez-Muñoz J, Gonzalez-Sán- 6,633,857, 14 Oct 2003
chez A, Jiménez-Aquino MA (2014) Data-driven forecasting of 30. Wang H, Hu D (2005) Comparison of SVM and LS-SVM for
naturally fractured reservoirs based on nonlinear autoregressive regression. In: International conference on neural networks and
neural networks with exogenous input. J Pet Sci Eng brain, 2005. ICNN&B’05, vol 1. IEEE, pp 279–283
123:106–119 31. De Brabanter K (2011) Least squares support vector regression
11. Zhong Y, Zhao L, Liu Z, Xu Y, Li R (2010) Using a support with applications to large-scale data: a statistical approach. PhD
vector machine method to predict the development indices of thesis, Katholieke Universiteit Leuven
very high water cut oilfields. Pet Sci 7(3):379–384 32. Hou L, Yang Q, An J (2009) An improved LSSVM regression
12. Rammay MH, Abdulraheem A et al (2014) Automated history algorithm. In: International conference on computational intelli-
matching using combination of adaptive neuro fuzzy system gence and natural computing, 2009. CINC’09, vol 2. IEEE,
(ANFIS) and differential evolution algorithm. In: SPE large scale pp 138–140
computing and big data challenges in reservoir simulation con- 33. Cao S-G, Liu Y-B, Wang Y-P (2008) A forecasting and fore-
ference and exhibition. Society of Petroleum Engineers warning model for methane hazard in working face of coal mine
13. Knabe SP, Goel H, Al-Jasmi AK, Rebeschini J, Querales MM, based on LS-SVM. J China Univ Min Technol 18(2):172–176
Adnan FMd, Rivas F, Rodriguez JA, Carvajal GA, Villamizar M 34. Suykens JAK, Vandewalle J (2000) Recurrent least squares
et al (2013) Building neural-network-based models using nodal support vector machines. IEEE Trans Circuits Syst I: Fundam
and time-series analysis for short-term production forecasting. In: Theory Appl 47(7):1109–1114
SPE middle east intelligent energy conference and exhibition. 35. Xie J (2009) Time series prediction based on recurrent LS-SVM
Society of Petroleum Engineers with mixed kernel. In: Asia-pacific conference on information
14. Gupta S, Fuehrer F, Jeyachandra BC et al (2014) Production processing, 2009. APCIP 2009, vol 1. IEEE, pp 113–116
forecasting in unconventional resources using data mining and 36. Qu H, Oussar Y, Dreyfus G, Xu W (2009) Regularized recurrent
time series analysis. In: SPE/CSUR unconventional resources least squares support vector machines. In: International joint
conference—Canada. Society of Petroleum Engineers conference on bioinformatics, systems biology and intelligent
15. Harris C et al (2014) Potential pitfalls in exploration and pro- computing, 2009. IJCBS’09. IEEE, pp 508–511
duction applications of machine learning. In: SPE western north 37. Smola AJ, Schölkopf B (2004) A tutorial on support vector
American and rocky mountain joint meeting. Society of Petro- regression. Stat Comput 14(3):199–222
leum Engineers 38. Espinoza M, Suykens JAK, De Moor B (2005) Kernel based
16. Al-Fattah SM, Startzman RA et al (2001) Predicting natural gas partially linear models and nonlinear identification. IEEE Trans
production using artificial neural network. In: SPE hydrocarbon Autom Control 50(10):1602–1606
economics and evaluation symposium. Society of Petroleum 39. Nield DA, Bejan A (2013) Mechanics of fluid flow through a
Engineers porous medium. Springer, Berlin
17. Al-Fattah SM, Startzman RA et al (2001) Neural network 40. Chakra NC, Song K-Y, Gupta MM, Saraf DN (2013) An inno-
approach predicts us natural gas production. In: SPE production vative neural forecast of cumulative oil production from a pet-
and operations symposium. Society of Petroleum Engineers roleum reservoir employing higher-order neural networks
18. Nakutnyy P, Asghari K, Torn A et al (2008) Analysis of water- (HONNs). J Pet Sci Eng 106:18–33
flooding through application of neural networks. In: Canadian
international petroleum conference. Petroleum Society of Canada
19. Tokar AS, Johnson PA (1999) Rainfall-runoff modeling using
artificial neural networks. J Hydrol Eng 4(3):232–239

123

You might also like