Forecasting PVT Properties of Crude Oil Systems Based On Support Vector Machines Modeling Scheme, 2009

Journal of Petroleum Science and Engineering 64 (2009) 25–34
Contents lists available at ScienceDirect
Journal of Petroleum Science and Engineering

j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / p e t r o l
Forecasting PVT properties of crude oil systems based on support

vector machines modeling scheme
Emad A. El-Sebakhy ⁎
Medical Artificial Intelligence (MEDai), Inc., an Elsevier Company Millenia Park One, 4901 Vineland Road, Suite 450 Orlando, Florida 32811, United States
a r t i c l e i n f o a b s t r a c t
Article history: PVT properties are very important in the reservoir engineering computations. There are numerous approaches
Received 2 January 2007 for predicting various PVT properties, namely, empirical correlations and computational intelligence schemes.
Accepted 1 December 2008 The achievements of neural networks open the door to data mining modeling techniques to play a major role in
petroleum industry. Unfortunately, the developed neural networks modeling schemes have many drawbacks
Keywords: and limitations as they were originally developed for certain ranges of reservoir fluid characteristics. This
Support Vector Machines Regression
article proposes support vector machines a new intelligence framework for predicting the PVT properties of
feedforward neural networks
empirical correlations
crude oil systems and solve most of the existing neural networks drawbacks. Both steps and training algorithms
PVT properties are briefly illustrated. A comparative study is carried out to compare support vector machines regression
Formation Volume Factor performance with the one of the neural networks, nonlinear regression, and different empirical correlation
bubble point pressure techniques. Results show that the performance of support vector machines is accurate, reliable, and out-
performs most of the published correlations. This leads to a bright light of support vector machines modeling
and we recommended for solving other oil and gas industry problems, such as, permeability and porosity
prediction, identify liquid-holdup flow regimes, and other reservoir characterization.
© 2008 Elsevier B.V. All rights reserved.
1. Introduction the solution is to use equation of state, statistical regression, graphical

techniques, and empirical correlations to predict these PVT properties.
Reservoir fluid properties are very important in petroleum en- The development of correlations for PVT calculations has been the
gineering computations, such as, material balance calculations, well subject of extensive research, resulting in a large volume of publi-
test analysis, reserve estimates, inflow performance calculations, and cations. Several graphical and mathematical correlations for deter-
numerical reservoir simulations. The importance of accurate PVT data mining both Pb and Bob have been proposed during the last decade.
for material-balance calculations is well understood. Among those These correlations are essentially based on the assumption that Pb and
PVT properties is the bubble point pressure (Pb), Oil Formation Volume Bob are strong functions of the solution gas–oil ratio (Rs), the reservoir
Factor (Bob), which is defined as the volume of reservoir oil that would temperature (Tf), the gas specific gravity (Gg), and the oil specific
produce one stock tank barrel. Precise prediction of Bob is very im- gravity (G0), El-Sebakhy et al. (2007), Goda et al. (2003), and Osman
portant in reservoir and production computations. It is crucial that all et al. (2001) for more details. However, the equation of state involves
calculations in reservoir performance, in production operations and numerous numerical computations that required knowing the
design, and in formation evaluation be as good as the PVT properties detailed compositions of the reservoir fluids, which is expensive and
used in these calculations. The economics of the process also depends time consuming. In addition, the successes of such correlations are not
on the accuracy of such properties. reliable in prediction and it depends mainly on the range of data at
The currently available PVT simulator predicts the physical pro- which they were originally developed and geographical area with
perties of reservoir fluids with varying degree of accuracy based on similar fluid compositions and API oil gravity. Furthermore, PVT cor-
the type of used model, the nature of fluid and the prevailing con- relations are based on easily measured field data, such as, reservoir
ditions. Nevertheless, they all exhibit the significant drawback of pressure, reservoir temperature, and oil and gas specific gravity.
lacking the ability to forecast the quality of their answers. Ideally, the Recently, artificial neural networks (ANNs) have been proposed
PVT properties are determined from laboratory studies on samples for solving many problems in the oil and gas industry, including
collected from the bottom of the wellbore or at the surface. Such permeability and porosity prediction, identification of lithofacies
experimental data are, however, very expensive to obtain. Therefore, types, seismic pattern recognition, prediction of PVT properties,
estimating pressure drop in pipes and wells, and optimization of well
production. The most popular neural networks in both machine
⁎ Tel.: +1 321 281 4480; fax: +1 321 281 4499. learning and data mining communities are feedforward neural
E-mail address: eae6@cornell.edu. network (FFN) and multilayer perceptron (MLP), they have gained
0920-4105/$ – see front matter © 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.petrol.2008.12.006
26 E.A. El-Sebakhy / Journal of Petroleum Science and Engineering 64 (2009) 25–34
high popularity in oil and gas industry applications, Goda et al. tions based on neural networks are proposed in Section 2 as well.
(2003), Osman et al. (2001), Al-Marhoun and Osman (2002), Osman The support vector machines regression (SVR) methodology and its
and Abdel-Aal (2002), and Osman and Al-Marhoun (2005). However, training algorithm with the most common architecture is proposed
these neural networks modeling schemes suffer from number of in Section 2. Section 4 provides both data acquisition and statistical
numerous drawbacks, such as, the limited ability to explicitly identify quality measures. The experimental set-up is discussed in Section 5.
possible causal relationships and the time-consumer in the devel- Section 6 shows the performance of the approach by giving the ex-
opment of back-propagation algorithm, which lead to an overfitting perimental results.
problem and gets stuck at a local optimum of the cost function. In
addition, the architectural parameters of ANN have to be guessed in 2. Literature review
advance, such as, number and size of hidden layers and the type of
transfer function(s) for neurons in the various layers. Moreover, the PVT properties, permeability and porosity, lithofacies types, seismic
training algorithm parameters were determined based on guessing pattern recognition are important properties for oil and gas industry.
initial random weights, learning rate, and momentum, see Castillo Last six decades, engineers realized the importance of developing and
et al. (2002, 2006) for more details. using empirical correlations for PVT properties. Studies carried out in
The main objective of this study is to investigate the capability of this field resulted in the development of new correlations. Predicting
support vector machine regression (SVR) on modeling PVT properties these properties in the laboratories is very expensive to be determined
of crude oil systems and solve some of the above neural networks and the accuracy of such prediction is critical and not often known in
limitations. The considerable amount of user intervention not only advance. In this section, we briefly summarized the most common
slows down model development, but also works against the principle empirical PVT correlations and different computational intelligence
of ‘letting the data speak’. In our research, we precisely investigate schemes published in forecasting these PVT properties.
the capability of support vector machines regression in modeling both
Pb and Bob based on the kernel function scheme using worldwide 2.1. The most common empirical models and evaluation studies
published experimental PVT databases. Comparative studies are car-
ried out to compare SVR performance with the one of the neural Last six decades, engineers realized the importance of developing
networks, nonlinear regression, and different empirical correlation and using empirical correlations for PVT properties. Studies carried out
techniques. in this field resulted in the development of new correlations, such as,
The potentiality prediction of enhanced oil recovery (EOR) is the Standing (1947, 1977), Katz (1942), Vasquez and Beggs (1980), Glaso
basis of EOR potentiality analysis as well as the robust guarantee of the (1980), and Al-Marhoun (1988, 1992). The author in Glaso (1980)
reliability of analysis results. In the light of statistical learning theory, developed his empirical correlation for formation volume factor using
establishing an EOR predictive model substantially falls within the 45 oil samples from North Sea hydrocarbon mixtures. The author in
problem of function approximation. According to Vapnik's structural Al-Marhoun (1988) published his empirical correlations for estimating
risk minimization principle, one should improve the generalization bubble point pressure and oil formation volume factor for the Middle
ability of learning machine, that is, a small error from an effective East oils; he used 160 data sets from 69 Middle Eastern reservoirs to
training set can guarantee a small error for the corresponding inde- develop the correlation. The authors in Abdul-Majeed and Salman
pendent testing set. The up-to-date results from studies on statistical (1988) published an oil formation volume factor correlation based on
theory in recent years are firstly applied to EOR potentiality analysis. 420 data sets and they name it as Abdul-Majeed and Salman empirical
The improved backpropagation artificial neural network, and support correlations. Their model is similar to that of Al-Marhoun (1988) oil
vector machine are discussed. Comparative studies are carried out to formation volume factor correlation with new calculated coefficients.
compare SVR performance with the results of three distinct empirical The author in Al-Marhoun (1992) published his second Al-Marhoun
correlations and feedforward neural networks modeling scheme. The empirical correlation for oil formation volume factor based on database
results show that SVR outperforms most of the popular modeling of 11,728 experimentally data points for formation volume factors
schemes with reliable and efficient performance. at, above, and below bubble point pressure. The data set represents
To demonstrate the usefulness of support vector machines re- samples from more than 700 reservoirs from all over the world,
gression as a new computational intelligence paradigm, the developed mostly from Middle East and North America. The reader can consider
SVR calibration model was built using three published distinct PVT other empirical correlations; see Al-Shammasi (1997) and El-Sebakhy
databases; see Osman et al. (2001) and Goda et al. (2003) for more et al. (2007) for more empirical correlations. In this research, we only
details. The database with 782 observations (after deleting the re- concentrate on the most common three empirical correlations, name-
dundant 21 observations from the actual 803 observations in Osman ly, Al-Marhoun (1992), Glaso (1980), and Standing (1947) for the sake
et al. (2001) and Goda et al. (2003) published data gathered from of both space and s three popular empirical correlations.
Malaysia, Middle East, Gulf of Mexico, and Colombia is a very chal- The author in Labedi (1990) presented correlations for oil formation
lenge data with wide domain and uncertain distributions. There- volume factor for African crude oils; he used 97 data sets from Libya, 28
fore, we are able to build SVR model in estimating both bubble point sets from Nigeria, and 4 sets from Angola to develop his correlations. In
pressure and Oil Formation Volume Factor using different databases of (Dokla and Osman, 1992), they published set of correlations for
four input parameters: solution gas–oil ratio (Rs) reservoir temperature estimating bubble point pressure and oil formation volume factor for
(Tf), oil gravity (API), and gas relative density. As it is shown below, the UAE crude's. They used 51 data sets to calculate new coefficients for Al-
results show that the SVR learning algorithm was found to be faster Marhoun Middle East models Al-Marhoun (1988). In Al-Yousef and Al-
and more stable than other schemes reported in the petroleum en- Marhoun (1993), they pointed out that Dokla and Osman (1992, 1993)
gineering literatures. In addition, it shows that the new support vector bubble point pressure correlation was found to contradict the physical
machines regression modeling scheme outperforms both the standard laws. The author in Al-Marhoun (1992) published a second correlation
neural networks and all the most common existing correlations mod- for oil formation volume factor. The correlation was developed with
els in terms of absolute average percent error, standard deviation, and 11,728 experimentally obtained formation volume factors at, above,
correlation coefficient. and below bubble point pressure. The data set represents samples from
The rest of this paper is organized as follows. Section 2 provides more than 700 reservoirs from all over the world, mostly from Middle
a brief literature review about the most common empirical correla- East and North America.
tions and neural networks modeling schemes in determining PVT The authors in Macary and El-Batanoney (1992) presented cor-
relationship. The most common drawbacks of modeling PVT correla- relations for bubble point pressure and oil formation volume factor.
E.A. El-Sebakhy / Journal of Petroleum Science and Engineering 64 (2009) 25–34 27
They used 90 data sets from 30 independent reservoirs in the Gulf of 2.2. Forecasting PVT properties based on artificial neural networks
Suez to develop the correlations. The new correlations were tested
against other Egyptian data in Saleh et al. (1987) and showed im- Artificial neural networks (ANNs) are parallel distributed informa-
provement over published correlations. The authors in Omar and tion processing models that can recognize highly complex patterns
Todd (1993) presented oil formation volume factor correlation, based within available data. In recent years, neural network have gained pop-
on standing correlation model using data sets within 93 observa- ularity in petroleum applications. Many authors discussed the applica-
tions from Malaysian oil reservoirs. The authors in Kartoatmodjo and tions of neural network in petroleum engineering such as Ali (1994),
Schmidt (1994) used a global data bank to develop new correlations Elsharkawy (1998), Gharbi and Elsharkawy (1997a,b), Kumoluyi and
for all PVT properties using data of 740 different crude oil samples Daltaban (1994), Mohaghegh and Ameri (1994), Mohaghegh (1995),
gathered from all over the world provided 5392 data sets for the Mohaghegh (2000), and Varotsis et al. (1999). The most common widely
correlation development. The authors in Almehaideb (1997) pub- used neural network in literature is known as the feedforward neural
lished a new set of correlations for UAE crude's using 62 data sets from networks with backpropagation training algorithm, see Ali (1994), Duda
UAE reservoirs for measuring both bubble point pressure and oil et al. (2001), and Osman et al. (2001). This type of neural networks is
formation volume factor. The bubble point pressure correlation, such excellent computational intelligence modeling scheme in both predic-
as, Omar and Todd (1993) uses the oil formation volume factor as tion and classification tasks. Few studies were carried out to model PVT
input in addition to oil gravity, gas gravity, solution gas oil ratio, and properties using neural networks. Recently, feedforward neural network
reservoir temperature. The authors in Sutton and Farshad (1990, 1991) serves the petroleum industry in predicting PVT correlations; see the
published an evaluation for Gulf of Mexico crude oils using 285 data work of Gharbi and Elsharkawy (1997a,b), and Osman et al. (2001).
sets for gas-saturated oil and 134 data sets for under saturated oil Al-Shammasi (1997, 2001) presented neural network models and
representing 31 different crude oils and natural gas systems. The compared their performance to numerical correlations. He concluded
results show that Glaso correlation for oil formation volume factor that statistical and trend performance analysis showed that some of
perform the best for most of the data of the study. The authors in the correlations violate the physical behavior of hydrocarbon fluid
Petrosky and Farshad (1993) published a new correlation based on properties. In addition, he pointed out that the published neural net-
Gulf of Mexico crudes and they reported that the best correlation for work models missed major model parameters to be reproduced. He
oil formation volume was Al-Marhoun correlation. The authors in uses (2HL) neural networks (4-5-3-1) structure for predicting both
McCain et al. (1991) published an evaluation of all reservoir properties properties: bubble point pressure and oil formation volume factor. He
correlations based on a large global database and they recommended evaluates published correlations and neural-network models for bub-
the Standing (1947) correlation for formation volume factor at and ble point pressure and oil formation volume factor for their accuracy and
below bubble point pressure for future use. flexibility in representing hydrocarbon mixtures from different
The authors in Ghetto et al. (1994) performed a comprehensive locations worldwide.
study on PVT properties correlation based on 195 global data sets The authors of Gharbi and Elsharkawy (1997a,b) and Osman et al.
collected from the Mediterranean Basin, Africa, Middle East, and (2001) carried out comparative studies between the feedforward
the North Sea reservoirs. They recommended both Vasquez and Beggs neural networks performance and the four empirical correlations:
correlation for the oil formation volume factor. On the other hand, Standing, Al-Mahroun, Glaso, and Vasquez and Beggs empirical correlation,
the author in Elsharkawy et al. (1994) evaluated PVT correlations see Al-Marhoun (1988), El-Sebakhy et al. (2007), and Osman et al. (2001)
for Kuwaiti crude oils using 44 samples; they concluded that the for more details about both the performance and the carried comparative
Standing correlation gave the best results for bubble point pressure studies results. In 1996, the authors in Gharbi and Elsharkawy (1997a,b)
while Al-Marhoun (1988) oil formation volume factor correlation published neural network models for estimating bubble point pressure
performed satisfactory. The authors in Mahmood and Al-Marhoun and oil formation volume factor for Middle East crude oils based on the
(1996) presented an evaluation of PVT correlations for Pakistani crude neural system with log sigmoid activation function to estimate the PVT
oils, and used 166 data sets from 22 different crude samples for the data for Middle East crude oil reservoirs, while in Gharbi and Elsharkawy
evaluation. The author in Al-Marhoun (1992) reported that the oil (1997a,b), they developed a universal neural network for predicting PVT
formation volume factor correlation gave the best results. The bubble properties for any oil reservoir. In Gharbi and Elsharkawy (1997a,b), two
point pressure errors reported in this study, for all correlations, are neural networks are trained separately to estimate the bubble point
among the highest reported in the literature. Furthermore, in Hanafy pressure and oil formation volume factor, respectively. The input data were
et al. (1997), the authors evaluate the most accurate correlation to solution gas–oil ratio, reservoir temperature, oil gravity, and gas relative
apply to Egyptian crude oils for formation volume factor correlation density. They used two hidden layers (2HL) neural networks: The first
based on Macary and El-Batanoney (1992). The results showed that the neural network, (4-8-4-2) to predict the bubble point pressure and the
average absolute error of 4.9%, while Dokla and Osman (1992) showed second neural network, (4-6-6-2) to predict the oil formation volume
3.9%. Therefore, the study strongly supports the approach of develop- factor. Both neural networks were built using a data set of size 520
ing a local correlation versus a global correlation. observations from Middle East area. The input data set is divided into a
An evaluation of all available oil formation volume factor corre- training set of 498 observations and a testing set of 22 observations.
lations has been published in Al-Fattah and Al-Marhoun (1994) based The authors in Gharbi and Elsharkawy (1997a,b) follow the same
on 674 data sets from published literature and they determined that criterion of Gharbi and Elsharkawy (1997a,b), but on large scale
Al-Marhoun (1992) correlation has the least error for global data set. covering additional area: North and South America, North Sea, South
In addition, they performed trend tests to evaluate the models' phys- East Asia, with the Middle East region. They developed a one hid-
ical behavior. Finally, Al-Shammasi (1997) evaluated the published den layer neural network using a database of size 5434 representing
correlations and neural network models for bubble point pressure and around 350 different crude oil systems. This database was divided into
oil formation volume factor for accuracy and flexibility to represent a training set with 5200 observations and a testing set with other 234
hydrocarbon mixtures from different geographical locations world- observations. The results of their comparative studies were shown that
wide. He presented a new correlation for bubble point pressure based the FFN outperforms the conventional empirical correlation schemes
on global data of 1661 published and 48 unpublished data sets. Also, in the prediction of PVT properties with reduction in the average
he presented neural network models and compared their perfor- absolute error and increasing in correlation coefficients. The reader can
mance to numerical correlations. He concluded that statistical and take a look at Al-Shammasi (1997) and El-Sebakhy et al. (2007) for
trend performance analysis showed that some of the correlations more utilization of other type of neural networks in predicting the PVT
violate the physical behavior of hydrocarbon fluid properties. properties, for instance, radial basis functions and abductive networks.
2.3. The most popular drawbacks of neural networks modeling scheme recognition, text categorization, and face detection in images,
Joachims (1997). The high generalization ability of SVR is ensured
Experience with neural networks has revealed a number of limi- by special properties of the optimal hyperplane that maximizes the
tations for the technique. One such limitation is the complexity of the distance to training examples in a high dimensional feature space.
design space, Omar and Todd (1993). With no analytical guidance on
the choice of many design parameters, the developer often follows an 3.1. Background and overview
ad hoc, trial-and-error approach of manual exploration that naturally
focuses on just a small region of the potential search space. Archi- Recent years, there have been plenty of researches on support vector
tectural parameters that have to be guessed a priori include the num- machines. An overview can be found in Vapnik (1995), Burges (1998),
ber and size of hidden layers and the type of transfer function(s) for Schölkopf and Smola (2002), and Kobayashi and Komaki (2006)
neurons in the various layers. Learning algorithm parameters to be developed support vector machine (SVM) modeling scheme, which is
determined include initial weights, learning rate, and momentum. a novel learning machine based on statistical learning theory, and which
Although acceptable results may be obtained with effort, it is obvious adheres to the principle of structural risk minimization seeking to
that potentially superior models can be overlooked. The considerable minimize an upper bound of the generalization error, rather than min-
amount of user intervention not only slows down model development, imize the training error. This induction principle is based on the
but also works against the principle of ‘letting the data speak’. To bounding of the generalization error by the sum of the training error and
automate the design process, external optimization criteria, e.g. in a confidence interval term depending on the VC dimension. Based on
the form of genetic algorithms, have been proposed, Petrosky and this principle, SVM achieves an optimum network structure by striking a
Farshad (1993). Over-fitting or poor network generalization with new right balance between the empirical error and the VC-confidence in-
data during actual use is another problem, Kartoatmodjo and Schmidt terval. This balance eventually leads to better generalization perfor-
(1994). As training continues, fitting of the training data improves but mance than other neural network models Peng et al. (2004).
performance of the network with new data previously unseen during Originally, SVM has been developed to solve pattern recognition
training may deteriorate due to over-learning. A separate part of the problems. However, with the introduction of Vapnik's ε-insensitive
training data is often reserved for monitoring such performance in loss function, SVM has been extended to solve nonlinear regression
order to determine when training should be stopped prior to complete estimation problems, such as new techniques known as support
convergence. However, this reduces the effective amount of data used vector regression (SVR), which have been shown to exhibit excellent
for actual training and would be disadvantageous in many situa- performance. The performance of SVR is based on the predefined
tions where good training data are often scarce. Network pruning parameters (so-called hyper-parameters). Therefore, to construct a
algorithms have also been described to improve generalization perfect SVR forecasting model, SVR's parameters must be set carefully.
Almehaideb (1997). The commonly used back-propagation training Recently, SVR has emerged as an alternative and powerful technique
algorithm with a gradient descent approach to minimizing the error to predict a complex nonlinear relationship problem. It has achieved
during training suffers from the local minima problem, which may great success in both academic and industrial platforms due to its
prevent the synthesis of an optimum model19. Another problem is the many attractive features and promising generalization performance.
opacity or black-box nature of neural network models. The associated
lack of explanation capabilities is a handicap in many decision support 3.2. The architecture of support vector machines regression
applications such as medical diagnostics, where the user would usually
like to know how the model came to a certain conclusion. Additional Recently, a regression version of SVM has emerged as an alternative
analysis is required to derive explanation facilities from neural net- and powerful technique to solve regression problems by introducing
work models; e.g. through rule extraction (Sutton and Farshad, 1991). an alternative loss function. In this subsection, a brief description of
Model parameters are buried in large weight matrices, making it SVR is given. Detailed descriptions of SVR can be found in Vapnik
difficult to gain insight into the modeled phenomenon or compare the (1998) and El-Sebakhy (2004). Generally, the SVR formulation follows
model with available empirical or theoretical models. Information on the principle of structural risk minimization, seeking to minimize an
the relative importance of the various inputs to the model is not readily upper bound of the generalization error rather than minimize the
available, which hampers efforts for model reduction by discarding prediction error on the training set. This feature makes the SVR with a
less significant inputs. Additional processing using techniques such as greater potential to generalize the input–output relationship learnt
the principal component analysis may be required for this purpose during its training phase for making good predictions, for new input
McCain (1991). data. The SVR maps the input data x into a high-dimensional feature
In this research, we proposed support vector machines regression space F by nonlinear mapping, to yield and solve a linear regression
to overcome these neural networks shortcomings and then we forecast problem in this feature space as it is shown in Fig. 1.
PVT properties based on this new modeling scheme. The support The regression approximation estimates a function according to a
vector machine modeling scheme is a new computational intelligence given data, G = {(xi, yi): xi Rp}in= 1 , where xi denotes the input vector; yi
paradigm that is based statistical learning theory and structural risk denotes the output (target) value and n denotes the total number of
minimization principle. Based on this principle, support vector ma- data patterns. The modeling aim is to build a decision function, ŷ = f(x)
chines achieve optimum network structure by striking a right balance that accurately predicts the outputs {yi} corresponding to a new set of
between the empirical error and the Vapnik–Chervonenkis (VC) con- input–output examples, {(xi, yi)}. Using mathematical notation, the
fidence interval, which is unlikely to generate local minima. linear approximation function (in the feature space) is approximated
using the following function:
3. Support vector machines regression modeling scheme
f ðxÞ = ωT uðxÞ + b ; u : Rp Z F; and ωaF; ð1Þ
Support vector machines regression is one of the most successful
and effective algorithms in both machine learning and data mining where ω and b are coefficients; φ(x) denotes the high-dimensional
communities. It has been widely used as a robust tool for classifi- feature space, which is nonlinearly mapped from the input space x.
cation and regression, see Vapnik (1995), Cortes and Vapnik (1995), Therefore, the linear relationship in the high-dimensional feature
El-Sebakhy (2004), El-Sebakhy et al. (2007), Cortes and Vapnik space responds to nonlinear relationship in the low-dimension input
(1995), and Osuna et al. (1997). It has been found to be very robust in space, disregarding the inner product computation between ω and
many applications, for example in the field of optical character φ(x) in the high-dimensional feature space. Correspondingly, the
complex fitting functions. Thus, SVR is formulated as minimization

of the following functional:
n
1 2 ⁎
Minimimize RSVR ðω; C Þ = k ω k + C ∑ Le ni + ni ; ð4Þ
2 i=1
yi −ω/ðxi Þ−bi e + ni ;
Subjected to nð⁎Þ z0;
ω/ðxi Þ−bi + yi e + n⁎i ;
where ξi and ξi⁎ denote slack variables that measure the error of the
up and down sides, respectively. The above formulae indicate that
increasing ε decreases the corresponding ξi and ξi⁎ in the same
constructed function f(x), thereby reducing the error resulting from
the corresponding data points. Finally, by introducing Lagrange mul-
tipliers and exploiting the optimality constraints, the decision
function given by Eq. (1) has the following explicit form (Vapnik,
1995, 1998):
n
Fig. 1. Mapping input space x into high-dimensional feature space (Peng et al., 2004). f x; α i ; α ⁎i = ∑ α i −α ⁎i K ðx−xi Þ + b; ð5Þ
i=1
Where the parameters αi and αi⁎ are called the lagrangian mul-
tipliers are in Eq. (5), which satisfy the equalities αiαi⁎ = 0, αi N 0 and
original optimization problem involving nonlinear regression is trans-
αi⁎ ≥ 0, where i = 1,2,…,n, see Vapnik (1998), Cristianini and Shawe-
formed into finding the flattest function in the feature space F, and not
Taylor (2000), Peng et al. (2004), and El-Sebakhy et al. (2007). The term
in the input space, x. The unknown parameters ω and b in Eq. (1) are
K(xi,xj) in Eq. (5) is defined as kernel function, where the value of
estimated by the training set, G.
kernel function equals the inner product of two vectors xi and xj in the
SVR performs linear regression in the high-dimensional feature
feature space φ(xi) and φ(xj), meaning that K(xi,xj) = φ(xi) × φ(xj). The
space by ε-insensitive loss. At the same time, to prevent over-fitting
kernel function is intended to handle any dimension feature space
and, thereby, improving the generalization capability, following regu-
without the need to calculate φ(x) accurately. If any function can satisfy
larized functional involving summation of the empirical risk and a
Mercer's condition, it can be employed as a kernel function, Cortes
kωk
2
complexity term 2 , is minimized. The coefficients ω and b can and Vapnik (1995) and Vapnik (1998). The typical examples of kernel
thus be estimated by minimizing the regularized risk function. function are the polynomial kernel (K(x,y) = [x ×y + 1]d) and the Gaussian
kernel (K(x,y) = exp[−(x −y)2/2σ2]). In these equations, d represents the
1 2 C n 1 degree of the polynomial kernel, and σ2 represents bandwidth of the
RSVR ðC Þ = Remp + kω k = ∑ Le yi ; ŷi + kω2 k; ð2Þ
2 ni=1 2 Gaussian kernel. These parameters must be selected accurately, since
they determine the structure of high-dimensional feature space and
where RSVR and Remp represent the regression and empirical risks, govern the complexity of the final solution.
2
kωk
respectively; 2 denotes the Euclidean norm and C denotes a cost
function measuring the empirical risk. In the regularized risk function 4. Data acquisition and statistical quality measures
given by Eq. (2), the regression risk (test set error), RSVR, is the possible
error committed by the function f in predicting the output corre- 4.1. The acquired data
sponding to a test example input vector.
The implementation studies were achieved based on three
jy−ŷj−e; if jy−ŷjze databases drawn from three distinct published research articles.
Le y; ŷ = ð3Þ
0; Otherwise:
1. The first database was drawn from the article of Al-Marhoun
n (1988). This database has 160 data drawn from 69 Middle Eastern
In Eq. (2), the first term C=n ∑ Le yi ; ŷi denotes the empirical error
i=1 reservoirs, which published correlations for estimating bubble
(termed “training set error”), which is estimated by the ε-insensitive loss point pressure and oil formation volume factor for Middle East oils.
function in Eq. (3). The loss function is introduced to obtain sufficient
samples of the decision function in Eq. (1) by using fewer data points.
kωk
2
The second item 2 is the regularization term. The regularized

constant C calculates the penalty when an error occurs, by determining
the trade-off between the empirical risk and the regularization term,
which represents the ability of prediction for regression. Raising the
value of C increases the significance of the empirical risk relative to the
regularization term. The penalty is acceptable only if the fitting error is
larger than ε. The ε-insensitive loss function is employed to stabilize
estimation. In other words, the ε-insensitive loss function can reduce the
noise. Thus, ε can be viewed as a tube size equivalent to the approx-
imation accuracy in training data as it is shown in Fig. 2. In the empirical
analysis, C and ε are the parameters selected by users.
To estimate ω and b, we introduce the positive slack variables ξi
and ξi⁎, then according to Fig. 2, the sizes of the stated excess
positive and negative deviations are represented by ξi and ξi⁎,
respectively. The slack variables assume non-zero values outside the
[− ε, ε] region. The SVR fits f(x) to the data such that, (i) the training
error is minimized by minimizing ξi and ξi⁎ and (ii) ||ω||2/2 is
minimized to raise the flatness of f(x), or to penalize excessively Fig. 2. Soft margin loss setting for a linear SVR (Schölkopf and Smola, 2002).
Table 1 Table 3
Testing results (Osman et al., 2001 and El-Sebakhy et al., 2007 data): statistical quality Testing results (Al-Marhoun and Osman, 2002 and Osman and Abdel-Aal, 2002
measures when estimate Bo dataset): statistical quality measures when estimate Bo
Correlation Er EA Emin Emax R2 Correlation Er EA Emin Emax SD R2

Standing (1947) −0.170 2.724 0.008 20.180 0.974 Standing (1947) −1.054 1.6833 0.066 7.7997 2.1021 0.9947
Glaso (1980) 1.8186 3.374 0.003 17.776 0.972 Glaso (1980) 0.4538 1.7865 0.0062 7.3839 2.1662 0.9920
Al-Marhoun (1992) −0.115 2.205 0.003 13.179 0.981 Al-Marhoun (1992) −0.392 0.8451 0.0003 3.5546 1.1029 0.9972
ANN System 0.3024 1.789 0.008 11.775 0.980 ANN System 0.217 0.5116 0.0061 2.6001 0.6626 0.9977
Support Vector Machine Regression 0.18 1.37 0.002 7.751 0.984 Support Vector Machine 0.006 0.353 0.0013 2.5835 0.4743 0.997
Regression
2. The second database was drawn from articles by Al-Marhoun and

During the implementation, the user should be aware of the input
Osman (2002), Osman and Abdel-Aal (2002) and Osman and Al-
domain values to make sure that the input data values fall in a natural
Marhoun (2005). This database has 283 data points collected from
domain. This step called the quality control and it is really an important
different Saudi fields to predict the bubble point pressure, and the
step to have very accurate and reliable results at the end. The following
oil formation volume factor at the bubble-point pressure for Saudi
is the most common domains for the input/output variables, gas–oil
crude oils. The models were based on neural networks with 142
ratio, API oil gravity, relative gas density, reservoir temperature; bubble
training set to build FFN calibration model to predict the bubble
point pressure, and oil formation volume factor that are used in the both
point pressure, and the oil formation volume factor, 71 to cross-
input and output layers of modeling schemes for PVT analysis:
validate the relationships established during the training process
and adjust the calculated weights, and the remaining 70 to test the • Gas oil ratio with range between 26 and 1602, scf/stb.
model to evaluate its accuracy. The results show that the developed • Oil formation volume factor which varies from 1.032 to 1.997, bbl/stb.
Bob model provides better predictions and higher accuracy than • Bubble point pressure, starting from 130, ending with 3573, psia.
the published empirical correlations. • Reservoir temperature with its range from 74 ° F to 240 ° F.
3. The third database was obtained from the works of Goda et al. (2003) • API gravity which changes between 19.4 and 44.6.
and Osman et al. (2001), where the authors used feedforward • Gas relative density, change from 0.744 to 1.367.
learning scheme with log sigmoid transfer function in order to
estimate the formation volume factor at the bubble point pressure. This 4.2. Judgment and quality measures
database contains 782 observations after deleting the redundant 21
observations from the actual 803 data points. This data set is gath- Immediately, after the learning process, we apply the judgment
ered from Malaysia, Middle East, Gulf of Mexico, and Colombia. The and evaluation of the quality and capability of the fitted model (vali-
authors in Goda et al. (2003) and Osman et al. (2001) designed a one dation step). To achieve this step, we compute numerous of quality
hidden layer (1HL) feedforward neural network (4-5-1) with the measures, such as, correlation coefficients (r) between the actual and
backpropagation learning algorithm using four input neurons cover- the predicted output, the root mean-squared errors (Erms), average
ing the input data of gas–oil ratio, API oil gravity, relative gas density, percent relative error (Er), average absolute percent relative error (Ea),
and reservoir temperature, one hidden layer with five neurons, and minimum absolute percent error (Emin), maximum absolute percent
single neuron for the formation volume factor in the output layer. error (Ermax), standard deviation (SD), and execution time. The best
model is the one with the highest correlation coefficient and the
To evaluate the performance of each SVR, FFN, and the used three
smallest root of the mean-squared errors.
empirical correlations modeling scheme using the above defined
The performance of the support vector machines modeling scheme
three distinct databases. The entire database is divided using the
is compared versus both neural networks and most common published
stratified criterion. Therefore, we use 70% of the data for building the
empirical correlations, such as, Standing (1947), Al-Marhoun (1992), and
SVR learning model (internal validation) and 30% of the data for
Glaso (1980) correlations using the three distinct data sets: Al-Marhoun
testing/validation (external validation or cross-validation criterion).
and Osman (2002) and Al-Marhoun (1992) and Osman et al. (2001) and
We repeat both internal and external validation processes for 1000
Osman and Abdel-Aal (2002). The implementations were repeated 1000
times. Therefore, the data were divided into two or three groups for
times using the cross-validation criterion (internal and external
training and cross validation check.
validation). We obtained excellent results the show the capability of
In this study, the 382 data points, 267 were used to build the
the new SVR modeling scheme, but for the sake simplicity, we recorded
calibration model, the remaining 115 to cross-validate the relation-
only the needed plots that gives the reader a complete picture about
ships established during the training and testing process to evaluate
both efficiency and reliability of the SVR modeling scheme.
its accuracy and trend stability. For the testing data, the statistical
summary of the investigated quality measures corresponding to SVR
4.3. Statistical quality measures
modeling scheme, neural networks, and the most popular empirical
correlations, such as, Standing, Al-Mahroun, and Glaso empirical cor-
To compare the performance and accuracy of the new model to
relation using the above distinct data sets. The prediction performance
other empirical correlations, statistical error analysis is performed. The
of both bubble point pressure and Oil Formation Volume Factor were
statistical parameters used for comparison are: average percent relative
shown in Tables 1–6, respectively.
Table 4
Table 2 Testing results (Al-Marhoun and Osman, 2002 and Osman and Abdel-Aal, 2002
Testing results (Osman et al., 2001 and El-Sebakhy et al., 2007 data): statistical quality dataset): statistical quality measures when estimate Pb
measures when estimate Pb
Correlation Er EA Emin Emax SD R2
Correlation Er EA Emin Emax R2 Standing (1947) −8.441 10.4562 0.2733 47.0213 11.841 0.8974
Standing (1947) 67.60 67.73 0.1620 102.08 0.867 Glaso (1980) −18.48 20.7569 2.0345 63.7634 16.160 0.9837
Glaso (1980) − 1.616 18.52 0.1056 138.96 0.945 Al-Marhoun (1992) 0.941 8.1028 0.0935 38.085 11.41 0.9905
Al-Marhoun (1992) 8.008 20.01 0.0254 109.12 0.906 ANN System −0.222 5.8915 0.2037 38.1225 8.678 0.9930
ANN System 8.129 21.02 0.0182 145.29 0.943 Support Vector Machine −0.326 3.260 0.0016 25.447 4.978 0.995
Support Vector Machine Regression 7.547 15.12 0.009 104.73 0.963 Regression
Table 5
Testing results (Osman et al., 2001 and Goda et al., 2003 dataset): statistical quality
measures when estimate Bo
Correlation Er EA Emin Emax R2

Standing (1947) −2.628 2.7202 0.0167 13.2922 0.9953
Glaso (1980) −0.5529 0.9821 0.0086 6.5123 0.9959
Al-Marhoun (1992) −0.4514 2.0084 0.0322 11.0755 0.9935
ANN System 0.3251 1.4592 0.0083 5.3495 0.9968
Support Vector Machine Regression 0.065 0.386 0.007 2.502 0.9971
error (Er), average absolute percent relative error (Ea), minimum and
maximum absolute percent error (Emin and (Ermax)) root mean square
errors (Erms), standard deviation (SD), and correlation coefficient (R2),
see Duda et al. (2001) and Osman et al. (2001) for more details about the
corresponding equations of these statistical parameters. To demonstrate
the usefulness of the SVR modeling scheme, the developed calibration
model based on the three distinct databases (i) database with 160
observations and (ii) database with 283 observations will be used to Fig. 3. Average Absolute relative error versus correlation coefficients for all schemes and
predict both Pb and Bob, and (iii) the world wide database with 782 empirical correlations for predicting (Bob) based on (Osman et al., 2001).
observations published in Goda et al. (2003) and Osman et al. (2001).
The results show that the support vector machines regression single or two hidden layer feedforward neural network based on back
algorithm has both reliable and efficient performance. In addition, its propagation (BP) learning algorithm with both linear and sigmoid
performance outperforms the one of the most popular empirical cor- activation functions. The initial weights were generated randomly and
relation schemes and the standard feedforward neural networks in the learning technique is achieved based on 1000 epoch or 0.001 goal
terms of root mean squared error, absolute average percent error, error and 0.01 learning rate. Each layer contains neurons that are
standard deviation, and correlation coefficient. connected to all neurons in the neighboring layers. The connections
have numerical values (weights) associated with them, which will be
5. Empirical study adjusted during the training phase. Training is completed when the
network is able to predict the given output. For the two models, the
We have done the quality control check on all these data sets and first layer consists of four neurons representing the input values of
clean it from redundant data and un-useful observations. To evaluate reservoir temperature, solution gas oil ratio, gas specific gravity and
performance of each modeling scheme, entire database is divided API oil gravity. The second (hidden) layer consists of seven neurons for
using the stratified criterion. Therefore, we use 70% of the data for the Pb Model, and eight neurons for the Bob model. The third layer
building SVR learning model (internal validation) and 30% of the data contains one neuron representing the output values of either Pb or Bob.
for testing/validation (external validation or cross-validation criter- Simplified schematic of the used neural networks for Pb and Bob
ion). Both internal and external validation processes are repeated 1000 models are illustrated in Al-Marhoun and Osman (2002) and Osman
times. Therefore, data were divided into two/three groups for training and Abdel-Aal (2002). It gives the ability to monitor the generalization
and cross validation check. Therefore, of the 782 data points, 382 were performance of the network and prevent the network to over fit the
used to train the neural network models, the remaining 200 to cross- training data based on repeating the computations for 1000 times and
validate the relationships established during training process and 200 take the average over all runs.
to test model to evaluate its accuracy and trend stability. For testing The implementation process started by feeding the support vector
data, statistical summary to investigate different quality measures machine modeling scheme by the available input data sets, one ob-
corresponding to SVR modeling scheme, feedforward neural networks servation at a time, then the learning process have to be developed
system, and the most popular empirical correlations in literatures to from the available input data. We observe that this cross-validation
predict both bubble point pressure and Oil Formation Volume Factor. criterion gives the ability to monitor the generalization performance of
Generally, after training the SVR modeling systems, the calibration the support vector machine regression scheme and prevent the kernel
model becomes ready for testing and evaluation using the cross-validation network to over fit the training data. In this implementation process,
criterion. Comparative studies were carried out to compare the perfor-
mance and accuracy of the new SVR model versus both the standard
neural networks and the three common published empirical correlations,
namely, Standing, Al-Mahroun, and Glaso empirical correlations.
5.1. Parameters initialization
In this study, we follow the same procedures in Al-Marhoun and

Osman (2002), Osman et al. (2001), and Osman and Abdel-Aal (2002) a
Table 6
Testing results (Osman et al., 2001 and Goda et al., 2003 dataset): statistical quality
measures when estimate Pb
Correlation Er EA Emin Emax R2

Standing (1947) 12.811 24.684 0.62334 59.038 0.8657
Glaso (1980) −18.887 26.551 0.28067 98.78 0.9675
Al-Marhoun (1992) 5.1023 8.9416 0.13115 87.989 0.9701
ANN System 4.9205 6.7495 0.16115 65.3839 0.9765
Fig. 4. Average Absolute relative error versus correlation coefficients for all schemes and
Support Vector Machine Regression 3.426 4.0757 0.128 53.219 0.9808
empirical correlations for predicting (bp) based on (Osman et al., 2001).
Fig. 5. Average Cross Plot of SVR modeling scheme for Bo or Bob based on (Osman et al., Fig. 7. Cross Plot of SVR modeling scheme for Bo based on data of Al-Marhoun and
2001). Osman (2002) and Osman and Abdel-Aal (2002).
we used three distinct kernel functions, namely, polynomial, sigmoid ing scheme outperforms both neural networks with backpropagation
kernel, and Gaussian Bell kernel. In designing the support vector ma- training algorithm and the most popular empirical correlations. The
chine regression, the important parameters that will control its overall proposed model shown a high accuracy in predicting both Pb and Bob
performance were initialized, such as, kernel = ‘poly’; kernel opt. = 5; values with a stable performance, and achieved the lowest absolute
epsilon = 0.01; lambda = .0000001; verbose = 0, and the constant C either percent relative error, lowest minimum error, lowest maximum error,
1 or 10 for simplicity. The cross-validation method used in this study lowest root mean square error, and the highest correlation coefficient
utilized as a checking mechanism in the training algorithm to prevent among other correlations for the used three distinct data sets.
both over fitting and complexity criterion based on the root-mean- We draw a scatter plot of the absolute percent relative error (EA)
squared errors threshold. The resulted weights for the Bob and Pb versus the correlation coefficient for all computational intelligence
models are given below in different tables and graphs. Moreover, the forecasting schemes and the most common empirical correlations.
relative importance of each input property are identified during the Each modeling scheme is represented by a symbol; the good fore-
training process and given for Bob and Pb models as it is shown below. casting scheme should appear in the upper left corner of the graph.
Fig. 3 shows a scatter plot of (EA) versus (R2 or r) for all modeling
5.2. Discussion and comparative studies schemes that are used to determine Bob based on the data set used in
(Osman et al., 2001).
One can investigate other common empirical correlations besides We observe that the symbol corresponding to SVR scheme falls in
these chosen empirical correlations, see El-Sebakhy et al. (2007) and the upper left corner with EA = 1.368% and r = 0.9884, while neural
Osman et al. (2001) for more details about these empirical correlations network is below SVR with EA = 1.7886% and r = 0.9878, and other cor-
mathematical formulas. The results of comparisons in the testing relations indicates higher error values with lower correlation co-
(external validation check were summarized in Tables 1–6, respectively. efficients, for instance, Al-Marhoun (1992) has EA = 2.2053% and r =
We observe form these results that the support vector machines model- 0.9806; Standing (1947) has EA = 2.7238% and r = 0.9742; and Glaso
Fig. 6. Average Cross Plot of SVR modeling scheme for bp or bpp based on (Osman et al., Fig. 8. Cross Plot of SVR modeling scheme for bp based on data of Al-Marhoun and
2001). Osman (2002) and Osman and Abdel-Aal (2002).
of agreement between the experimental and the predicted values

based on the high quality performance of the SVR modeling scheme.
The reader can compare the performance of theses patterns with
the corresponding ones of the published neural networks modeling
and the most common empirical coefficient correlations in (Al-
Marhoun, 1992; Al-Marhoun and Osman, 2002), (Osman and Abdel-
Aal, 2002), and (Osman et al., 2001). Finally, we conclude that
developed SVR modeling scheme has better and reliable perfor-
mance compared to the most published modeling schemes and
empirical correlations.
The bottom line is that, the developed SVR modeling scheme
outperforms both the standard feedforward neural networks and the
most common published empirical correlations in predicting both bpp
and Bob using the four input variables: solution gas–oil ratio, reservoir
temperature, oil gravity, and gas relative density.
6. Conclusion and recommendation
Fig. 9. Cross Plot of SVR scheme for Bo based on the provided data set in (Al-Marhoun,
In this study, three distinct published data sets were used in
1992). investigating the capability of the SVR modeling scheme as a new
framework for predicting the PVT properties of oil crude systems.
Based on the obtained results and comparative studies, the conclu-
sions and recommendations are drawn as:
A new computational intelligence modeling scheme based on the
SVR scheme to predict both bubble point pressure and oil formation
volume factor using the four input variables: solution gas–oil ratio,
reservoir temperature, oil gravity, and gas relative density. As it is
shown in the petroleum engineering communities, these two pre-
dicted properties were considered the most important PVT properties
of oil crude systems.
The developed SVR modeling scheme outperforms both the standard
feedforward neural networks and the most common published em-
pirical correlations. Thus, the developed SVR modeling scheme has
better, efficient, and reliable performance compared to the most
published correlations. In addition, the developed SVR modeling scheme
shown a high accuracy in predicting the Bob values with a stable per-
formance, and achieved the lowest absolute percent relative error,
lowest minimum error, lowest maximum error, lowest root mean square
error, and the highest correlation coefficient among other correlations
for the used three distinct data sets. Furthermore, the SVR modeling
scheme is flexible, reliable, and shows bright future in implementing
it for the oil and gas industry, especially permeability and porosity
prediction, history matching, predicting the rock mechanics properties,
flow regimes and liquid-holdup multiphase follow, and facies
Fig. 10. Cross Plot of SVR scheme for bp based on the provided data set in (Al-Marhoun,
classification.
1992).
Nomenclature
Bob OFVF at the bubble-point pressure, RB/STB
Rs on solution gas oil ratio, SCF/STB
(1980) correlation with EA = 3.3743% and r = 0.9715. Fig. 4 shows the T reservoir temperature, degrees Fahrenheit
same graph type, but with bp instead, based on the data set used in γo oil relative density (water = 1.0)
Osman et al. (2001) using the same modeling schemes. We observe γg gas relative density (air = 1.0)
that the symbol corresponding to SVR scheme falls in the upper Er average percent relative error
left corner with EA = 1.368% and R2 = 0.9884, while neural network is Ei percent relative error
below SVR with EA = 1.7886% and r = 0.9878, and other correlations Ea average absolute percent relative error
indicates higher error values with lower correlation coefficients, for Emax Maximum absolute percent relative error
instance, Al-Marhoun (1992) has EA = 2.2053% and r = 0.9806; Standing Emin Minimum absolute percent relative error
(1947) has EA = 2.7238% and r = 0.9742; and Glaso (1980) correlation RMS Root Mean Square error
with EA = 3.3743% and r = 0.9715.
The same implementations process were repeated for the other Acknowledgement
data sets used in Al-Marhoun (1988, 1992), and Al-Marhoun and
Osman (2002) and Osman and Abdel-Aal (2002), but for the sake of The author wishes to thank King Fahd University of Petroleum and
simplicity, we did not include it in this context. Minerals and Mansoura University for the facilities utilized to perform
Figs. 5–10 illustrate six scatter plots of the predicted results the present work and for their support. In addition, he is thanking
versus the experimental data for both bpp and Bob values using the both editor-in-chief and the reviewer's for their valuable comments
same three distinct data sets. These cross plots indicates the degree and support.
References Kartoatmodjo, T., Schmidt, Z., 1994. Large data bank improves crude physical property
correlations. Oil Gas J. 51 July 4.
Abdul-Majeed, G.H.A., Salman, N.H., 1988. An empirical correlation for FVF prediction. Katz, D.L., 1942. Prediction of shrinkage of crude oils. Drill&Prod. Pract., API, pp. 137–147.
JCPT 118 July–August. Kobayashi, Kei, Komaki, Fumiyasu, 2006. Information criteria for support vector
Al-Fattah, S.M., Al-Marhoun, M.A., 1994. Evaluation of empirical correlation for bubble machines. IEEE Trans. Neural Netw. 17 (3), 571–577.
point oil formation volume factor. J. Pet. Sci. Eng. 11, 341. Kumoluyi, A.O., Daltaban, T.S.,1994. High-order neural networks in petroleum engineering.
Ali, J.K., 1994. Neural Networks: A New Tool for the Petroleum Industry. Paper SPE paper SPE 27905 presented at the 1994 SPE Western Regional Meeting, Longbeach,
27561. California, USA, March 23–25.
Al-Marhoun, M., 1988. PVT correlations for Middle East Crude Oils. J. Pet. Tech. 650–666. Labedi, R., 1990. Use of production data to estimate volume factor density and
Al-Marhoun, M.A., 1992. New correlation for formation volume factor of oil and gas compressibility of reservoir fluids. J. Pet. Sci. Eng. 4, 357.
mixtures. JCPT 22 March. Macary, S.M., El-Batanoney, M.H., 1992. Derivation of PVT correlations for the Gulf of
Al-Marhoun, M.A., Osman, E.A., 2002. Using artificial neural networks to develop new Suez crude oils. Paper presented at the EGPC 11th Petroleum Exploration &
PVT correlations for Saudi crude oils. Paper SPE 78592, presented at the 10th Abu Production Conference, Cairo, Egypt.
Dhabi International Petroleum Exhibition and Conference (ADIPEC), Abu Dhabi, Mahmood, M.M., Al-Marhoun, M.A., 1996. Evaluation of empirically derived PVT
UAE, October 8–11. properties for Pakistani crude oils. J. Pet. Sci. Eng. 16, 275.
Almehaideb, R.A., 1997. Improved PVT correlations for UAE crude oils. paper SPE 37691 McCain, W.D., 1991. Reservoir fluid property correlations — state of the art. SPERE 266
presented at the 1997 SPE Middle East Oil Show and Conference, Bahrain, March May.
15–18. McCain Jr., W.D., R. Soto, B., Valko, P.P., Blasingame, T.A., 1991. Correlation of bubble
Al-Shammasi, A.A., 1997. Bubble Point Pressure and Oil Formation Volume Factor point pressures for reservoir oils — a comparative study. paper SPE 51086 presented
Correlations. paper SPE 53185 presented at the 1997 SPE Middle East Oil Show and at the 1998 SPE Eastern Regional Conference and Exhibition held in Pittsburgh, PA,
Conference, Bahrain, March 15–18. 9–11 November.
Al-Shammasi, A.A., 2001. A review of bubble point pressure and oil formation volume Mohaghegh, S., 1995. Neural networks: what it can do for petroleum engineers. JPT 42
factor correlations. SPE Reservoir Evaluation & Engineering 146–160 April, This Jan.
paper (SPE 71302) was revised for publication from paper SPE 53185. Mohaghegh, S., 2000. Virtual intelligence applications in petroleum engineering: part 1 —
Al-Yousef, H.Y., Al-Marhoun, M.A., 1993. Discussion of correlation of PVT properties for artificial neural networks. JPT September.
UAE crudes. SPEFE 80 March. Mohaghegh, S., Ameri, S., 1994. A artificial neural network as a valuable tool for petroleum
Burges, C.J.C., 1998. A tutorial on support vector machines for pattern recognition. Data engineers. SPE 29220, unsolicited paper for Society of Petroleum Engineers.
Mining and Knowledge Discovery 2. Kluwer, Norwell, MA, pp. 121–167. Omar, M.I., Todd, A.C., 1993. Development of new modified black oil correlation for
Castillo, E., Fontenla-Romero, O., Alonso Betanzos, A., Guijarro-Berdiñas, B., 2002. A Malaysian crudes. paper SPE 25338 presented at the 1993 SPE Asia Pacific Asia
global optimum approach for one-layer neural networks. Neural Comput. 14 (6), Pacific Oil & Gas Conference and Exhibition, Singapore, Feb. 8–10.
1429–1449. Osman, E.A., Abdel-Aal, 2002. Abductive netwoks: a new modeling tool for the oil and
Castillo, E., Guijarro-Berdiñas, B., Fontenla-Romero, O., Alonso-Betanzos, A., 2006. A gas industry. SPE 77882 Asia Pacific Oil and Gas Conference and Exhibition
very fast learning method for neural networks based on sensitivity analysis. J. Mach. Melbourne, Australia, 8–10 October.
Learn. Res. 7, 1159–1182. Osman, E., Al-Marhoun, M., 2005. Artificial neural networks models for predicting PVT
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20, 273–297. properties of oil field brines. paper SPE 93765, 14th SPE Middle East Oil & Gas Show
Cristianini, C., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines. and Conference in Bahrain, March.
Cambridge Univ. Press, Cambridge, U.K. Osman, E.A., Abdel-Wahhab, O.A., Al-Marhoun, M.A., 2001. Prediction of Oil Properties
Dokla, M., Osman, M., 1992. Correlation of PVT properties for UAE crudes. SPEFE 41 Using Neural Networks. Paper SPE 68233.
March. Osuna, E., Freund, R., Girosi, F., 1997. Training support vector machines: an application to
Dokla, M., Osman, M., 1993. Authors' Reply to Discussion of Correlation of PVT face detection. IEEE Conf. Comput. Vis. Pattern Recognition, pp. 130–136. Jan.
Properties for UAE Crudes. SPEFE, p. 82. March. Peng, K.L., Wu, C.H., Goo, Y.J., 2004. The development of a new statistical technique
Duda, R.O., Hart, P.E., Stock, D.G., 2001. Pattern Classification, Second Edition. John Wiley for relating financial information to stock market returns. Int. J. Manag. 21 (4),
and Sons, New York. 492–505.
El-Sebakhy, A.E., 2004. A fast and efficient algorithm for multi-class support vector Petrosky, J., Farshad, F., 1993. Pressure volume temperature correlation for the Gulf of
machines classifier. ICICS2004: 28–30 November. IEEE Computer Society, pp. 397–412. Mexico. paper SPE 26644 presented at the 1993 SPE Annual Technical Conference
El-Sebakhy, A.E., Hadi, A.S., Faisal, A.K., 2007. Iterative least squares functional networks and Exhibition, Houston, TX, Oct 3–6.
classifier. IEEE Trans. Neural Netw. J. 3 (18), 844–850 May. Saleh, A.M., Maggoub, I.S., Asaad, Y., 1987. Evaluation of empirically derived PVT
Elsharkawy, A.M., 1998. Modeling the properties of crude oil and gas systems using RBF properties for Egyptian oils. paper SPE 15721, presented at the 1987 Middle East Oil
network. paper SPE 49961 presented at the 1998 SPE Asia Pacific Oil & Gas Show & Conference, Bahrain, March 7–10.
Conference, Perth, Australia, October 12–14. Schölkopf, P., Smola, A.J., 2002. Learning with Kernels: Support Vector Machines,
Elsharkawy, A.M., Elgibaly, A., Alikhan, A.A., 1994. Assessment of the PVT correlations for Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA.
predicting the properties of the Kuwaiti crude oils. paper presented at the 6th Abu Standing, M.B., 1947. A pressure–volume–temperature correlation for mixtures of
Dhabi International Petroleum Exhibition & Conference, Oct. 16–19. California oils and gases. Drill & Prod. Pract. API, pp. 275–287.
Gharbi, R.B., Elsharkawy, A.M., 1997a. Neural-network model for estimating the PVT Standing, M.B., 1977. Volumetric and Phase Behavior of Oil Field Hydrocarbon System.
properties of Middle East crude oils. paper SPE 37695 presented at the 1997 SPE Millet Print Inc., Dallas, TX, p. 124.
Middle East Oil Show and Conference, Bahrain, March 15–18. Sutton, R.P., Farshad, F., 1990. Evaluation of Empirically Derived PVT Properties for Gulf
Gharbi, R.B., Elsharkawy, A.M., 1997b. Universal neural-network model for estimating of Mexico Crude Oils. SPRE, p. 79. Feb.
the PVT properties of crude oils. paper SPE 38099 presented at the 1997 SPE Asia Sutton, Roberts P., Farshad, F., 1991. Supplement to SPE 1372, Evaluation of Empirically
Pacific Oil & Gas Conference, Kuala Lumpur, Malaysia, April 14–16. Derived PVT Properties for Gulf of Mexico Crude Oils. SPE 20277, Available from SPE
Ghetto, G., De Paone, F., Villa, M., 1994. Reliability analysis on PVT correlation. paper SPE book Order Dep., Richardson, TX.
28904 presented on October 25–27, 1994 at the SPE European Petroleum Vapnik, V., 1995. The nature of statistical learning theory. Springer, New York.
Conference, London, UK. Vapnik, V., 1998. Statistical Learning Theory. Wiley, New York.
Glaso, O., 1980. Generalized pressure–volume temperature correlations. JPT 785 May. Varotsis, N., Gaganis, V., Nighswander, J., Guieze, P., 1999. A novel non-iterative method
Goda, M.H., Shokir, E.M., El, M., Fattah, K.A., Sayyouh, H.M., 2003. Prediction of the PVT for the prediction of the PVT behavior of reservoir fluids. paper SPE 56745
data using neural network computing theory. The 27th Annual SPE International presented at the 1999 SPE Annual Technical Conference and Exhibition, Houston,
Technical Conference and Exhibition in Abuja, Nigeria, August 4–6, p. SPE85650. Texas, October 3–6.
Hanafy, H.H., Macary, S.A., Elnady, Y.M., Bayomi, A.A., El-Batanoney, M.H., 1997. Vasquez, M., Beggs, H.D., 1980. Correlation for fluid physical property prediction. JPT
Empirical PVT correlation applied to Egyptian crude oils exemplify significance of 968 June.
using regional correlations. paper SPE 37295 presented at the SPE Oilfield
Chemistry International Symposium, Houston, Feb. 18–21.
Joachims, T., 1997. Text categorization with support vector machines: learning with
many relevant features. Proc. Europ. Conf. Mach.

Forecasting PVT Properties of Crude Oil Systems Based On Support Vector Machines Modeling Scheme, 2009

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forecasting PVT Properties of Crude Oil Systems Based On Support Vector Machines Modeling Scheme, 2009

Uploaded by

Copyright:

Available Formats

Journal of Petroleum Science and Engineering 64 (2009) 25–34

Contents lists available at ScienceDirect

Journal of Petroleum Science and Engineering

Forecasting PVT properties of crude oil systems based on support

1. Introduction the solution is to use equation of state, statistical regression, graphical

complex ﬁtting functions. Thus, SVR is formulated as minimization

The second item 2 is the regularization term. The regularized

Correlation Er EA Emin Emax R2 Correlation Er EA Emin Emax SD R2

2. The second database was drawn from articles by Al-Marhoun and

Correlation Er EA Emin Emax R2

5.1. Parameters initialization

In this study, we follow the same procedures in Al-Marhoun and

Correlation Er EA Emin Emax R2

of agreement between the experimental and the predicted values

6. Conclusion and recommendation

You might also like