You are on page 1of 5

Proceedings of ICEE 2009 3rd International Conference on Energy and Environment,

7-8 December 2009, Malacca, Malaysia



978-1-4244-5145-6/09/$26.00 2009 IEEE 1
Prediction of PVT Properties In Crude Oil Systems
Using Support Vector Machines
J awad Nagi, Tiong Sieh Kiong, Syed Khaleel Ahmed
Department of Electronics and Communication Engineering
Universiti Tenaga Nasional
KM 7, J alan Kajang-Puchong, 43009 Kajang, Malaysia
{jawad, siehkiong, syedkhaeel}@uniten.edu.my
Farrukh Nagi
Department of Mechanical Engineering
Universiti Tenaga Nasional
KM 7, J alan Kajang-Puchong, 43009 Kajang, Malaysia
farrukh@uniten.edu.my


Abstract Calculation of reserves in an oil reservoir and the
determination of its performance and economics require good
knowledge of its physical properties. Accurate determination of
the pressurevolumetemperature (PVT) properties such as the
bubble point pressure (P
b
) and the oil formation volume factor
(B
ob
Keywords- Support vector machine; support vector
regression; PVT properties; bubble point pressue; oil formation
volume factor.
) are important in the primary and subsequent development
of an oil field. This paper proposes Support Vector Machines
(SVMs) as a novel machine learning technique for predicting
outputs in uncertain situations using the -Support Vector
Regression (-SVR) method. The objective of this research is to
investigate the capability of SVRs in modeling PVT properties of
crude oil systems and solving existing Artificial Neural Network
(ANN) drawbacks. Three datasets used for training and testing
the SVR prediction model were collected from distinct published
sources. The -SVR model incorporates four input features from
the datasets: (1) solution gas-oil ratio, (2) reservoir temperature,
(3) oil gravity and, (4) gas relative density. A comparative study
is carried out to compare -SVR performance with ANNs,
nonlinear regression, and different empirical correlation
techniques. The results obtained reveal that the -SVR once
successfully trained and optimized is more accurate, reliable, and
outperforms the other existing approaches such as empirical
correlation for estimating crude oil PVT properties.
I. INTRODUCTION
Reservoir fluid properties are very important in reservoir
engineering computations such as material balance
calculations, well test analysis, reserve estimates, and reservoir
simulations [1]. Knowledge of the pressurevolume
temperature (PVT) properties is a requirement for all types of
petroleum calculations such as determination of hydrocarbon
flowing properties, design of fluid handling equipment, and
reservoir volumetric estimates. For the development of a
correlation, the geological condition is considered important
because the chemical composition of crude oil differs from
region to region [3]. Among the PVT properties is the bubble
point pressure (P
b
) and oil formation volume factor (B
ob
),
which is defined as the volume of reservoir oil that would be
occupied by one stock tank barrel oil plus any dissolved gas at
the bubble point pressure and reservoir temperature [4]. Since
it is crucial that all calculations in reservoir performance,
production operations and design, and formation evaluation be
as good as the PVT properties, therefore precise prediction of
P
b
and B
ob
There are many empirical correlations for predicting
different PVT properties such as the equation of state (EOS),
linear or non-linear multiply regression or graphical techniques
[4]. Many correlations already exist in the oil and gas industry
such as the: Standing [5], Glaso [6], Beggs and Vasquez [7]
correlations etc. Many researchers have investigated that
artificial neural networks (ANNs) are able to solve the PVT
correlation problems in the petroleum industry [8]. Therefore,
in recent years researchers have utilized ANNs to predict more
accurate PVT correlations [4]. However, ANN developed
correlations do not perform accurately and suffer from a
number of drawbacks. They have some limitations as they were
originally developed for a certain range of reservoir fluid
characteristics and geographical area with similar fluid
compositions and API oil gravity [3].
properties is necessary.
The objective of this study is to investigate the capability of
support vector machines (SVMs), i.e. the -Support Vector
Regression (-SVR) method in modeling PVT properties of
crude oil systems and to overcome the limitations and
drawbacks of ANNs. The -SVR is an intelligence paradigm
which is based on the statistical learning theory and the
structural risk minimization (SRM) principle. In this research
study, the capability of -SVR in modeling both P
b
and B
ob
II. LITERATURE SURVEY
is
investigated, using worldwide published experimental PVT
databases. Comparative studies are also carried out to compare
the performance of the proposed -SVR model with the ANNs,
nonlinear regression, and different empirical correlation
techniques.
In the oil and gas industry PVT properties such as,
permeability and porosity, liquid-holdup flow regimes,
lithofacies types etc are very important properties in reservoir
engineering computations. Predicting these properties in the
laboratories is very expensive and the accuracy of such
predictions is critical and not often known in advance. This
section briefly summarizes the most common empirical PVT
correlations and different computational intelligence schemes
published for predicting PVT properties in the recent years.

2
A. Common Empirical Correlation Methods
In the last six decades, researchers and engineers have
realized the importance of developing and using empirical
correlations for PVT properties [1]. Many studies have been
carried out in this field to develop new correlations. Standing
[5] and [9] presented correlations for the bubble point pressure
(P
b
) and the oil formation volume factor (B
ob
). Standings
correlations were based on laboratory experiments carried out
on 105 samples from 22 different crude oils in California. Katz
[10] presented five methods for predicting reservoir oil
shrinkage. Vazquez and Beggs [7] presented correlations for
B
ob
by dividing oil mixtures into two groups, above and below
30 API gravity. Glaso [6] developed correlation for B
ob
using
45 oil samples from North Sea hydrocarbon mixtures. Al-
Marhoun [11] published correlations for estimating P
b
and B
ob

for the Middle East oils. He used 160 datasets from 69 Middle
Eastern reservoirs to develop the correlation. Abdul-Majeed
and Salman [12] published a B
ob
B. Predicting PVT Properties Using ANNs
correlation based on 420
datasets. A detailed literature of PVT correlation models and
schemes proposed by researchers and engineers over the last
six decades can be reviewed from Osman et al. [1] and Al-
Marhoun et al. [4]. This research only concentrates on the most
common three empirical correlations, namely: Al-Marhoun et
al. [13], Glazo [6], and Standing [5] for the purpose of a
comparative study.
In recent years, Artificial Neural Networks (ANNs) have
gained much popularity in petroleum engineering applications.
Few studies have been carried out for predicting PVT
properties using feed-forward ANNs such as: Al-Shammasi
[14], Elsharkawy [16], Gharbi [2] and [17], Kumoluyi [18],
McCain et al. [19], Osman et al. [1] and Varotsis et al. [20]. In
[14] Al-Shammasi presented neural network models and
compared their performance to numerical correlations. He used
a two hidden layer neural network (4-5-3-1) structure for
predicting both P
b
and B
ob
Gharbi and Elsharkawy [17] and Osman et al. [1]
conducted comparative studies to determine the performance
between feed-forward ANNs and four empirical correlations,
Standing [5], Al-Marhoun et al. [13], Glaso [6], and Vasquez
and Beggs [7] correlation. In 1997, Gharbi et al. [2], [17]
published ANN models for estimating P
properties. He concluded that the
statistical and trend performance analysis showed some of the
correlations violate the physical behavior of hydrocarbon fluid
properties [4].
b
and B
ob
for Middle
Eastern crude oils based on a neural system with a sigmoid
activation function. Gharbi et al. [2] developed a universal
ANN model for predicting PVT properties for any oil reservoir.
In [17] Gharbi and Elsharkawy trained two ANNs separately to
estimate P
b
and B
ob
III. IMPLEMENTATION
. The input data consisted of the solution
gas-oil ratio, reservoir temperature, oil gravity, and gas relative
density. They used a two hidden layers neural network [1].
Results obtained from their comparative studies indicated that
the feed-forward ANNs outperformed conventional empirical
correlation schemes for predicting PVT properties. A detailed
literature of PVT correlation models using ANN schemes
proposed by researchers in the past two decades can be
reviewed from Osman et al. [1] and Al-Marhoun et al. [4].
The prediction model developed in this paper is
implemented using datasets from authors who have published
work in this research area [1], [4], [11], [2224]. Prediction of
the P
b
and B
ob
A. Data Acquisiton
uses the -Support Vector Regression (-SVR)
method in SVMs. The -SVR model is trained and optimized
using cross-validation (CV) and is tested using these datasets.
Datasets used to implement the -SVR prediction model
were collected from three distinct published sources. Properties
of the three datasets acquired are explained below:
1. The first dataset was retrieved from work conducted
by Al-Marhoun [11]. This dataset consists of 160
observations collected from 69 Middle Eastern
reservoirs, which published correlations for estimating
P
b
and B
ob
2. The second dataset was obtained from work
conducted by Al-Marhoun [4] and Osman et al. [22]
and [23]. This dataset consists of 283 observations
collected from different Saudi oil fields to predict the
P
for Middle Eastern oils.
b
and B
ob
at the P
b
3. The third dataset was retrieved from the work
conducted by Goda et al. [24] and Osman et al. [1].
This dataset contains 782 observations. This dataset
was collected from the oil fields of Malaysia, Middle
East, Gulf of Mexico and Columbia.
for Saudi crude oils.
B. SVR Model Development
For development of the prediction model, LIBSVM v2.86
[25], a library for support vector machines is used in MATLAB
R2009a. LIBSVM is used as the core of the SVM to accurately
predict P
b
and B
ob
values using a -SVR model. Four features
from the three datasets were selected to build -SVR model for
predicting the P
b
and B
ob
In order for the all features to fit the -SVR model properly,
all data points from the four features were linearly scaled
(normalized) in the range from 0 to 1. To evaluate the
performance of the -SVR model, the datasets were divided
using a stratified criterion such that 70% of the data was used
for training the -SVR and the remaining 30% of the data was
used for testing and validation. Therefore, of the 782 data
points: 382 were used to train the -SVR, 200 were used to
cross-validate (CV) the relationships established during the
training process, and the remaining 200 were used to test -
SVR model for evaluation of its accuracy and trend stability.
The internal and external validation processes were repeated
1000 times for 10-fold CV trials.
. The four input features are: (1)
solution gas-oil ratio, (2) reservoir temperature, (3) oil gravity,
and (4) gas relative density.
The radial basis function (RBF) kernel is used for training
the -SVR. Model optimization was performed by optimizing
the three important -SVR hyper-parameters: (1) Cost of error,
C (2) Width of the -tube and the, (3) RBF kernel
parameter, . Experimentally though 1000 iterative trials the
best 10-fold CV accuracy was achieved by selecting =0.56.

3
Optimal -SVR hyper-parameters (C, ) were selected using
the SVM Grid-Search method [26], by using exponentially
growing sequences of parameters (C, ) and iterating them for
all possible combinations to determine the optimal -SVR
hyper-parameter set. The optimal -SVR hyper-parameters (C,
) were found to be (1, 0.89) respectively. The 10-fold CV
method utilized in this study is used as a checking mechanism
to prevent both over fitting and complexity criterion based on
the root mean square error (RMSE).
C. Quality Control and Dataset Domain
A statistical summary of the investigated quality measures
corresponding to proposed -SVR modeling scheme, ANNs
and the three common empirical correlations (Standing [5], Al-
Marhoun et al. [13] and Glaso [6]) using the three distinct
datasets were found. The following are the most common
domains for the variables that are used in both the input and
output layers of the modeling schemes for PVT analysis:
1. Input variables:
a. Solution gas-oil ratio range: 26 and 1602, scf/stb.
b. Reservoir temperature range: 74 F to 240 F.
c. Oil gravity (API) range: 19.4 and 44.6.
d. Gas relative density: 0.744 to 1.367.
2. Output variables:
a. Oil formation volume factor: 1.032 to 1.997 bbl/stb.
b. Bubble point pressure: 130 to 3573 psia.
D. Statistical Quality Measures
To compare the performance of the proposed -SVR model
to other schemes such as ANNs and empirical correlations,
seven quality measures based on statistical error analysis were
computed. The statistical parameters calculated were: average
percent relative error (E
r
), average absolute percent relative
error (E
a
), minimum absolute percent error (E
min
), maximum
absolute percent error (E
max
), root mean square errors (E
rms
),
standard deviation (), and the correlation coefficient (r).
Detailed information corresponding to the equations of these
statistical parameters can be found in Osman et al. [1]. The best
model in terms of the P
b
and B
ob
IV. EXPERIMENTAL RESULTS
prediction accuracy is the
model with the highest correlation coefficient (r) and the
smallest RMSEs.
Comparative studies were conducted to compare the
performance of the proposed -SVR model with ANNs and the
three common empirical correlations (Standing [5], Al-
Marhoun et al. [13] and Glaso [6]) for all three distinct
datasets. Graph plots of the average absolute percent relative
error (E
a
) versus the correlation coefficients (r) for all
modeling schemes are shown in Figure 1 and Figure 2 for B
ob

and P
b
respectively. Figure 1 and Figure 2 are based on the
dataset used by Osman et al. [1]. It is observed that the SVR
scheme for both the bubble point pressure (P
b
) and oil
formation volume factor (B
ob
) prediction has the lowest error
and highest correlation coefficient, while the ANN and all
other schemes fall below the SVR.

Figure 1. Average absolute percent relative error (E
a
) versus correlation
coefficients (r) for all modeling schemes for predicting B
ob


Figure 2. Average absolute percent relative error (E
a
) versus correlation
coefficients (r) for all modeling schemes for predicting P
The comparative prediction results of both the P
b

b
and B
ob
V. CONCLUSION

for all modeling schemes (SVR, ANN and empirical
correlations) using all three distinct datasets, are summarized in
Tables I through VI respectively.
In this study, three distinct published datasets were used to
investigate the capability of SVR in modeling a new
framework for predicting PVT properties of oil crude systems.
The -SVR, a novel machine learning technique was used
predict both the bubble point pressure (P
b
) and oil formation
volume factor (B
ob
) using four input variables: (1) solution gas-
oil ratio, (2) reservoir temperature, (3) oil gravity, and (4) gas
relative density. The prediction results obtained revealed that
the developed -SVR modeling scheme outperforms both
ANNs and the three most common empirical correlations.

4

TABLE I. MODEL TESTING RESULTS (USING EL-SEBAKHY ET AL. [15] AND OSMAN ET AL. [1] DATASET) FOR PREDICTING B
Correlation
OB

E E
r
E
a
E
min
r
max

Standing (1947) [5] -0.170 2.724 0.008 20.180 0.974
Glaso (1980) [6] 1.8186 3.374 0.003 17.776 0.972
Al-Marhoun (1992) [13] -0.115 2.205 0.003 13.179 0.981
ANN System 0.3024 1.789 0.008 11.775 0.980
SVR System 0.18 1.37 0.002 7.751 0.984

TABLE II. MODEL TESTING RESULTS (USING EL-SEBAKHY ET AL. [15] AND OSMAN ET AL. [1] DATASET) FOR PREDICTING P
Correlation
B

E E
r
E
a
E
min
r
max

Standing (1947) [5] 67.60 67.73 0.1620 102.08 0.867
Glaso (1980) [6] -1.616 18.52 0.1056 138.96 0.945
Al-Marhoun (1992) [13] 8.008 20.01 0.0254 109.12 0.906
ANN System 8.129 21.02 0.0182 145.29 0.943
SVR System 7.547 15.12 0.009 104.73 0.963

TABLE III. MODEL TESTING RESULTS (USING AL-MARHOUN [4] AND OSMAN [22] DATASET) FOR PREDICTING B
Correlation
OB

E E
r
E
a
E
min

max
r
Standing (1947) [5] -1.054 1.6833 0.066 7.7997 2.1021 0.9947
Glaso (1980) [6] 0.4538 1.7865 0.0062 7.3839 2.1662 0.9920
Al-Marhoun (1992) [13] -0.392 0.8451 0.0003 3.5546 1.1029 0.9972
ANN System 0.217 0.5116 0.0061 2.6001 0.6626 0.9977
SVR System 0.006 0.353 0.0013 2.5835 0.4743 0.997

TABLE IV. MODEL TESTING RESULTS (USING AL-MARHOUN [4] AND OSMAN [22] DATASET) FOR PREDICTING P
Correlation
B

E E
r
E
a
E
min

max
r
Standing (1947) [5] -8.441 10.4562 0.2733 47.0213 11.841 0.8974
Glaso (1980) [6] -18.48 20.7569 2.0345 63.7634 16.160 0.9837
Al-Marhoun (1992) [13] 0.941 8.1028 0.0935 38.085 11.41 0.9905
ANN System -0.222 5.8915 0.2037 38.1225 8.678 0.9930
SVR System -0.326 3.260 0.0016 25.447 4.978 0.995

TABLE V. MODEL TESTING RESULTS (USING GODA ET AL. [24] AND OSMAN ET AL. [1] DATASET) FOR PREDICTING B
Correlation
OB

E E
r
E
a
E
min
r
max

Standing (1947) [5] -2.628 2.7202 0.0167 13.2922 0.9953
Glaso (1980) [6] -0.5529 0.9821 0.0086 6.5123 0.9959
Al-Marhoun (1992) [13] -0.4514 2.0084 0.0322 11.0755 0.9935
ANN System 0.3251 1.4592 0.0083 5.3495 0.9968
SVR System 0.065 0.386 0.007 2.502 0.9971

TABLE VI. MODEL TESTING RESULTS (USING GODA ET AL. [24] AND OSMAN ET AL. [1] DATASET) FOR PREDICTING P
Correlation
B

E E
r
E
a
E
min
r
max

Standing (1947) [5] 12.811 24.684 0.62334 59.038 0.8657
Glaso (1980) [6] -18.887 26.551 0.28067 98.78 0.9675
Al-Marhoun (1992) [13] 5.1023 8.9416 0.13115 87.989 0.9701
ANN System 4.9205 6.7495 0.16115 65.3839 0.9765
SVR System 3.426 4.0757 0.128 53.219 0.9808

5
Thus, the proposed -SVR prediction model has a far better,
efficient, and reliable performance compared to ANNs and
empirical correlations. In addition, the proposed scheme shows
higher prediction accuracy for B
ob
Hence, the proposed prediction model is flexible and shows
a bright future for implementation in the oil and gas industry.
SVR models similar to this can be easily incorporated into
reservoir simulators and production optimization software.
Future work using this -SVR framework can be achieved by
modeling other oil and gas industry problems such as:
permeability and porosity prediction, identifying liquid-holdup
flow regimes, fracture corridor, identifying lithofacies types,
seismic pattern recognition, estimating pressure drops in the
pipes and wells, optimizing well production, and other
reservoir characteristics.
by achieving the lowest
error metrics and the highest correlation coefficient among
other techniques.
REFERENCES
[1] E. A. Osman, O. A. Abdel-Wahhab, and M. A. Al-Marhoun,
Prediction of Oil PVT Properties Using Neural Networks, Paper
SPE 68233 presented at the 12th SPE Middle East Oil Show &
Conference, March, 2001.
[2] R. B. Gharbi, A.M. Elsharkway, and M. Karkoub, Universal Neural-
Network-Based Model for Estimating the PVT Properties of Crude
Oil Systems, Energy & Fuels, vol. 13, pp. 454458, 1999.
[3] M. A. Al-Marhoun, Evaluation of empirically derived PVT
properties for Middle East crude oils, Journal of Petroleum Science
and Engineering, vol. 42, pp. 209221, 2004.
[4] M. A. Al-Marhoun, and E. A. Osman, Using Artificial Neural
Networks to Develop New PVT Correlations for Saudi Crude Oils,
Paper SPE 78592 presented at the 10th Abu Dhabi International
Petroleum Exhibition and Conference, Abu Dhabi, UAE, 8-11
October, 2002.
[5] M. B. Standing, A pressurevolumetemperature correlation for
mixtures of California oils and gases, Drilling and Production. Am.
Pet. Inst., Tulsa, OK, pp. 275287, 1947.
[6] O. Glaso, Generalized pressurevolumetemperature correlations,
J. Pet. Technol., vol. 32, no. 5, pp. 785795, 1980.
[7] M. E. Vasquez, and H.D. Beggs, Correlation for fluid physical
property prediction, J. Pet. Technol., vol. 32, no. 6, pp. 968970,
1980.
[8] E. M. El-M. Shokir, H. M. Goda, K. A. Fattah, and M. H. Sayyouh,
Modeling Approach for Predicting PVT Data, Engineering Journal
of the University of Qatar, vol. 17, pp. 1128, 2004.
[9] M. B. Standing, Volumetric and Phase Behavior of Oil Field
Hydrocarbon System Millet Print Inc., Dallas, pp. 124126, 1977.
[10] D. L. Katz, Prediction of Shrinkage of Crude Oils, Drill. & Prod.
Pract., API, pp. 137147, 1942.
[11] M. A. Al-Marhoun, PVT Correlations for Middle East Crude Oils,
J. Pet. Technol., vol. 40, no. 5, pp. 650666, 1988.
[12] G. H. A. Abdul-Majeed, and N. H. Salman, An Empirical
Correlation for FVF Prediction, Journal of Canadian Petroleum
Technology, pp. 118, 1988.
[13] M. A. Al-Marhoun, New Correlations for Formation Volume Factors
of Oil and Gas Mixtures, Journal of Canadian Petroleum
Technology, vol. 31, no. 3, pp. 2226, 1992.
[14] A. A. Al-Shammasi, Bubble Point Pressure and Oil Formation
Volume Factor Correlations, Paper SPE 53185 in Proceedings of
the SPE Middle East Oil Show and Conference, Bahrain, March 15-
18, 1997.
[15] A. E. El-Sebakhy A. E., A. S. Hadi, and A. K. Faisal, Iterative Least
Squares Functional Networks Classifier, IEEE Transactions Neural
Networks, vol. 18, no. 3, pp. 844850, 2007.
[16] A. M. Elsharkawy, Modeling the Properties of Crude Oil and Gas
Systems Using RBF Network, Paper SPE 49961 presented at the
SPE Asia Pacific Oil and Gas Conference, Australia, 12-14 October,
1998.
[17] R. B. Gharbi, and A. M. Elsharkawy, Neural-Network Model for
Estimating the PVT Properties of Middle East Crude Oils, Paper
SPE 37695 presented at the 1997 SPE Middle East Oil Conference,
Bahrain, 1997.
[18] A. O. Kumoluyi, and T. S. Daltaban, High Order Neural Network in
PetroleumEngineering, Paper SPE 27905 presented at the 1994
SPE Western Regional Meeting, Longbeach, California, 23-25 March,
1994.
[19] W. D. McCain J r., B. R. Soto, P. P. Valko, and T. A. Blasingame,
Correlation of Bubble point Pressures for Reservoir Oils - A
Comparative Study, Paper SPE 51086 presented at the 1998 SPE
Eastern Regional Conference and Exhibition, Pittsburgh, PA, 9-11
November, 1998.
[20] N. Varotsis, V. Gaganis, J . Nighswander, and P. Guieze P, A Novel
Non-Iterative Method for the Prediction of the PVT Behavior of
Reservoir Fluids, Paper SPE 56745 presented at the SPE Annual
Technical Conference and Exhibition, Texas, October 3-6, 1999.
[21] M. I. Omar, and A. C. Todd, Development of New Modified Black
oil Correlation for Malaysian Crudes, Paper SPE 25338 presented at
the SPE Asia Pacific Oil & Gas Conference, Singapore, 8-10
February, 1993.
[22] E. A. Osman, and R. E. Abdel-Aal, Abductive Networks: A New
Modeling Tool for the Oil and Gas Industry, Paper SPE 77882 Asia
Pacific Oil and Gas Conference and Exhibition Melbourne, October,
2002.
[23] E. A. Osman, M. A. Al-Marhoun, Artificial Neural Networks
Models for Predicting PVT Properties of Oil Field Brines, Paper
SPE 93765 presented at the 14th SPE Middle East Oil & Gas
Conference, March, 2005.
[24] M. H. Goda, E. M. El-M. Shokir, K. A. Fattah, and H. M. Sayyouh,
Prediction of the PVT Data using Neural Network Computing
Theory, SPE-I Paper presented at the 27th Annual SPE
International Technical Conference and Exhibition in Abuja, Nigeria,
4-6 August, 2003.
[25] C.-C. Chang and C.-J . Lin, LIBSVM: A Library for Support Vector
Machines. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm
[26] C.-W. Hsu, C.-C. Chang, and C.-J . Lin, A practical guide to support
vector classification, Technical Report, Department of Computer
Science, National Taiwan University, Taipei, Taiwan, 2003.

You might also like