Incorporating Historical Data Into Monte Carlo Simulation

----------------------------11
Incorporating Historical Data Into

Manta Carlo Simulation
James A. Murtha, SPE • Consultant
Summary beta, gamma, Weibull, and Pareto). As we see when we use field
This paper demonstrates how to incorporate historical data into data, however, the distribution may take on a customized shape
Monte Carlo simulations, describes how the parameters are dis- instead of being an exact theoretical distribution. These theoretical
tributed, and quantifies dependencies among them. It also shows distributions and the grouped field data are represented as histo-
the effect of ignoring dependency and presents summary statistics grams, probability density functions (e.g., the bell-shaped curve),
on field data. or cumulative distribution functions. The three types of graphs in
Fig. 1 are examples of three common distributions.
Introduction Regardless of how we represent input distributions, it is helpful
to represent an output distribution with a cumulative distribution
As Cronquist 1 points out, the oil and gas industry has been un- function (CDF), which gives the user a means to compare alterna-
willing to adopt stochastic definitions of reserves. Nevertheless, tives. That is, two outputs may represent alternative prospects. Their
Monte Carlo simulation methods seem to be gaining acceptance by CDF's can be overlaid for easy comparison.
engineers, geoscientists, and other professionals who want to evalu- The CDF is also useful to illustrate how Monte Carlo sampling
ate prospects or to otherwise analyze problems that involve uncer- is accomplished (Fig. 2). First, a uniformly distributed random num-
tainty. Among the common applications of Monte Carlo simulation ber between 0 and 1 is selected and used to enter the vertical axis
are estimation of recoverable hydrocarbons from a reservoir, fore- of a CDF, which represents cumulative probability. Proceeding to
casting production and revenue streams for a well or a field, eval- the curve and then down to the horizontal axis, a unique value of
uation of a waterflood prospect, and comparison of net present the corresponding random variable is determined. Thus, the sam-
values of alternative investments. In each case, the user must pling process requires only the existence of a CDF for the parame-
prescribe statistical distributions for the input parameters. Select- ter being sampled, which is the key to using any set of field data
ing these distributions is often the most challenging aspect of the as a model for an input distribution. We simply construct the CDF
simulation. While experience and fundamental principles should for those data, first by grouping them into classes and then by cal-
guide us, field data may exist that could suggest both the type of culating the cumulative relative frequency.
distribution and the parameters necessary to describe it. Finding
appropriate data and incorporating them into the model is one fo- Examples of Simulation Models
cus of this paper.
Although we concentrate on one model for our example, several
A second focus is the possibility that two or more of the parame-
applications can be described by the following two general types
ters may depend on one another. For example, in some environ-
of simulation models-product and forecast models.
ments, area and net pay, porosity and permeability, or decline rate
The volumetric model in Eq. I represents a class of models in
and pay thickness may exhibit bivariate dependency. Instead of ig- which the output is a product of several input parameters. A sim-
noring the dependency or guessing how to quantify it, we may be plified form for reserves often used in exploration prospects is given
able to examine historical data for clues. by
What Is Monte Carlo Simulation? N=AhR, ......................................... (2)
A Monte Carlo simulation begins with a model. To illustrate, we where R is a recovery factor that includes efficiency, porosity, satu-
select one form of a volumetric model for oil in place (OIP), N, ration, and FVF's.
in terms of area, pay, porosity, water saturation, and FVF: Caldwell and Heather 2 featured two product models, one for
coalbed methane reservoirs,
N=7,758 Ah¢(l-Sw)IBo' .......................... (I)
Rc=AhCgpR, .................................... (3)
Think of A, h, S w' and Boas input parameters and N as the out-
put. Once we specify values for each input parameter, we can cal- and another for naturally fractured reservoirs,
culate a value for the output. Each parameter is viewed as a random
RH=RvLH (l-lj)(I-fjw)/£j' ....................... (4)
variable. A trial consists of selecting one value for each input and
calculating the output. A simulation is a succession of hundreds The exponential decline curve,
or thousands of repeated trials, during which the output values are
q=qi exp( -at), ................................... (5)
stored. Afterward, the output values are grouped into a histogram
or a cumulative distribution function. can be used in a Monte Carlo simulation by treating both the initial
Monte Carlo simulation is an alternative to both single-point (de- productivity, qi' and the decline rate, a, as random variables. The
terministic) estimation and the scenario approach that presents three production forecast appears no longer as a single curve, but as a
cases: worst, most likely, and best. band of uncertainty (Fig. 3).
After a production forecast is available, an economic forecast
How Do We Select a Value for Each Input? A Monte Carlo simu- can be generated by assigning prices and operating costs. These
lation customarily is run with special software-either spreadsheet parameters can be treated as random variables also. Typical out-
add-ins or compiled programs. The key step is to choose a value put distributions include present value and discounted cash flow.
for each input parameter according to a specified distribution.
Among distributions commonly used are the familiar (i.e., normal, Monte Carlo Simulation Advantages and Disadvantages. Some
triangular, lognormal, and uniform) and the less-familiar types (i.e., of the reasons to use Monte Carlo simulation.
Copyright 1994 Society of Petroleum Engineers I. The results contain maximum information about possible out-
SPE Computer Applications, April 1994 comes compared with either the scenario or deterministic approach.
12------------------------------------------------------------
'E7¥1-]
70 120 170
0.5
o . .__. .
70
~--------~
100 130 o 250 500
0.1 r-~=-~---~-_, 0.24 ,.....----,.-.------. 0.25 ,.....,...----.,.----...,
0.05 0.12 0.13
O . . . .~--~-~. . . . o 250 500

120 170 70 100 130
0.1 r--=-----,---_,
I
,·,:I~i·················
0.25 .
0.05 1
150 70 100 130 o 250 500
Triangular Normal Lognormal
Fig. 1-Three common distributions, shown as cumulative distribution functions (first row), probability density functions (sec-
ond row), and histograms (third row).
2. The simulation emphasizes the underlying model with its as- 4. Sensitivity analyses reveal the key parameters and help quan-
sumptions and helps the user quantify and incorporate historical tify the value of additional information.
data, including dependence. Some of the disadvantages of Monte Carlo simulation are that
3. The results enable us to answer such questions as (1) "How (1) users need to buy and learn how to use the software, (2) the
likely is the most likely outcome?"; (2) "Which alternative is more language of probability and statistics can be a barrier to understand-
risky?"; (3) "Does one alternative dominate the other?"; (4) "How ing and explaining results, and (3) the results are only as good as
many wildcats will have to be drilled to have 90% confidence of the model and the input assumptions.
at least two successes?"; and (5) "What is the probability of going The first two disadvantages are simple to dismiss. Inexpensive
broke before the second success?" software is available. The time needed to learn to use the software
1.0000
~-
0.9000
0.8000
/
0.7000 I
0.6000
I
0.5000
/
0.4000 /
I
0.3000
0.2000
..... I
0.1000
/-
0.0000
~ ~ "
70 80 90 100 110 120 130
Fig. 2-Sampling using a cumulative distribution function.

--------------------------------------------------------13
54
I
N
43.2
T
H
0 32.4
U
5
A 21.6
N
D
5 10.8
-.000008
2 3 4 5 6 7 8 9 10 Cell#
in RCIl~e
Fig. 3-Production forecast incorporating uncertainty.
and to improve on the fundamentals is a small price to pay for the of our assumptions. If there is disagreement about the type of dis-
power of the simulation tool, and some of the learning experience tribution or the range of some input parameters, at least the simu-
should broaden the user's general analytical skills. lation can be run under various assumptions to quantify the
The importance of the selection of the model and its inputs can- differences. Sensitivity analysis is an important partner to Monte
not be overemphasized. You should not use an exponential decline Carlo simulation in the overall decision process.
curve to model production that has a definite hyperbolic shape. Nor
would you assume that reservoir acreage was uniformly distributed. Historical Data. The first place to look for data might be in your
own company. Corporate databases have drawn more attention in
Sources for Input Distributions recent years. There are obvious limitations of quantity and scope.
Where do we look for guidance when we select models and make In addition, there may be artificial barriers between groups, making
assumptions about input parameters? Three general sources are access difficult. It is not uncommon for two or more databases for
the same field to exist. For example, there may be a petrophysical
available: fundamental principles, expert opinion, and historical
database and a production database. It is even possible that these
data. While this paper emphasizes historical data, the usefulness
two sources will have conflicting data, resulting from revisions based
of guiding principles and experts should not be underrated. Indeed,
on additional information or advances in interpretative technolo-
the wise practitioner of risk analysis embraces all three sources.
gy. Nevertheless, this can be a valuable source of data, particular-
ly if the personnel in charge of the databases were to participate
Fundamental Principles. There are reasons you might expect cer- in the modeling and analysis.
tain parameters to be lognormally distributed. One key example Commercial database vendors and software vendors who accumu-
is field size, which is the product of acreage, net pay, and recov- late databases are natural sources of field data. State and federal
ery. A consequence of the Central Limit Theorem in statistics is agencies are other popular sources. On the list are the DOE, the
that products of variables tend to be lognormally distributed. Simi- U.S. Minerals Management Service, the Bureau of Economic Ge-
larly, sums of variables tend to be normally distributed. Cost engi- ology, the Gas Research Inst., and various state oil and gas com-
neers who estimate numerous subtotals should observe that their missions.
grand totals resemble normal distributions. Technical papers and journal articles often summarize their data
It is not mere happenstance that lab reports for core samples plot or present it only in graphical form, removing the level of detail
the log of permeability against porosity. The underlying facts are necessary for use in Monte Carlo simulation. More-detailed topi-
that permeabilities tend to be lognormally distributed and that there cal reports and field studies, however, often preserve adequate
tends to be a positive correlation between permeability and porosi- detail.
ty. Much to the chagrin of the well-test devotee, there is some truth
in the assumption that porosity can be a predictor of permeability. Case Study: Using Historical Data for a Volumetric Model
The way that water saturation and porosity are calculated im-
For our purposes, let us suppose that we assembled the perfect team
poses a negative correlation between those two parameters. Another for the analysis, selected an appropriate model to use, and exam-
argument for this inverse relationship, at least in water-wet rock, ined the fundamental principles. We are at a point where we need
is the following. Water saturation is the ratio of water volume to to select appropriate distributions for each input parameter. To be
total PV. In an idealized pore space, the water saturation is propor- specific, suppose we are using the volumetric model described above
tional to the surface area of a sphere (the pore space). As the radius to evaluate a drilling prospect in a play where extensive produc-
of a sphere increases, its area/volume ratio shrinks. tion data is available.
Incidentally, the lognormal distribution has been studied exten- We have a database 4 that contains parameters for 26 existing oil
sively outside the oil and gas industry. While not for the timid, the reservoirs in the Repetto turbidite sandstone, Geologic Play Code
treatise by Aitcheson and Brown 3 offers examples and explana- 415, which matches our prospect. We use these data along with
tions for the ubiquity of these types of random variables. data from two similar plays (Geologic Play Codes 414 and 416,
Puente turbidite and Repetto/Puente turbidite sandstones, respec-
Expert Opinion. Monte Carlo simulation at its worst is just another tively) to generate our distributions and to look for dependence rela-
black box. There is no substitute for experience. When used prop- tionships. Together, all three plays comprise 83 reservoirs.
erly, risk analysis is done in groups where engineers and geoscien- Table 1 contains the data for the Repetto sandstone. We select-
tists collaborate on the description of the prospect being analyzed. ed 7 of the 61 available database fields (i.e., columns in the data-
They must strive to account for the particular depositional envi- base): area, pay, porosity, initial water saturation, permeability,
ronments, driving mechanisms, heterogeneities, and other factors. initial FVF, and initial GOR. Our volumetric model uses five of
Just as the quantitative method suffers without the voice of ex- these parameters-all except permeability and initial GOR, which
perience, expert opinion rings hollow when its consequences are are included because they are commonly thought to be correlated
not scrutinized. One of the powerful aspects of risk analysis is the with others in the group. Later, we calculate the correlation matrix
ability to compare alternatives and to examine the consequences for all seven parameters.
14-----------------------------------------------------------
Two issues arise. First, how do we describe each parameter in
TABLE 1-DATA FROM 26 OIL RESERVOIRS USED TO
GENERATE SIMULATION INPUT DISTRIBUTIONS terms of a probability distribution? Second, to what extent do the
parameters depend on one another?
Initial
A h ¢ Swi k Boi GOR Distribution Results: Which Type Is Best? We used the data from
(acres) (ft) (%) (%) (md) (RB/O) (scf/O) 83 reservoirs (Plays 414 through 416) to construct histograms with
200 172 27 28 790 1.24 420 10 classes or spreadsheet "bins." Fig. 4 shows the field data (in
250 72 38 30 2,091 1.05 215 symbols) and a matching theoretical distribution (in lines). Each
355 388 21 40 300 1.17 800 parameter is displayed as a "density function," essentially obtained
1,268 125 32 35 1,000 1.04 97 by connecting the top center points of the histograms. The area under
388 224 20 37 133 1.30 400 the density function between A and B represents the probability
265 250 20 37 70 1.30 550 that a data point falls between A and B. Thus, the total area under
445 332 26 25 700 1.16 300 the curve is 1.00.
525 338 29 27 600 1.16 300 We used history-matching software (BestFit 5 ) to match our data
144 95 36 40 137 1.08 160 with each of 12 common distributions and to indicate the chi-square
365 133 32 25 680 1.04 98 "goodness-of-fit" measure, which allows us to select one distri-
1,200 511 24 31 337 1.05 697 bution that fits better than others. The distributions in Fig. 4 are
320 85 28 25 260 1.05 125 either the best fit or among the top two or three best fits. Table
3,000 250 36 36 1,100 1.05 200 2 shows how the different distribution types fit one parameter-
445 150 38 35 2,300 1.05 100 porosity. The arguments appearing with each distribution type spec-
1,133 300 23 40 60 1.15 150 ify the particular function in the class. The normal curve, for ex-
1,133 400 32 23 200 1.10 185
ample, has a mean of 29.84 and a standard deviation of 5.36,
1,133 325 26 30 250 1.15 200
1.43 whereas the gamma distribution that fits our data best has a shape
374 91 20 40 58 860
355 300 30 50 250 1.24 350 parameter (alpha) of 30.03 and a scale parameter (beta) of 1.01.
373 130 28 35 290 1.08 110 The lognormal distribution fits both area and pay quite well.
1,000 80 33 19 500 1.05 124 Porosity data is slightly skewed left and matched by the normal
859 123 33 19 500 1.05 15 curve. Water saturation is matched best by a beta distribution but
270 80 34 18 1,600 1.05 109 also could be matched reasonably well either by a lognormal or
400 50 35 18 1,600 1.05 113 by it normal distribution.
200 75 30 26 1,000 1.05 200 Little has been published about fitting distributions to field data.
180 325 25 37 600 1.00 40 Triangular and lognormal distributions are widely used in
exploration-prospect simulation. Yet, some evidence indicates that
Area - Lognormal(7.88e+2,9.06e+2) Water sat, Gamma(21.56,1.35)

0.002,------,,------,---r---.------,
0.001
I~
\ . ....
:::kJ±HJ
0.Of7.0 23.2 29.4 35.6 41.8 48.0
0.000
'77.0 1516.2 2955.4 4394.6 5833.8 7273.0 Values in 10"1
Pay, Lognormal(2.S0e+2,2.27e+2) FVF, 8eta(1.14,9.04) + 1.00

6.9r---,---.-----,---,.----,
0.005
0.003 ~ 3.411----+"-..:--+----+---t---1
0.00°40
" 392
~
744
.
Porosity. Normal(29.B. 5.36)

1096 1448 1800 O·~.O 1.1 1.2 1.3 1.4 1.5
0.08 y - - - - - . - - - - , . - - , - - - , - - - ,
O. 04 t----+-----i~-+---f'I.___---i
O.OQ .L..A......~_-+_ _1--_-+---''"--I

.11 .29 .41
Fig. 4-Density functions from history match used as inputs to simulation.

------------------------------------------------------------15
Some correlation coefficients should come as no surprise, although
TABLE 2-GOODNESS-OF-FIT RESULTS FROM HISTORY
MATCHES OF 12 DISTRIBUTIONS TO POROSITY DATA coefficients as large as 0.7 are unusual. In many depositional envi-
ronments, pay and area (r=0.36) are positively related. We have
Function Chi Square already mentioned porosity and initial water saturation (r= -0.47)
Weibull (6.90, 31.79) 2.14x10- 3 and permeability and porosity (r=0.70). FVF and GOR (r=O.72)
Logistic (29.73, 3.37) 2.28x10- 3 commonly also are correlated positively, as can be seen by examin-
Normal (29.84, 5.36) 4.38x10- 3 ing numerical correlation models for PVT properties. Other coeffi-
Triangular (11.25,30.67,41.16) 4.81 x 10- 3 cients could be influenced by mutual dependencies. For example,
Chi square (29.00) 7.88x 10- 3 if A and B are highly positively correlated as are Band C, then
Poisson (29.15) 1.04x 10- 2
A and C could not be highly negatively correlated.
Lognormal (29.83, 6.24) 0.031
Lognormal2 (3.39, 0.21) 5.01 x 10- 2 Holtz 6 also found some interesting relationships between param-
Erlang (30.00, 1.01) 5.34x10- 2 eters. Net pay and porosity displayed a positive correlation in the
Gamma (30.03, 1.01) 5.41x10- 2 Woodbine sandstone but negative correlations in the Grayburg lime
Exponential (29.15) 0.102 and the Morrow sand. Porosity and initial water saturation showed
Geometric (3.32 x 10 -2) 0.104 the expected negative correlations in the Woodbine and Penn plays.
Beta (6.31, 3.88) x 29.9 + 11.2 0.872 How significant are these numbers? There is a significance test
Erf (0.13) 5.87x 10 7 for correlation coefficients. The standardized normal (z-score) is
Pareto (33.01, 11.16) 2.02 x 10 13
obtained from the following.
certain parameters defy being matched universally by any type of

distribution. Holtz 6 recently showed that, while net pay was gener-
ally positively skewed and often might be represented by a lognor-
J
z= n-\n(4
1 +r), .............................. (6)
l-r
mal curve, porosity and water saturation were skewed to the right where nis the number of data and ris the correlation coefficient.
or the left, depending on the particular plays involved. Lognormal For 26 data points, a coefficient of 0.45 is significant at the 98%
distributions are always skewed right and thus would be inappropri- confidence level.
ate in some cases. Triangular distributions assign relatively high In many cases, the difference between the rank-correlation and
probability (compared with normal or lognormal distributions) to the raw-data correlation coefficients is minor, but Table 5 illus-
values toward the extremes, particularly when highly skewed. Peter- trates how dramatic the difference can be. The original data con-
son et at. ,7 using North Sea drilling data, found the gamma dis- tained a value for permeability, 26,816 md, which was subsequently
tribution to be the best fit for such operational stages as mob and determined to be incorrect. The correct value, based on other data
demob time, drilling time, and evaluation time. for the reservoir, was 26.815 md. The effect of making this cor-
As more people use Monte Carlo methods, the collective expe- rection for a single point on the raw data correlation coefficients
rience in history matching field data may reveal some good guide-
between permeability and the other parameters is astounding. The
lines. For now, however, one should approach the input parameters
effect of the correction on the rank correlations, however, was rela-
with an open mind. When you do not have any analogous data,
tively minor.
the problem becomes more subtle. Then you must rely on experts
and fundamental principles . You can, and often should, run a simu-
lation several times to see what effects choices of input parameters How To Look for Dependency. When you have field data, gener-
have on the results. ating crossplots can be a simple method of identifying dependen-
cy. Spreadsheet functions and commands can yield regression lines
Correlation Results. Tables 3 and 4 show the matrices of corre- and correlation coefficients. Some care must be exercised because
lation coefficients and rank-correlation coefficients for the seven these tools quantify linear relationships between pairs of variables.
parameters on the basis of a smaller 26-point sample. It is customary With a little work, you can replace the raw data with their ranks
in Monte Carlo simulation software to use rank-correlation coeffi- and then do regression and correlation calculations. Figs. 5 and
cients to describe relationships between parameters because this 6 illustrate the Cartesian and semilog plots of permeability vs.
measure is not influenced by either the types of underlying distri- porosity. The relationship between the variables is more apparent
butions or the magnitudes of the parameters. in the semilog plot.
TABLE 3-CORRELATION MATRIX FOR PLAY 415 USING ORIGINAL DATA

Initial
A h cf> Swl k Bol GOR
(acres) .J!!L (%) (%) (md) (RB/D) (sefID)
A 1.00
h 0.29 1.00
cf> 0.19 -0.47 1.00
Swl 0.01 0.30 -0.37 1.00
k -0.02 -0.41 0.70 -0.33 1.00
Bol -0.21 0.15 -0.67 0.45 -0.47 1.00
Initial GOR -0.10 0.40 -0.67 0.41 -0.37 0.72 1.00
TABLE 4-CORRELATION MATRIX FOR PLAY 415 USING RANKS OF DATA

Initial
A h cf> Swl k Bol GOR
(acres) .J!!L (%) (%) (md) (RB/D) (scflD)
A 1.00
h 0.36 1.00
cf> 0.03 -0.50 1.00
SWI -0.14 0.32 -0.40 1.00
k -0.03 -0.34 0.65 -0.50 1.00
BOl -0.01 0.26 -0.43 0.35 -0.46 1.00
Initial GOR -0.06 0.37 -0.54 0.38 -0.38 0.77 1.00
16
TABLE 5-BEFORE AND AFTER CORRELATION MATRIX ROW;
EFFECT OF REMOVING OUTLIER
Raw Data Matrix
Initial
A h ¢ Swi k Boi GOR
(acres) (It) (%) (%) (md) (RB/D) (scf/D)
Before -0.06 0.01 -0.09 -0.12 1.00 0.03 0.09
Alter 0.09 -0.11 0.43 -0.30 1.00 -0.31 -0.16
Ranks of Data Matrix

Initial
A h ¢ Swi k Boi GOR
(acres) (It) (%) (%) (md) (RB/D) (scf/D)
Before 0.17 -0.16 0.41 -0.36 1.00 -0.32 -0.18
Alter 0.22 -0.19 0.46 -0.35 1.00 -0.36 -0.22
Simulation Results. Using@RISK,8 a spreadsheet add-in Monte

TABLE 6-COMPARISON OF OUTPUTS IN MODEL USING
Carlo simulation software, we ran two cases of a simulation with DEPENDENT AND INDEPENDENT INPUTS
500 trials. In both cases, we used the five distribution functions
in Fig. 4 as inputs. In one case, we assumed that all the parameters Dependent Inputs Independent Inputs
were independent. In the other case, we used the rank-correlation (million bbl) (million bbl)
matrix in Table 3 to link the parameters. Fig. 7 displays the results Expected value 509 276
in the form of a CDF for N. 10th percentile 15 28
The actual ranges of values for N were truncated to make the Median (50th percentile) 144 130
comparison easier. Table 6 provides some simple numeric com- 90th percentile 1,181 709
parisons.
The explanation for the effect of dependent sampling is straight-
forward. In the dependent case, whenever a large value of area is value of N is calculated. This magnification of extremes accounts
sampled, the tendency is for correspondingly large values of net for the larger spread of the N values for the dependent case.
pay, porosity, and oil FVF and a relatively small value of Sw
(hence, a relatively large value of oil saturation). The resulting prod- Discussion
uct, N, is relatively large. Similarly, when a small value of area When field data are available, they can be used to generate input
is sampled, the other parameters, except Sw, are small and a small distributions for a Monte Carlo simulation. The first step is to group
the data and construct a histogram and a cumulative distribution
function. Spreadsheet add-in software can accommodate these gener-
Plays 414,415, and 416: Crossplot of
al shaped distributions and common theoretical distributions. Rela-
Permeabilityvs Porosity
tively new software automatically history matches data with common
2500 _ distributions and provides goodness-of-fit measures.
E. 2000 t The field data can also be used to test for dependency between
~ 1500 ~ ... pairs of parameters. Crossplots are a good first step to identifY rela-
tionships. When two or more parameters in the underlying model
~cu
.........
1000 •• 1 •••• appear to depend on one another, the degree of dependence can
e
Q;)
500 + .- J'-. -.. be measured by regression and correlation tools. Any dependency
of this sort can be included in the Monte Carlo simulation.
c.. oi • 4·· •• ~~ ..~-~---·~ : In a field example, several reservoir parameters were analyzed
10.00 20.00 30.00 40.00
both for their types of distributions and for bivariate correlation.
Porosity, percent A simulation incorporating these distributions revealed that the range
of output parameter, OIP, was affected by the decision to include
Fig. 5-Crossplot of permeability vs. porosity using a dependence relationships.
Cartesian scale. Experts and fundamental principles supplement data analysis.
Types of distributions and relationships among the parameters are
Plays 414,415,and 416: Crossplotof

Per mea b iii ty v s P 0 r 0 s i ty
@RISK Simulation Results
1000 .
100
10
gp
1 ---"-
10.00 20.00 30.00 40.00
O~----~---'~--~--N-,-M-M-S-T-B--~1~O~OO~~----~~1~500
Porosity. percent
Fig. 6-Crossplot of permeability vs. porosity using a semilog Fig. 7-Comparison of output distribution from simulation-
scale. dependent vs. independent inputs.
------------------------------------------------------------17
dependent on the environment being modeled. As more users ex- SUbscript
amine field data, general guidelines may emerge. For now, users H = horizontal
must look carefully at available data. It is always wise to run mul- i = initial
tiple simulations to compare the effects of different assumptions.
References
Nomenclature 1. Cronquist, C.: "Reserves and Probabilities-Synergism or Anachron-
ism?," JPT (Oct. 1991) 1258.
a = exponential production decline rate 2. Caldwell, R.H. and Heather, D.l.: "How To Evaluate Hard-To-Evaluate
A = area, acres Reserves," JPT (Aug. 1991) 998.
Bo = oil FVF, bbl/B 3. Aitcheson, J. and Brown, J.A.C.: The Lognormal Distribution, Cam-
Cg = gas content, scflton bridge U. Press, Cambridge (1957).
4. "Enhanced Oil Recovery," Natl. Pet. Council (1984).
Jj = fraction of fractures depleted 5. "Best Fit-Distribution Fitting Software for Windows," Beta Release
f/w = fraction of fractures water filled 1.0, Palisade Corp., Newfield, NY (1993).
h = net pay thickness, ft 6. Holtz, M.H.: "Estimating Oil Reserve Variability by Combining Geo-
logie and Engineering Parameters," paper SPE 25827 presented at the
k = permeability, md 1993 SPE Hydrocarbon Economics and Evaluation Symposium, Dal-
L = length, ft las, March 29-30.
£/ = fracture systems spacing, ft 7. Peterson, S.K., Murtha, J. A., and Schneider, F.F.: "Risk Analysis and
n = number of data points in z-score calculation Monte Carlo Simulation Applied to the Generation of Drilling AFE Es-
timates," paper SPE 26339 presented at the 1993 Annual Technical Con-
N = reserves, bbl ference and Exhibition, Houston, Oct. 3-6.
q = production rate, bbl/day 8. "@RISK-RiskAnalysis and Simulation Add-in for Microsoft Excel,"
r = Pearson correlation coefficient Release 1.1 User's Guide, Palisade Corp., Newfield, NY (1992).
R = well recovery factor, fraction
SI Metric Conversion Factors
Rc = coalbed methane recovery, scf
R v = vertical well recovery, STB acre x 4.046 873 E+03 m2
bbl x 1.589 873 E-Ol m3
Sw = water saturation, percent
ft x 3.048* E-Ol m
t = time, month or year
z = standardized normal value • Conversion factor is exact. SPECA
p = density, ton/acre-ft Original SPE manuscript received for review July 11. 1993. Revised manuscript received
Feb. 17. 1994. Paper accepted for publication Dec. 12. 1993. Paper (SPE 26245) first
cp = porosity, percent presented at the 1993 SPE Petroleum Computer Conference in New Orleans, July 11-14.

Incorporating Historical Data Into Monte Carlo Simulation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Incorporating Historical Data Into Monte Carlo Simulation

Uploaded by

Copyright:

Available Formats

----------------------------11

Incorporating Historical Data Into

100 130 o 250 500

0.1 r-~=-~---~-_, 0.24 ,.....----,.-.------. 0.25 ,.....,...----.,.----...,

0.05 0.12 0.13

O . . . .~--~-~. . . . o 250 500

150 70 100 130 o 250 500

Triangular Normal Lognormal

Fig. 2-Sampling using a cumulative distribution function.

Fig. 3-Production forecast incorporating uncertainty.

Area - Lognormal(7.88e+2,9.06e+2) Water sat, Gamma(21.56,1.35)

Pay, Lognormal(2.S0e+2,2.27e+2) FVF, 8eta(1.14,9.04) + 1.00

Porosity. Normal(29.B. 5.36)

O.OQ .L..A......~_-+_ _1--_-+---''"--I

Fig. 4-Density functions from history match used as inputs to simulation.

certain parameters defy being matched universally by any type of

TABLE 3-CORRELATION MATRIX FOR PLAY 415 USING ORIGINAL DATA

TABLE 4-CORRELATION MATRIX FOR PLAY 415 USING RANKS OF DATA

Ranks of Data Matrix

Simulation Results. Using@RISK,8 a spreadsheet add-in Monte

Plays 414,415,and 416: Crossplotof

You might also like