You are on page 1of 8

User Guide for the E-FACIES v1.

0
1.

Introduction
This is the user guide for E-FACIES, a Window based Software for Electrofacies Characterization
Based on Multivariate Analysis from Well Logs. Generally a suite of well logs can provide valuable but
indirect information about mineralogy, texture, sedimentary structure, fluid content and hydraulic
properties of a reservoir. The distinct log responses in the formation represent electrofacies that very often
can be correlated with actual lithofacies identified from cores, based on depositional and diagenetic
characteristics. The importance of electrofacies characterizations in reservoir description and management
has been widely recognized.
In this software, Fortran 77, for calculation part, and C++, for graphical interface part, are used. The
program requires IBM PC or Compatible computer with Windows 95 or Windows NT operating system.
The method used to perform the electrofacies classification is based on attempts to identify clusters of well
log responses with similar characteristics. It also shows data-driven characteristics because this
classification does not require any artificial subdivision of the data population but follows naturally based
on the unique characteristics of well log measurements reflecting minerals and lithofacies within the logged
interval. A combination of principal components analysis, model-based cluster analysis and discriminant
analysis is used to characterize and identify electrofacies types.
1.1 Objectives
- Better understanding overall data structure of well logs
- Determining electrofacies groups from well logs
- Improving permeability correlation or prediction through data partitioning scheme
1.2 Scope of the Application
- Well to well correlation
- Development of permeability correlation
- Prediction of lithofacies or electrofacies
- Other applications related to pattern recognition
1.3 Problems and Suggestions
We developed this software at Texas A&M University for research purposes. The software may not be
fully user friendly. But there is a plan to improve it greatly. If the software is crashed, please send
your problems and comments to datta-gupta@spindletop.tamu.edu or fax us at (979) 845-1307 with
reference to Dr. A. Datta-Gupta on the cover page.
2.

Three Basic Modules


E-FACIES consists of the following three basic modules in multivariate analysis.

Principal Component Analysis (PCA): It reveals the structure of the well log data and can reduce
the dimensionality without a significant loss of information.
Model based CLUSTering (MCLUST): It classifies a data set (well log data) into distinct groups
(Electrofacies) based on a measure of similarity or dissimilarity between groups.
Discriminant Analysis: It assigns an individual well log response to two or more predefined
groups (e.g. in MCLUST) on the basis of measurements.

2.1 Principal Component Analysis (PCA)


Principal components analysis (PCA) is used to summarize the data effectively and to reduce the
dimensionality of the data without a significant loss of information. First of all, to minimize the effects of
scales and units of log variables, logs are standardized by subtracting from each reading the mean and
dividing by the standard deviation. (Scale option =1)
Principal components constitute an alternative form of displaying the data, thereby allowing better
knowledge of its structure without changing the information. In addition, because the total variance in a
data set can be defined as the sum of the variances associated with each principal component, the first few

principal components that explain most of the variation in the original variables are often useful to reveal
the structure in the data. This can reduce the dimensionality of the problem and complexity in the cluster
and discriminant analysis.
2.2 Model Clustering Analysis (MCLUST)
The aim of cluster analysis is to classify a data set into groups that are internally homogeneous and
externally isolated on the basis of a measure of similarity or dissimilarity between groups. In this software,
model-based clustering, a hierarchical agglomerative clustering technique, is used. This approach can give
much better performance than traditional procedures such as single-link (nearest neighbor) and k-mean
clustering, which often fail to identify groups that are either overlapping or of varying sizes and shapes.
Another advantage of the model-based approach is that there is an associated Bayesian criterion for
assessing the model. This provides a means of selecting not only the parameterization of the model, but
also the number of clusters without the subjective judgements necessary in other conventional cluster
analysis techniques.
Banfield and Raftery developed a model-based framework for clustering by parameterizing the
covariance matrix in terms of its eigenvalue decomposition. This model has had considerable success in a
number of practical applications, including character recognition, tissue segmentation, minefield and
seismic fault detection, identification of textile flaws from images, and classification of astronomical data.
We used this model-based cluster analysis (MCA) in our approach.
The identified clusters can be viewed as distinct electrofacies groups that reflect the hydrologic,
lithologic, and diagenetic characteristics. If we have additional information such as core observations or
geological insight, the identified electrofacies groups could be calibrated to ensure their interpretable
geological meaning.
2.3 Discriminant Analysis
Discriminant analysis is a multivariate method for assigning an individual observation vector to two or
more predefined groups on the basis of measurements. It requires prior classification of the data into
relatively homogenous subgroups whose characteristics can be described by the statistical distributions of
the grouping variables associated with each subgroup. Typically, the classification is performed by defining
the distinct groups based on the unique characteristics of well log measurements or by applying known
external geologic criteria such as core-derived lithofacies information. However, because in many
situations a training dataset with absolutely known classifications is not easily obtained, a method like
model-based cluster analysis is commonly used. In this software, group-specific probability density
functions were determined by the distinct electrofacies groups defined in the model-based cluster analysis
using a training dataset.
3.

Procedure for determining electrofacies

The following step by step procedure will illustrate how to run E-FACIES.
(1) To start E-FACIES, execute E_Facies.exe. This can be done just like in any Windows program, i.e.,
by double clicking left mouse button at the program icon or by executing the executable file from the
Run menu of Windows. If the program is executed, the following window (Fig. 1) appears
(2) Create the input data in a simplified Geo-EAS formatted file (Fig. 2). The data are ordered row-wise.
First two columns are assigned for well name and zone (marker) name. The list of log names may be
followed. The width of first two columns must be specified as 15 characters.

Figure 1

Figure 2

(3) For PCA, click Parameter File option under PCA menu item. Once you click this button, a Word Pad
will pop (Fig. 3), and the parameter file will be displayed. The parameters required for the PCA
program are listed below and shown in Fig. 4.

Figure 3

PCA Parameter
START OF PARAMETERS:
scfu5wells.dat
7
5 6 7 8 9 10 11
scfu5_pca.out
2
1
1
scfug517.dat
g517_pca.out

\input file name


\number of variables
\column numbers
\output file name
\method(1,2,3)
\scale option (0=no, 1=yes)
\inew (0=no, 1=yes)
\input file name of new data
\output file name of new data
Figure 4: An example parameter file for PCA

input file name: the input data in a simplified Geo-EAS formatted file.
number of variables: the number of variables.
column numbers: the columns for the variables in data file.
output file name: file for PCA results.
method: analysis option.
= 1: on sums of squares & cross products matrix.
= 2: on covariance matrix.
= 3: on correlation matrix.

In practice, users most often select either 2 or 3. If method option is 3, then the principal
components are based on the correlation matrix rather than the covariance matrix. That is, the variables
are scaled to have unit variance.
scale option: If set to 1, data will be standardized.
new data option: If set to 1, new data is read and performed to obtain principal components.
input file name of new data: the name of new data file.
output file name of new data: file for PCA results of new data.
(4) To execute PCA, click Run submenu. If you need to create the screeplot, a barplot of the variances of
the principal components labeled by

i / trace() , then you can use Scree Plot option shown in

i =1

(Fig. 5). For plotting the distribution of well logs on PC1 and PC2 domain, use PC1-PC2 option under
Plot submenu. To see the final results, click Result submenu.

Figure 5

(5) For MCLUST, click Parameter File option under MCLUST menu item. Once you click this button, a
Word Pad will pop, and the parameter file will be displayed. The parameters required for the
MCLUST program are listed below and shown in Fig. 6:

input file name: the input data in a simplified Geo-EAS formatted file.
number of variables: the number of variables.
column numbers: the columns for the variables in data file.
output file name: file for MCLUST results.
shape: (a(i)) vector determining the shape of clusters for method options 1 and 2
signif: (isig(i)) vector giving the number of significant decimal places in each column of x.
scale: (s(i)) vector for scaling the observations. The i-th column of x is multiplied by scale[i]
before cluster analysis begins.

method: (icrit) integer selects criterion for agglomerative heirarchical clustering


= 1: S* (default) which is optimal for ellipsoidal clusters that are long and point in different
directions, perhaps even overlapping. The shape of all clusters is fixed but the size and
direction of all clusters are allowed to vary among clusters.
= 2: S which give importance to the shape of the clusters. The shape of all clusters is fixed, the
volume size of ellipsoidal clusters is equal, but the orientation direction is varying.
= 3: spherical (with varying sizes)
= 4: sum of squares (Wards method) = k-mean
= 5: unconstrained option
= 6: determinant ( all clusters have the same variance)
= 7: centroid
= 8: weighted average link
= 9: group average link
=10: complete link (farthest neighbor)
=11: single link (nearest neighbor)
Note: option 7-11 are heuristic methods that are not model-based.
noise: indicates whether or not Poisson noise should be assumed new data file name: the name of
new data.
number of clusters: the number of cluster to be determined. Uses the tree$merge component of
the output from mclust to determine the classification corresponding to a given number of clusters.

MCLUST Parameter
START OF PARAMETERS:
scfu5_pca.out
4
1 2 3 4
scfu_mclust.out
1.0 0.2 0.2 0.2
0.0 0.0 0.0 0.0
1.0 1.0 1.0 1.0
1
1
8

\input data file name


\number of variables
\column numbers
\output file name
\a(i): i=1,M
\isig(i): i=1,M
\s(i): i=1,M
\icrit: 1,...,9
\noise: (0=no,1=yes)
\number of clusters

Figure 6: An example parameter file for MCLUST

(6) To execute MCLUST, click Run submenu. In Plot submenu, the distribution of clusters on two major
principal components, PC1 and PC2, is shown (Fig. 7). To see the final results, click Result submenu.
(7) For discriminant analysis, click Parameter File option under Discriminant Analysis menu item. Once
you click this button, the parameter file will be displayed. The parameters required for the
Discriminant Analysis program are listed below and shown in Fig 8:

input file name: the input data in a simplified Geo-EAS formatted file.
number of variables: the number of variables.
column numbers: the columns for the variables and cluster in data file.
output file name: file for Discriminant Analysis results.
number of clusters: the number of Gaussian clusters to be predetermined.
prediction option: If set to 1, group membership of new data will be calculated using
discriminant functions based on the training data set. (e.g. new well data or blind well data)
input file name of new data: the name of new data file.
output file name of new data: file for Discriminant Analysis results of new data.

Figure 7
Discriminant Analysis Parameter
START OF PARAMETERS:
scfu_mclust.out
\input file name
5
\number of variables
2 3 4 5 1
\columns for variables and cluster
scfu_discrim.out
\output file name
8
\number of clusters
1
\predict option (yes=1,no=0)
g517_pca.out
\input file name of new data
g517_ef.out
\output file name of new data
Figure 8: An example parameter file for Discriminant Analysis

(8) To execute Discriminant Analysis, click Run submenu. To see the final results, click Result
submenu. Under Result submenu, there are two options such as Summary of Analysis and
Prediction. In Prediction option, you can see the prediction results of group membership for new
data set.
4.

References

Lee, Sang Heon and Datta-Gupta, A.: Electrofacies Characterization and Permeability Predictions in
Carbonate Reservoirs: Role of Multivariate Analysis and Non-parametric Regression, SPE 56658
presented at the 1999 SPE Annual Technical Conference and Exhibition held in Houston, Texas, 36
October 1999.
Banfield, J. D. and Raftery, A. E.: Model-based Gaussian and Non-Gaussian Clustering, Biometrics
(1993) 49, 803.
Bucheb, J. A. and Evans, H. B.: Some Applications of Method Used in Electrofacies Identification,
The Log Analyst (Jan.-Feb. 1994) 14.

5.

Serra, O. and Abbott, H.T.: The Contribution of Logging Data to Sedimentology and Stratigraphy,
SPEJ (Feb. 1982) 117.
Wolff, M. and Pelissier-Combescure, J.: FACIOLOG-Automatic Electrofacies Determinationn,
paper FF presented at the 1982 SPWLA Annual Logging Symposium, Corpus Cristi, July 6-9.
Authors and contact

Dr. Sang Heon Lee (shlee@tamu.edu ), Research Associate, Texas A&M University
Arun Kharghoria (a0k9854@spindletop.tamu.edu ) Ph.D. candidate, Texas A&M University
Dr. Akhil Datta-Gupta(datta-gupta@tamu.edu), Associate Professor, Texas A&M University
6.

Disclaimer and copyright information

The program is written for research proposes. The authors are not liable for any possible error. No portion
of the information and documents provides may be reproduced in any form without prior permission from
the Texas A&M University, except as allowed under the copyright laws and except that user may produce a
single human readable copy of the data and source code for internal use. The source code and software can
not be redistributed without permission.

You might also like