You are on page 1of 6

1

PETE 630: Geostatistics


Homework 2
Deepthi Sen

After mean-centering and normalizing the well log data of the 5 wells given, by dividing each element by the standard
deviation, the covariance matrix obtained is:

1.0000 0.0019

0.0019 1.0000

0.1266
0.6466

cov(data) = 0.2399 0.5597

0.1969
0.5295

0.2343 0.5473

0.2619 0.2946

0.1266

0.2399

0.1969

0.2343

0.6466

0.5597

0.5295

0.5473

1.0000

0.6283

0.5888

0.6454

0.6283

1.0000

0.8578

0.8714

0.5888

0.8578

1.0000

0.8864

0.6454

0.8714

0.8864

1.0000

0.1986

0.0127

0.0871

0.1062

0.2619

0.2946

0.1986

0.0127

0.0871

0.1062

1.0000

(1)

This was decomposed into a pair of orthogonal and one diagonal matrices using SVD decomposition. The eigenvalues
are listed below:
T


eig(cov) =

3.8007

1.3755

0.6851

0.5579

0.3382

0.1355

0.1071

(2)

The fraction of variance explained by each principal component is calculated as

i
.
N
i
i

This turns out to be:


T


f racof var =

0.5430

0.1965

0.0979

0.0797

0.0483

0.0194

(3)

0.0153

The scree plot is given below:

The set of principal components (eigenvectors of cov matrix) are given below:

cov(data) =

0.1224

0.6279

0.7612

0.0273

0.0958

0.0153

0.0328

0.3794

0.2931

0.1193

0.5673

0.6535

0.0716

0.0430

0.4123

0.1268

0.1444

0.4834

0.7417

0.0457

0.0844

0.4695

0.1402

0.1981

0.1606

0.0676

0.8048

0.2064

0.4643

0.0894

0.2054

0.3396

0.0786

0.5534

0.5536

0.4759

0.0893

0.1243

0.2828

0.0396

0.1719

0.7994

0.0923

0.6842

0.5371

0.4719

0.0357

0.0953

0.0432

(4)

Here, each column represents one principal component, in the order of their contribution to variance.

The projection of the data points onto PC1 and PC2 are shown below. This is the 2-D representation of the entire
dataset:

The plots of all log data with the first two principal components are shown below:

Subsequently on performing k-means clustering of the projected data into 4 clusters, the distribution of data points
into various clusters are shown below:

On increasing the cluster number to 5, we obtain the following distribution:

The cluster number was further increased to 8. The distribution given below resulted:

On plotting a histogram of the logs GR and DT in each cluster, it is seen that GR forms a roughly unimodal plot

whereas DT forms bi-modal plots in certain clusters. The histograms of GR log for 4 clusters are as given below:

On increasing the number of clusters to 5, the following plots were obtained for GR log:

On increasing the cluster number to 8, GR logs histograms are shown below:

We see that the multimodality of the histograms reduces with increased cluster number.
The histograms of DT log for 4 clusters are as given below:

We see that the bimodality of the DT logs in each cluster has reduced by increasing the cluster number.
The same for DT log on increasing cluster number to 5:

On increasing the cluster number to 8, DT logs histograms are shown below:

You might also like