Professional Documents
Culture Documents
By
A. J. Sinclair
Professor Emeritus
Geological Engineering
Dept. of Earth and Ocean Sciences
The University of British Columbia
6339 Stores Rd.,
Vancouver, B. C. V6T 1Z4
e-mail: ajsincon@shaw.ca
May , 2005
Revised February 2009
SECTION 1 1
1.0 INTRODUCTION........................................................................................................................................... 1
1.1 Recent Perspectives ..................................................................................................................................... 1
1.2 ISO 9000 and Other Quality Systems.......................................................................................................... 1
1.3 Appraisal and Quality Control Plan for Resource Estimation ..................................................................... 2
1.4 Salting.......................................................................................................................................................... 2
1.5 ExampleBre-X ......................................................................................................................................... 4
1.6 General Precautions..................................................................................................................................... 8
1.7 Course Illustrations Using P-res Software................................................................................................... 8
SECTION 2 .......................................................................................................................................................... 9
2.0 STATISTICAL PARAMETERS COMMONLY USED IN ERROR ANALYSIS ........................................ 9
2.1 Introduction ................................................................................................................................................. 9
2.2 Measures of Central Tendency .................................................................................................................... 9
2.2.1 Arithmetic Mean................................................................................................................................. 10
2.2.2 Median.................................................................................................................................................... 14
2.3 Measures of Dispersion (Spread) of Values .............................................................................................. 14
2.3.1 Introduction ........................................................................................................................................ 14
2.3.2 Variance.............................................................................................................................................. 14
2.3.3 Standard Deviation ............................................................................................................................. 16
2.3.4 Standard Error of the Mean ................................................................................................................ 16
2.4 Coefficient of Variation............................................................................................................................. 17
2.5 The Simple Linear Model.......................................................................................................................... 18
2.5.1 Introduction ........................................................................................................................................ 18
2.5.2 Assumptions Inherent in a Linear Model Determined by Least Squares ........................................... 19
2.5.3 A Practical Linear Model ................................................................................................................... 20
2.5.4 Choice of an Estimation Method. ....................................................................................................... 20
2.5.5 ExampleSilbak Premier Gold Deposit............................................................................................ 22
2.6 Displaying Datathe Histogram .............................................................................................................. 24
2.7 Displaying DataScatter (xy) Diagrams .................................................................................................. 26
2.7.1 xy Plots Using P-res ........................................................................................................................... 26
SECTION 3 ........................................................................................................................................................ 28
3.0 STATISTICAL TESTS COMMONLY USED IN TREATING DUPLICATE AND REPLICATE
ANALYSES ........................................................................................................................................................ 28
3.1 Introduction ............................................................................................................................................... 28
3.2 F-testComparison of Variances ............................................................................................................. 29
3.3 Students t-testcomparison of means ..................................................................................................... 31
3.4 Paired t-Tests ............................................................................................................................................. 32
3.4.1 Example: Silbak Premier Blasthole Data............................................................................................ 35
3.5 Significance of r, the correlation coefficient ............................................................................................. 36
3.6 Statistical Tests Involving the Linear Model............................................................................................. 36
3.6 1 Significance Test ................................................................................................................................ 38
3.6.2 ExampleSilbak Permier Gold Deposit............................................................................................ 38
SECTION 4 ........................................................................................................................................................ 41
4.0 PRACTICAL MEASURES OF SAMPLING AND ANALYTICAL ERRORS .......................................... 41
4.1 The nature of errors ................................................................................................................................... 41
4.2 Relative Error ............................................................................................................................................ 46
4.3 Mean Absolute Difference......................................................................................................................... 47
4.4 Thompson and Howarth Error Analysis.................................................................................................... 48
4.4.1 Assumptions ....................................................................................................................................... 48
4.4.2 The Method ........................................................................................................................................ 48
4.5 The Simple Linear Model (Sinclair and Blackwell, 2002) ........................................................................ 50
SECTION 5 ........................................................................................................................................................ 53
5.0 SOURCES OF ERRORS: SAMPLING, SUBSAMPLING AND ANALYSIS............................................ 54
5.1 Introduction ............................................................................................................................................... 54
5.2 Sampling.................................................................................................................................................... 54
5.2.1 Chip Samples...................................................................................................................................... 56
5.2.2 Channel Samples ................................................................................................................................ 56
5.2.3 Drill Core Samples ............................................................................................................................. 56
5.2.4 Drill Cuttings Samples........................................................................................................................ 57
5.3 Subsampling .............................................................................................................................................. 61
5.4 Pierre Gys Fundamental sampling Error .................................................................................................. 62
5.5 Analysis ..................................................................................................................................................... 66
Figure 5.10: Scatter plot of thirty one pulps analyzed by both NiS and Pb fusions for Pd. ........... 69
5.5.1 Metallic Assays................................................................................................................................... 69
SECTION 6 ........................................................................................................................................................ 70
6.0 MONITORING AND QUANTIFYING DATA QUALITY ........................................................................ 70
6.1 Introduction ............................................................................................................................................... 70
6.2 Dealing With Standards............................................................................................................................. 70
6.2.1 Introduction ........................................................................................................................................ 70
6.2.2 Blanks ................................................................................................................................................. 73
6.2.3 Monitoring Subsampling Contamination............................................................................................ 74
6.2.4 Monitoring the Analytical Environment............................................................................................. 74
6.3 Laboratory QA/QC Procedures ................................................................................................................. 75
6.4 Duplicate QC Data During Exploration/Evaluation .................................................................................. 75
6.4.1 Sampling Plan for Quality Control ..................................................................................................... 76
6.4.2 Data Editing........................................................................................................................................ 76
6.4.3 Principal lab........................................................................................................................................ 76
6.4.4 Check Lab........................................................................................................................................... 76
6.5 Interpretation of Results ............................................................................................................................ 77
6.6 AN EXAMPLE OF A COHERENT DATA SET ..................................................................................... 78
6.6.1 Introduction ........................................................................................................................................ 78
6.6.2 Estimating Analytical Error of the Due Diligence Lab ...................................................................... 78
6.6.3 Estimating Analytical Error of the Original Lab ................................................................................ 80
6.6.4 Sampling Error ................................................................................................................................... 82
6.7 SOME COMMONLY ENCOUNTERED SITUATIONS ........................................................................ 85
6.7.1 HALF-CORE VERSUS QUARTER CORE ...................................................................................... 85
6.7.2 DATA SPANNING A VERY LARGE RANGE (Perhaps several orders of magnitude) ................. 87
6.8 ANALYSES BY ALTERNATIVE ANALYTICAL METHODS ............................................................ 88
6.9 LACK OF CONSISTENCY IN DUPLICATE DATA ............................................................................. 90
SECTION 1
1.0 INTRODUCTION
1.1 Recent Perspectives
In recent years there has been a strong international move toward knowing and improving the quality
of information used in the mining industry for resource/reserve estimation. In Canada this trend has been
accentuated because of recent, highly publicized scams that involved contamination of samples so as to produce
assay results far above the true metal contents of the samples in question. One important aim of quality control
procedures is to minimize the likelihood of such scams so that the public is not misled as to the economic
potential of a mineral deposit. In addition, quality control procedures serve the technical purposes of identifying
sources of and quantifying both random errors and unintentional bias in sampling, subsampling and analytical
routines and thus provide the basis for improved procedures of data collection that translate into improved
resource/reserve estimates.
One of the important reactions in Canada to recent mining scams has been the implementation of what
is known as National Instrument 43-101 (NI 43-101) in which a wide range of requirements relating to the
publication of assays and resource/reserve estimates, are laid out. These requirements identify a Qualified
Person (QP) who is responsible for all technical matters related to obtaining and publicizing both assay data and
resource/reserve figures. This course incorporates a variety of procedures designed to fulfill the requirements of
NI 43-101 insofar as standard, blank and duplicate samples can be used to define and monitor quality of assay
values that are the basis of deposit evaluation.
3. Specific and documented procedures, methods and work instructions for individuals in 2,
above.
4. Testing, controls, check and audit programs appropriate to the various work stages.
5. A method to achieve any necessary changes and modifications to the quality plan as the
projects advance.
2.
3.
4.
5.
1.4 Salting
Salting, the surreptitious introduction of material into samples (McKinstry, 1948), has been with us
since the time of Agricola (Hoover and Hoover, 1950). Salting of possible ore material is an unfortunate
occurrence in the evaluation of some deposits and its presence or absence must be assured during the process of
data verification. Such invalid data cannot be recognized by the use of blanks, standards or reference samples
but must be checked by a program including examination of sample rejects as well as resampling. Salting by
the addition of particulate gold to samples of gold deposits at some stage in the subsampling procedure is
unlikely where very low grades are reported because such salting normally leads to an abundance of moderate
to high values. Consider the following hypothetical example: a single particle of gold in a subsample to be
assayed generally will lead to an assay well in excess of the 0 to 4 g/t range. One grain of gold in a subsample
consisting of 1 million grains (gold grain assumed to be the average size of non-gold particles) is equivalent to
an average grade (for the 1 million grain sample) of about 7 g Au/t.
Details of the calculation: 1 grain of gold is 19/2.8 times as heavy as an equivalent sized grain of most gangue minerals. Consequently, the 1
million grain sample contains about 19 grams of Au per 2,800, 016.2 grams of sample (i.e., about 7 grams Au per 1 million grams of sample
or 7 grams per tonne).
The surest method to verify that salting has not occurred where half core samples are involved, is to
undertake duplicate sampling of the other half of the core and compare the assays for the second set of samples
with the previously reported data (e.g., McKinstry, 1948). Of course, this approach assumes that the second set
of half cores has neither altered due to natural processes nor has itself been contaminated in any way. Rejects
and pulps would not be re-analyzed as an analytical test for salting in the example cited because they might
already be contaminated. Rather, rejects and pulps (or heavy mineral concentrates from them) should be
examined by microscope to check for particulate, contaminant gold i. e., shavings of refined gold or grains of
naturally occurring alluvial or colluvial gold.
In the case of gold assays, optical investigation of pulps and rejects with a microscope can be a partial
approach to recognizing the presence or absence of salting in cases where either placer gold grains or refined
gold shavings have been purposefully added to samples. Both types of contaminant gold are relatively straightforeward for a mineralogist to identify. The simple procedure of testing a rough heavy mineral concentrate,
taken from samples intermittently for microscope examination, is a small price to pay for the benefit that arises
in establishing the absence of particulate, contaminant gold at an early stage in a scam.
Precious metal deposits are particularly susceptible to salting, perhaps in part because of the romantic
lore surrounding highly publicized gold rushes of the 19th century (California, Klondike, Witwatersrand, etc.)
represents a romantic picture that scam artists of today can draw on. Today scam artists are particularly
innovative in their approaches to salting. Consequently, as Rogers (1998) emphasizes, security of samples
(chain of custody) from the time of their taking until assays are received i.e., an open and clearly defined chain
of custody of samples and assays through target hardening) is the safest way of being confident that salting
has not occurred. Danger signs or red flags that alert one to the possibility of subterfuge in mineral deposit
evaluation, after Rogers (1998), are given in Table 1-1: Of course, it is impossible to see in advance all of the
creative approaches that scam artist might take, hence, vigilance with regard to the samples and sampling and
assaying procedures is essential.
(i)
(ii)
(iii)
Drilling Program
(i)
consistently poor drill core recovery in mineralized intervals
(ii)
skeletonized core or missing intervals in retained splits
(iii)
inability to replicate lithology between adjacent drill holes
(iv)
failure to maintain continuity of raw core and splits
(v)
core logging inconsistencies, particularly concerning alteration/minerlization
Sampling and Assaying
(i)
unaccounted delays in sample shipment
(ii)
identical sample numbers used more than once
(iii)
presence of chemicals, equipment and/or materials not related to work or site
(iv)
drill results inconsistent with trench or channel samples
(v)
inability of independent labs to replicate original assay results within acceptable limits
Resource Estimation
(i)
resource estimates inconsistent with other deposits of same type
(ii)
resource estimates inconsistent with other deposits in same area
(iii)
resource estimates not reproducible by independent audit
Petrographic, Mineralogical and Mineral Processing Audit
(i)
inconsistent mineral assemblages for deposit type
(ii)
mineral processing/metallurgical anomalies; unusual metal recovery characteristics
1.5 ExampleBre-X
The recent Bre-X scandal is reputably the most significant mining scam of modern times with losses
said to be of the order of Can$6 billion. In brief, the scam seems to have been perpetrated by a small group of
employees who, at an in-transit storage location, reopened bags containing samples and contaminated them
with carefully weighed amounts of placer (flour) gold. Difficulties arose in 1996 in reproducing assays, leading
to an auditing firm being employed to verify the assays. The resulting report (Farquharson et al, 1996) noted
a number of observations.considered to be indicative of potential problems with the Bre-X sample data as
follows:
1.
2.
We were surprised when hearing that all of the core from Busang was being assayed with the
exception of the 10-cm half-skeletons that were retained from each one metre of core. The
reason offered for this decision (which does not conform with normal industry practice) was
that the coarse nature of the mineralization required a large sample. However, the HQ core size
is larger than is normally used in many exploration programs and results in a smaple weight of
14 kg for every two metres, one-half of this sample, as we have used in our audit, would still
provide a sample of 7 kgmore than enough to be representative of the two metres. The basic
reason for the industry practice of retaining half the core is to be able to review the geology at
any time and to carry out check assays should there be any questions as to the accuracy of the
original assays.
The decision by Bre-X to designate some core as mineralized and other core as in-fill is
very different from what one would expect as a standard practice particularly given that the two
categories of material would follow different sampling routes. The in-fill material was treated in
the sample preparation facility at Busang to produce a pulverized pulp that would be ready for
assaying and invariably resulted in low gold values. The mineralized core was bagged,
delivered to Samarinda and subsequently taken to Balikpapan, and usually some gold values
3.
4.
5.
resulted. Normal mining industry practice would be to have all the core treated exactly the same
way through the same facilities. With the very large backlog of core to be sampled at Busang
we were surprised to note that during our period at the property the sample preparation facility,
which is very well equipped, was idle and the employees assigned to that area advised that they
had no material to process.
In reviewing the Kilborn intermediate feasibility study of November 1996 and in particular the
metallurgical work carried out under the supervision of Normet Pty. Ltd. in Australia, we were
struck by the statement that more than 90 percent of the gold in the Busang metallurgical
samples could be recovered in a gravity concentrate that represented about 6 percent of the
weight of the feed. Although one could expect to see such exceptional recoveries in a gravity
circuit for material coming from an alluvial deposit, we have never before seen such a response
for material coming from a primary deposit. In a mineralogical study carried out for Normet by
Roger Townend and Associates of Perth, they identified the gold particles and made the
following comment gold particles are liberated and mostly 100 to 400 microns. Some particles
show distinct fold rich rims with argentian core, other particles are of uniform colour of
varying silver content. Gold particle shapes were mostly rounded with beaded outlines. With
the very coarse liberated gold of up to 400 microns reported in this study, one would have
expected to see visible gold somewhere in the many thousands of metres of drill core from
Busang. However, there is no mention of visible gold in any of the documentation that we have
seen with the Kilborn feasibility study or resource estimate, or in the drill logs prepared by BreX geologists, other than in the mineralogical studies done on the concentrate samples.
In a parallel metallurgical investigation carried out at Hazen Research Inc. in golden,
Colorado, and as reported on October 24, 1996 in a Hazen report included in the Kilborn
intermediate feasibility study, similar remarkable metallurgical and mineralogical characteristics
were discussed. In fact, if there has not been a typographical mistake, Hazen reported
recovering 91 percent of the gold in a gravity concentrate that was less than one percent of the
material delivered to the gravity circuit. Similarly, a mineralogical study was done by Hazen,
and they reported that the gold particles in the Busang composites were liberated as relatively
coarse nuggets and minor flakes with an average size range of 60 to 180 microns. The gold
grains are typically very compact and often nearly spherical in shape. Photographs were
included as they were with the report prepared by Roger Townend and Associates, showing
coarse liberated gold lying amongst much smaller size particles of pyrite and other gangue
material. Hazen also pointed out in their test work that there was no observable trend with
respect to gold recovery and residual tailing grade as a function of fineness of grind. In other
words, it mattered little how fine the ore was ground in the grinding circuit, the gold was
already liberated and minimal grinding was required, which is most unusual for ore originating
from a primary deposit. The deposit was therefore a metallurgists dream, but no reconciliation
was made with the fact than no visible gold had been reported in core samples. In their
mineralogical study, Hazen also commented on and provided a photomicrograph of the colour
of the Busang gold indicating that it was electrum but with a deep yellow rim suggesting
surface dissolution of some of the silver, a fact that has since been independently confirmed by
both Freeport and ourselves. Both Hazen and Normet have commented on the great difficulty in
repeating grades for assays of the same samples that were used in the metallurgical test work, a
feature that many others have comment on, and this is a reflection of the coarse, free, and
rounded nature of the gold grains in the Bre-X samples.
A very comprehensive petrographic report was prepared for Bre-X in September 1996 by
PetraScience Consultants Inc. of Vancouver, in which they reviewed the characteristics of
alteration and mineralization of 103 samples from Busang. All the samples were described in
great detail petrographically, from polished thin sections that allowed description of sulfides,
oxides, and silicate minerals. Amongst the conclusions from this report was the following
statement: Gold is assumed to occur dominantly as free grains, as no unequivocal gold was
observed in this study, either as grains or in other sulfides. It is again remarkable that in a
deposit where so much coarse gold has been observed in samples submitted for assaying that
none was observed in the 103 samples that were selected to represent the full range of rock
types and associated alteration in the Busang deposit.
6.
With a deposit the size and grade reported for the Southeast Zone at Busang, and with its close
proximity to surface (based on the assays reported for many holes), one would have expected a
strong surface expression either through geochemical sampling of soils and sediments or
through sampling of outcrop. There is a geochemical anomaly over the Central Zone, which is
consistent with the mineralization reported to be there in the drilling program of 1989 and later.
We were unable to find any evidence of a geochemical survey on the Southeast Zone, but were
given a surface map showing where outcrops had been sampled with very low gold values
resulting and not what would normally be expected for a deposit of several hundred million
tonnes grading 2 to 3 g/t, with almost no stripping required to commence mining.
A number of additional problems with the Bre-X situation have been listed by others (e.g., Jones et al, 1998;
Lawrence, 1998)
7.
Three separate data bases were provided to Freeport McMorran for their due diligence of early
1997. There were discrepancies of thousands of samples between any two of these data bases.
One particular problem was the existence of 3864 sample numbers for which duplicate values
existedone set showed ore grades whereas the other contained little or no gold.
8. Extremely unusual rapid rate of growth of reported resources during exploration.
9. The magnitude of the reported resource is so far above normal as to demand thorough
verification
10. Normal industry practice is to have all core samples treated by the same facility with the same
protocol rather than the two very different subsampling/analytical paths used by Bre-X
personnel for their mineralized and fill-in (unmineralized) categories of samples that were
based on visual classification of sample material
11. Publicity regarding the deposit routinely included the 50 percent of resources that were in the
Inferred category and to which no emphasis should have been attached.
Target hardening is the strategic strengthening of high risk areas in exploration/evaluation to reduce
the risk of tampering. To effect target hardening it is imperative to clearly understand the details of all aspects
of information gathering and transfer, including collection of information, shipment of samples, sample
analysis, and reporting of assay results. This general sequence is known as the chain of custody of
samples/information and the chain should be clearly documented. Safety lies to a large extent in adherence to
a selection of so-called best practices that are widely used and recommended throughout the mining industry.
An indication of some of the more important best practices is given in Table 1-2.
TABLE 1.2: A SELECTION OF BEST PRACTICES IN THE MINERAL INDUSTRY (after Rogers,
1998)
SECTION 2
2.0 STATISTICAL PARAMETERS COMMONLY USED IN ERROR ANALYSIS
2.1 Introduction
Quality control procedures, pertaining to assay data for use in mineral deposit appraisal, are aimed, in
part, at understanding sources and magnitudes of errors in the assay values reported from an analytical
laboratory. This information is used, where necessary, to improve aspects of the sampling-subsamplinganalytical system in order to maintain errors at an acceptably low level. It is important to realize that errors in
assay data exist, however good the individuals are who provide the data. Furthermore, it is important to
appreciate that unless errors are known quantitatively it is not possible to be confident about whether or not
they are at acceptable levels or how they can best be reduced to acceptable levels. It is impossible to deal
quantitatively with errors in assay values without an understanding of some basic statistical concepts and
procedures, including measures of central tendency and dispersion, and various tests such as t- and F-tests.
Figure 2.1: Histogram of a hypothetical data set of integer values (abscissa) illustrating that mode,
median and mean can differ for a single data set that is skewed. Numbers of items in each class are
listed in each bar of the histogram.
Figure 2.2: Three examples of drill-core assays presented as histograms. (A) negatively skewed,
(B)symmetric, (C) positively skewed. A smooth normal curve has been fitted to the symmetric
histogram to emphasize the similarity of histograms and data distributions.
(1)
10
Figure 2-3: A sequential plot of 58 analyses of a Pt standard. The lower horizontal line is the mean
value for reference. The upper horizontal line is one standard deviation above the mean. Note that
the mean value is biased high relative to the true mean because a single, obvious outlier has been
included in the estimate.
TABLE 2.1: CONSECUTIVE Pt AND Pd VALUES OBTAINED DURING NUMEROUS BATCH PGM
ANALYSES OF ROUTINE SAMPLES (Values in ppb)
In-House Std
Seq Pt Pd
1 30 140
2 41 103
3 31 124
4 116 100
5 28 102
6 33 103
7 60 103
8 29 109
9 35 86
10 33 98
11 33 106
12 38 93
13 30 91
14 26 86
15 37 100
16 57 84
17 47 134
18 32 122
19 35 113
20 27 145
21 31 117
22 32 99
23 58 104
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
33
33
31
28
38
30
42
42
39
23
32
30
28
30
33
32
36
22
46
27
38
30
31
49
19
86
100
115
85
96
89
152
99
98
101
120
92
96
109
93
93
89
97
81
99
95
99
103
71
76
49
50
51
52
53
54
55
56
57
58
23
27
22
31
38
47
28
29
53
36
93
88
79
75
79
91
98
94
97
125
11
12
The data in Table 2.1 are consecutive Pt and Pd analyses of pulp samples of an in house standard (not
a CRM, certified reference material)each 2 to 6 consecutive analyses were included in separate analytical
batchesanalyzed by a reputable lab. The standard was constructed by mixing a number of subsamples of mill
feed and blending the composite material. Normally, supporting data would include many analyses by other
reputable labs and an analysis of the results to provide a best estimate of the true metal contents as well as
their 95% confidence limits. A plot of the Pt data is shown in Figure 2.3 where the presence of a single outlier
is evident. The statistics for 58 Pt values are: m = 35.8 ppb, s = 13.9 ppb. If the outlier is removed these values
change to: m = 34.4 ppb and s = 8.9 ppbnote that with exclusion of the outlier the mean is lowered by
100(1.4/35.8) = 3.9% whereas the standard deviation is reduced by 100(4/13.9) = 28.8%. The general precision
is then (200 x 8.9/34.4) = 51.7%, hardly the quality of reproducibility that one would hope for in a standard.
These data illustrate some of the practical problems encountered with in-house standards.
1.
2.
3.
The presence of an outlier value (xi = 116 ppb Pt) in the Pt datathe outlier must be omitted to arrive at a
best value for Pt-content of the standard (see Figure 2.3 ).
A relatively high level of variability among batch mean valuesexamine the variability of averages of
groups of 3 to 6 values.
A relatively high level of variability of repeat values within individual batchesexamine the internal
variability of local groups of 3 to 6 values.
Explanations for the high variability of analytical results are difficult without further tests but possibilities
include:
1. Inappropriate analytical procedure.
2. Poor procedures for homogenizing the pulped material of the standard. Special attention must be directed
to homogenizing the pulp material that comprises the standard.
3. Poor choice of material for a standard, perhaps mineralogically complex and difficult to homogenize
because of the inability to redistribute trace amounts of some of the valuable minerals uniformly
throughout the standard. This is a very likely cause of the poor precision in these data.
In addition, without supporting data from other independent labs, these data only provide a measure of
precision but not accuracy!
The data of Table 2.2 are a summary of average metal contents based on 24 repeat analyses of a standard
used to monitor internal lab reproducibility. The data by themselves do not provide a measure of accuracy
because the standard has not been treated as a CRM (certified reference material) so the metal contents have not
been established adequately. However, the data are typical and represent an example of the level of
reproducibility (precision) that can be expected for routine multielement analyses (low levels of concentration)
by reputable commercial labs. Note the very large differences in precision for various metals in the same
standard, a situation that is in part attributable to different abundance levels and in part to variable difficulty in
analyzing different metals with a single analytical procedure.
TABLE 2.2: AVERAGE GRADES OF 24 REPEAT ANALYSES* OF A SINGLE IN-HOUSE
STANDARD BY A REPUTABLE COMMERCIAL LABORATORY
Element
Average
Std. Dev.
Rel. Err.
Precisionand units
200(s/m)
Pd_ppb
3882
150.2
0.039
7.74%
Pt_ppb
221.5
17.03
0.076
15.4%
Cu_ppm
13390
601.5
0.045
9.0%
Au_ppb
87.3
15.7
0.180
35.9%
Ni_ppm
408
39.2
0.096
19.2%
Co_ppm
57.1
2.29
0.04
8.0%
* Standards were inserted with various subsets of samples by the client and were not recognizable
as standards by the lab.
12
13
The value m in equation 1 is also spoken of as the expected value of the data. Commonly, estimation of the
average using equation 1 provides an acceptable estimate of m although difficulties can arise in certain cases.
For example, the presence of an outlier in the data can bias the estimation of m in some situations;
consequently, data should be examined for outliers and they should be removed if they impose bias on the
estimate. Consider the case of repeated analyses of a standard by several labs for which one lab has a single
value that is a factor-of-three times the average of all labs. Inclusion of that single value in estimating the
average metal content of the standard will clearly bias the value to be applied to the standard. The data of Table
2.1 show the large difference in estimated average value with and without inclusion of the outlier (116 ppb Pt).
The average, excluding the outlier compares well with averages obtained from other laboratories that formed
part of a round robin analytical program to establish an accepted value for the standard. The average that
includes the outlier is clearly out of line relative to data from all the other laboratories.
A second concern in evaluating the average value using equation 1 arises where there are very few
data. The arithmetic average based on only a few data is highly susceptible to a single high (not necessarily an
outlier) value. In such cases, where very few data are available, the median value is a better measure of central
tendency.
Figure 2.4: Histogram of differences of paired analyses of gold (Au1 Au2) for pulps and rejects by
two different labs in 1983. The mean difference (vertical line through central class interval) is offcentre mainly because of a single value on the far right that destroys the symmetry of the histogram.
An important use of histograms is illustrated in Figure 2.4. In this example, duplicate analyses exist, an original
by the mine lab and a reject analysis by an independent lab. Generally, the purpose of analyses by an
independent lab is to monitor bias and duplicate pulp analyses are appropriate. The use of reject analyses for
this purpose introduces unnecessary errors (hence, wide spread of values about the mean) and clouds the
quantitative interpretation of the data.
13
14
2.2.2 Median
To find the median of a data set, arrange the data in either increasing or decreasing order, locate either
the mid value (odd number of data) or the two mid values (even number of data) and accept the corresponding
value or average of two, as the median. Median values rather than averages are an important part of the
Thompson-Howarth error estimation procedure to be described in a later section. For strongly skewed
distributions that are sampled by relatively few values, the median provides a more acceptable estimate of the
mean than does the arithmetic average.
2.3.2 Variance
The variance of a population is the mean squared difference from the average and is estimated from
a data set as follows:
s2= (xi m)2/(n 1)
(2)
where s2 is the variance, xi represents the n data items for i = 1 to n, and m is the arithmetic average of the data.
From the definition it is evident that the variance is a measure of average spread about the average value, albeit
in terms of squared units. The variance is a very fundamental measure of spread (dispersion) of data and is
widely used to characterize average error in a data set or a subset of the data. For example, equation 2 can be
applied to the data of Table 1.1 (excluding the outlier) and the spread of values about the average, expressed as
a variance, is calculated to be 72.2ppb2 Pt.
The variance has the particularly useful property that it is the sum of the variances that arise from
different sources. As an example, consider the total error that arises in comparing the analyses of first halfcore samples with the analyses of the other (second) half cores (i.e., the practical situation in which half cores
are the original sample and the remaining half cores are the check field samples). In this case the total error as a
variance, st2, is given by (see Burn, 1981; Kratochvil and Taylor, 1981)
st2 = sa2+ sss2+ ss2
(3)
where sa2 is the analytical error, sss2 is the subsampling error and ss2 is the sampling (geologic) variability. With
appropriate duplicate data, each of these sources of variability can be estimated and, if necessary, consideration
can then be given as to how best to improve the quality of the data.
Because it is only as squared values that errors are additive, variances are a sensible way of
considering the relative magnitudes of analytical, subsampling and sampling variabilities. Several examples are
shown in Table 3.
14
15
Comment
37
55
Platinum deposit
12.8
44.7
0*
0*
87.2
55.3
Bushveld platinum
Bushveld palladium
72.0
20.7
7.3
Epithermal gold
96.7
3.1
0.2
Smee and Stanley, 2005
* Subsampling error is zero because entire sample was pulped.
The figures of Table 2.3 are source error percentages of the total errorall errors as variances. It is
important to appreciate the significance of squared aspect of errors expressed as variances. Consider a gold ore
averaging 2.5 g/t with an average total error as a variance of 4.4 (g/t)2. Assume that the disposition of errors is
as in the Toronto Stock Exchange example of Table 2.3. Thus, the absolute value of analytical error variance is
0.05 x 4.4 = 0.22 (g/t)2, the absolute value of subsampling error variance is 0.15 x 4.4 = 0.66 (g/t)2, and the
absolute value of sampling error variance is 0.8 x 4.4 = 3.52 (g/t)2. It is more comprehensible to consider these
absolute errors as the square root of the variances so that the errors are in the same units as the assay data.
These square roots are respectively, analytical error = 0.47 g/t, subsampling error = 0.81 g/t and sampling error
= 1.88 g/t. Clearly, no matter how much time, money and effort is directed toward reducing the analytical error,
the overall error will remain high.
In evaluating the significance of the information in Table 2.3 bear in mind that sa2 , although defined
as analytical error, involves the random error inherent in subsampling a pulp, generally one of the easiest tasks
to perform of the sampling protocol. Hence, in many cases it will be difficult to effect large improvements in
quality of data simply through improvements to sampling and analyzing of the pulps. In the example cited in
the preceding paragraph, a 50% decease in the analytical error has a very small impact on the total error
because of the much larger errors implicit in subsampling and sampling. It is surprising, the amount of effort
that has been directed in the past to improving the quality of pulp analyses considering the fact that about 70 to
90 percent of the variability in quality generally arises from other sources.
Large subsampling errors can arise easily because of non-optimal sample reduction procedures. The
use of Gys sampling equation (Appendix ) and appropriate care taken to homogenize samples prior to each
mass reduction stage can result in a subsampling protocol that leads to a level of subsampling error as low as is
practicable. There are practical situations where the original sample is small enough that the entire sample can
be ground to a pulp, effectively negating subsampling as a source of error. Of course, this procedure results in
larger amounts of pulp than would otherwise be obtained, resulting in a larger analytical error than for a smaller
amount of pulp. This is because the error in sampling the pulp for analysis is traditionally included in what is
reported as the analytical error.
The sampling error is commonly the major source of variability in duplicate sampling, as for example,
in the comparison of assays for pairs of half cores or the comparison of assays of duplicate samples taken from
piles of blasthole cuttings. In contrast to analytical and subsampling variability which are errors, the sampling
variability is largely a real geologic variability, that is, a real difference in metal abundances over very short
distances in the original rock, a major contributor to the so-called nugget effect of the geostatistician. The most
effective way of reducing the impact of geologic variability is to increase sample mass, perhaps by increasing
the length of drill core that comprises a sample and/or increasing the diameter of the drill core that is to be
sampled. Some variability between paired half core analyses can arise in manual core splitting because the two
15
16
half cores can differ substantially in mass. This source of inconsistency can be overcome in practice by
diamond sawing half cores.
(4)
Thus, the standard deviation, like the variance, is a measure of the spread of data about the average value. A
particular advantage of the standard deviation is that it is in the same units as the data, rather than squared units
as in the case of the variance.
Figure 2.5: Two data distributions with common mean value but relatively wide and narrow dispersion
(spreads) of values.
There are several important uses for the standard deviation in dealing with duplicate data. The first
arises from the fact that many of the data distributions (cf. histograms) of concern in error analysis are normal
(Gaussian) in form, that is, they have the form of a bell-shaped curve. In such cases, the standard deviation is
the particular instrument by which probability ranges can be defined, i.e., the proportion of data that lie between
any two limits, or, the probability that a new or random draw from the population will lie within those limits.
For example, in a normal distribution, about 95 percent of the data are within the range m + 2*s. Similarly, 68
percent of the data are in the range m + s. Tables are available in many statistical texts that allow the calculation
of the proportion of samples between any two selected values of the variable. The procedure is illustrated in
Appendix 1.
(5)
In this case, n samples have been used to determine the mean value. That mean value plus or minus two
standard errors of the mean will contain 95% of new mean values calculated from new samples of size n! It is
evident from the equation for the standard error of the mean that se will in all cases be less than s and that as n
increases, se can be very much smaller than s, even orders of magnitude smaller. Consequently, where
dispersion limits are attached to a standard sample it is essential to be clear as to the nature of the dispersion
being quoted. First, is standard deviation or the standard error of the mean the basis of the quoted dispersion,
second, does the reported dispersion represent 1 or 2 standard deviations (or standard errors)?
16
17
The standard error of the mean (or a close equivalent) is the basis of many comparative statistical tests,
including so-called t-tests, that will be illustrated in a later section of this course.
An example of the use of the standard error of the mean is in placing confidence limits on estimates of mean
values (e.g., Kratochvil and Taylor, 1981)
= m + tse/n 1/2
(6)
where : is the population mean, m is the mean of the sample, se is the standard error, n is the number
of values in the data set and t is a value obtained from tables of the t-distribution in statistical texts. If n > 30
then t = 1.96 (commonly approximated as 2) for 95% confidence limits.
Metal contents of standards are never known exactly. They are generally assigned an average value, v,
estimated from the group of analyses deemed to be acceptable (see later section of course), with a 95% error
band that is based on the standard error of the mean, i.e., v + 2*se. This error limit on the quoted value of the
standard is generally relatively small because many analyses by many labs have been used in its estimation
(recall that se decreases as the number of analyses on which it is based increases). Replicate analyses by a
single lab will spread well beyond these error limits.
(7)
Where the standard deviation represents an error, the ratio s/m is referred to widely as the relative error, er, and
is commonly quoted as a proportion.
er = s/m
(8)
The CV is widely used in the mining industry as a general way to identify symmetry or asymmetry in the
distribution (histogram) of values. Generally, if CVs are in the range 0 to 20%, values are symmetrically
distributed about the mean value. As the value of the CV increases beyond 100%, to as much as 300% or 400%
in extreme cases, the distribution becomes more and more asymmetric or skewed toward high values.
There are many practical cases where the relative error interpretation of s/m is appropriate, such as, (1)
replicate analyses of a standard, and (2) duplicate analyses that span a limited range of concentrations. In such
cases, the relative error can be multiplied by 200 to provide a common estimate of precision as a percentage.
Note that precision is widely quoted as the 95% confidence range of analyses, about the expected value and,
traditionally, is reported as a percentage. Examples are illustrated in Table 2.2 where the means and standard
deviations of 6 metals are reported. These values have been used to calculate both relative error and precision
for each of the elements. Note in particular the wide range in precisions for the various metals in the same
standard, from approximately 8% to about 36%. This variation is in part a function of the difficulty in analyzing
low abundances relative to high abundances of a metal, differences in mineralogies of metals and also the
inherent differences in analyzing different metals by a particular analytical method. The mere fact that detection
limits are likely to be different for different metals implies that analytical precision will also be different.
17
18
2.5.1 Introduction
The simple linear model is simply the equation of a straight line. Consider a graph of x1 vs x2, where
x1 represents original analyses of many pulps and x2 represents a second set of analyses of the same pulps; in
other words, from each pulp we have an original and a second value. If the pulp duplicate analyses are plotted
on an x-y graph (scatter plot) we expect that if no bias exists between the two sets of analyses they will plot
scattered about the line y = x. Of course, there might be a component of bias and the scatter about the y = x line
can be either very large or minimal, for example, Figures 2.6 and 2.7 in which duplicate analyses plot as a
linear trend with a very large scatter to which a best fit line has been fitted.
In fitting models that quantify a systematic relation between two variables, such as a set of duplicate
analyses, it is common to (i) determine or assume the form of the mathematical relation between the two
variables, and then (ii) adopt a method to calculate the model parameters specific to a data set. A wide range of
mathematical models are available and even with a single model type (e.g., linear) there are various choices to
be made in the calculation procedures available. For example, one might adopt a linear model to describe a
relationship between duplicate analyses. However, a number of very different linear models could arise
depending on the many different calculation methods and their implicit and explicit assumptions. An incorrect
choice of calculation method can lead to an inappropriate model from which incorrect statistical inference can
result. Paired analytical data are not immune from this problem.
Regression techniques incorporating a linear model are commonly used for the comparison of one set
of analyses (ys) with another (xs), as described by Till (1973), Ripley and Thompson (1987), Sinclair and
Bentzen (1998), and many others. The principal justification for use of a linear model is the expectation that,
with no bias, a set of duplicate analyses will be equivalent except for a component of random error. Hence, the
data are expected to cluster along and about the line y = x on a graph of y versus x. If the random difference is
small the spread away from the y = x line will be small; if the random differences are large the spread will be
large. Where there is a significant bias between the two sets, some, even most, of the plotted values will not be
centered on the y = x line. Instead, the data may be centered, all or in part, on another line of the form
y = bo + b1x + e
where
bo
b1
e
(8)
is the intercept of the line with the y axis (i.e., the value of y if x = 0),
is the slope of the line (for any two points on the line the slope is the ratio of the
difference in y values over the difference in x values for any two
points on the line),and
is a measure of spread of the data about the line (a standard deviation).
m, bo and b1 are called parameters of the linear model; for a specific model, m, bo and b1 are constants. Once m,
bo and b1 are known, the linear equation can be solved by substituting any value of x and calculating the
corresponding value of y.
18
19
If y values are distributed normally about the line for any value of x, and m bo and b1 are estimated by
a procedure known as least squares, then m and bo and b1 are also normally distributed (Miller and Kahn,
1962). The advantage of these parameters being normally distributed is that they can be used to make statistical
tests, specifically, whether or not bias is recognizable in the data. In fact, because of the central limit theorem
these statistical tests can be made even if the underlying distributions are not normal, providing the amount of
data is large. As the amount of data increases (n > 40) and the data distribution becomes more symmetric, the
mean values of these parameters tend toward a normal distribution, regardless of the nature of the actual data
distribution.
If statistical tests form part of the evaluation of the significance of a linear model it is important that
the paired data cover an appropriate range of concentrations; otherwise conclusions are not generally
applicable.
The distribution of y values has the same spread regardless of the value of x.
3.
19
20
is no basis on which to select continuously curved models rather than two or more linear models for different
grade ranges.
In each of these cases there is a general expectation that the two sets of analyses will be identical, on average,
providing that no bias exists in any of the analyses. Reality commonly does not attain this ideal situation. Both
sampling and analytical procedures can lead to very different error patterns for different subsets of the total
data. Consequently, the subjective process of subsetting the data might be necessary. There are two additional
reasons why subsetting of data might be necessary:
1.
2.
Outliers are those values that differ very greatly from the vast majority of data As a rule outliers are
fairly straight forward to recognize although in some cases their recognition is subjective. Influential values
represent a small proportion of data that do not class as outliers but which have a very strong influence on the
particular model calculated for a data set. As an example, consider a data set consisting of 100 paired gold
analyses 95 of which are about evenly scattered in the range 0 to 4 g Au/t. The remaining 5 values spread
between 6 and 10 g/t. The 5 high values might dominate the linear model to the point that the model is not
representative of the other 95 values. Clearly, in such a case it is wise to remove the 5 influential values and
calculate a model for the 95 values. The 5 values might or might not be described adequately by the model
based on 95 values; if not, they must be considered separately.
In all 4 cases a general method known as least squares is used to determine the linear model and the result is
referred to as a best-fit model. In each case the term best fit means that a particular error criterion is
minimized relative to the linear model that is determined. Not all of these criteria are appropriate for comparing
replicate assay data.
The most widely available method of fitting a line to a set of paired data, traditional least squares, is
an example of an inappropriate least squares procedure that, in some cases, has been incorrectly applied to the
description of paired assay data. The reason that traditional least squares is inappropriate for such data is that
20
21
the method assumes that one of the variables (x) is perfectly known and places all the error in the second
variable (y). In reality, there are errors in both of the variables being compared and this must be taken into
account in defining the linear model. The problem with traditional least squares is well illustrated by an
example, gold assays of duplicate samples of blastholes from the Silbak Premier gold mine, illustrated in Figure
2.6 (from Sinclair and Bentzen, 1998). Two different lines are obtained, depending on which variable is taken
as y, the dependent variable. If we were to incorrectly accept these lines and test one variable statistically
against the other we would arrive at two opposing conclusions for the two lines, both conclusions being
incorrect. For example, if AUD is taken as y we would conclude that bias exists and that AUD underestimates
AU by about 27%; if AU is taken as y we would conclude that bias exists and that AU underestimates AUD by
about 29%. These two results are dramatically in conflict and clearly show that the traditional least squares
method is generally inappropriate as a means of defining a best fit linear model for paired assay data.
A least squares procedure is required that produces a fair representation of the underlying trend in a set
of paired data. This is achieved best by using a method that takes into account the different errors that exist in
the two sets of data being compared (e.g. Mark and Church, 1974). Because errors are rarely known in detail in
the comparison of many kinds of variables, including assay data, several practical approaches have been offered
to producing least squares models so that relationships between paired variables are fairly determined.
Weighted least squares procedures (e.g., Ripley and Thompson, 1987) can be highly subjective
because of the manner by which the weights are determined. In some cases they provide linear models that lie
outside the limits defined by the two traditional least squares procedures. Consequently, weighted least squares
methods are not generally acceptable to give an unbiased treatment of paired assay data.
The major axis solution is based on minimizing the squared perpendicular distances from each point to
the line. This is equivalent to minimizing simultaneously in both the x and y directions. This procedure is
affected by differences in scale between the two variables being compared, not normally a problem in dealing
with paired quality control data.
The Reduced Major Axis (RMA) linear model combines a standardization of the two variables (i.e.,
divide each value by the standard deviation of the data) and a Major Axis least squares solution to determine
the linear model. This procedure avoids any concern of difference in scale of the two variables (e.g., where
large biases exist between the paired variables). Dent (1937) showed that for paired variables the maximum
likelihood estimator of the ratio of errors, where the errors are unknown, is (sy/sx)2, which is equivalent to an
RMA line through the data. In general, at the outset of a study, errors are unknown for paired analytical data.
A reduced major axis (RMA) regression is desirable where it is important that errors in both variables
be taken into account in establishing the relation between two variables (Sinclair and Bentzen, 1998). The
methodology for reduced major axis regression has been described in an earth science context by Agterberg
(1974), Till (1974), Miller and Kahn (1962) and Davis (1986). Till (1974) emphasizes the importance of using
RMA in comparing paired (duplicate) analytical data.
The general form of the RMA line is:
_
_
y = bo + b1x + e
(9)
where x and y are the duplicate analyses, bo is the y-axis intercept by the RMA linear model and b1is the slope
of the model. For a set of paired data, b1 is estimated as
b1 = sy/sx
) (10
where sx and sy are the standard deviations of variables x and y, respectively, and bo is estimated from
_
_
bo = y - b1x
(11)
21
22
_
_
where y and x are the mean values of y and x, respectively. Commonly we are interested in whether or not the
line passes through the origin because, if not, there is clearly a fixed bias of some kind. The standard error on
the y-axis intercept, so, is given by
so = sy{([1-r]/n)(2 + [x/sx] 2[1+r])}1/2
where r is the correlation coefficient between x and y. The standard error on the slope is
ss1 = (sy / sx )([1 - r2]/n)1/2
The dispersion Sd about the reduced major axis is
Sd = {2(1-r)(sx2 + sy2)}1/2
These errors can be taken as normally distributed (cf. Miller and Kahn, 1962) and can be used to test
whether the intercept error range includes zero (in which case the intercept cannot be distinguished from zero)
and the slope error range includes one (in which case the slope cannot be distinguished from one). The
dispersion about the RMA line can be used in several practical comparisons including (1) the comparison of
replicates of several standards by one laboratory with replicates of the same standards by another laboratory;
and (2) the comparison of inter- and intra-laboratory paired analyses for routine data spanning a wide range of
values..
22
23
Figure 2.6: Scatter plot of 122 duplicate blasthole samples analyzed for gold, Silbak Premier gold
mine, Stewart, British Columbia. Reduced major axis (RMA) and two traditional least squares linear
models are illustrated. Note the limited scatter of low values and the much greater scatter of high
values, suggesting the presence of two data subsets that should be investigated separately.
Figure 2.7: The data of Figure 2.5 subdivided arbitrarily into low and highgrade subgroups. RMA
models have been fitted to each group as well as biased linear models of traditional least squares.
23
24
y-variable
intercept
and error1
slope
and error1
dispersion
about line2
RMA3
AUD
-0.173(0.323)
0.952(0.057)
4.12
TLS3
AUD
0.677(0.035)
0.708(0.191)
2.66
AU
1.028(0.192)
0.782(0.038)
2.80
Values in brackets are one standard error
2
One standard deviation (corrected from Sinclair and Bentzen, 1998)
3
RMA = reduced major axis (error in both variables); TLS = traditional least squares (error entirely in
y-variable).
1
24
25
Figure 2.8: Histogram of 412 differences in paired gold assays by two different labs (Lab A and Lab
B). The average difference is 231 ppb Au and the standard deviation is 1210.
A second example for paired analyses of duplicate blasthole samples from the Silbak Premier gold
mine, Stewart, British Columbia, is shown in Figure 2.9. These same data are used in the next subsection to
illustrate the use of scatter diagrams.
Figure 2.9: Histogram of differences in gold analyses of two independent samplings of piles of
blasthole cuttings, Silbak Premier gold mine, British Columbia.
25
26
Figure 2.10: A simple linear model to describe errors in paired data as a function of composition.
Scales for x and y are not quite, but should be, equal interval. Data are duplicate half core samples
for a molybdenum deposit and show wide scatter about the y = x line (lower line making the data
difficult to interpret visually). The upper line is a best fit line (reduced major axis) with a slope of 1.2,
indicating, on average, a 20 percent bias between the two sets of results.
In some cases, two fundamentally different variables are displayed on a scatter diagram in order to
establish whether or not some form of correlation exists between them. For example, it may be that bulk density
of an ore is closely related to grade. If such a relation can be established it might be possible to estimate bulk
density from the grade rather than the more costly approach of physical measurements, although some
monitoring of the relation would be required.
26
27
some of the results of a quality control program of a molybdenum prospect, in this case duplicate half core
samples of about 5 feet in length. Graphic and numeric output are explained in the figure. The numeric output
for individual variables is well established in practical usage in the mineral industry. The numeric output for the
linear model (in this case a Reduced Major Axis, RMA, model) are particularly important because the various
parameters allow statistical testing for the presence of bias and the quantitative determination of both random
error and bias, as will become evident later in the course.
Figure 2.11:Paired analyses of duplicate half core samples (two half cores of the same
hole interval). This is typical of P-res output; most of the features are labeled. The lower
left corner contains statistics of the ordinate (left) and the abscissa (lower) as well as the
correlation coefficient between the two variables. Above these statistics are the name of
the ordinate and a multiplier required to provide an appropriate scale for the values on
the diagram. The figures beneath the diagram are the parameters and errors of the
reduced major axis line (RMA). The figure itself shows the individual data points plotted,
the y = x line and the RMA line.
27
28
SECTION 3
3.0 STATISTICAL TESTS COMMONLY USED IN TREATING DUPLICATE AND
REPLICATE ANALYSES
3.1 Introduction
As an introduction to the various statistical tests that are important in dealing with paired assay data it
is useful to have an appreciation of the concept of probability in the context of statistics. Consider a situation
where a population has been sampled (e.g., 20,000 samples from a well-explored mineral deposit or 309
samples from a prospect) and a histogram of the resulting assays has been prepared. Commonly, the histogram
can be approximated closely by the well-known normal (bell-shaped) curve known as a probability density
function shown for the molybdenum data in Figure 2.2. In some other cases, the data can be transformed to
approximate a normal distribution. Normal distributions have the following general equation
y = [(2) -0.5 s-1]exp[-(xi m)2/2s2]
(3-1)
(3-2)
Figure 3.1: Standard normal distribution. Zvalues along the abscissa are numbers of
standard deviations to the left (-ve) or right
(+ive) of the mean (0). Percentage figures
indicate the percentage of error under the
curve between the two indicated z-values. All
normal distributions can be cast as a standard
normal distribution by subtracting the mean
from each value and dividing by the standard
deviation of the distribution. Consequently, the
standard normal distribution has a mean of
zero and a standard deviation (and variance)
of 1.0.
Fortunately, neither of these formulae need be used directly in quality control work. All normal distributions
can be reduced to this same standard normal equation; hence, this equation is a useful basis by which to
summarize the proportion of values that occur anywhere within a distribution. For example, the centrallylocated range, (m s) to (m + s), equivalent to the range z =-1 to z = +1, contains 68.26% of a normal
distribution. The remaining 31.74% of the values are located equally in the lower and higher tails of the
distribution, limited respectively by the values (m s) and (m + s), i.e., z = -1 and z = +1. Similarly the mean
28
29
plus and minus 2 standard deviations encompasses 95.45% of the area under the normal curve, equivalent to
95.45% of the values that make up the data. In fact, the proportion of area under the normal curve between
minus infinity (- ) and any value of z can be found in tabulations in most statistical texts. The difference
between two such estimates for z1 and z2 is the proportion of area between z1 and z2 and is equivalent to the
probability that a random value from the population will fall between z1 and z2. (Figure 3.2). Statistical tests
described in this course make use of this concept of probability, i.e., the percentage likelihood that a value
calculated from a data set (with a known form to the distribution) lies within a certain range that is defined by
the level at which a test is run.
The simple statistical tests to be considered here are examples of a large group of so-called tests of
significance. In general, such tests involve assuming a particular hypothesis (commonly referred to as the null
hypothesis) and then using data to generate a statistic whose value will be within defined limits if the
hypothesis is highly likely (or true!). Note the use of the term highly likely; statistical tests rarely lead to
absolute certainty, rather, they allow a person to make a statement such as the result we have obtained is
within the limits of what would be expected 95 percent of the time if our hypothesis is true. This is
comparable to reporting a result of a national poll stating that 45% answered yes to a question, so the true
figure is within 3% of the value 45%, 19 times out of 20. In our terms, the mean value is 45%, the standard
deviation is 3/2 = 1.5% and 19 times out of 20 is equivalent to 95% i.e., 95% of the time the true mean value
of the poll will lie in the range 42% to 48%. But, note that there is a 5% chance of making a wrong decision,
i.e., the true value has a 2.5% chance of being higher than 48% and a 2.5% chance of having a value less than
42%. This chance of being wrong is what statisticians commonly refer to the power or level of significance of
a statistical test i.e., a test is significant at the 0.05 level, where 0.05 is the proportion of chances of being
wrong (equivalent to 5% chance of being wrong). Of course, the test could be done at some other level of
significance, say 0.01 or 0.1. However, tests of significance in sampling and analytical work in the mining
industry are commonly conducted at the 0.05 level and that will generally be the level used throughout this
course.
Figure 3.2: Example of a normal (Gaussian) probability density function (pdf). The curve is symmetric
about the mean value of m = 0.76%Cu. Spread (dispersion) is measured by the standard deviation (s
= 0.28 in this case). Note that there are inflections in the curve at m+s. Two arbitrary Cu values are
shown; these can be divided by s to give a z-value and the proportion of area under the curve lower
than those values can be derived from tabulations.
29
30
whether or not the two labs/techniques produce data of equivalent quality. Part of the answer is to determine
whether or not the variability in the analyses of both labs could be expected sampling variability from the same
parent population. Here, we will hypothesize (null hypothesis) that the two variances are equal, i.e.,
though numerically different, they are two possible sampling outcomes from the same parent population. To
test
Figure 3.3: An example of the F-distribution, typically positively skewed. The value fa is the critical
value that is tabulated for the particular degrees of freedom (n 1) on which the distribution is based.
is the probability of being wrong in hypothesis testing with the F-function, often taken as 0.05.
this we will use the ratio of variances by the two labs and conduct an F-test. The F-distribution (cf. histogram of
variance ratios drawn from a normal population) is known (tabulated in statistical texts), so we know the
probability with which values will occur for any range of F values. In this case we conduct the test at the 0.05
level and use a 2-sided distribution, that is, we distribute the 5% chance of being wrong equally on the two tails
of the F-distribution. This gives us two critical F-valuesif the data provide a calculated F value between these
two critical values we accept the null hypothesis. Otherwise we reject the null hypothesis and conclude that the
two labs indeed have different random errors, in which case, the lab with the greatest variability is the poorer
quality of the two.
Details of the calculation: The standard deviations of the analyses by the two labs are s1 = 0.33 and s2 = 0.24 to give variances
of s12 = 0.1089 and s22 = 0.0576. If the two estimates of variance represent the same parent normal distribution, their ratio
should lie within an expected range of F-values characteristic of a normal distribution. These values, determined from
tabulations in many statistical texts, are approximately 1.68 and 0.065 (these values are the 2.5 and 97.5 percentiles of the Fdistribution with 55 degrees of freedom for both variables, so we are conducting a statistical test at the 0.05 level). Fcalc =
0.1089/0.0576 = 1.89 which lies outside the expected range of values. Consequently, we conclude that the two variances are
highly unlikely to represent the same population. In other words, one lab/analytical method has a larger random error than does
the other labin this case, lab 2, with lowest standard deviation, has the lowest random error.
This simple example illustrates how variability in the analyses of standard samples can be used to
measure relative quality of labs. This is particularly true where the labs in question do not realize which, of the
samples submitted to them, are standards. When a lab knows that a particular sample is a standard the lab might
be prone to give the sample special attention and thus produce better results than would apply to routine
samples. It is common to always put the larger variance in the numerator to always give an F-value greater
than 1.0hence, a single, critical value of F for the 0.05 level is used to compare with a calculated F-value. In
particular, a calculated F-value less than the critical value means that we accept the null hypothesis(i.e., we
conclude that the two variances represent the same parent population), where, a calculated value greater than
the critical value means that we reject the null hypothesis (i.e., the two variances do not represent the same
parent population).
30
31
Figure 3.4: Histograms of 58 samples analyzed for Au by two different analysts using different analytical
techniques after Fletcher (1981).
(3-3)
Figure 3.5: An example of the t-distribution. Note the symmetry. In hypothesis testing the error in the
test is distributed equally to each tail. The critical values ta and t1-a are the same in absolute value.
31
32
Critical t-values are determined from tables in many introductory statistical texts using a level of significance,
, and degrees of freedom, df, determined as follows:
df = n1 + n2 2
(3-4)
For example, if n1= 16, n2 = 9and = 0.05, then df = 23 and the critical t-value (from tabulations) is tcrit =
1.714. If a calculated (absolute) value of t is less than the critical value, the null hypothesis (means are the
same) is accepted, otherwise, the hypothesis is rejected and the means are said to be different at the 0.05 level.
For the null hypothesis that the two means are identical, but where the variances have been shown to
be different (by an F-test), the t-value is calculated as in the case for equal variances (equation 3). Degrees of
freedom, df, are determined as
df = [(s12/n1) + (s22/n2)]2 /[ (s12/n1)2/n1 + (s22/n2)2/n2]
(3-5)
This value for df will not necessarily be an integer but interpolation can be easily done in using it to estimate
critical t-values from published tables that are available in most introductory statistical texts.
Consider a comparison of mean values in the two sets of data used above in illustrating an F-test and
shown as histograms in Figure 3.4. The F-test demonstrated that the variances for the two labs/methods were
significantly different.
Details of the calculation: The mean values reported by Fletcher (1981) for the two sets of analyses of
a common standard sample by the two labs are m1 = 4.16 g/t and m2 = 3.43 g/t . Degrees of freedom
are calculated by equation 6 to be73.1 (In this case the calculation is unnecessarythe calculated
value for df will be higher than the lowest n which is 56 (a value above 30) and the critical t-value
defaults to 1.96 for an infinitely large sample). The calculated t-value tcalc = (4.16 3.43)/[(0.332/56 +
0.242/56]1/2 = 13.4, very much larger than the critical value of 1.96. Consequently, we conclude that
the two means are highly likely to be different.
32
33
expected sampling variation from zero and cannot be distinguished from zero); if zero is not included in the
95% confidence range of the mean difference, then the mean difference is statistically different from zero and
bias exists between the two sets of data.
se = ( sd /n1/2)
(3-6)
The previous example of a t-test for the data illustrated in Figure 3.4 could be evaluated with a paired t-test if
the original data pairs were available and differences could be calculated for each pair.
Figure 3.6: Histogram of 412 differences in paired gold assays for two labs, A and B. The average
difference is 231 ppb Au and the standard deviation of differences is 1210.
As an example of a paired t-test consider the data of Figure 3.6 which shows a histogram of the
differences in Au analyses by two labs, A and B, for 412 samples. The question to be asked is whether or not
the differences (average diff = 231 ppb, standard deviation = 1210) are of a magnitude to be expected or
whether there is a global bias between the two labs? The standard error is se = 1210/412 = 59.6. Confidence
limits on 231 are 231 + 2x 59.6 = 231 + 119, that is 112 to 350, a range that does not include zero.
Consequently, a global bias is demonstrated with lab B measuring higher than lab A by 231 ppb, on average.
Another way of expressing this global bias is as a percentage of the mean of the original valuesif the mean
value of the data is 2.0 g/t Au (i.e., 2000 ppb), then the global bias is 100 x 231/2000 = 11.6%. While these
figures prove a significant bias, on average, between the results of two labs, it must be recognized that the
local bias can vary with different gold concentrations.
In conducting both the t-test and paired t-test one must be careful in advance to eliminate outlier
values; these are generally easily recognized as values extremely far out on the limbs of a histogram of
differences and can also generally be identified on a scatter diagram of the duplicate analyses.
33
34
A practical problem in conducting the t-test using assay data is that data commonly are highly
concentrated at low values which have a substantially different spread of differences compared with higher
values. Consequently, to apply the t-test fairly it may be necessary to subset the data into groups, each with
more-or-less uniform dispersion over the grade range of the group. Subdividing the data is a subjective
procedure although appropriate threshold values are commonly fairly evident from visual examination of a
scatter diagram. The effect of grouping all the data together is that n is very large and generates small values of
tcalc that fail to identify bias over part of the total range of data.
Figure 3.7: Schematic subsetting of data to compare averages by the application of a threshold to
omit lower values. The ellipses represent a field of plotted paired data. A and B illustrate a threshold
(solid line) applied to ordinate and abscissa respectively, to provide a biased comparison. D shows
the threshold applied to both abscissa and ordinate to give a fair comparison of values.
Another potential problem in dealing with paired data can arise where the two average values below
(or above) a threshold value (i.e., the averages of both the x- and y-values) are to be compared. There are many
practical situations where the two averages are expected to be more-or-less the same. If the two averages are
not the same it may indicate the presence of a bias between the two sets of measurements. It is common
practice, for example, for only paired values above some threshold (e.g., cutoff grade) to be compared. This
comparison of average values can be biased if the limiting grade (threshold) for a subset is applied to one
member of the pair only (see Figure 3.7). To offset this bias, any cutoff grade used to limit a subset of paired
34
35
data should be applied to both members of the pair, or, better still, the data should be subsetted based on the
average value of a pair. In Figure 3.7 the reason for bias if the threshold is applied to the data of only one lab is
obviousvalues below the threshold are allowed for one lab but are preferentially excluded for the other.
Figure 3.8: Histogram of differences in gold analyses of two independent samplings of piles of
blasthole cuttings, Silbak Premier gold mine, British Columbia. The tabulated data are sufficient to
conduct a paired t-test.
This example is modified from Sinclair and Bentzen (1998) and their paper can be examined for more
detail. A histogram of differences for duplicate blasthole samples analyzed for gold, is plotted in Figure 3.8. For
these data (n = 125, one extreme outlier has been omitted) the mean difference is 0.397 g/t Au with a standard
deviation of 3.04. This information provides a t-value of
t = (0.397 0)/(3.04/1251/2) = 1.46
The t-value is well below the critical value (e.g., Walpole and Myers, 1978) of 1.96 so no global bias can be
identified. An equivalent result is obtained if 3 influential values are omitted
A second useful test can be applied to the data. One can count the number of paired samples that plot on
each side of the y = x line. In an unbiased sample roughly equal numbers of data should plot on each side of the
line just as with a large number of coin tosses one expects heads to appear about half the time. In general, the
number of positive differences gives the number of samples on one side of the line; clearly, all remaining
samples are on the other side of the line (except for equal valued pairs). For example, if there are a total of 122
data points (or coin tosses) the expectation is that 61 of them will, on average, plot on one side of the y = x line
and 61 will plot on the other side. The sampling distribution of this mean value is binomial (closely
approximated by a normal distribution) with a standard error of (npq)1/2 where n = 122, p = 0.5 (the proportion
of positive values) and q = 0.5 (the proportion of negative values). Consequently, the 95% confidence limits for
a mean value where n = 122 is 61 + 2(npq)1/2 = 61 + 11. For the Silbak Premier blasthole data there are 78
positive differences in the data set of 122 values, a number that is well outside the expected limits of 50 to 72.
Hence, we can conclude that there is an abnormal distribution of plotted values relative to the line y = x (and
probably a small bias) even though the paired t-test was not able to identify a bias.
A second example is illustrated in Figure 3.7 involving original and check Ag analyses for Equity
Silver mine. Of the 23 values, 21 plot on one side of the y=x line. Because of the distribution of values, the ttest does not indicate bias. However, the chances of getting such an extreme distribution of data about the y = x
line is highly unlikely. Confidence limits on the expected value of 11.5 are +2(npq)1/2 = 2 (23x0.5x0.5)1/2 =
35
36
+4.8. Hence, 95% of the time, the distribution about the y = x line should be no more extreme than 11.5 + 4.8,
i.e., 6.7 (say 6) and 17.
Figure 3.7: Check of mine analyses for silver by an independent lab, Equity Silver mine. C =
concentrate; O = ore; T = tails.
(3-7 )
where x and y are the duplicate (paired) analyses, bo is the y-axis intercept by the RMA linear model and b1is
the slope of the model. For a set of paired data, b1 is estimated as
b1 = sy/sx
(3-8)
where sx and sy are the standard deviations of variables x and y, respectively, and bo is estimated from
36
_
_
bo = y - b1x
(3-9)
_
_
where y and x are the mean values of y and x, respectively. Commonly we are interested in whether or not the
fitted line (model) passes through the origin because, if not, there is clearly a fixed bias of some kind. The
standard error on the y-axis intercept, so, is given by
so = sy{([1-r]/n)(2 + [x/sx] 2[1+r])}1/2
(3-10)
where r is the correlation coefficient between x and y. If bo + 2so contains the value zero then the calculated
value is a likely sampling variation of a true value of zero.
Similarly, we commonly want to know if the fitted line has a slope of 1.0, i.e., the paired values, on average, are
equal. The standard error on the slope is
ss1 = (sy / sx )([1 - r2]/n)1/2
(3-11)
If the range ssl + 2ssl includes 1.0, then the calculated value of the slope is not statistically distinguishable from
the value 1.0 at the level = 0.05. Note that if we are checking for the coincidence of a fitted line with the y
= x line then we must be able to show both that the intercept is a likely sampling outcome of an average value
of zero and that the calculated slope is a likely sampling outcome of a slope of 1.0.
The dispersion Sd about the reduced major axis is
Sd = {2(1-r)(sx2 + sy2)}1/2
(3-12)
(3-13)
where sxp and syp represent the average precision of x and y respectively as one standard deviation. It follows
that if x and y represent the same conditions (i.e., same lab and same methodology such that sxp and syp are
estimates of the same precision) the average error of the procedure (as a variance), savg2, can be determined as
follows:
savg2= Sd2/2
(3-14)
These errors can be taken as normally distributed (cf. Miller and Kahn, 1962) The dispersion about the RMA
line can be used in several practical comparisons including (1) the comparison of replicates of several standards
by one laboratory with replicates of the same standards by another laboratory; and (2) the comparison of interand intra-laboratory paired analyses for routine data spanning a wide range of values. Similarly, it follows that
where x and y values represent two different labs (a common situation with much paired data in the mining
industry), the average precision determined from such duplicate data is not the precision of either lab but is a
form of average precision of the two labs. Such paired data, by themselves, contain no basis for determining the
average precision of either of the two labs in question.
Because the errors are normally distributed, the estimate of average precision (one standard deviation)
can be used to estimate the average, absolute average difference between pairs of values, mad, as follows:
mad = 0.8 savg
(3-15)
37
The average absolute difference is a useful parameter because it quantifies the differences to be expected
between pairs of duplicate data.
Of course, it is relatively easy to calculate the mean absolute difference directly from a set of paired data;
however, equation 16 turns out to be an extremely good estimate in practice and is easily obtained by
calculation.
38
(one standard deviation) and a mean absolute difference of 2.33 g/t. Blasthole samples are unbiased but have a
very large random error.
Close inspection of the diagram reveals that the dispersion of data is not uniform throughout the range
of the data. Specifically, for lower grade data the dispersion about the line is much less than for higher grades.
Consequently, formal statistical tests, as done above for illustration, are not appropriate; the data must be
subsetted (subjectively) and the linear model for each subset considered individually (see Figure sinben-1.bmp).
Note that such subsetting provides a better description of the paired data because examination of the scatter plot
for the entire data set clearly shows that the higher grade values are substantially more dispersed than are the
lower grade values.
Figure 3.8: Scatter plot of 122 duplicate blasthole samples analyzed for gold (g/t), Silbak Premier gold
mine, Stewart, B. C. Reduced major axis (RMA) and two traditional least squares linear models are
illustrated. Note the limited scatter of low values and the much greater scatter of high values,
suggesting the presence of two data subsets that should be investigated separately.
TABLE 3-1: PARAMETERS OF VARIOUS LINEAR MODELS SHOWN ON FIGURE 3.8
Model
y-variable
intercept
and error1
slope
and error1
RMA3
AUD
-0.173(0.324)
0.9515(0.057)
4.12
TLS3
AUD
0.677(0.191)
0.708(0.035)
2.66
AU
1.028(0.192)
0.782(0.038)
Values in brackets are one standard error
2
One standard deviation (corrected from Sinclair and Bentzen, 1998)
3
RMA = reduced major axis; TLS = traditional least squares
dispersion
about line2
2.80
39
The data of Table 3-1 are sufficient to do a quantitative analysis of the relation between the duplicate samples
plotted in Figure 3.8. Consider TLS first; in both cases the y-intercept cannot be distinguished from zero
because the range +2s about the estimated intercept includes zero. On the other hand, in both TLS cases the
slope is statistically different from 1.0 because the estimated value +2s does not include 1.0. Consequently,
both TLS models lead to an interpretation of bias between the two samplings, interestingly, in opposing
direction; that is, in one model y is biased high, in the other x is biased high. These conflicting results should
clearly emphasize the error in using a TLS linear model in dealing with paired quality control data.
The RMA model can also be tested is the same manner as the TLS model; the intercept is seen to be
indistinguishable from zero and the slope is indistinguishable from 1.0. The dispersion can be used to estimate
the total error, st, as follows:
st2 = (4.12)2/2
At an average grade of 7 g/t, this error translates into a sampling/analytical precision of 200 x 2.91/7 = 83%
(Note that precision as a percent varies with concentration). This precision is an average for the data of Figure
3.8 and is influenced strongly by the very wide scatter of the higher values (note the different symbols on
Figure 3.8 that subdivide the data into two groups with obvious different scatter). Consequently, the
calculations above are simply to illustrate procedure. It would be more appropriate to subdivide the data as
illustrated in Figure 3.8 and treat each of the two subgroups separately in the same manner as above as
illustrated in Figure 2.6. Note that the absolute mean difference of the sample pairs can be estimated as 0.8 x
st = 2.3. This value simple means that the average (positive) difference between paired sample values is 2.3 g/t;
some differences will be larger, some smaller.
40
SECTION 4
4.0 PRACTICAL MEASURES OF SAMPLING AND ANALYTICAL ERRORS
4.1 The nature of errors
Errors fall naturally into two categories, random and systematic (biased). Random error is the moreor-less symmetric dispersion (spread) of individual measurements about a mean value; systematic error or bias
occurs where a set of analyses depart, on average, in a regular manner from the corresponding true or reference
metal contents. In sampling and assaying random error always exists and its average magnitude is relatively
easy to quantify. Systematic error might or might not be identifiable in a particular set of data and although
obvious in some cases, commonly will have to be identified by statistical test. The distinction between random
and systematic error is illustrated in Figure 4.1 where accuracy is shown by values (dots) that cluster about the
center of the target and bias is demonstrated by data clusters whose centers are removed from the target center.
In Figure 4.1 it is evident that a set of analyses can be (A) imprecise and inaccurate, (B) imprecise and
accurate, (C) precise and inaccurate, and (D) precise and accurate. Consequently, in properly understanding
errors in a data set, we must distinguish and quantify both the degree of accuracy (cf. bias) and the degree of
precision (ability to reproduce values). A more tradition view of accuracy and precision is illustrated in Figure
4.2. Both of these images of accuracy and precision are limited in that they provide a global view whereas,
practice has amply demonstrated that the nature of error can differ widely as a function of concentration.
Obvious explanations for this include the use of different methodology for different sample types, and the
presence of different styles of mineralization as an approximate function of concentration (e.g., low grade
disseminations versus high grade veinlets in gold deposits).
Figure 4.1: Graphic illustration of accuracy and precision. (A) imprecise and inaccurate, (B) imprecise
and accurate, (C) precise and inaccurate, (D) precise and accurate.
41
Figure 4.2: Types of error in measurement data, random and systematic errors, each with narrow
(good quality) and wide (poor quality) precision. is the true population value that is being estimated
by the mean of a distribution of values shown by the symmetric curves (analogous to histograms).
The great majority of assay results are subject to random errors that can be described by a normal
distribution. As an example, consider a histogram of 29 analyses of Canmet standard CH-3 (reported value =
1.40 g Au/tonne) by one check laboratory, to which has been fitted a normal curve (Table 4.1 and Figure). The
normal curve and the histogram have the same mean and standard deviation (m = 1.38 g Au/tonne; s = 0.11). In
this example, the distribution of replicate analyses is well-described by a normal distribution. In fact, in the
great majority of cases error can is distributed normally. Exceptions to this situation, rare and generally
recognizable for assay data of mineral deposits, are summarized in Table 1 (Thompson and Howarth, 1976a).
TABLE 1: SOURCES OF NON-GAUSSIAN (NON-NORMAL) ERROR DISTRIBUTION*
2.
The sample is heterogeneous, the analyte being largely or completely concentrated in a small
proportion of the particles constituting the sample, e.g., tin as cassiterite in sediments.
3.
The precision of the (analytical) method is poor, and the calibration is intrinsically non-linear, e.g., in
the region of the detection limit of spectrographic methods, where the calibration is logarithmic.
4.
The concentration levels are within an order of magnitude of the digital resolution of the instrument.
For example, lead concentrations determined by atomic-absorption spectroscopy are commonly recorded as
integer multiples of 0.1 g ml-1 with no intermediate values. The final values, referring to the original samples
(after multiplying the instrumental value by a factor) take only discrete values, such as 0, 5, 10, 15,..ppm.
This custom produces a discontinuous frequency distribution of error.
5.
The concentration levels are near the detection limit, and sub-zero readings are set to zero. Alternately,
readings below the detection limit are set to the detection limit or recorded as less than..In this connection,
it is worth emphasizing that, while the idea of negative (or even zero) concentration has no physical
significance, a negative measurement of concentration is feasible and, when considered statistically (i.e., as an
estimate with confidence limits), meaningful.
6.
The data set contains wild results or fliers. These values can be distinguished conceptually from
ordinary random variations as arising from mistakes or gross errors in procedure. In short, they really belong to
a different population of results.
*Summarized from Thompson and Howarth (1976a)
Errors in assay data are more complex than the simple concept of Figures 4.1 and 4.2. A general model
for considering sampling and related errors based on duplicate analyses is summarized in Figure 4.3 (after
Sinclair and Blackwell, 2002). This model is concerned with paired data values where the expectation is that
42
one set of values will, on average, be reproduced by a second corresponding set, providing there is no bias
between the two sets. For example, we might be comparing two analyses for each of 127 pulps; or, assays of
one half core might be compared with assays of corresponding second halves of core, etc. In each of these cases
we do not expect the paired values to be identical because of inherent errors but we generally hope that in a
large data set the paired values will be identical on average. The term identical on average means that
differences for pairs will have an average value of zero, which implies that no bias is evident.
Figure 4.3: A simple linear model to describe errors in paired data as a function of composition.;
Scales for x and y are equal interval. (a) Dispersion of paired data about the y=x line results from
random error in both x and y, always present. (b) Proportional bias plus random error produces a
linear trend through the origin with slope different from 1.0. (c) Random error plus a fixed bias
produces a line with slope =1.0 and a nonzero y-intercept. (d) A general model incorporates random
error, fixed bias and proportional bias to produce a linear array of plotted data with slope different
from 1.0 and a nonzero y-intercept. After Sinclair and Blackwell (2002).
If no bias is present in a graph of paired data, the data points will be scattered about the y = x line. This
scatter is the random error inherent in the paired data. In some cases the paired data points do not scatter about
the y = x line, but are displaced from that reference line. Two idealized situations are common,
(1) scatter about a line that passes through the origin but is not parallel to the y = x line (proportional bias as
shown in Figure 4.3b), and
(2) scatter about a line parallel to the y = x line but not passing through the origin (fixed bias as shown in
Figure 4.3c).
Of course, where bias exists, it can be a combination of both proportional bias and fixed bias, and a more
general model is required to explain the distribution of paired data as illustrated in Figures 4.3b and 4.3d. These
models all incorporate straight line relations on a graph of paired analyses. In rare cases, other mathematical
models than straight lines might be appropriate. In general, however, in modelling duplicate assay data, there is
no reason to suspect anything other than a linear model and the scatter of data is such that small departures
from a linear model generally cannot be identified with confidence.
One complication that does occur commonly, is the fact that very different error patterns arise for
different concentration levels. For example, in gold deposits, a background, disseminated mineralization can be
sampled in a representative fashion much better than can higher grade, more erratically distributed gold-bearing
veins. The two styles of mineralization (disseminated vs. erratic veins) can have very different character to their
errors. Similarly, for many metals, low-grade and high-grade values might be assayed by different methods and
thus have very different character to their errors. It is evident that there are good reasons for data to be divided
43
into subsets, each of which should be analyzed separately for characterizing errors. From a purely practical
point of view, data might be subsetted on the basis of abundance/density of data in various concentration ranges
simply because low values are very much more abundant than are high values so that average errors inherent in
linear models for the entire data set are not representative of either high or low values. In brief, subsetting of
data for error analysis might be necessary because of
(1) different styles of mineralization,
(2) different sampling and/or analytical methods for different concentration ranges, and
(3) very different amounts of data for different concentration ranges.
Figure 4.4: Idealized examples of patterns exhibited on scatter plots of paired quality control data
incorporating sampling and analytical errors. (a) Random error plus outlier. (b) Two different random
errors as a function of composition, perhaps resulting from different analytical methods or different
styles of mineralization. (c) Random error plus proportional bias at low values, only random error at
high values, perhaps resulting from errors in calibration of standards. (d) Difference in random error
as a function of concentration, perhaps arising from disseminated versus nugget styles of
mineralization. (e) Difference in random error as a function of concentration, plus bias in the high
valued data group, possibly resulting from segregation during sampling or subsampling by one of the
operators involved in obtaining the paired data. (f) Proportional bias such as might arise by incorrect
calibration of a standard that was then diluted to form standards of lower concentrations. After
Sinclair and Blackwell (2002)
The simple linear model to describe error in paired data is widely used in the mining industry and
elsewhere and has the advantage of being both easily understood and easily implemented for quantitative
determination of errors. In general, it is not adequate to fit a linear model to data by eye. Instead, an appropriate
statistically sound model fitting procedure must be used, for example, an appropriate variation of what is
44
known as least squares procedures. For modeling described here the use of traditional least squares fitting
procedures, in which all the error is attributed to one of the variables being compared, is not only inappropriate
but is incorrect and can lead to serious errors as we have seen in the previously described example for blasthole
sampling at Silbak Premier gold mine (Sinclair and Bentzen, 1998).
Figure 4.5: Tube-sampling Ag analyses (g/t) of 42 blasthole piles, Equity Silver mine, versus best
weighted value of the Ag content of the piles. Note the dispersion.
Consider a practical example of the application of the simple linear model from Giroux and Sinclair
(1986), illustrated in Figures 4.5 and 4.6. The example involves 42 piles of blasthole cuttings that were sampled
by two methods (tube and channel) and then sampled in their entirety. Ag assays were obtained for all 3 types
of samplings so that a weighted average grade (best) could be calculated for each cuttings pile. The two
sampling methods can then be compared individually with the best value. Because the abscissa is the same in
both cases, any difference in scatter about the best fit line can be attributed to differences in error between the
two sampling methods. The dispersion (standard deviation) for tube sampling is 26.2 g/t (Figure 4.5) whereas
for channel sampling it is 3.8 g/t. Clearly, there is substantially less scatter in the case of channel sampling and
one can conclude that channel sampling produces better results than does tube sampling. In addition, in both
cases the statistical data indicate there is no bias, only random error because the y-intercepts cannot be
distinguished from zero and the slopes cannot be distinguished from 1.0.
45
Figure 4.6: Channel sampling Ag analyses (g/t) of 42 blasthole cuttings piles, Equity Silver mine,
versus best weighted value of the Ag content of the piles. Compare dispersion with that of Figure 4.5.
(4-1)
Consider a simple example involving replicate analyses of a standard as summarized in Table 4-1. Both labs A
and B are slightly low, on average, for Authe reported value for CH-3 is 1.40 gpt. The average apparent
bias for Lab A is 1.368 1.4 = -0.032 gpt or 2.29%. The average apparent bias for Lab B is 1.378 1.4 = 0.022 gpt or 1.57%. Both of these apparent biases can be tested formally to determine if they are simply
examples of random error of a zero bias situation, or, are real. In either case, the amount is small and is within a
range that is generally acceptable in routine analyses by a commercial lab. In the case of Lab C, the average Cu
content is 0.829%, virtually identical with the reported value of 0.83%. Recall that precision can be determined
from the relative error, simply by multiplying by 200. Note the roughly comparable precisions for Au for Labs
A and B and the very much better precision obtained by Lab C for Cu (compared with Au). Precisions can
differ markedly for different metals even within the same standard! Traditionally base metals are estimated with
better precision than are precious metals.
46
BAugpt*
1.46
1.48
1.4
1.61
1.5
1.25
1.45
1.28
1.39
1.23
1.31
1.44
1.27
1.34
1.26
0
0
0
0
0
0
0
0
0
0
0
0
0
0
CCupct*
0.83
0.82
0.84
0.83
0.8
0.83
0.82
0.84
0.82
0.84
0.83
0.82
0.83
0.84
0.84
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1.368
1.378
0.829 Average
0.091
0.112
0.011 Stdev
0.067
0.081
0.013 Rel. Err.
13.4%
16.2%
2.6% Precision
29
15
15
n
*Variable name includes lab designation (A, B, C), metal (Au, Cu)
and units (grams per ton or percent). A zero value indicates not analyzed.
(4-2)
47
Note that the differences themselves can also be used to test for global biasthe average difference should not
be significantly different from zero or else bias exists.
4.4.1 Assumptions
The Thompson-Howarth method of quantifying error assumes (i) a normal distribution to the error and
(ii) that the relative error (i.e., s/m) in sampling and analysis changes as the true average grade increases
(Figure 4.7). The assumption of normality of error distribution is generally, but not always, met (see Table 1.1).
The assumption regarding change in error as a function of concentration, taken as a linear relation of absolute
error versus concentration, is more problematic. In general, experience demonstrates that average absolute error
does increase as a function of increasing concentration. However, it is also common that assay data are obtained
using different sampling procedures, different labs and different analytical approaches even within the same
lab, all of which contribute to complicating the pattern of error versus concentration if such data are considered
together.
The Thompson-Howarth method is restricted to a quantification of random error as a linear function
of metal concentration, bias is ignored and, in fact, is obscured by the method; consequently, some other
method must be used to test for bias. If bias is present in a set of paired data, the Thompson-Howarth method is
an inappropriate means of quantifying error. In addition, the Thompson-Howarth method is most appropriate
for data from the same laboratory, hence, it is generally an unacceptable method for comparing data from two
different labs. The reason is that any error quantified by the T-H method is an average of the errors of the two
components of paired data. Where the data represent a single lab, the T-H method produces an estimate of the
average random error for the particular sampling and/or analytical protocol used by that lab; where the data
represent two different labs, the T-H method produces an estimate of the average error of the two labs (possibly
including a component for bias) whereas, what we really want to know is the error of the principal lab
producing the data in question and the presence or absence of bias between the two labs. Of course, where bias
is demonstrated between two labs, additional information is required in order to ascertain which if either lab is
correct.
For a set of paired (duplicate) data, determine the mean concentration of each pair [(x1 + x2)/2] and the
corresponding absolute difference in concentrations (|x1 x2|).
2.
Arrange paired data in order of increasing concentration, using means of pairs. A minimum of 50 pairs is
recommended for geochemical data.
3.
Divide the full data set into successive ordered groups of 11 pairs (for geochemical data). That is, the first
eleven pairs in the ordered set is group 1, the second 11 pairs is group 2 and so on. The authors have found
48
that for assay data (generally more accurate than geochemical data) as few as 7 pairs per group is
commonly adequate.
4.
For each group find the group mean value (i.e., mean concentration of the 11 mean values) and the median
value of pair differences (i.e., the middle value of the absolute differences).
Figure 4.7a: From Thompson; and Howarth (1973). Illustration of the T-H method of estimating
precision for geochemical duplicate data. Vertical dashed lines divide the data into subgroups of 11.
For each subgroup individual data items are shown as open circles and an average of each group
(black circle) is the average grade of the 11sampale pairs versus the median difference of sample
pairs. A line is fitted to the average values representing average error as a function of concentration.
See text for details.
5.
Plot the coordinates of these two parameters on an xy graph and pass a straight line through the points.
Because there are so few points it is generally unwise to use a least squares fitting procedure (unusual line
equations can result), but is better to fit a sensible line by eye.
Examples are shown in Figure 4.7a and Figure 4.7b. The example of Figure 4.7a illustrates a least squares line
fitted to the median/mean values for each of the groups of 11 analyses (separated by vertical dashed lines) for a
very well-behaved set of data. Generally, it is not wise to use a least squares fitting routine because with a small
number of points controlling the line (9 in the case illustrated) a single point that is out-of-line can lead to a
peculiar and unrealistic equation. Consequently, it is preferable to fit a line to the median/average values by
eye. The problem arises in Figure 4.7b where only 5 points control the least squares lineit is evident that the
four lowest points describe a line substantially different from the least squares line illustrated; and that the
highest point has an adverse effect on the model for error. In this latter case is seems that a single linear model
49
does not represent the data well, a situation that can be taken into account easily manually but which cannot be
dealt with in a computer routine involving a single linear model.
Figure 4.7b: Sampling and analytical errors of gold analyses as a function of composition based on
55 duplicate samples from the J&L massive sulphide deposit, southeastern B. C. Ordinate is standard
deviation of replicate analyses (or absolute difference of paired values); abscissa is average
concentration of replicate analyses. Filled squares are individual samples that have been replicated;
+ signs are median differences (or standard deviations) versus averages of successive groups of 11
sets of sample replicates. A linear model has been fitted to the + signs to describe error as a function
of composition. From Sinclair and Blackwell (2002).
50
The conceptual model of Figures 4.3 and 4.4 make use of a straight line describing the data.
Commonly, such a line has been generated as a statistical best fit using the method of least squares. Sinclair and
Bentzen (1998), among others, recommend that the particular linear equation used is to define the model should
be the reduced major axis solution, which has been described previously in the course.
The reduced major axis (RMA) linear model combines a standardization of the two variables (i.e.,
effectively divides each value by the standard deviation) and a major axis, least squares solution to determine
the statistical parameters of the linear equation. This procedure avoids any concern about widely different
errors in the two members of the pairs. An RMA regression is desirable when it is important that errors in both
variables be taken into account in establishing the relation between two variables. Many other writers including
Agterberg (1974), Till (1974) Davis (1986) and Miller and Kahn (1961), discuss the use of the reduced major
axis solution in the context of analytical data.
The simple linear model has been used to illustrate much of this course material to this point. Consider
a simple example to illustrate the interpretive procedure. The data are 74 pulps that were analyzed a second
time for platinum group elements by the same sampling and analytical protocol as a means of examining
analytical variability as a function of concentration. Let us follow through a detailed examination of the
resulting paired analyses for Pt using output from the P-res program.
1.
Examine the entire data set on an x-y plot . Note the presence of one extremely high outlier (224200,
237300 ppbagreement within 7%). The presence of the outlier strongly skews the linear model and
related statistics, to the point that statistics are meaningless relative to the great mass of data. It is important
to view the data with outliers removed in order to allow a fair interpretation of the bulk of the data.
2.
Remove the high outlier, for example, by setting lower and higher limits to the data that are displayed on
the x-y plot, and view the remaining data on an expanded scale . Note a group of the 8 highest values (Pthi)
that are widely dispersed along the concentration axes, and the remaining lower data that cluster along the
line near the origin (Ptlo). Form subsets of the two groups of data by fencing the Ptlo values (with P-res)
and moving them to a new symbol.
Figure 4.8: Scatter plot of eight highest platinum values except for one outlier removed. See text.
51
3.
Examine the high Pt group alone by turning off the symbol for Ptlo (Figure 4.8). Note that the linear model
for the 8 high values indicates that the slope is different from 1.0 and the intercept is different from zero,
implying the presence of bias. This would seem to indicate that the two sets of results are not in good
agreement. However, note that the intercept is positive and the slope is less than 1.0 so the two effects
partly compensate for each other. In fact, examination of the plotted points shows that the lowest 6 are well
described by the y = x line; and the upper two are below the line and control the model, imposing both the
low slope and the positive intercept on the model. The two highest samples do not represent a satisfactory
representative sampling of values near 2000 ppb so we can conclude that there is no evidence of bias. As a
further test we could conduct a paired t-test. The mean value of original minus duplicate is 23.25 ppb
with a standard deviation of 74.46, giving a standard error of 74.46/8 = 26.3. Zero is contained within the
95 percent confidence limits of the mean, indicating that no bias is evident.
Figure 4.9: Scatter plot of 65 paired low platinum assays for pulps. See text.
4.
5.
Examine an expanded plot of the low Pt data by centering on the cluster and zooming in (Figure 4.9). The
plot shows the RMA line very close to the y = x line. The statistics of the RMA line indicate that the slope
is slightly but significantly different from 1.0 (a 3% bias is indicated) and the intercept cannot be
distinguished from zero. Consequently, we conclude that there is a small bias between the two sets of data,
of an amount that is commonly encountered both between and within labs.
In summary: a single high outlier is reproduced satisfactorily, a group of relatively high Pt values (100 to
2400 ppb) is reproduced acceptably with some indication that values near 2000 ppb should be closely
monitored, and low Pt values (1 to 100 ppb) are duplicated satisfactorily despite the presence of a 3% bias
between the two sets of data.
Note that in this interpretive procedure, the simple linear model has been applied independently to each of two
subgroups of data. The subgroups are defined on the basis of data density.
Consider the Ag analyses of Figure 4.5, in particular, the top ten values reproduced in Figure 4.10. These data
are well fitted by a linear model, the parameters of which are shown on the figure.
52
There are situations where it is inappropriate to apply a linear model to duplicate analyses, in
particular, where there are very few pairs of duplicate data. Some approaches to dealing with such situations are
summarized in the following table from Stanley and Lowie (2007) which shows the close relation of a variety
of formulae that have been used in the past.
Measuremen
t
Coefficient
of Variation
Conceptual Formula
CV =
Relative
Precision
RP =
Relative
Variance
2
RV = 2
Absolute
Relative
Difference
ARD =
x1 x2
Relationshi
p with CV
CV =
2 x1 x2
2 ( x1 + x2 )
1 n 2 x1i x2i
CV =
n i =1 2 ( x1i + x2i )
RP =
4 x1 x2
2 (x1 + x2 )
1 n 4 x1i x2i
RP =
n i =1 2 ( x1i + x2i )
2
(
x1 x2 )
RV = 2
(x1 + x2 )2
ARD = 2
x1 x2
(x1 + x2 )
Half
x x
1 x1 x2
HARD = 1 2
HARD =
Absolute
2
(x1 + x2 )
Relative
Difference
From Stanley and Lawrie (E&MG, v. 6, 2007)
CV
2 CV
2
1 n ( x x2i )
RV = 2 1i
n i =1 ( x1i + x 2i )2
1 n x1i x2i
ARD =
2
n i =1 ( x1i + x2i )
CV 2
2
2 CV
2
CV
2
1 n x1i x2i
HARD =
n i =1 ( x1i + x 2i )
53
SECTION 5
5.0 SOURCES OF ERRORS: SAMPLING, SUBSAMPLING AND ANALYSIS
5.1 Introduction
The quality of assay data is irrevocably tied to the design and implementation of appropriate sampling
and subsampling procedures or protocols. Samples are, first and foremost, to be taken in a fashion such that
they are fair representations of larger volumes of in situ rock or fragmental ground (Vallee, 1998). Then the
samples must be treated in a manner that maintains the integrity of the metal content throughout sample
handling and eventual analysis. Duplicate sampling data generally do not provide a clear indication of the
representativeness of samples to a much larger volume of rock; that test is best made with a bulk sampling
program. Duplicate cores (adjacent/facing half core samples) do provide insight to very short range variability
(the nugget effect) but give no indication of the relation of more widely spaced (5, 10, 50 meters) samples; thus,
the duplicate core data do no more than provide a minimum insight of the variability that exists between
samples and much larger volumes of ore. In practice, this larger range variability is best evaluated by bulk
sampling although, if a semivariogram model is known, geostatistical methods are available to examine such
variability theoretically.
5.2 Sampling
The term sample can mean different things to different people, particularly statisticians and
geologists/miners. A sample to a statistician generally means a relatively large number (n) of items for which
some quality has been measured (e.g., 283 assay values for 2m half core samples). On the other hand, to a
geologist a sample is the amount of rock or fragmental material that has been taken for purposes of analysis to
determine an estimate of metal contentthe metal content is one item in a statistical sample. In the former
meaning a sample is many values, in the latter, a single value. Statisticians use terms to describe various
distribution patterns to items (geological samples) that make up their samplerandom, random stratified, etc.
There is a tendency for practical geological sampling to be classed as a modified, regular sampling pattern
because there commonly is an underlying regularity (e.g., drill holes located at 50m centres) modified by some
additional irregularly located drill holes that have been positioned for a variety of reasons (e.g., to establish
rock characteristics, metallurgical character, geological continuity, etc.).
Samples or rock material of the order of 0.5 to 5 kilograms, are taken routinely by a variety of
procedures, to represent a very much larger mass of material, commonly 10,000 to 100,000 times larger than
the samples themselves. This general situation is true of much routine work in the mining industry.
Consequently, samples must be taken in a manner that is as representative as possible, and, of course, each
sample must be treated subsequently in a manner that maintains the integrity of the overall metal content as the
sample is reduced in quantity to an amount that is amenable to chemical or instrumental analysis.
54
Sampling methodology is not a major topic for this course, but is the essential first step in a series of
undertakings designed to produce an assay value that is as representative of a larger mass of rock, as is
reasonably feasible. Consequently, errors implicit in the sampling procedure must be as small as can be
reasonably expected. Sampling error can only be monitored if a sampling program involves the analyzing of
duplicate of samples, duplicates of subsamples and duplicate pulps, preferably all taken at the same time as the
original samples. Duplicate samples are taken through the subsampling and analytical procedures
independently of the original samples so contain errors from both subsampling and analysis as well as the
fundamental sampling error itself. Each of these errors can be expressed as a variance and their sum represents
total error in a data setrecall the fundamental relation of errors, as follows:
s2total = s2sampling + s2subsampling + s2analytical
Sampling techniques and related quality control aspects are discussed by Vallee (1998) and Sinclair
and Blackwell (2002); Vallees classification is summarized in Table 5-1.
TABLE 5-1: CLASSIFICATION OF SAMPLE TYPES
Sample Type
Description
Point
Linear
Linear samples have one very long dimension relative to the other two; they
include channel sammples, linear chip samples, drill hole samples (core and
cuttings). They commonly range from 0.5 to several kg.
Panel
Panel samples (or planar samples) are made up of multiple chips or fragments
collected from a surface such as the wall or roof of a drift, raise or stope They
commonly range from 1 to 5 kg.
Broken Rock
Rock fragments (muck) from surface trenches, drifts, slashes, raises, etc. Such
samples are commonly collected during exploration, deposit appraisal and mine
development. The source mass may vary from hundreds of kilograms to
hundreds of tons but generally the samples collected are miniscule in
comparison, varying between 2 and 10 kg.
Large Volume
Bulk and trial stopes are examples of large volume samples that commonly
range from hundreds to thousands of tonnes. Bulk samples often are obtained
from various surface or underground workings.
55
56
assay. Generally, acceptable quality control procedures demand that the core be split along its axis with one
half being retained for future reference and the second half forming a sample for assay. The use of whole core
for samples should be avoided if possible. (Highly specialized sampling might include the assaying of entire
core as in the case of drilling purely for metallurgical purposes.) A range of core dimensions can be obtained.
Vallee (1998) indicates that BQ core size (36mm diameter) has replaced AQ (26mm diameter) as the industry
standard in Canada despite higher cost, because the larger diameter provides better core recovery and a larger,
more representative sample. Of course, in any particular deposit evaluation, what is considered an acceptable
core size depends very much on the rock character and grade variability in the deposit.
The smaller the core diameter, the greater will be the grade variability of half core samples;
consequently, the greater will be the variability between two facing half core samples using small core
diameter, relative to larger core diameter. The sampling variability between facing half cores can also be
increased somewhat where manual splitting gives widely varying masses to the two, so-called half core
samples (i.e., the two so-called half cores differ substantially in mass). This source of variability can be
overcome by using a diamond saw to cut the core along its axis to provide two half cores of equal size.
Variability of core sample assays can be decreased by increasing the mass of core in the sample. This
can be achieved by (1) using the entire core rather than half core samples (generally undesirable); (2) increasing
the length of core that comprises a sample (perhaps by constructing composites of shorter assay lengths), and
(3) increasing the diameter of the core recovered by drilling. For the most part, half cores should be retained.
The selection of sample length should be considered early in the exploration of a deposit. In general, during
early stages of exploration of a deposit, a shorter sample length can be desirable in order to understand grade
variability as a function of geology. Later, samples can be longer and the earlier, shorter sample values can be
combined (composited) to an equivalent length. Core size (diameter) can be controlled by the necessity of
having a high core recovery to provide faith in resulting assays. Otherwise, core diameter can be selected to
provide samples (normally half cores) that adequately represent the deposit.
Core recovery can be gauged approximately by a visual estimation during core logging. A preferred
technique is to weigh the core for a given length of drilling and compare the weight with perfect recovery. This
procedure requires that an appropriate bulk density be known. Note that such a procedure could occasionally
lead to calculated recoveries in excess of 100 percent (this arises from the use of average bulk density values
for samples that are substantially more dense). Where core recovery is low, assay values become suspect. What
an acceptable lower limit to core recovery will be, is dependent on the mineralogical characteristics of the core.
Soft and/or cleavable minerals can be lost preferentially due to impact and grinding of core pieces during
drilling. Those soft minerals might be either ore or waste. In some such cases both core and sludge can be
assayed and a weighted assay produced. Of course, this combined assay has a different support than do assays
based on half cores. It is useful in some cases to weigh both sludge and core to determine if, together, they
represent an adequate level of recoverythe procedure, however, is relatively costly in man-hours and may
require settling contains of a size not easily obtained.
57
quality control such as the use of Gys equation for fundamental sampling error. Application of Gys formula
helps to ensure that a sample is representative of a very large mass of cuttings. Proper application of Gys
fundamental formula implies adequate sample homogenization prior to each mass reduction stage.
Of course, cuttings piles are prone to stratification because of the way in which they accumulate, a
stratification that is the reverse of the true stratification in the hole. For example, if the bottom of the hole is
relatively rich in heavy sulphides then the top of the cuttings pile will be similarly enriched. This stratification
must be considered in adopting an appropriate sampling plan for cuttings piles, perhaps by riffling an entire
cuttings pile as part of the sampling protocol, or by taking a slice or pie-shaped segment of the pile. The widely
used tube sampling procedure (see Sinclair and Blackwell, 2002) generally, at best, will have a large random
error.
Figure 5.2: Surface view of a conical pile of blasthole cuttings sampled by 4 scoops (xs) from a tube
and 4 channels. Sampling is adapted to the asymmetry of the pile. The black circle is the blasthole.
Figure 5.3 compares results of a sampling experiment involving duplicate sampling of blasthole
cuttings piles (Figure 5.2). Tube sampling and channel sampling assays are compared with a best value
(weighted average of tube sample, channel sample and remaining material). Clearly, the dispersion about the y
= x line is less for the channel sample relative to the tube sample. In other words, the random error of channel
sampling is less than the random error of tube sampling.
58
Figure 5.3a: Tube sampling results vs best value, Equity Silver Mine.
Figure 5.3b: Scatter plot of channel sampling results vs. best estimate, blasthole samples,
Equity Silver Mine.
Sampling tests are useful, even necessary, in adopting an appropriate sampling protocol for crushed
material. Figure 5.4 is an example of paired assay data for two different sampling procedures applied to reverse
circulation cuttings. A so-called regular procedure involved directing all cuttings to a cyclone and taking a
1/8th split as the sample, and a so-called total procedure which, in addition to a 1/8th split comparable to that
just described, also collected overflow from the cyclone and weighted that grade into the final assay that was
reported. A superficial examination of Figure 5.4 might lead
59
Figure 5.4: Assay data for two sampling methods for reverse circulation drilling. See text for details.
one to the conclusion that there is good agreement between results of the two sampling procedures. However, 9
relatively high influential values have a strong control on the linear model. Clearly the 9 high values are
unbiased because they scatter either side of the y = x line. However, there is a strong concentration of data (75
of the 84 values) clustered near the origin and it is worthwhile to examine these data in expanded format. That
lower grade cluster is shown in Figure 5.5 where a strong proportional bias (Total samples assay about 26
percent less than corresponding Regular assays, on average) is demonstrated by the reduced major axis line. In
this case the conclusion that seems evident is that gold is not distributed evenly among the various size fractions
of the drill
cuttings; consequently, the Regular (routine) sampling procedure must be replaced by a more representative
procedure.
60
Figure 5.5: Expansion of low grade reverse circulation sample data of Figure 5.4.
5.3 Subsampling
A mineralized sample commonly is a mixture of relatively large but variable-sized, solid fragments
that must be reduced in both particle size and mass to a small amount of finely ground material that is analyzed
to determine metal content. For example, a 2-metre length of half (split) BQ core has an ideal volume of
0.00104 m3 which, for a bulk density of 2.9 g/cc, translates to a mass of about 3000 grams, a quantity that must
be reduced by 2 orders of magnitude (to about 30 grams) for traditional fire assay. This overall procedure,
known as a subsampling protocol, involves a series of steps of alternating particle size reduction and mass
reduction that can be demonstrated by a simple example plotted on a sample reduction diagram (Figure 5.6).
Suppose that a sample consisting of 1 meter of half-core weighs 2700 grams and consists of fragments up to 10
centimeters in length (point #1, Figure 5.6). The sample might be crushed so that the maximum particle
diameter is 0.5 centimeters (point #2, Figure 5.6). Then the crushed material is homogenized and a portion is
taken, say one quarter of the original sample, perhaps by riffling (point #3, Figure 5.6). This smaller portion is
then further crushed and/or ground to a much smaller particle size (point #4, Figure 5.6) and again the material
is homogenized and a fraction is taken, perhaps one-quarter of the material (point #5, Figure 5.6). The
remaining 3/4ths of material (about 2700 x x = 506 grams) at this stage, might be saved as a sample reject
and the one-quarter taken will be further ground to provide a sample pulp (point #6, Figure 5.6), part of which
will be analyzed (point #7, Figure 5.6). Assuming no loss of material during size reduction the amount of
material forming the pulp is x = 1/16th of the original sample or 2700/16 = 169 grams of which 30 grams
(approximately 1 assay ton) normally will be measured out for actual analysis by fire assay. The rejects and
unused pulps commonly are retained for a specified period of time, perhaps several years, in case they are
required for quality control or other purposes. Such rejects and pulps are also available for reanalysis during
audit, due diligence and feasibility procedures or when significant errors are suspected in assay data.
61
Figure 5.6: Hypothetical example of a sample reduction scheme (essential elements of a sampling
protocol) plotted on a sample reduction diagram. The step-like pattern shows alternating stages of
particle size reduction and mass reduction (circled numbers) to eventually end with a subsample of
pulp that will be analyzed. See text for details.
There is an extensive literature on the procedures to be used to optimize the sample reduction
procedures so that the errors (both bias and reproducibility) in the analyses are acceptably small and, hence, a
reported assay value is truly representative of the initial field sample (e. g., Gy, 1979; Francois-Bongarcon,
1998; Ingamells, 1974). Gys approach to designing an optimal sampling protocol is discussed in a separate
section. Regardless of how well designed a sample reduction scheme is, errors, however small or large, exist
and it is essential to monitor the quality of data obtained. Such monitoring provides the basic assurance that a
specific level of data quality is being maintained and allows the recognition of procedural difficulties as they
arise. A number of practical considerations in minimizing and monitoring subsampling errors are given in Table
2.
MS[grams]
ML [grams]
C [g / cm3]
(5-1)
weight of sample
weight of lot being sampled
sampling constant
62
and
d [cm]
s (fraction)
x
Note that where ML is very large relative to MS, 1/ML becomes negligible and the left hand side of the equation
reduces to 1/MS so that equation 5-1 reduces to
MS = Cdx/s2
(5-2)
(5-3)
Where
m, the mineralogic parameter (g/cc) is m = [(1 xL)/xL] [1 xL) x + xLw] where
xL is grade of the valuable mineral (as a proportion rather than percent7% becomes 0.07)
x is the density of the valuable mineral in gms/cc
w is the density of the waste material (i.e., everything but the valuable
mineral)
63
Figure 5.7: A subsampling protocol for a bulk sampling program on the Ortiz gold deposit.
<1
1.0
1-4
0.8
4-10
0.4
10-40
0.2
where d is the mesh size that retains the upper 5% of fragments and do is the effective
liberation size of the valuable material.
f is the form parameter (dimensionless) which is 0.5 for most practical situations except for highly platy
materials such as flakes of gold (for which f = 0.2).
g is the size distribution factor (dimensionless) which for unsorted material is taken as 0.25. For material sorted
into screened fractions the value of g for each screen fraction is approximately 0.5.
The estimation of the sampling constant, C, is fundamental to the practical application of Gys
sampling equation. Values can range over many orders of magnitude. For gold deposits alone, Sketchley (1998)
has shown values ranging over 5 orders of magnitude as illustrated in Table Sketch-C.bmp. Very early in the
history of deposit evaluation when little detail is known about the mineralogic characteristics of the material
being assayed a very crude approach to quality control is to use Gys safety diagram which guarantees a
precision of 5% (one standard deviation) for most materials (many gold deposits excepted). An example is
shown in Figure 5.8 where a safety line is shown defined by
64
(5-4)
Contains the points (0.2mm, 1g) and (2mm, 1000g). In describing the safety line, Gy (1979) states
It is valid for all geological materials such as core samples, with the exception of gold ores, irrespective of the
critical content, with the exception of very low grade ores. This rule is obviously the result of a compromise
between cost and reproducibility. When low cost is regarded as more important than reproducibility, then the
factor 125,000 may be reduced to 60,000. When on the contrary, precision is regarded as more important than
cost, which should be nearly always the case, the factor can be doubled to 250,000. The reader should
understand that what is important is the order of magnitude of the sample weight.
65
It is important to fully comprehend what Gys equation does and does not do. First and foremost, the
equation assumes that optimal homogenization of the sample occurs prior to each stage of mass reduction. In
other words, poor procedures can destroy the validity of the equation. Moreover, there is a common
misapprehension that Gys equation tells us something about the solid, in-place material from which the broken
material was extracted. It must be clearly understood that Gys equation applies to broken and/or fragmented
material and is aimed at optimizing the eventual assay attributed to that broken/fragmented material. Applying
Gys safety diagram or, more specifically, his equation, will not compensate for unrepresentative samples. In
the case of a half core sample, a carefully obtained analytical result tells us nothing about how representative
the value is of the second half of core. In other words, when applied to a sample of broken material derived
from solid rock, the quality of analysis obtained using Gys equation bears no relation to the real sampling
variability within the corresponding solid rock material.
5.5 Analysis
66
In practice, analytical error includes a small subsampling error that arises inherently when a small
amount of a pulp is selected for analysis. Analyzing pulps in duplicate, therefore, is a standard means of
documenting analytical error. The regular use of standard reference materials and blanks is a check on the
presence or absence of bias in the analytical procedures as well as providing specific examples of the random
error inherent in the analytical procedures. Regular analyses of pulps by a second, competent lab, is a common,
useful means of monitoring for analytical bias. It is important to realize that such duplicate data do not provide
a quantitative estimate of precision of either lab, but do provide an average precision of the two labs. The
purpose of this monitoring by an independent lab is to check for bias, not to determine precision. The precision
of a lab can be determined by having that lab, itself, analyze a series of duplicate pulps. Consider a case where
duplicate analyses by two labs (subscripts a and b) have been used to demonstrate an average relative error of
5% for gold values that average 1.5g/t, that is, s = 0.075 g/t and s2 = 0.0752 = .00563. We know that variances
are additive so (sa2 + sb2)/2 = 0.00563, or, sa2 + sb2 = 0.01126. It is an interesting exercise to substitute various
values for one of the variances in this latter equation and examine the results as average precisions for both
labsthis is done in Table 5-3.
TABLE 5-3: CALCULATED PRECISIONS FOR VARIOUS ERROR DISTRIBTUIONS IN
THE RELATION sa2 + sb2 = 0.01126
sa
sa/m
Pra
sb2
sb
Prb
sa 2
Sb/m
.01
.005
.004
.003
.002
.1
.0707
.0633
.0548
.0447
.6666
.0471
.0422
.0365
.0298
13.3%
9.4%
8.4%
7.3%
6.0%
.00126 .0355
.00626 .0791
.00726 .0852
.00826 .0909
.00926 .0962
.0237
.0528
.0568
.0606
.0642
4.7%
10.5%
11.4%
12.1%
12.8%
For a relative error of 0.5, the average precision is 10%. The data of Table 5.3 demonstrate that even where the
average precision of the two labs is reasonable, the two labs can differ in precision by a factor of two.
A variety of analytical methods commonly are available for samples from an evaluation project. It is
not our purpose here to deal with the details of various methods, rather we want to indicate that different
methods might be appropriate depending on factors such as
(i) the composition of the matrix of samples being analyzed,
(ii) the compositional range being investigated, and
(iii) the quality required for the data.
Because of the wide variety of analytical methods/procedures, we are commonly faced with comparing data
obtained by more than one methodthere is every reason to be on guard for differences between methods, in
particular, biases and differences in amount of average random error.
The explorationist/mine geologist should determine an adequate analytical procedure in discussions
with analysts. Consider the case of gold analyses, as discussed by Hoffman et al (1998), for which many
analytical techniques exist. Of course, for most routine deposit evaluation, the various tried and true variants of
the fire assay techniques are relied on widely. However, it is apparent in Table (hoffman.bmp) that other
techniques might be desirable for special purposes or for particular ranges of gold content.
67
Figure 5.9: Scatter plot of original Mo assays (x) versus a replicate assay (y) for an operating
molybdenum mine. Original analyses were low by an average of about 5% and were found to be the
source of the discrepancy.
The same approach applies to analyses of a wide range of metals. Springett documents three
techniques of tungsten analysis with a substantial bias among them. Schwarz et al (1983) found biased results
for one of two analytical methods for molybdenum as illustrated in Figure 5.9. Figure 5.10 is a plot of 31 pulps
analyzed for Pd by both NiS and Pb fusion procedures. In this case the NiS fusion assays are about 25% lower,
on average, than the corresponding Pb fusion procedures. Because of the scatter of data, the linear model does
not prove a significant difference between the two procedures. However, a simple t-test at the 0.05 level
indicates an average bias.
68
Figure 5.10: Scatter plot of thirty one pulps analyzed by both NiS and Pb fusions for Pd.
69
SECTION 6
6.0 MONITORING AND QUANTIFYING DATA QUALITY
6.1 Introduction
The use of duplicate data as a means of monitoring and quantifying the quality of assay data is well
ingrained in mineral exploration literature (e.g., Sketchley, 1998). The theory and background of duplicate
samples in providing quality control of assay data have been well known and in intermittent use for many years
(e.g., Burns, 1981, Thompson and Howarth, 1976; Kratochvil and Taylor, 1981). Never-the-less, the use of
duplicate samples had increased substantially in recent years, stemming largely from the Bre-X fiasco of the
mid 1990s and various legislation and guidelines that emerged as a response to that incident.
In addition to the use of duplicate sample analyses, replicate analyses of appropriate standard
materials, is an essential part of a quality control program for assays. There is no advantage to being able to
demonstrate a precision of 5 percent on pulps within a laboratory if the lab has a 10 percent high bias that has
not been documented! Consequently, in addition to the analysis of duplicate samples as a monitor on quality, it
is important to periodically analyze standard samples that have known metal abundances, including those
highly specialized standards with zero metal content, called blanks.
70
of that found on the property under evaluation; hence, there should be no serious matrix differences between
the property standard and the great majority of samples analyzed.
Standards are used principally as a check on lab accuracy. Standards are analyzed sequentially along
with routine samples and the analytical results for each standard should be plotted sequentially so the any
systematic variations over time can be identified readily and dealt with where necessary. Significant variations
from the recommended values of certified reference materials indicate that bias is present in the laboratory
procedure and rectification is in order. Some of the idealized patterns that arise and their explanations are given
in Figure 6.1.
The statistics for repeat analyses of two local commercial standards, UTM-1 and WPR-1, are given in
Table 2 and provide an indication of how analytical data should be summarized as well as the level of
variability that can arise.
TABLE 6-2: SUMMARY STATISTICS FOR TWO COMMERCIAL STANDARDS ANALYZED AS
PART OF A DEPOSIT RE-EVALUATION PROJECT
Standard
UMT-1 20
Pt(ppb)
Avg.
sd
CV
125.2
8.2
1.8
6.6%
se
108.7
Pd(ppb)
Avg.
sd
CV
6.8
1.5
6.3%
se
WPR-1 21
276.4 10.8
3.9%
2.4
231.9 8.3
3.6%
1.8
n = number of analyses, avg = average, sd = standard deviation; se = standard error; CV = coefficient of
variation
Figure 6.1: (a) Time lots of analytical data for two Mo standards, E and D. The expected (known)
values with which these replicates should be compared are 0.2347+0.0184 for E and 0.1160+0.0038
71
for D. Note bias in analyses of both standards. (b) Idealized time plot to illustrate interpretation of
patterns. In all cases, assume an expected value of 1.5 ppm Au. Results 0-10 are accurate and
precise except for one outlier; results 11-20 are accurate but not precise; results 21-30 show a
systematic trend that warrants investigation; results 31-40 are precise but not accurate; results 41-50
are both precise and accurate.
Note that repeat analyses of standards of all types provide data that can be used to estimate error quantitatively.
In the case of the two metals summarized in Table 6-2 the error (dispersion) is reported as standard deviation
(sd) and coefficient of variation (CV). Recall that multiplying the CV by 2 gives the average precision, i.e., for
Pd in UMT-1 the precision is Pr = 2 x 6.3 = 12.6%. Where standard are recognizable as such to the analyst,
precisions determined from their analyses are apt to be optimistic relative to precision of unknown samples.
In-house standards commonly are not analyzed widely enough that the true metal content is known with the
same confidence as for certified reference materials. If the in-house analyses are obtained with sufficient quality
control involving the use of known standards, however, they can still provide a reasonable control for
monitoring bias as well as precision.
The analytical results for the standards of Table 6-2 are acceptable because of the following features:
1.
2.
3.
examination of histograms of the data (not shown) indicates that they are distributed symmetrically about
the mean values
the standard deviations (SD) are small relative to the corresponding mean value. This is also shown also by
the small coefficients of variation (CV) given by the formula 100*sd/(avg).
the 95 percent confidence limits for the mean range of values (avg + 2*SDse) define a very short range that
is relatively small as a percentage of the mean.
Practical use of the repeat analyses of a standard require sufficient replicates to show whether or not various
types of trend exist; such trends may be obvious, as illustrated in Figure 6.1, or might be less obvious visually
and require statistical testing. Consider a typical example of 58 sequential analyses of a Pt standard (figure 6.2).
A variety of statistics for these data are given in Table 6-3. Consider the data with outlier removed and divided
into two
TABLE 6-3: SUMMARY STATISTICS FOR REPLICATE ANALYSES OF A Pt STANDARD
N
Range
Remarks
58
35.78 13.89 19-116 ppb
Total available data
57
34.37 8.914 19-60
Data less one outlier (4th analysis)
22*
36.50 10.06 26-60
First 23 results less outlier
35*
33.03 7.976 19-58
Final 35 results
*Division into subgroups is arbitrary and is based on a visual examination of a graph of the data (grade versus
sequence no.).
subgroups, 22 early analyses and 35 later analyses. These two groups can be tested for bias. First determine an
F-value: F = 10.062/7.9762 = 1.59. This is less than Fcrit = 1.99 for = 0.05 and df = (34, 21) derived from
tabulations in statistical texts, so the two variabilities cannot be distinguished. A t-value can be determined as
follows:
t = (33.03 36.50)/(10.062/35 + 7.9762/22)1/2 = 1.44. This value compares with tcrit = 1.96 for = 0.05 and df =
55 so the two means are indistinguishablein other words, no bias can be identified.
A common practice in dealing with analytical data for a standard is to view the data as a histogram, as
shown in Figure 6.2 for the data summarized in Table 3. In this example, the outlier stands out clearly from the
well-defined bell shape of the remaining data and serves as pictoral justification for omitting the outlier value in
72
estimating a representative mean value of the replicate analyses. The histogram, however, is not useful for
recognizing subpopulations of the data, that are more likely to be evident on a probability plot.
Figure 6.2: Fifty-eight sequential analyses of a Pt standard. Note the outlier. Also, note the possibility
that early analyses are higher, on average, than later analyses. See text for discussion.
6.2.2 Blanks
73
Blanks are samples or pulps that are known to contain negligible (effectively zero) contents of an
element or elements (metals) for which assays are being determined. They are used for two main purposes, (1)
to monitor contamination during subsampling and (2) to monitor contamination in the analytical environment.
Occasionally blanks are inserted at various places in a sample sequence to check that samples have been kept in
order throughout the various subsampling and analytical processes. Long (1998) suggests that in low grade ores
blanks are not particularly effective for this latter purpose because many of the low-grade samples can be near
the analytical detection limit.
74
Figure 6.4: Concentration-difference plot for 4 molybdenum standards. The ordinate is a difference
between analyses of the standards and their accepted values; abscissa is concentration. Two of
these correspond to the time plot of Figure 6.1a.
75
To allow for more limited use of relatively expensive certified reference materials.
76
2.
3.
To monitor the quality of original data with compositions between those of certified reference materials or
internal standards.
To monitor potential analytical problems in the principal lab.
Duplicate pulp analyses involving data from two labs are commonly, but incorrectly, used to estimate precision.
It should be apparent that any so-called precision obtained by comparing analyses from one lab with analyses
from a second lab will be intermediate between the true precisions of the two labs. If the precision of neither of
the two labs is known for data consistent with the duplicate data, no quantitative estimate of the precision of
either lab is possible. If the precision of one of the labs is known for data consistent with the duplicate data, it is
possible to estimate the precision of the second labthis arises because the dispersion (as a variance) about the
RMA line is the sum of the dispersions/analytical error (as variances) for the two labs, i.e., sa12 + sa22 = sd2
(recall example in Table 3, Section 5, Analysis).
About 3 in every 40 to 100 duplicate pulps (depending on the scale of the project) should be sent to an
independent lab as a routine monitor on the principal lab. In addition, pulps of property standards in use by the
principal lab should be added to the routine sample duplicates sent to the check lab.
There is no such thing as too much data in most practical quality control situations. Consider 30 drill holes
that intersect a narrow vein such that there is only 1 sample per drill hole. This provides 30 samples, perhaps 10
of which should be done in duplicate as described later. Now suppose that 30 drill holes, each of 100m length
in a massive, mineralized epithermal gold zone are sampled as 2-meter lengths of half core. Each hole
represents 50 samples for a total of 1500 samples. Of these 1500 samples about 150 should be analyzed in
duplicate and should include about 50 duplicate half cores, 50 duplicates generated from reject material and 50
duplicate pulps. In addition, 50 pulps should be duplicated by an independent, reputable lab. These duplicate
samples are in addition to blanks and standards that are included with analytical batches. Labs traditionally
insert their own blanks and standards but the client should also submit blind (i.e., unknown to the lab) blanks
and standards. Moreover, the client should receive the labs results for the lab blanks and standards, as well as
detailed information on the reported values for these materials.
Following QC data verification, an appropriate approach to interpretation includes
1.
A general examination of the data for irregularities, outliers, subpopulations that might require separate
interpretation, etc., using histograms, probability plots and scatter (x-y) diagrams.
2.
3.
Examine scatter diagrams of various sets of duplicate data to identify influential data (especially a small
number of scattered high values that might control a linear model but might bear no particular relation to
the error characteristics of lower values. In general, divide the data into appropriate subgroups to be
77
evaluated separatelybase this division on variations in spread about the y = x line and on data density as
a function of concentration.. Divide into ranges using the average of pairsexception is the case where
there is a change in method at a particular value.
4.
Examine sequential analyses of standards on a value-time plot. Such a plot should be updated and reviewed
as each new batch of data is received. The mean and spread of assay values reported for the standards
should be compared with the reported (or best) value for the standard.
5.
For each appropriate subgroup, examine duplicate data from the check and principal labs for bias by
comparing a linear model with the y = x line. Quantify the bias, if present, for each subgroup. An average
precision for the two labs can be determined but is not always necessary and can be misleading in that the
uninitiated might incorrectly assume that the estimated precision is a precision for the data generated by the
principal lab.
6.
For the duplicate samples by the principal lab (e.g., duplicate half cores), evaluate the results together with
the duplicate pulp analyses for a reject sample of the corresponding original half cores. Determine the
magnitudes of sampling, subsampling and analytical errors.
78
Figure 6.5: Scatter plot of duplicate Pt analyses of pulps from first half of core, by due diligence lab.
The first estimate of analytical precision obtained from the data of Figure 6.5 is
2sa2 = 265.62 = 70,543.4 so that sa2 = 35,271.7 and sa = 187.8
A second estimate of analytical error can be obtained in the sameway from data in Figure 6.6, that is, 2sa2 =
222.62 so that sa2 = 24,775.4 and sa = 157.4 . These two estimates can be averaged (by averaging the variances)
to give a best estimate of the average analytical error of the due diligence labe based on these data alone; that is
average sa2 = (35,271.7 + 24,775.4)/2 = 30,023.5 to give an average sa of +173.3 . This average analytical error
can be transformed into an average precision based on an average grade of 1172 ppb. The average relative error
sa/m = 173.3/1172 = 0.148 to give an average precision of 200x 0.148 = 29.6%.
79
Figure 6.6: Scatter plot of duplicate Pt analyses of pulp from second half of core by due diligence lab.
80
Figure 6.7: Scatter plot of original core analysis versus first same pulp analysis by due diligence lab.
Figure 6.8: Scatter plot of original pulp analysis and second same pulp analysis by due diligence lab.
Data of the type displayed in Figures 6.7 and 6.8 are commonly used incorrectly to estimate precision. The fact
is, that if a precision is determined from data such as that of Figure 6.7 the resulting value is an average
precision of the two labs involved and one could be much worse that another. Let us calculate the precision
illustrated by Figure 6.7 as follows:
81
ss2
=
sampling
error
st2
total
error
We know sa2 from the calculations we have done above and we have four separate estimates of st.
Consequently, we can calculate an average sampling error. The 4 estimates of st arise from the following
pairings which all involve a comparison from one half core to the other half core.
AARLPT1 vs ACOREPT3
AARLPT1 vs ACOREPT4
AARLPT2 vs ACOREPT3
AARLPT2 vs ACOREPT4
These comparisons are shown in Figures 6.9 to 6.12.
Figure 6.9: First half-core vs. second half core, ACOREPT4 vs AARLPT1
82
83
84
Figure 6.13: Two hypothetical grade distributions centered on the same mean value, a wide
dispersion representing quarter-core data and a narrow dispersion representing half-core data.
It is an established fact that sample variance (i.e., sample grade variability) is a function of sample
size, generally increasing as the sample size decreases (Journel and Huijbregts, 1979). In simple terms, this
means that a data set based on physically small samples will have more high and low values than will a data set
based on physically larger samples. The principle is illustrated in Figure 6.13 where two grade distributions
(ideal normal curves) are centered on the same mean valuesmaller samples produce distributions with
broader dispersion. Note the impact if a cutoff grade is applied to the two curvesthe grade distribution for
smaller samples produces high-grade values that do not exist in larger volumes and the forecasted average
grade is biased high.
This principle of increasing grade dispersion with decreasing sample size becomes a matter of concern in
auditing and due diligence work that involves resampling as a check on the validity of data. The common
practice of quartering core (i.e., halving some of the remaining half cores) leads to the comparison of assays for
two sets of data of differing sample support. In many practical cases the differences in dispersion (standard
deviation) are of the order of a few percent and are not a matter of concern. In other cases, especially those with
a high nugget effect and/or a strongly skewed distribution, the use of quarter cores can lead to very large
differences between quarter-core values and corresponding half-core values.
Part of a property evaluation in a New Brunswick PGM prospect produced a set of -core assays in
the early 909s. A few years later the remaining half cores were themselves halved to produce quarter cores that
were also analyzed for PGMs (Table 6.4). These data have a mean value of about 78ppb Pt and standard
deviations of about 60.6 and 101.4 for half-core and quarter-core data respectively.
Pt
M
Au
m
Pd
m
-core
-core
76.2
78.5
101.4
60.6
21.0
21.4
33.3
23.3
274.4
286.0
379.5
214.7
85
Figure 6.14: Idealized illustration of a comparison of -core and -core assay distributions
discussed in text.
Consider the impact of applying a cutoff grade of xc = 100 to both the -core and the -core data sets. It is
evident from Figure 6.14 that the -core data above cutoff will produce an average grade substantially higher
than will the -core data above cutoff grade
It is evident from these data that comparison of quarter core with half core assays is not a fair comparison,
particularly for the very high values of -core which are clearly of special interest but for which the -core
data are likely to be biased high relative to the -core results.
A Solution
A fairer way to compare quarter core results with half core analyses in this case is
to make 2-meter composites of the -core values and compare them with each of the
corresponding 1-meter -core data. Values of composites, can be compared first
with the upper half core values and then with the lower -core values.
86
87
Example 2:
88
In general, it is important to appreciate whether a metal analysis is total metal or a partial extraction.
For example, nickel can occur in the sulphide form as well as being tied up in silivate lattices; in sulphide-rich
deposits only the sulphide form is generally of interest because the Ni tied up in silicate lattices is not
recoverable. Consequently, an appropriate method of analysis is required. An example is illustrated in Figure
6.16 where a 4-acid digestion produces a total metal analysis whereas, a 3-acid digestion approximates sulphide
Ni. The figure clearly shows a fixed bias of about 75ppm Ni that represents an average Ni-content of olivine in
this example.
Figure 6.16:
Example 3
This example involves a comparison of instrumental analyses for uranium versus chemical analyses.
Abundant data are illustrated in Figure 6.17 where two types of bias are evident. The first is for low values
where decay equilibrium did not prevail, the
89
Figure 6.17: A scatter diagram of uranium analyses by an instrumental counting procedure and
chemical analysis.
second is for higher values where a linear trend to the data is apparent above the y = x line. Generally, in such
cases, the chemical analyses are taken as being closer to the true values.
90
Figure 6.18: Scatter diagram of first half-core analyses for Cu (Cu_ppm) versus values for facing halfcores obtained about 4 months later in a hot humid climate.
There is a complex relation between the paired data including evidence that high values appear to have been
overestimated by the initial data relative to the duplicates. This result could arise because of a sampling bias at
the time of initial sampling, but could also have arisen because of some selective leaching of Cu in the time
interval between the two samplings. Ni values also show a significant bias with the original data high relative to
the later data (Figure 6.19). It is interesting to note that original precious metal values (Pt, Pd, Au) are almost
perfectly reproduced by the duplicates 4 months later. This leads one to believe that a slight preferential
leaching of Cu and Ni took place during the 4 month interval.
Figure 6.19:
91
Substantial disparity can arise in attempting to verify very old assay data, with say, tens of years between
original analyses and duplicate analyses Old data generally lack the highly organized quality control
procedures required today. The problem of potential alteration of old half-cores, as illustrated in the previous
example, is accentuated because of the time interval involved. Furthermore, the conditions of original analysis
and subsampling protocol may be impossible to reconstruct from available records and information. This is a
general problem that is encountered in audits and due diligence work.
Consider the example of Figures6.20 and 6.21, for which duplicate half core analyses for Ni, taken about 30
years after the original analyses, were obtained. In this case a proportional bias is evident, with the newer
analyses about 10% lower than the original data, on average. In such a case it was essential to have confidence
in the newer analyses because it was not possible to reconstruct the subsampling and analytical protocol for the
original data. Consequently, the newer data required substantial checking by independent labs as well as by
other analytical methods than those used initially. In addition, it was essential to evaluate the quality of core and
how it might or might not have been affected during the 30-year period it was in storage. In this particular case,
the environment was cold temperate, the core showed little evidence of alteration and, on investigation, the
original lab was thought to be suspect.
Figure 6.20: Scatter plot of 1105 original half-core analyses versus the second half-core analyses
obtained after a 30 year interval.
92
Figure 6.21: Expanded scatter plot of the concentration of relatively low valued data of Figure 6.20.
The data of Figure 6.20 do not indicate a particularly serious problem. However, if the relatively few high
values are considered influential and removed, the remaining group of abundant lower values indicate an
average bias of more than 10 percent, with the original assay high relative to the later assays. Clearly, this is a
serious problem in what was perceived to be a low grade deposit.
7.0: REFERENCES
AGI, 1972, Glossary of Geology, American Geological Institute, Washington, D. C. 805 pp. Plus appendices.
Agterberg, F. P., 1974, Geomathematicsmathematical background and geo-science applications; Elsevier Sc.
Pub. Co., Amsterdam, The Netherlands, 596 p.
Annels, A. E., 1991, Mineral deposit evaluation, a practical approach; Chapman and Hall, London, 436 p.
Anon.
Bentzen, A., and A. J. Sinclair, 1993, P-RES, a computer program to aid in the investigation of polymetallic ore
reserves; Tech. Rept. MT-9, Mineral Deposit Researach Unit, Dept. of Earth and Ocean Sciences, The Univ. of
British Columbia, Vancouver, B. C. (includes diskette), 55 p.
Blackwell, G. H., 1998, in Vallee, M., and A. J. Sinclair (eds.), 1998, Quality assurance, continuous quality
improvement and standards in mineral resource estimation; Exploration and Mining Geology, v. 7, nos. 1 and 2,
pp. 99-106.
Burn, R. G., 1981, Data reliability in ore reserve assessments; Mining Mag., October, pp. 289-299.
Cabri, L. J., 2002, Overview of platinum-group minerals (PGM) and Alaskan-type deposits (abstract);
93
Abstracts, Cordilleran Roundup (British Columbia and Yukon Chamber of Mines), Vancouver, January 21-25,
p. 43-44.
Cornish, E. C., 1966, Sampling ore deposits; Mineral Industries Bull., Colo. Sch. of Mines, v. 9, no. 2, 14 p.
CSA, 1992a, ISO 9000Quality management and quality assurance standardsguidelines for selection and
use; Canadian Standards Assoc., 16 p.
CSA, 1992b, ISO 9001Quality systemsquality assurance in design/development, production, installation
and servicing; Canadian Standards Assoc., 16 p.
CSA, 1992c, ISO 9004Quality management and quality system elementsguidelines, ISO-9004; Canadian
Standards Assoc., 16 p.
Cumming, J. D., 1980, Diamond drill handbook; J. K. Smit, Toronto, Canada, 547 p.
David, M., 1977, Geostatistical ore reserve estimation; Elsevier Sc. Pub. Co., Amsterdam, The Netherlands,
364 p.
Davis, J. C., 1986, Statistics and data analysis in geology; John Wiley and Sons, Inc., New York, 646 p.
Dent, B. M., 1935 or 1937, On observations of points connected by a linear relation; Proc. Physical Soc.
London, v. 47, pt. 1, p. 92-108.
Farquharson, G., H. Thalenhorst and R. von Guttenberg, 1997, Busang projecttechnical audit for Bre-X
Minerals Ltd., Interim Report, May 3, 54 p. plus photographs and appendices.
Fletcher, W. K., 1981, Analytical methods in geochemical prospecting; Handbook of exploration geochemistry,
vol. 1, Elsevier Sci. Pub. Co., Amsterdam, Holland, 255 p.
Francois-Bongarcon, D., 1993, The practice of the sampling theory of broken ores; Can. Inst. Mining and
Metallurgy Bull., no. 970, pp. 75-81.
Francois-Bongarcon, D., 1998, Error variance information from paired data: applications to sampling theory: in
Vallee, M., and A. J. Sinclair (eds.), 1998, Quality assurance, continuous quality improvement and standards in
mineral resource estimation; Exploration and Mining Geology, v. 7, nos. 1 and 2, pp. 161-166.
Francois-Bongarcon, D., 1998, Extensions to the demonstration of Gys formula; in Vallee, M., and A. J.
Sinclair (eds.), 1998, Quality assurance, continuous quality improvement and standards in mineral resource
estimation; Exploration and Mining Geology, v. 7, nos. 1 and 2, pp. 149-154.
Griffiths, J. C., 1967, Scientific method in analysis of sediments; McGraw-Hill Inc., New York, 508 p.
Gy, P., 1979, Sampling of particulate materialstheory and practice; Elsevier Scientific Pub. Co., Amsterdam,
431 p.
Henley, S., 1981, Nonparametric geostatistics; Applied Science Pub., London, 145 p.
Hoffman, E. L., J. R. Clark and J. R. Yeager, 1998, Gold analysisfire assaying and alternative methods; ; in
Vallee, M., and A. J. Sinclair (eds.), 1998, Quality assurance, continuous quality improvement and standards in
mineral resource estimation; Exploration and Mining Geology, vol. 7, nos. 1 and 2, pp. 155-160.
Hoover, H. C., and L. H. Hoover, 1950, De re metallica (translated from the first Latin edition of 1556); Dover
Publ. Inc., New York, 638 p.
94
Ingamells, C. O., 1974, Control of geochemical error through sampling and subsampling diagrams; Geochimica
et Cosmochimica Acta, v. 38, pp. 1225-1237.
Isaaks, E. H., and R. M. Srivastava, 1989, An introduction to applied geostatistics; Oxford Univ. Press, New
York, 561 p.
Iso, 1992a, Quality management and quality assurance standardsguidelines for selection and use; Can.
Standards Assoc., Rexdale, Ont., 16 p.
Iso, 1992b, Quality management and quality system elementsguidelines; Can. Standards Assoc., Rexdale,
Ont., 16 p.
Jen, L. S.,1992, Co-products and by-products of base metal mining in Canada: facts and implications; Can. Inst.
Min. Metall. Bull., v. 85, no. 965, pp. 87-92.
John, M., and H. Thalenhorst, 1991, Dont lose your shirt, take a bulk sample; in Proc. Of a Seminar entitled
Sampling and Ore Reserves, Prospector and Developers Assoc. of Canada, Toronto, Ont., March 23, pp. 1122.
Jones, C., D. Potter, K. Paris, M. Mamamoba, D. Hudawa and R. Obial; Busangdigging for the truth: the
Freeport due diligence; in More meaningful sampling for the mining industry, A. I. G. Bulletin 22, pp. 83-101.
Journel, A. G., and C. Huijbregts, 1978, Mining Geostatistics, Academic Press, London, U. K., 600 pp.
Kermack, K. A., and J. B. S. Haldane, 1950, Organic correlation and allometry; Biometrika, v. 37, p. 30-41.
Knoll, K., 1989, And now the bad news; The Northern Miner Magazine, v. 6, p. 48-52.
Kratochvil, B., and J. K. Taylor, 1981, Sampling for chemical analysis; Analyt. Chem., v. 53, no. 8, p. 924A938A.
Lemieux, E., Y Ruel and B. Parent, 2003, Mineral economicsprogress report on the pilot project to
implement ISO 9001-2 quality assurance systems in mineral exploration companies in Quebec; Can. Inst. Min.
Metall. Bull. v. 96, no. 1069, p. 91-94.
Lister, B., and M. J. Gallagher, 1970, An interlaboratory survey of the accuracy of ore analysis; Trans. Inst.
Min. Metall., p. B213-B237.
Long, S. D., 1998, Practical quality control procedures in mineral inventory estimation; in Vallee, M., and A. J.
Sinclair (eds.), 1998, Quality assurance, continuous quality improvement and standards in mineral resource
estimation; Exploration and Mining Geology, v. 7, nos. 1 and 2, p.117-128.
Mark, D. M., and M. Church, 1977, On the misuse of regression in earth science; Math. Geol. v. 9, no. 1, p. 6375.
McKinstry, H. E., 1948, Mining geology; Prentice-Hall Inc., New York, 680 p.
Miller, R. L., and J. S. Kahn, 1962, Statistical analysis in the geological sciences; John Wiley and Sons, Inc.,
New York, 483 p.
Northern Miner, 1998, Mining explained; The Northern Miner, Don Mills, Ontario, Canada, 150 p.
Ottley, D. J., 1966, Pierre Gys sampling slide rule; Can. Mining Jour., v. 87, no. 7, p. 58-62.
Pitard, F., 1989a, Pierre Gys sampling theory and sampling practice; vol. 1, heterogeneity and sampling;
95
96
Springett, M., 1984, Sampling practices and problems; in Erickson, A. J., Jr. (ed.), Applied mining geology;
Amer. Inst. Min. Metall. Eng., Soc. Min. Eng., pp. 189-195.
Thompson, M., and R. J. Howarth, 1978, A new approach to the estimation of analytical precision; Jour.
Geochem. Expl., v. 9, p. 23-30.
Thompson, M., and R. J. Howarth, 1976a, Duplicate analysis in geochemical practice Part I. Theoretical
approach and estimation of analytical reproducibility; The Analyst, v. 101, p. 690-698.
Thompson, M., and R. J. Howarth, 1976b, Duplicate analysis in geochemical practice Part II. Examination of
proposed method and examples of its use; The Analyst, v. 101, p. 699-709.
Thompson, M., and R. J. Howarth, 1973, The rapid estimation and control of precision by duplicate
determinations; The Analyst, v. 98, no. 1164, p. 153-160.
Till, R., 1974, Statistical methods for the earth scientistan introduction; McMillan Press Ltd., London, 154 p.
Vallee, M., 1998, Sampling quality control; in Vallee, M., and A. J. Sinclair (eds.), 1998, Quality assurance,
continuous quality improvement and standards in mineral resource estimation; Exploration and Mining
Geology, v. 7, nos. 1 and 2, pp. 107-116.
Vallee, M., 1998, Quality assurance, continuous quality improvement and standards; Exploration and mining
geology, vol. 7, nos. 1 and 2, p. 1-13.
Vallee, M., and A. J. Sinclair, 1997, Efficient mining and mineral processing depend on high quality geology,
data gathering and evaluation procedures; CIM Bull., June, p.76-79.
Vallee, ;M., M. Filon and M. David, 1976, Of assays, tons and dollars, or Can you trust gold assay values?;
preprint of presentation at CIMM Ann. Mtg, Quebec, Que., April 25-28, 1976, 32 p. plus figures and tables.
Walpole, R. E., and R. H. Myers, 1978, Probability and statistics for engineers and scientists; McMillan Pub.
Co. Inc., New York, 580 p.
97