You are on page 1of 17

Measurement and Instrumentation

Lecture 5
Errors during the measurement process
II

Classification of
Error
Errors arising during the measurement process can be divided into two
groups, known as systematic errors and
Systematic Errors
Sources of systematic error
Errors due to environmental inputs
Wear in instrument components
Connecting leads
Reduction of systematic error
Careful instrument design, Method of opposing inputs, High-gain
feedback, Calibration, Manual correction of output reading, Intelligent
instruments
Random

errors

Random
errors
Random errors in measurements are caused by unpredictable
variations in the measurement system.
They are usually observed as small perturbations of the
measurement either side of the correct value, i.e. positive errors
and negative errors occur in approximately equal numbers for a
series of measurements made of the same constant quantity.
Therefore, random errors can largely be eliminated by
calculating the average of a number of repeated
measurements, provided that the measured quantity remains
constant during the process of taking the repeated
measurements.

Assume the measurements


{x1, x2, x3, x4, x5, x6, x7, x8, x9}

Random
errors
Statistical
analysis - 1

Mean (Average)
Xmean = (x1+x2+x3+x4+x5+x6+x7+x8+x9) / 9
Median
The median is the middle value when the measurements
in the data set are written down in ascending order of
magnitude. For the above set fifth value is the median.

Random
errors
Statistical
analysis - 2

Example 1
Measurement Set A (11 measurements)
398 420 394 416 404 408 400 420 396 413
430
Xmean = 409, Xmedian = 408,
Example 2
Measurement Set B (11 measurements)
409 406 402 407 405 404 407 404 407 407
408
Xmean = 406, Xmedian = 407,
Which one is more realiable?
5

Random
errors
Statistical
analysis - 3

Example 3
Measurement Set C (23 measurements)
409 406 402 407 405 404 407 404 407 407 408 406
410
406 405 408 406 409 406 405 409 406 407
Xmean = 406.5, Xmedian = 406,
The median value tends towards the mean value as
the number of measurements increases

Variance and Standard Deviation


Deviation (error) in each measument

Random
errors
Statistical
analysis - 4

Variance

Standard Deviation

Random errors
Statistical analysis - 5
Measurement Set A (11
measurements)
398 420 394 416 404 408 400 420 396 413 430
Xmean = 409, V = 137, = 11.7
Measurement Set B (11
measurements)
409 406 402 407 405 404 407 404 407 407 408
Xmean = 406, V = 4.2, = 2.05

Measurement Set C (23


measurements)
409 406 402 407 405 404 407 404
407 407 408 406 410
406 405 408 406 409 406 405 409 406 407
Xmean = 406.5, V = 3.53, = 1.88

Random errors
Statistical analysis - 6
Note that the smaller values of V and for measurement set B
compared with A correspond with the respective size of the spread
in the range between maximum and minimum values for the two
sets.
Thus, as V and decrease for a measurement set, we are able
to express greater confidence that the calculated mean or median
value is close to the true value, i.e. that the averaging process has
reduced the random error value close to zero.
Comparing V and for measurement sets B and C, V and get
smaller as the number of measurements increases, confirming
that confidence in the mean value increases as the number of
measurements increases.
9

Graphical Data
Analysis
Histogram
Measurement Set C (23 measurements)
409 406 402 407 405 404 407 404 407 407 408 406 410
406 405 408 406 409 406 405 409 406 407
The simplest way of doing
this is to draw a histogram, in
which bands of equal width
across the range of
measurement values
are defined and the number
of measurements within each
band is counted.
10

Graphical Data
Analysis
Frequency
Distribution
As the number of measurements increases, smaller bands can be
defined for the histogram, which In
retains
its basic
shape
but then
the limit,
as the
number
f(D)
consists
of a larger number of smaller
steps
on
each
side
of the
of measurements approaches
peak.
infinity, the histogram becomes a
smooth curve known
as a frequency distribution curve.
The ordinate of this curve is the
frequency of occurrence of each
deviation value, f(D), and the
abscissa is the magnitude of
deviation, D.
11

f(D)

Graphical Data
Analysis
Properties Frequency
The area under the curve is unity.
Distribution

f (D) dD 1

The probability of observing a


value less than or equal to Di
F (Di ) P(D Di )

Di

f (D) dD

F is known as cumulative distribution function (c.d.f.).


Cumulative distribution function F varies between 0 and
1.
12

Graphical Data
Analysis
Gaussian
distribution

f(x)

The Gaussian distribution is


1

f ( x)
e xm / 2
defined
as.
2
where m is the mean and is the
standart deviation
2

For D = x - m, then f(D) is known as the error frequency


distribution.
f (D)

e D

/ 2 2

13

Graphical Data
Analysis
Standard Gaussian
distribution

f(z)

Substituting
a new Gaussian distribution with
zero mean (m=0) and unit
standard deviation (=1), is
obtained.
The new form called standard Gaussian curve.
f ( z)

e z

/2

and F ( z)

e z

/2

dz

14

Graphical Data
Analysis
Standard Gaussian
Tables

15

Graphical Data Analysis


Example -1
How many measurements in a data set subject to random
errors lie outside deviation boundaries of and - , i.e. how
many measurements have a deviation greater than ||?
The required number is
represented by the sum of the
two shaded areas in the Figure.

16

Graphical Data Analysis


Example - 2
P(E < - or E > +) = P(E < -) + P(E > +)
If E = -, then z = -1 and hence
P(E < -) = F(z = -1) = 1 F(1) = 1 0.8413 = 0.1587
F(1) = 0.8413 (from Gaussian Table)
Similarly, P(E > ) = 1 F(1) = 1 - 0.8413 = 0.1587.
Then P(E < -) + P(E > +) = 0.1587+0.1587=0.3174 = 31.74
%
i.e. 32% of the measurements lie outside the
boundaries, then 68% of the measurements lie inside.

17

Deviation boundaries

Graphical Data
Analysis
Example % of data points within
Probability of any
3 data point
boundary
particular
being outside boundary

68.0

32.0 %

95.4

4.6 %

1.96

95.0

0.3%

95.0

0.3%

99.7

0.3%

18

Graphical Data
Analysis
Standard error of the
mean
The error between the mean of a finite data set and the true
measurement value (mean of the infinite data set) is defined
as the standard error of the mean, . This is calculated as
The measurement value obtained from a set of n measurements,

{x1, x2, ........., xn}

19

Standard error of the mean


Example
Measurement Set C (23 measurements)
409 406 402 407 405 404 407 404 407 407 408 406 410 406
405 408 406 409 406 405 409 406 407
Xmean = 406.5, V = 3.53, = 1.88 and = 0.39
The length can therefore be expressed as 406.5 0.4 (68%
confidence limit).
However, it is more usual to express measurements with
95%
confidence limits ( 2 boundaries).

In this case, 2 = 3.76, = 0.78 and the length can be expressed


as 406.5 0.8 (95% confidence limits).
20

10

Standard error of the


mean
Estimation of random error in a single
measurement
For 95% confidence interval, the maximum likely deviation in
a single measurement is 1.96. However, this only expresses
the maximum likely deviation of the measurement from the
calculated mean of the reference measurement set, which is
not the true value as observed earlier. Thus the calculated
value for the standard error of the mean has to be added to
the likely maximum deviation value. Thus, the maximum likely
error in a single measurement can be expressed as:
Error = (1.96
21

Estimation of random error in a single


measurement
Example
Example
Suppose that a standard mass is measured 30 times with the same
instrument to create a reference data set, and the calculated values
of
and are 0.43 and = 0.08. If the instrument is then used to
measure an unknown mass and the reading is 105.6 kg, how should
the mass value be expressed under %95 confidence interval?

Solution
Error = (1.96 + ) = 0.92
The mass value should therefore be expressed as:
105.6 0.9 kg.
22

11

Example

Distribution of manufacturing
tolerances
Example

An integrated circuit chip contains 105 transistors. The transistors have a


mean current gain of 20 and a standard deviation of 2. Calculate the
following:
(a) the number of transistors with a current gain between 19.8 and 20.2
(b) the number of transistors with a current gain greater than 17.

Solution
(a) X = 19.8 => z = - 0.1 and for X = 20.2 => z = 0.1
P(19.8 < X < 20.2) = P(-0.1 < z < 0.1) = 0.0796 (0.0796 x 105 = 7960)
(b) X = 17 => z = -1.5 => P(X>17) = P(z = -1.5) = 0.9332 (93 320
transistors)

23

12

Graphical Data Analysis


Chi-Squared Distribution

The standard deviation of the distribution of the mean


values was quantified as the standard error of the mean.
It is also useful for many purposes to look at distribution
of the variance of successive sets of samples of N
measurements that form part of a Gaussian distribution.
This is expressed as the
distribution F(2),
kchi-squared
x2
2

where 2 is given by:


2

Where is the variance of a sample of N measurements


and is the variance of the infinite data set that sets of N
samples are part of. k is a constant known as the number
of degrees of freedom and is equal to (N-1).
21

Graphical Data
Analysis
Chi-Squared Distribution
The 2 distribution expresses the expected variation
due to random chance of the variance of a sample away
from the variance of the infinite population that the
sample is part of. The magnitude of this expected
variation depends on what level of random chance we
set. The level of random chance is normally expressed
as a level of significance, which is usually denoted by
the symbol .

22

11

Graphical Data Analysis


Chi-Squared Distribution

21

Graphical Data
Analysis
Chi-Squared Distribution
Meaning of Symbol () For Single Measurement

22

11

Graphical data Analysis


Goodness of fit to a Gaussian Distribution
The degree to which a set of data fits a Gaussian
distribution should always be tested before any analysis
is carried out. This test can be carried out in one of three
ways:
(A)Inspecting the shape of histogram
(B)Using a Normal Probability Plot
(C)The 2 Test

21

Graphical data Analysis


Goodness of fit to a Gaussian Distribution
(A)Inspecting the shape of histogram
The simplest way to test for Gaussian distribution of
data is to plot a histogram and look for a bell shape .
Deciding whether the histogram confirms a Gaussian
distribution is a matter of judgment. For a Gaussian
distribution, there must always be approximate
symmetry about the line through the center of the
histogram, the highest point of the histogram must
always coincide with this line of symmetry, and the
histogram must get progressively smaller either side of
this point.

22

11

Graphical data Analysis


Goodness of fit to a Gaussian Distribution
(B) Using a Normal Probability Plot
A normal probability plot involves dividing data values
into a number of ranges and plotting the cumulative
probability of summed data frequencies against data
values on graph paper. This line should be a straight line
if the data distribution is Gaussian.
However, careful judgment is required, as only a finite
number of data values can be used and therefore the
line drawn will not be entirely straight even if the
distribution is Gaussian.

21

Graphical data Analysis


Goodness of fit to a Gaussian Distribution
(C) The 2 Test
The 2 distribution provides a more formal method for
testing whether data follow a Gaussian distribution. The
principle of the 2 test is to divide data into p equal
width bins and to count the number of measurements ni
in each bin, using exactly the same procedure as done
to draw a histogram. The expected number of
measurements ni
In each bin for a Gaussian distribution is also calculated.
Before proceeding any further, a check must be made
at this stage to confirm that at least 80% of the bins
have a data count greater than a minimum number for
both ni and ni.
. 22

11

Graphical Data
Analysis
Student t Distribution
When the number of measurements of a quantity is
particularly small and statistical analysis of the
distribution of error values is required, the possible
deviation of the mean of measurements from the true
measurement value (the mean of the infinite population
that the sample is part of) may be significantly greater
than is suggested by
analysis based on a z distribution.
In response to this, a statistician called William Gosset
developed an alternative distribution function that gives
a more accurate prediction of the error distribution when
the number of samples is small. He published this under
the pseudonym Student and the distribution is
21
commonly called student t distribution.

Aggregation of Measurement System Errors


Errors in measurement systems often arise from two or
more different sources, and these must be aggregated
in the correct way in order to obtain a prediction of the
total likely error in output readings from the
measurement system.
Two
different forms
of
aggregation are required:
(1)A single measurement component may have both
systematic and random errors
(2)A measurement system may consist of several
measurement components that each have separate
errors.

22

11

You might also like