You are on page 1of 184

Statistics of Measurements and Reliability

Kristiaan Schreve
Stellenbosch University
kschreve@sun.ac.za

January 26, 2015

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

1 / 181

Overview I
1
2
3
4

5
6
7
8
9
10
11
12

Introduction
Some Important Concepts
Excel Demonstration
Graphing Data
Choosing the right type of graph
Guidelines for creating good scientific graphs
Calculating Averages with Excel
Standard Deviation and Variance
Z Scores
Higher Order Distribution Descriptors
Frequency and Histograms
Box-and-whisker Plots
The Normal Distribution
Confidence Limits
Sampling distributions
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

2 / 181

Overview II
Central limit theorem
Limits of confidence
t-distribution
Normal distribution and t-distribution confidence limits compared
13

One-sample Hypothesis Testing


Some revision
Hypothesis testing
Summary of one-sample hypothesis tests

14

Two-sample Hypothesis Testing


Hypotheses for two-sample means testing
Hypotheses for two-sample variance testing
Summary of two-sample hypothesis tests

15

Analysis of Variance - Part One


Introduction to ANOVA
Single factor ANOVA
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

3 / 181

Overview III
After the F-test
16

Regression
Linear regression
Testing hypotheses about regression
Excels R-squared
Excel functions for regression
Multiple regression
Guidelines

17

Correlation
Pearsons correlation coefficient
Correlation and regression
Testing hypotheses about correlation

18

Uncertainty of Measurement
Evaluation of standard uncertainty
Type A evaluation of standard uncertainty
Type B evaluation of standard uncertainty
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

4 / 181

Overview IV
Law of propagation of uncertainty for uncorrelated quantities
Law of propagation of uncertainty for correlated quantities
Determining expanded uncertainty
Reporting uncertainty
Example

19

Selecting the Right Method

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

5 / 181

Some Important Concepts I

[7]: pp. 10-17

Samples and Populations

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

6 / 181

Some Important Concepts II


Probability
Pr(event) =

Number of ways the event can occur


Total number of possible events

Conditional Probability
Pr(event|condition)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

7 / 181

Some Important Concepts III


Hypothesis
A statement of what you are trying to prove.
What is the probability of obtaining the data, given that this hypothesis is
correct?
Can only be rejected.

Null hypothesis
H0

Alternate hypothesis
H1

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

8 / 181

Some Important Concepts IV


Type I error
Rejecting H0 when you should not.

Type II error
Not rejecting H0 when you should.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

9 / 181

Excel Demonstration I

[7]: pp. 37-55

Accessing statistical functions (pp. 37)


Array functions (pp. 38)
Just remember to press Ctrl+Shift+Enter to complete the function

Naming cells or arrays (pp. 42)


Data analysis tools (pp. 51)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

10 / 181

Graphing Data I

[7]: pp. 65-96

Choosing the right type of graph

Column graphs
E.g. show percentage change over time for nominal values
Discrete data: open space between columns
Continuous data: no space between columns

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

11 / 181

Graphing Data II
Choosing the right type of graph

Avoid 3D. Here it works to show a zero value.


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

12 / 181

Graphing Data III


Choosing the right type of graph

Pie graph
E.g. show percentages that make up one total
Avoid 3D effects, it can distort the ability to distinguish between sizes
of the slices
As few slices as possible

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

13 / 181

Graphing Data IV
Choosing the right type of graph

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

14 / 181

Graphing Data V
Choosing the right type of graph

Line graph
E.g. show trends, or relationships between parameters

Figure: Global Temperature

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

15 / 181

Graphing Data VI
Choosing the right type of graph

Figure: Global Temperature

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

16 / 181

Graphing Data VII


Choosing the right type of graph

Bar graph
E.g. make a point about reaching a goal
Good if the labels on the horizontal axis take too much space
Arrange in ascending/descending order whenever appropriate

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

17 / 181

Graphing Data VIII


Choosing the right type of graph

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

18 / 181

Graphing Data IX
Choosing the right type of graph

Linear regression
E.g. show relationship between parameters
Use with great care!

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

19 / 181

Graphing Data X
Choosing the right type of graph

Figure: Regression example

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

20 / 181

Graphing Data I

Not in textbook

Guidelines for creating good scientific graphs

Avoid colour graphs


Black & white printers
Colour blindness: up to 10% of male population suffer from red-greed
colour blindness (www.colour-blindness.com)
Using colour in presentations is OK.

Dont wear out the viewers eyes


Pie graphs: avoid too many slices
Line graphs: avoid too many series/lines

Avoid unnecessary junk - it distracts from the main message (grid


lines, 3D effects, etc.)
Include all information (axis labels, units, appropriate legends)
Excels Smooth scatter plots are almost always a bad idea
Independent variable on horizontal axis
Dependent variable on vertical axis
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

21 / 181

Graphing Data II
Guidelines for creating good scientific graphs

Use regression with great care


The order of the regression must be appropriate for the number of data
points and trend in the data, e.g. dont fit a quadratic polynomial to
only 3 data points.
In general, dont extrapolate beyond the data range.
Give an indication of the goodness of fit, see Figure 3.
Give the confidence limits, see Figure 18.
Check that the regression curve gives a valid prediction, e.g. a curve
fitted to data that predicts temperature in Kelvin, cannot give negative
values.
Too large samples can be bad (see pp. 417 in textbook)

When plotting experimental data, use markers, with no lines between


them, see Figure 17.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

22 / 181

Graphing Data III


Guidelines for creating good scientific graphs

Whenever appropriate, include variability in your graphs (error


bars...). Also indicate what the error bars mean (95% confidence,
min/max range, standard deviation, etc.), see Figure 17.
Graph a categorical (discrete) variable as though it is a quantitative
variable is just wrong (see Fig 19-1 in the textbook).
Choose the range of the variables appropriately, see Figure 5.
When the dependent and independent variable have the same unit,
make sure that the axes have the same scale, see Figure 3.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

23 / 181

Graphing Data IV
Guidelines for creating good scientific graphs

Figure: An example of how NOT to plot categorical data.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

24 / 181

Graphing Data V
Guidelines for creating good scientific graphs

Figure: Use appropriate vertical range [4]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

25 / 181

Graphing Data VI
Guidelines for creating good scientific graphs

Table: Data set A, Running Times. [3]


Name
Thomas
Anthony
Emma
Jaspal
Lisa
Meena
Navtej
Nicola
Sandeep
Tanya

Kristiaan Schreve (SU)

Time [s]
19
26
18
19.6
21
22
27
23
17
23

Stats Block

January 26, 2015

26 / 181

Graphing Data VII


Guidelines for creating good scientific graphs

Figure: Charts based on data in Table 1 [3]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

27 / 181

Graphing Data VIII


Guidelines for creating good scientific graphs

Horizontal bars useful for large number of bars


Also useful if there is too much text for the horizontal axis
Rank of each athlete is clearly visible on bottom graph
None of the graphs shows the distribution of the data

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

28 / 181

Graphing Data IX
Guidelines for creating good scientific graphs

Figure: Pie chart based on data in Table 1 [3]

Pie graphs generally OK for showing discrete data


Must show parts of a hole - not in this case!

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

29 / 181

Graphing Data X
Guidelines for creating good scientific graphs

Figure: Histogram showing distribution of data in Table 1 [3]

Histograms show continuous data - no spaces between the bars.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

30 / 181

Graphing Data XI
Guidelines for creating good scientific graphs

Figure: Correct histogram showing distribution of data in Table 1 [3]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

31 / 181

Graphing Data XII


Guidelines for creating good scientific graphs

Table: Data set B: Wind in January [3]

Wind type
Strong wind
Calm
Gale
Light breeze
Total

Kristiaan Schreve (SU)

Stats Block

Days
10
5
7
9
31

January 26, 2015

32 / 181

Graphing Data XIII


Guidelines for creating good scientific graphs

Figure: Bar chart based on data in Table 2 [3]

Discrete data should have spaces between columns


Sequence of wind categories is not helpful

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

33 / 181

Graphing Data XIV


Guidelines for creating good scientific graphs

Figure: Bar chart based on the data in Table 2 [3]

Meaningless to compare Total to wind categories. Looks like


another category.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

34 / 181

Graphing Data XV
Guidelines for creating good scientific graphs

Figure: Bar chart based on the data in Table 2 [3]

Note discontinuity at start of Y-axis. This distorts the effect of the


columns.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

35 / 181

Graphing Data XVI


Guidelines for creating good scientific graphs

Figure: Correct bar chart based on the data in Table 2 [3]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

36 / 181

Graphing Data XVII


Guidelines for creating good scientific graphs

Figure: This is how you show a discontinuity in an axis.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

37 / 181

Graphing Data XVIII


Guidelines for creating good scientific graphs

Figure: Pie chart based on the data in Table 2 [3]

Data in Table 2 is ideal for pie charts.


Including the Total makes no sense in the pie graph since it represents
components of the total.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

38 / 181

Graphing Data XIX


Guidelines for creating good scientific graphs

Figure: Correct pie chart based on the data in Table 2 [3]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

39 / 181

Graphing Data XX
Guidelines for creating good scientific graphs

Figure: Graphing experimental data. Error bars show the measurement error
range.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

40 / 181

Graphing Data XXI


Guidelines for creating good scientific graphs

Figure: Graphing regression curves

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

41 / 181

Graphing Data XXII


Guidelines for creating good scientific graphs

Figure: Example of a bad graph

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

42 / 181

Graphing Data XXIII


Guidelines for creating good scientific graphs

Figure: Example of a bad graph

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

43 / 181

Graphing Data XXIV


Guidelines for creating good scientific graphs

Figure: Example of a bad graph

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

44 / 181

Calculating averages with Excel I

[7]: pp. 97-112

Mean (Excel: AVERAGE, AVERAGEA, AVERAGEIF, AVERAGEIFS,


TRIMMEAN)
(We dont do geometric mean or harmonic mean on pp 106-107)
Median (Excel: MEDIAN)
Mode (Excel: MODE.MULT, MODE.SNGL)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

45 / 181

Standard Deviation and Variance I

[7]: pp. 113-123

Population variance
2

)2
(X X
N

Excel function: VAR.P and VARPA

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

46 / 181

Standard Deviation and Variance II

Sample variance
s2 =

P
)2
(X X
N 1

Excel functions: VAR.S and VARA


,
Why divide by (N 1)? Calculating the average of the sample, X
effectively takes away one degree of freedom.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

47 / 181

Standard Deviation and Variance III

Standard deviation of a population


s
P

)2
(X X
= 2 =
N
Excel function: STDEV.P and STDEVPA
NOTE: the standard deviation has the same unit as the original
measurements

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

48 / 181

Standard Deviation and Variance IV

Standard deviation of a sample

s=

s
s2 =

P
)2
(X X
N 1

Excel function: STDEV.S and STDEVA


NOTE: whenever presenting a mean, always provide a standard
deviation as well

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

49 / 181

Z Scores I

[7]: pp. 131-145

How do you compare scores in one year to another year for, say,
Mechatronics 424?
Z scores take the mean as a zero point and the standard deviation as a
unit of measure. Therefore, for a sample
z=

X X
s

z=

and for a population

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

50 / 181

Z Scores II

IQ scores are typically transformed Z scores


IQ = 16z + 100
The implication of this formula: mean IQ score is 100, standard deviation
of IQ scores is 16.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

51 / 181

Z Scores III

Excel function related to Z scores


STANDARDIZE
PERCENTILE.EXC, PERCENTILE.INC
PERCENTRANK.EXC, PERCENTRANK.INC
QUARTILE.EXC, QUARTILE.INC

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

52 / 181

Higher Order Distribution Descriptors I

[7]: pp. 152-156

Descriptors
Variance: Describes the spread in the data.
Skewness: Describes how symmetrically the data is distributed.
Kurtosis: Describes whether or not there is a peak in the distribution
close to the mean.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

53 / 181

Higher Order Distribution Descriptors II


Skewness
Excel function: SKEW
P
)3
(X X
skewness =
(N 1)s 3

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

54 / 181

Higher Order Distribution Descriptors III


Kurtosis
Excel function: KURT
)4
(X X
3
(N 1)s 4

P
kurtosis =

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

55 / 181

Frequency and Histograms I

[7]: pp. 156-160

Frequency: Excel function: FREQUENCY - Remember: it is an array


function.
Histogram: Use the Data Analysis Tool

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

56 / 181

Frequency and Histograms II


Histrogram: Shows the number of items in a certain category.
Frequency distribution: Shows the percentage of the total in a certain
category, i.e. the histogram number for the category is
divided by the total number of samples in the histogram.
Histograms and frequency distributions are good to study central
tendencies, i.e. the tendency of all values in a sample of random variables
to be scattered around a certain value.
The following is a guideline for the number of intervals K (from [2])
K = 1.87(N 1)0.4 + 1
As N, the number of measurements, becomes large, choose K

Kristiaan Schreve (SU)

Stats Block

N [2]

January 26, 2015

57 / 181

Box-and-whisker Plots I

Not in textbook

Figure: Box-and-whisker plot generated with Python

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

58 / 181

Box-and-whisker Plots II
Gives an indication of the distribution of the data
Compare with histogram
Useful to compare different distributions
Matlab and Python both have useful tools to create these tools. More
difficult with Excel.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

59 / 181

Box-and-whisker Plots III

Figure: Box-and-whisker plot generated with Python

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

60 / 181

Box-and-whisker Plots IV
Example (Showing results of robot movement - Table)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

61 / 181

Box-and-whisker Plots V
Example (Showing results of robot movement - Box-and-whisker plot)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

62 / 181

The Normal Distribution I

[7]: pp. 173-183

1
f (x) = e
2

(x)2
2 2

f (x) Probability density


Standard deviation
Mean
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

63 / 181

The Normal Distribution II


Properties of the normal curve [8], pp. 141
Point where curve reaches its maximum is at x =
Curve is symmetric about a vertical line through x =
Points of inflection at x = . It is concave downward if
< x < + , concave upwards otherwise.
Approaches horizontal axis asymptotically in both directions away
from x =
Total area under the curve above the horizontal axis is 1.
Other names for the normal curve
Gaussian curve
Bell curve

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

64 / 181

The Normal Distribution III


Standard Normal Distribution
=0
=1
If Z scores are normally distributed, it will fit the standard normal
distribution.
Normal distribution of IQ scores

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

65 / 181

The Normal Distribution IV


Cumulative Normal Distribution
Gives the cumulative area under the normal distribution.
Z x
(x)2
1
F (x) =
e 22
2

Figure: Cumulative Normal Distribution

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

66 / 181

The Normal Distribution V

Vertical axis gives area under normal distribution to the left of x.


Asymptotically approaches 1.
Areas under the normal distribution is used to calculate probabilities as
follows:
Probability of an event between two values:
1
P(x1 < x < x2 ) =
2

Kristiaan Schreve (SU)

Stats Block

x2

(x)2
2 2

x1

January 26, 2015

67 / 181

The Normal Distribution VI

Figure: Probability of event between x1 and x2

Grey area is the probability of an event, x, between x1 and x2 , i.e.


P(x1 < x < x2 )
F (x1 ) is probability of an event, x, less than x1 , i.e. P(x < x1 ). This
is found from the cumulative distribution function.
Therefore P(x1 < x < x2 ) = F (x2 ) F (x1 )
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

68 / 181

The Normal Distribution VII

In Excel: P(x1 < x < x2 ) =NORM.DIST(x2 ,mean,standard


deviation,TRUE) - NORM.DIST(x1 ,mean,standard deviation,TRUE)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

69 / 181

The Normal Distribution VIII


Probability of an event less than a value:
1
P(x < x1 ) =
2

x1

(x)2
2 2

Figure: Probability of event less than x1

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

70 / 181

The Normal Distribution IX

Grey area is the probability of an event, x, less than x1 , i.e.


P(x < x1 ) = F (x1 )
In Excel: P(x < x1 ) =NORM.DIST(x1 ,mean,standard
deviation,TRUE)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

71 / 181

The Normal Distribution X


Probability of an event more than a value:
Z
(x)2
1
P(x > x1 ) =
e 22
2 x1

Figure: Probability of event more than x1

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

72 / 181

The Normal Distribution XI

Grey area is the probability of an event, x, more than x1 , i.e.


P(x < x1 ) = 1 F (x1 )
Note: the cumulative distribution gives the area to the left of x1 .
Since we are interest in the area to the right, we must subtract F (x1 )
from 1.
In Excel: P(x < x1 ) = 1NORM.DIST(x1 ,mean,standard
deviation,TRUE)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

73 / 181

The Normal Distribution XII

Excel functions
NORM.DIST, NORM.S.DIST
NORM.INV, NORM.S.INV
Use NORM.DIST(x,mean,standard deviation,TRUE) for the
cumulative distribution function
Use NORM.DIST(x,mean,standard deviation,FALSE) for the
probability density function

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

74 / 181

The Normal Distribution XIII

Example (Interpreting the normal curve [3], pp. 291)


The example refers to the distribution of normal IQ scores.
What proportion of the population measures an IQ less than 105?
90% of the population will have an IQ below what value?
The top 1% of the population will have an IQ above what value?
What range of IQs define the 95% interval?
Someone with a measured IQ in excess of 140 is considered eligible
for MENSA. What is the probability that a randomly chosen person
falls in this category?

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

75 / 181

Confidence Limits I

[7]: pp. 187-189

Sampling distributions

A sampling distribution is the distribution of all possible values of a


statistic for a given sample size.
Remember, the statistic, can be anything, e.g. the mean or the
standard deviation.
We are talking about a statistic, because we are talking about samples,
not populations, which would have parameters.
In other words, if we repeatedly take samples from the same
population, we would get a slightly different statistic, say the mean,
each time. The sampling distribution is the description of all the
possible values that the statistic can have.

The sampling distribution therefore has its own mean and standard
deviation.
The mean of the sampling distribution of the mean is x .
The standard deviation of the sampling distribution is called the
standard error.
The standard error is denoted as x .
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

76 / 181

Confidence Limits I

[7]: pp. 189-195

Central limit theorem

Theorem (Central limit theorem)


is the mean of a random sample of size n taken from a population
If X
with mean and finite variance 2 , then the limiting form of the
distribution of

X
Z=

as n , is the standard normal distribution with = 0 and = 1. [8]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

77 / 181

Confidence Limits II
Central limit theorem

Implications of the central limit theorem.


Sampling distribution of the mean is approximately a normal
distribution if sample size is large enough (i.e. 30 or more samples).
The mean of the sampling distribution mean is the same as the
population mean, = x .
The standard error (or standard deviation of the sampling distribution
mean) is equal to the population standard deviation, divided by the
square root of the sample size, x = N .
The population does not have to be a normal distribution.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

78 / 181

Confidence Limits I

[7]: pp. 195-199

Limits of confidence

Theorem (Confidence interval of ; known)


If x is the mean of a random sample of size n from a population with
known variance 2 , a (1 )100% confidence interval for is given by

x z/2 < < x + z/2


n
n
where z/2 is the z value leaving an area of /2 to the right. [8]
Note: for non-normal populations, n > 30, still give good results
thanks to the central limit theorem.
Work through the example on pp. 195-198.
Excel function: CONFIDENCE.NORM, CONFIDENCE.T
Note: only use CONFIDENCE.NORM when n > 30 and if the
population is normally distributed.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

79 / 181

Confidence Limits I

[7]: pp. 199-201

t-distribution

What if the sample size is < 30 or the distribution is not normal?


t-distribution works better.
t=

Kristiaan Schreve (SU)

s/ n

Stats Block

January 26, 2015

80 / 181

Confidence Limits II
t-distribution

Shape of the distribution depends on the degrees of freedom or df.

Figure: From [6]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

81 / 181

Confidence Limits III


t-distribution

Theorem (Confidence interval for ; unknown)


If x and s are the mean and standard deviation of a random sample from a
normal population with unknown variance 2 , a (1 )100% confidence
interval for is given by
s
s
x t/2 < < x + t/2
n
n
where t/2 is the t value with n 1 degrees of freedom, leaving an area of
/2 to the right. [8]
Excel functions:
T.INV, T.INV.2T
T.DIST, T.DIST.2T, T.DIST.RT
Repeat example on pp. 195-198, but use t-scores.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

82 / 181

Confidence Limits I

Not in textbook

Normal distribution and t-distribution confidence limits compared

Figure: Comparison of 90% confidence limits for the normal and t-distributions
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

83 / 181

Confidence Limits II
Normal distribution and t-distribution confidence limits compared

Note: the range of for the t-distribution is much larger than for the normal
distribution.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

84 / 181

One-sample Hypothesis Testing I

[7]: pp. 203-204

Some revision

Hypothesis Essentially a guess about the way the world works.


Null hypothesis H0 The data wont show anything new or interesting.
Any deviation from the norm, is strictly due to chance.
Alternative hypothesis H1 Explains the world differently.
H0 Can only reject or not reject. Can never accept a hypothesis.
Type I error Incorrectly rejecting H0
Type II error Not rejecting H0 when it should have been rejected.
Hypothesis testing is about setting criteria for rejecting H0 . This sets the
probability of making a Type I error. The probability is called .

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

85 / 181

One-sample Hypothesis Testing I

[7]: pp. 205-209

Hypothesis testing

Figure: From [6]


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

86 / 181

One-sample Hypothesis Testing II


Hypothesis testing

and are areas that show the probabilities of making decision


errors.
is typically 0.05. This corresponds to a 5% chance of making a
Type I error. It also represents the likelihood that the sample mean x,
is in that shaded region.
represents the likelihood that x is in the H1 distribution.
is never set beforehand. It depends on the distributions and where
is set.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

87 / 181

One-sample Hypothesis Testing III


Hypothesis testing

Example on pp 207-209

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

88 / 181

One-sample Hypothesis Testing IV


Hypothesis testing

Guidelines for writing the hypotheses [8], pp. 299


For a simple direction such as more than, less than, superior to,
inferior to, etc., state H1 as an appropriate inequality (< or >). H0
will be stated with the = sign.
If the claim suggests an equality and direction such as at least, equal
to or greater, at most, no more than, etc., then state H0 using (6 or
>). State H1 with the opposite inequality (< or >)sign.
If no direction is claimed (two-tailed tests), state H1 with 6= and H0
with =.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

89 / 181

One-sample Hypothesis Testing V


Hypothesis testing

One-sided (or one-tailed) tests are stated as


H0 : = (or )0
H1 : > 0
or
H0 : = (or )0
H1 : < 0
Two-sided (or two-tailed) tests are stated as
H0 : = 0
H1 : 6= 0
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

90 / 181

One-sample Hypothesis Testing VI


Hypothesis testing

Reject H0 , with variance known, if x > b or x < a, where

a = 0 z/2
n

b = 0 + z/2
n

Figure: From [8]


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

91 / 181

One-sample Hypothesis Testing VII


Hypothesis testing

The above is for a two-tailed test. A similar test can be formulated for a
one-tailed hypothesis.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

92 / 181

One-sample Hypothesis Testing VIII


Hypothesis testing

Tests on a single mean (variance unknown)


Rejection of H0 at significance level for
t=

x 0

s/ n

when
t > t/2,n1 or t < t/2,n1
Excel function: T.DIST

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

93 / 181

One-sample Hypothesis Testing IX


Hypothesis testing

Hypotheses involving variances


What if the hypothesis uses a variance rather than a mean?
H0 : 2 = (or )02
H1 : 2 > 02
or
H0 : 2 = (or )02
H1 : 2 < 02

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

94 / 181

One-sample Hypothesis Testing X


Hypothesis testing

Two-sided (or two-tailed) tests are stated as

H0 : 2 = 02
H1 : 2 6= 02

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

95 / 181

One-sample Hypothesis Testing XI


Hypothesis testing

Hypotheses involving variances


The chi-square distribution is used in the hypothesis test
Like the t-distribution, it also involves the degrees of freedom in the
sample (df=n-1).
2 =

Kristiaan Schreve (SU)

(N 1)s 2
2

Stats Block

January 26, 2015

96 / 181

One-sample Hypothesis Testing XII


Hypothesis testing

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

97 / 181

One-sample Hypothesis Testing XIII


Hypothesis testing

H0 is rejected at significance level under the following conditions


One-tailed hypothesis
For H1 : 2 < 02 2 < 21
For H1 : 2 > 02 2 > 2
Two-tailed hypothesis
2 < 21/2 or 2 > 2/2
Excel functions
CHISQ.DIST, CHISQ.DIST.RT
CHISQ.INV, CHISQ.INV.RT
CHISQ.TEST

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

98 / 181

One-sample Hypothesis Testing I

Not in textbook

Summary of one-sample hypothesis tests

H0
= 0 or 0
= 0 or 0
= 0

Value of Test Statistic


x
0 known
z = /
n

H1
< 0
> 0
6= 0

= 0 or 0
= 0 or 0
= 0

0 unknown
t = xs/
n
df = n 1

< 0
> 0
6= 0

2 = 02 or 2 02
2 = 02 or 2 02
2 = 02

2 = (n1)s
2
df = n 1

Kristiaan Schreve (SU)

Stats Block

2 < 02
2 > 02
2 6= 02

Critical Region
z < z
z > z
z < z/2
and z > z/2
t < t
t > t
t < t/2
and t > t/2
2 < 2,df
2 > 2,df
2 < 2/2,df
and 2 > 2/2,df
January 26, 2015

99 / 181

Two-sample Hypothesis Testing I

[7]: pp. 219-235

Hypotheses for two-sample means testing

Objective: does the two samples come from two different populations or
not?
Null hypothesis: Difference between the two samples are strictly due to
chance. They come from the same population.
Alternative hypothesis: There is a real difference between the samples.
They come from different populations.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

100 / 181

Two-sample Hypothesis Testing II


Hypotheses for two-sample means testing

One-tailed tests
or
H 0 : 1 2 = 0

H0 : 1 2 = 0

H 1 : 1 2 > 0

H1 : 1 2 < 0

Two-tailed tests

H0 : 1 2 = 0
H1 : 1 2 6= 0

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

101 / 181

Two-sample Hypothesis Testing III


Hypotheses for two-sample means testing

Hypothesis testing procedure


1

Write the hypotheses, H0 and H1

Select the probability for making a Type I error

Calculate 1 , 2 , 1 and 2

Compare the test statistic to a sampling distribution of test statistics


(see next slides)

Reject (or do not reject) H0

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

102 / 181

Two-sample Hypothesis Testing IV


Hypotheses for two-sample means testing

For this type of testing, the sampling distribution of the difference between
means is needed.
The sampling distribution of the difference between means is the
distribution of all possible values of differences between pairs of sample
means with the sample sizes held constant from pair to pair.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

103 / 181

Two-sample Hypothesis Testing V


Hypotheses for two-sample means testing

Figure: From [6]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

104 / 181

Two-sample Hypothesis Testing VI


Hypotheses for two-sample means testing

NOTE:
All samples from population 1 must have the same size.
All samples from population 2 must have the same size.
The two sample sizes are not necessarily equal.
Characteristics of the sampling distribution of the difference between
means according to the Central Limit Theorem
For large samples, it is approximately normally distributed.
For normally distributed populations, it is normally distributed.
The mean is the difference between the population means
x1 x2 = 1 2
The standard deviation
error of the difference between
q 2(or standard
1
22
means) is x1 x2 = N1 + N2
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

105 / 181

Two-sample Hypothesis Testing VII


Hypotheses for two-sample means testing

Tests on two means (variance known). Rejection of H0 at significance


level for
z=

(
x1 x2 ) (1 2 )
q 2
1
22
N1 + N2

when H1 : 1 2 < 0 (one tailed tests)


z < z
or H1 : 1 2 > 0 (one tailed tests)
z > z
or (two tailed tests)
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

106 / 181

Two-sample Hypothesis Testing VIII


Hypotheses for two-sample means testing

z > z/2 or z < z/2

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

107 / 181

Two-sample Hypothesis Testing IX


Hypotheses for two-sample means testing

Tests on two means (variance unknown, but equal)


Central Limit Theorem no longer applicable. Now, rather use the
t-distribution.
Calculate the pooled estimate of the standard error of the difference
between means.
(N1 1)s12 + (N2 1)s22
(N1 1) + (N2 1)
df = (N1 1) + (N2 1)
sp2 =

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

108 / 181

Two-sample Hypothesis Testing X


Hypotheses for two-sample means testing

Rejection of H0 at significance level for


t=

(
x1 x2 ) (1 2 )
q
sp N11 + N12

when H1 : 1 2 < 0 (one tailed tests)


t < t,df
or H1 : 1 2 > 0 (one tailed tests)
t > t,df
or (two tailed tests)
t > t/2,df or t < t/2,df
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

109 / 181

Two-sample Hypothesis Testing XI


Hypotheses for two-sample means testing

Tests on two means (variance unknown, and unequal)


Same test as the previous test (Two means, variance unknown), but the
degrees of freedom will be adjusted as follows [5], pp. 356:
df = 

(s12 /n1 + s22 /n2 )2


(s12 /n1 )2
n1 1

(s22 /n2 )2
n2 1

df will in general not be an integer. Round down to nearest integer to use


t table.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

110 / 181

Two-sample Hypothesis Testing XII


Hypotheses for two-sample means testing

Rejection of H0 at significance level for


t=

(
x1 x2 ) (1 2 )
q 2
s1
s22
N1 + N2

when H1 : 1 2 < 0 (one tailed tests)


t < t,df
or H1 : 1 2 > 0 (one tailed tests)
t > t,df
or (two tailed tests)
t > t/2,df or t < t/2,df
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

111 / 181

Two-sample Hypothesis Testing XIII


Hypotheses for two-sample means testing

Hypothesis testing of paired samples [5], pp. 359


One-tailed test
Two-tailed test
H0 :(1 2 ) = D0

H0 :(1 2 ) = D0

H1 :(1 2 ) > D0

H1 :(1 2 ) 6= D0

[or H1 : (1 2 ) < D0 ]
d D0
; df = n 1
t=
sd / n
Assumptions
The relative frequency distribution of the population of differences is
approximately normal.
The paired differences are randomly selected from the population of
differences.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

112 / 181

Two-sample Hypothesis Testing I

[7]: pp. 239-248

Hypotheses for two-sample variance testing

Comparing the variances of two samples


Two-tailed hypothesis
H0 :12 = 22
H1 :12 6= 22
To compare variances of two samples, the F-test is used.
The test statistic is the F-ratio
F =

sa2
sb2

where sa2 > sb2


To draw a conclusion, the F-distribution is needed.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

113 / 181

Two-sample Hypothesis Testing II


Hypotheses for two-sample variance testing

Figure: From [6]

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

114 / 181

Two-sample Hypothesis Testing III


Hypotheses for two-sample variance testing

NOTE
The distribution depends on two dfs, dfa and dfb .
dfa = na 1
dfb = nb 1

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

115 / 181

Two-sample Hypothesis Testing IV


Hypotheses for two-sample variance testing

Rejection of H0 at significance level when


F > F1/2 (dfa , dfb ) and F < F/2 (dfa , dfb )
The F-test can be used to see if the variances of two samples differ
significantly before deciding which t-test to use for testing the difference
between the means. In this case, we are not looking for small differences
between the variances, therefore it is desirable to choose a higher , say
0.2 for the variance test.
Excel functions
F.TEST
F.DIST, F.DIST.RT
F.INV, F.INV.RT
Data analysis tool: F-test two sample for variances
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

116 / 181

Two-sample Hypothesis Testing I

Not in textbook

Summary of two-sample hypothesis tests

H0
1 2 = 0

1 2 = 0

Value of Test Statistic


x )(1 2 )
r2
z = (x1
2
2

H1
1 2 < 0

Critical Region
z < z

1 and 2 known

1 2 > 0
1 2 6= 0

z > z
z < z/2
and z > z/2
t < t,df

1+ 2
N1
N2

t=

(
x1
x2 )(1 2 )
q
sp N1 + N1
1

1 and 2 unknown
but equal
df = N1 + N2 2

Kristiaan Schreve (SU)

1 2 < 0

Stats Block

1 2 > 0
1 2 6= 0

t > t,df
t < t/2,df
and t > t/2,df

January 26, 2015

117 / 181

Two-sample Hypothesis Testing II


Summary of two-sample hypothesis tests

H0
1 2 = 0

Value of Test Statistic


x )(1 2 )
r2
t = (x1
2
2

H1
1 2 < 0

Critical Region
t < t,df

1 and 2 unknown
and unequal
(s 2 /n +s 2 /n )2
df =  1 1 2 2 

1 2 > 0
1 2 6= 0

t > t,df
t < t/2,df

s
s
1 + 2
N1
N2

and t > t/2,df

(s 2 /n1 )2
(s 2 /n2 )2
1
+ 2n 1
n1 1
2

1 2 = D0

0
t = sdD
d/ n
df = n 1

Kristiaan Schreve (SU)

1 2 < D 0
1 2 > D 0
1 2 6= D0

Stats Block

t < t,df
t > t,df
t < t/2,df
and t > t/2,df

January 26, 2015

118 / 181

Two-sample Hypothesis Testing III


Summary of two-sample hypothesis tests

H0
12

Value of Test Statistic


=

22

F =

sa2
sb2

dfa = na 1
dfb = nb 1

Kristiaan Schreve (SU)

H1
12
12
12

Stats Block

Critical Region
<
>
6
=

22
22
22

F < F (dfa , dfb )


F > F1 (dfa , dfb )
F < F/2 (dfa , dfb )
and
F > F1/2 (dfa , dfb )

January 26, 2015

119 / 181

Analysis of Variance - Part One

[7]: pp. 251-253

Introduction to ANOVA

Example (Based on Table 12-1, [6])


Table: Data from Three Training Methods
Method 1
95
92
89
90
99
88
96
98
95
Mean
Variance
Standard Deviation
Kristiaan Schreve (SU)

93.44
16.28
4.03
Stats Block

Method 2
83
89
85
89
81
89
90
82
84
80
85.20
14.18
3.77

Method 3
68
75
79
74
75
81
73
77

75.25
15.64
3.96
January 26, 2015

120 / 181

Analysis of Variance - Part One I


Introduction to ANOVA

Example (Continued...)
Hypothesis
H0 :1 = 2 = 3
H1 :Not H0
= 0.05
Performing multiple t-tests possibly sets us up for a disaster. Lets see why:
Chance of NOT making a Type I error with one comparison, with a
significance level of = 0.05 is 95%.
So, for 3 samples, 3 tests must be done: Method 1 Method 2,
Method 1 Method 3 and Method 2 Method 3.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

121 / 181

Analysis of Variance - Part One II


Introduction to ANOVA

Each test will have a probability of NOT making a Type I error of


pi = 95%.
The combined probability of NOT making a Type I error is therefore
p(p1 p2 p3 ) = 0.95 0.95 0.95 = 0.86
Therefore, the combined chance (note, this is covered in chapter 16)
of making a Type I error is
1 p(p1 p2 p3 ) = 0.14 or 14%
In general, the chance of making a Type I error increases as
1 (1 )N where N is the number of t-tests.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

122 / 181

Analysis of Variance - Part One III


Introduction to ANOVA

Table: Increasing chance of making a Type I error for multiple t-tests, from [6]
Number of samples t
3
4
5
6
7
8
9
10

Kristiaan Schreve (SU)

Number of tests
3
6
10
15
21
28
36
45

Stats Block

Pr(at least one significant t)


0.14
0.26
0.40
0.54
0.66
0.76
0.84
0.90

January 26, 2015

123 / 181

Analysis of Variance - Part One IV


Introduction to ANOVA

The idea with ANOVA is to separate the total variability into the following
components [8]
1

Variability between samples, measuring systematic and random


variation.

Variability within samples, measuring only random variation.

Finally, determine if component 1 is more significant than component


2.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

124 / 181

Analysis of Variance - Part One V


Introduction to ANOVA

The idea can also be illustrated with the following plots.


The figure shows a single factor experiment at two levels, i.e. two treatments.

Figure: From [5] pp. 627

Is there sufficient evidence to indicate a difference between the population


means?
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

125 / 181

Analysis of Variance - Part One VI


Introduction to ANOVA

How about these two plots?

Figure: From [5] pp. 627

What statistics of the two samples in these plots did we intuitively use to
make a decision on the difference between the population means?
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

126 / 181

Analysis of Variance - Part One I

[7]: pp. 253-265

Single factor ANOVA

Recall the definition of the sample variance


P
(x x)2
s2 =
N 1
This is often called the Mean Square, because it is almost a mean of
squared deviations.
P
Numerator: sum of squares= (x x)2
Denominator: degrees of freedom, df

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

127 / 181

Analysis of Variance - Part One II


Single factor ANOVA

We can calculate the following variances (or mean squares) (alternative


definitions are derived from [8], pp 472).

MST =

SST
dfT
Pk Pni
i=1

2
j=1 yij

2
Pni
i=1
j=1 yij
Pk
i=1 ni 1

Pk

Pk

i=1 ni

Mean Square for all the data.


Subscript

is for total data.

Numerator: Total sum of squares


Denominator: Total degrees of freedom. All the data - 1.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

128 / 181

Analysis of Variance - Part One III


Single factor ANOVA

In the second equation


k is the number of samples or treatments
ni is the number of data points in the i th sample
yij is the j th data point, from the i th sample.

MSW =

SSW
dfW
Pk

i=1

Pni

j=1 yij

2

ni

Pk

2
Pni
i=1
j=1 yij
Pk
i=1 ni 1

Pk

i=1 (ni

1)

Mean squares within samples. It is a pooled estimate of the


population variance.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

129 / 181

Analysis of Variance - Part One IV


Single factor ANOVA

Indication of variances within samples.


Subscript

stands for within

Numerator: Within samples sum of squares


Denominator: Sum of degrees of freedom of each sample
SSB
dfB
SST SSW
=
dfT dfW

MSB =

Mean squares between samples. Indicates how the means differ.


Subscript

stands for between

Numerator: Between samples sum of squares


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

130 / 181

Analysis of Variance - Part One V


Single factor ANOVA

Denominator: Number of samples - 1


Note that
SSB + SSW = SST and
dfB + dfW = dfT

Note that both MSW and MSB are estimates of the population variance.
If there is a meaningful difference between the variances, then the samples
cannot all come from the same populations and therefore there is a
meaningful difference between the samples that cannot be attributed just
to random errors.
ANOVA translates

H0 :1 = 2 = . . . = k
H1 :Not H0
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

131 / 181

Analysis of Variance - Part One VI


Single factor ANOVA

into
2
H0 :B2 W
2
H1 :B2 > W

Variances are compared with the F-distribution.


The test statistic is therefore
f =

MSB
MSW

Reject H0 at significance level if f > f

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

132 / 181

Analysis of Variance - Part One I

[7]: pp. 258-261

After the F-test

If H0 is rejected, how can you find where the differences lie?


Planned comparisons
Also called a priori tests
Essentially it is t-tests comparing means of different samples.
The test statistic is
t=q

x1 x2
MSw [ n11 +

1
n2 ]

The hypotheses are:


H0 :1 2
H1 :1 > 2
The rest of the test is a standard t-test with df = dfW .
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

133 / 181

Analysis of Variance - Part One II


After the F-test

Unplanned comparisons
There may be some situations where the conditions for the t-test
mentioned above are not met. This is then called a unplanned comparison.
Also known as a posteriori or post hoc tests.
Numerous tests are available...

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

134 / 181

Regressions I

[7]: pp. 293-299

Linear regression

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

135 / 181

Regressions II
Linear regression

Figure: Left: Scatter plot. Right: With linear trend line.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

136 / 181

Regression I

[7]: pp. 299-306

Testing hypotheses about regression

Residual variance of estimate

2
syx

P
(y y 0 )2
=
N 2
P
(y y 0 )2
=
N n1

n is the degree of the polynomial fitted to the data. In the linear case,
n = 1.
N is the number of data points.
y y 0 is the difference between the measured and predicted value.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

137 / 181

Regression II
Testing hypotheses about regression

Standard error of estimate


syx

q
2 =
= syx

sP
(y y 0 )2
N 2

Hypothesis

H0 :No real relationship


H1 :Not H0
Similar to ANOVA, the hypothesis will compare variances. Therefore,
rewrite

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

138 / 181

Regression III
Testing hypotheses about regression

2
2
H0 :Regression
Residual
2
2
H1 :Regression
> Residual

To find the variances, we need the sums of squares and their corresponding
degrees of freedom.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

139 / 181

Regression IV
Testing hypotheses about regression

Figure: Deviations in a scatter plot, from [6]


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

140 / 181

Regression V
Testing hypotheses about regression

SSResidual =

(y y 0 )2

This represents the variability around the regression curve.


SSRegression =

(y 0 y )2

This represents the gain in prediction by using a regression curve rather


than just the average of the data.
SSTotal =

(y y )2

This represents the total variance.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

141 / 181

Regression VI
Testing hypotheses about regression

The following identities hold

SSResidual + SSRegression = SSTotal


dfResidual + dfRegression = dfTotal
dfResidual = N 2
dfTotal = N 1

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

142 / 181

Regression VII
Testing hypotheses about regression

Similar to ANOVA, we use mean squares for the variances


SSRegression
dfRegression
SSResidual
=
dfResidual
SSTotal
=
dfTotal

MSRegression =
MSResidual
MSTotal
Test the hypothesis with an F test
F =

MSRegression
MSResidual

Reject H0 at significance level if F > F


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

143 / 181

Regression VIII
Testing hypotheses about regression

Testing the slope


(Note, this is a different approach from the textbook on pp. 267)
Is the slope different from zero? Or, is the mean an equally good
predictor?
Hypotheses

H0 : = 0
H1 : 6= 0
This is a standard one-sample, two tailed, t-test. In what follows, = 0
The test statistic is
t=
Kristiaan Schreve (SU)

b
; df = N 2
sb
Stats Block

January 26, 2015

144 / 181

Regression IX
Testing hypotheses about regression

Denominator estimates the standard error of the slope


syx

sx N 1
s
P
(y y 0 )2
syx =
N 2
sP
(x x)2
sx =
N 1
sb =

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

145 / 181

Regression X
Testing hypotheses about regression

Testing the intercept


Is the intercept not zero?
Hypotheses
H0 : = 0
H1 : 6= 0
This is a standard one-sample, two tailed, t-test. In what follows, = 0
The test statistic is
a
; df = N 2
sa
syx
sa = q
x2
sx N1 + (N1)s
2
t=

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

146 / 181

Regression I

Not in textbook

Excels R-squared

Coefficient of Determination
R2 =

SSRegression
SSTotal

When R 2 1, there is a good correlation.


When R 2 0, not so!

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

147 / 181

Regression I

[7]: pp. 307-319

Excel functions for regression

SLOPE
INTERCEPT
STEYX
FORECAST
TREND
LINEST
Data analysis tool: Regression

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

148 / 181

Multiple regression I

[7]: pp. 320-327

Regression for more than one dependent variable.


E.g. a plane:
y = a + b1 x1 + d2 x2
Any number of dependent variables are possible.
y =a+

bi x i

Other types of fitting is also possible in Excel (logarithmic, exponential,


higher order polynomials, etc.). Make careful decisions about the trend in
the data and choose an appropriate model. Use hypothesis testing to test
your assumptions.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

149 / 181

Regression

Not in textbook

Guidelines

Give an indication of the goodness of fit.


Report the range of the dependent variable(s) for which the
regression was done and therefore the range for which the goodness
of fit test is valid.
Check the validity of the prediction of the regression result over the
range of the dependent variable. E.g. sometimes the predicted result
must be a positive value (e.g. the score of the tut test). If the
regression result allows the possibility of predicting a negative value in
this case, the result must be reconsidered.
Fit the lowest order curve possible.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

150 / 181

Correlation I

[7]: pp. 331-334

Pearsons correlation coefficient

Correlation is an alternative to regression for looking at relationships


between parameters. With regression it is possible to make
predictions. With correlation it is easier to say that relationships are
stronger than others.
Positive correlation means that as one parameter increases, the other
also increases.
Negative correlation means that as one parameter increases, the other
decreases.
Note that correlation does not imply causality. (The same is true for
regression.)

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

151 / 181

Correlation II
Pearsons correlation coefficient

Pearsons product-moment correlation coefficient


1
N1

P
(x x)(y y )
r=
sx sy
cov(x, y )
=
sx sy


Numerator: covariance represents how x and y vary together.


Denominator: Standard deviations of x and y variables.
r = 1 implies perfect negative correlation (minimum value r can
have)
r = 1 implies perfect positive correlation (maximum value r can have)
r = 0 implies no correlation.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

152 / 181

Correlation I

[7]: pp. 334-337

Correlation and regression

SSRegression
r = r2 =
SSTotal
r 2 is just Excels Coefficient of Determination
R 2 = 0.667 implies SSRegression is 66.7% of SSTotal . To find out if
that is significant, do a hypothesis test...

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

153 / 181

Correlation I

[7]: pp. 338-340

Testing hypotheses about correlation

Correlation coefficient greater than zero?


Sample statistic is r .
Test for positive correlation
H0 : 0
H1 : > 0
Test statistic (N 2) degrees of freedom.
t=

r
sr

Where
=0
q
2
sr = 1r
N2
Reject H0 at significance level if t > t .
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

154 / 181

Correlation II
Testing hypotheses about correlation

Example (Too much data for regression? [6] pp. 371)


Say, N = 102 and = 0.05.
Say r = 0.195
Is it asignificant correlation?
N2
t = r1r
= 1.988
2
t = 1.984. Since t > t , reject H0 . We suspect the correlation is
significant.
BUT
r 2 = 0.038, which implies that SSRegression is just 4% of SSTotal .

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

155 / 181

Correlation III
Testing hypotheses about correlation

Example (Too much data for regression? Continued...)


r
0.195
0.195
0.195
0.195
0.195

t
2.178
2.085
1.988
1.886
1.778

t
1.980
1.982
1.984
1.987
1.990

Kristiaan Schreve (SU)

N 2
120
110
100
90
80

Reject?
Yes
Yes
Yes
No
No

Stats Block

January 26, 2015

156 / 181

Correlation IV
Testing hypotheses about correlation

Do two correlation coefficients differ?

H0 :1 = 2
H1 :1 6= 2
We have to transform the r value with
zr = 0.5[ln(1 + r ) ln(1 r )]
The test statistic is then
z=

z1 z2
z1 z2

where
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

157 / 181

Correlation V
Testing hypotheses about correlation

r
z1 z2 =

1
1
+
N1 3 N 2 3

Reject H0 at significance level if


z/2 > z or z > z/2

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

158 / 181

Uncertainty of Measurement I

Not in textbook

Based on ISO Guide 98-3[1].


Formal standard for expression of uncertainty in measurement.
True valuesof measurand can never be known.
Therefore, measurement errorcan also never be known.
Measurement results therefore should be expressed in statistical
terms, i.e. as a distribution.
Therefore, we should report some nominal value, e.g. the mean value,
with some expression of the measurement uncertainty.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

159 / 181

Uncertainty of Measurement II
Example (Measuring Power Dissipated from a Resistor [1])
If a potential difference V is applied to the terminals of a
temperature-dependent resistor that has a resistance of R0 at the defined
temperature t0 and a linear temperature coefficient of resistance , the
power P (the measurand) dissipated by the resistor at the temperature t
depends on V , R0 , and t according to
P = f (V , R0 , , t) =

V2
R0 [1 + (t t0 )]

P is never directly measured. We will measure V and t. With enough


repetitions, measurement uncertainties for V and t can be found.
Hopefully, the uncertainty in the reference values of R0 , and t0 are
known. Then we need a method to propagate the uncertainty of these
values to the uncertainty of the measurand P.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

160 / 181

Uncertainty of Measurement III


The example illustrates a few things
The measurand is seldom measured directly. Often it is derived from
a functional relationship such as
Y = f (X1 , X2 , ..., XN )
Y is the measurand.
The Xi is either known from measurements or from some prior
knowledge (e.g. a catalogue value).
There are two types of evaluation of standard uncertainty
Type A is determined from statistical analysis of a set of
measurements.
Type B is determined by any other means.

We need a method to propagate uncertainty (see slide 164).


Kristiaan Schreve (SU)

Stats Block

January 26, 2015

161 / 181

Uncertainty of Measurement IV

To find the mean value of the measurand, do you take the mean of
the input quantities or do you first calculate the measurand for each
set of measurements and then take the mean of the measurand?

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

162 / 181

Uncertainty of Measurement V
Example (When to calculate the mean)
The table shows voltage and
temperature readings for the power
dissipated by the resistor in the
previous example. If R0 = 4.33 ,
= 0.00393 and t0 = 20 C, the
mean power dissipated is

The difference is due to the nonlinear


function for P. The GUM guide [1]
states that for nonlinear relations, the
measurand for each data point must
be calculated and then the mean of
the set of measurands must be taken.

21.43545 W if P is calculated
for each data point and then the
mean of the 10 power values are
taken.
21.43568 W if the mean voltage
(10.006565 V) and mean
temperature (40.0563 C) is
used.
Kristiaan Schreve (SU)

Stats Block

Voltage [V]
10.030
9.991
9.971
10.023
10.000
10.039
10.073
9.987
9.935
10.017

Temperature [ C]
39.930
39.962
39.916
40.102
39.949
40.250
40.315
39.921
40.124
40.093
January 26, 2015

163 / 181

Uncertainty of Measurement I

Not in textbook

Evaluation of standard uncertainty

From the examples it is clear that there are two types of uncertainty.
One is based on a set of repeated measurements. (Type A.) In the
example, it is the standard uncertainty of the temperature t and
voltage V .
Another is based on other information, e.g. data sheets. (Type B.) In
the example, it is the standard uncertainty of the constants R0 , and
t0 .

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

164 / 181

Uncertainty of Measurement I

Not in textbook

Type A evaluation of standard uncertainty

Type A standard uncertainty is based on repeated measurements.


It is typically estimated with
s
sx =
N
Note, it is the standard error (or standard deviation of the sampling
distribution mean).
Can also be evaluated by other means, depending on the situation.
It is important to always report the degrees of freedom with the Type
A standard uncertainty.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

165 / 181

Uncertainty of Measurement I

Not in textbook

Type B evaluation of standard uncertainty

Type B standard uncertainty is NOT based on repeated


measurements.
Typical sources of information [1]
previous measurement data
previous experience and good engineering judgement
manufacturers specifications
data provided in calibration and other certificates
uncertainties assigned to reference data taken from handbooks.

If the source does not give the standard uncertainty explicitly, it may
be derived. The GUM Guide [1] gives several examples in section 4.3.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

166 / 181

Uncertainty of Measurement

Not in textbook

Law of propagation of uncertainty for uncorrelated quantities

When the measurand is not directly measured, as in the example, the


standard uncertainty of the measurand depends on the combined
Type A and Type B standard uncertainties.
It can be shown, if the input quantities are independent, that the
combined standard uncertainty is
sc2 (y ) =


N 
X
f 2
i=1

xi

s 2 (
xi )

This is called the law of propagation of uncertainty


f is the function Y = f (X1 , X2 , ..., XN ) and xi are the estimates of Xi .
Note, the partial derivatives essentially scales the input uncertainties.
It is sometimes called sensitivity coefficients.
If the partial derivatives cannot be calculated directly, they may be
evaluated numerically, or estimated experimentally (see sections 5.1.3
and 5.1.4 in the GUM Guide [1]).
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

167 / 181

Uncertainty of Measurement

Not in textbook

Law of propagation of uncertainty for correlated quantities

The law of propagation of uncertainty for correlated input quantities


is
sc2 (y ) =


N 
X
f 2
i=1

xi

s 2 (
xi ) + 2

N1
X

N
X
f f
s(
xi , xj )
xi xj

i=1 j=i+1

s(
xi , xj ) is the estimated covariance associated with xi and xj . It is
calculated as
N

s(
xi , xj ) =

X
1
(xi,k xi )(xj,k xj )
N(N 1)
k=1

EXCEL: Covariance is calculated with COVARIANCE.P (populations)


or COVARIANCE.S (samples).
How do you handle a situation where some quantities are correlated
and some not?
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

168 / 181

Uncertainty of Measurement

Not in textbook

Determining expanded uncertainty

In some practical cases the combined uncertainty is insufficient to


capture the uncertainty.
The expanded uncertainty is
U = ksc (y )
where k is the coverage factor.
Typically, 2 k 3.
The result of the measurement is then typically expressed as
Y = y U.
k can be chosen to cover a certain confidence interval, in which can
the confidence level should also be given.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

169 / 181

Uncertainty of Measurement I

Not in textbook

Reporting uncertainty

In general, give all the information needed to repeat the evaluation.


Rather report too much.
What is reported should be in line with the intended use of the
measurement result, e.g. a calibration certificate for a nano-metre
precision measurement device would require a lot more information
than a laser distance sensor you can buy at the local hardware store.
Consider to include the following [1]
clearly describe the methods used to calculate the measurement result
and its uncertainty from the experimental observation (Type A
standard uncertainty) and input data (Type B standard uncertainty)
list all the uncertainty components and document fully how they were
evaluated.
present the data analysis in such a way that each of its important steps
can be readily followed and the calculation of the reported result can
be independently repeated
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

170 / 181

Uncertainty of Measurement II
Reporting uncertainty

give all the corrections and constants used in the analysis and their
sources
in the case of reporting expanded uncertainty report the coverage
factor.

The numerical result of the uncertainty is reported in one of the


following four ways. (Assume a mass ms of an object weighing about
100 g is being reported.) The words below in parentheses may be
omitted. [1]
ms =100,021 47 g with (a combined standard uncertainty)
sc =0,35 mg
ms =100,021 47(35) g, where the number in parentheses is the
numerical value of (the combined standard uncertainty) sc referred to
the corresponding last digits of the quoted result.
ms =100,021 47(0,000 35) g, where the number in parentheses is the
numerical value of (the combined standard uncertainty) sc expressed in
the unit of the quoted result.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

171 / 181

Uncertainty of Measurement III


Reporting uncertainty

ms =(100,021 47 0,000 35) g, where the number following the


symbol is the numerical value of (the combined standard
uncertainty) sc and not a confidence interval.

Report an expanded uncertainty as


ms =(100,021 47 0,000 79) g, where the number following the
symbol is the numerical value of (an expended uncertainty) U = ksc ,
with U determined from (a combined standard uncertainty)
sc =0,35 mg and (a coverage factor) k=2,26 based on the
t-distribution for v =9 degrees of freedom, and defines an interval
estimated to have a level of confidence of 95 percent.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

172 / 181

Uncertainty of Measurement I
Example

Continue with example from beginning of section.


Use
sc2 (y )


N 
X
f 2

i=1

xi

s 2 (
xi ) + 2

N1
X

N
X
f f
s(
xi , xj )
xi xj

i=1 j=i+1

to calculate the combined uncertainty for


P = f (V , R0 , , t) =

V2
R0 [1 + (t t0 )]

Let

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

173 / 181

Uncertainty of Measurement II
Example

x1 = V
x2 = R0
x3 =
x4 = t
Ignore the uncertainty contribution of t0 . Assume it is a very well known
reference value with negligible uncertainty. Then

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

174 / 181

Uncertainty of Measurement III


Example

f
2V
=
V
R0 [1 + (t t0 )]
V 2
f
= 2
R0
R0 [1 + (t t0 )]
f
(t t0 )V 2
=

((t t0 ) + 1)2 R0
f
V 2
=
t
[(t t0 ) + 1]2 R0
Evaluate these values at mean values of V , R0 , and t, i.e.
V = 10.007 V, R0 = 4.33 , = 0.00393 and t = 40.056 C.
This gives

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

175 / 181

Uncertainty of Measurement IV
Example

f
V
f
R0
f

f
t

= 4.284
= 4.950
= 398.506
= 0.078

Assume only V and t is correlated. Hence, from EXCEL, find


, t )=0.00296. Also, from the data we can find
s(V

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

176 / 181

Uncertainty of Measurement V
Example

) = 0.00149
s 2 (V
s 2 (t ) = 0.02076
Finally, lets assume that somehow we know that
s 2 (R0 ) = 0.001
s 2 (
) = 0.02
Now it is straight forward to calculate sc2 (P).

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

177 / 181

Selecting the Right Method I


Method
Confidence interval of ;
known
Confidence interval of ;
unknown
One sample hypothesis
test: z-test on single
mean
One sample hypothesis
test: t-test on single
mean

Kristiaan Schreve (SU)

Not in textbook

Typical Use
Calculate confidence limits for your estimate
of the population mean. You know the population variance.
Calculate confidence limits for your estimate
of the population mean. You do not know
the population variance.
You have one sample and some guess of the
population mean. You want to know if the
guess is right or how it differs. You know
the population variance.
You have one sample and some guess of the
population mean. You want to know if the
guess is right or how it differs. You do not
know the population variance.
Stats Block

January 26, 2015

178 / 181

Selecting the Right Method II


One sample hypothesis
test: 2 -test on single
variance
Two sample hypothesis
test: z-test on two means
Two sample hypothesis
test: t-test on two means
with equal variances
Two sample hypothesis
test: t-test on two means
with unknown, unequal
variances
Two sample hypothesis
test: paired samples.
Kristiaan Schreve (SU)

You have one sample and some guess of the


population variance. You want to know if
the guess is right or how it differs.
You have two samples and want to know if
they are the same or not. You know the
population variance.
You have two samples and want to know if
they are the same or not. You do not know
the population variance, but know that they
are equal.
You have two samples and want to know
if they are the same or not. You have no
knowledge about the population variance.
Comparing two samples, but the specimens
in the two samples are somehow linked. You
do not know the population variance.
Stats Block

January 26, 2015

179 / 181

Selecting the Right Method III


Two sample hypothesis
test: F-test
Single factor ANOVA
Single factor ANOVA:
Planned comparison
Single factor ANOVA:
Unplanned comparison
Regression
Regression: F-test
Regression: Testing the
slope

Kristiaan Schreve (SU)

Comparing the variances of two samples.


Seeing if there is a difference in the means
of more than two samples.
A priori t-tests on the means of selected
samples to find out if there is a significant
difference.
A posteriori test on sample means. Not covered in this course.
If you suspect there is a trend between the
dependent and independent variables.
Test the above mentioned suspicion.
See if the slope of the linear regression
curve is significant, otherwise the mean is
an equally good predictor.
Stats Block

January 26, 2015

180 / 181

Selecting the Right Method IV


Regression: Testing the
intercept
Regression: Coefficient
of Determination R 2
Correlation:
Pearsons
correlation coefficient

Correlation: Is correlation coefficient greater


than zero?
Correlation: Do two correlation coefficients differ?
Kristiaan Schreve (SU)

See if the intercept plays a significant role.


Otherwise it could have been zero.
Indication of goodness of fit. Is not a hypothesis test. Should be combined with an
F-test for the regression.
Similar to coefficient of determination, but
distinguishes between positive and negative
correlation. Tests if data is correlated, but
does not tell how. Is not a hypothesis test.
Hypothesis test to evaluate correlation coefficient.
Is there a new correlation between the data?

Stats Block

January 26, 2015

181 / 181

References I
Uncertainty of measurementpart 3: guide to the expression of uncertainty in
measurement, 1995.
R.S. Figliola and D.E. Beasley.
Theory and Design for Mechanical Measurements.
Wiley, Hoboken, 4th edition, 2006.
A Graham.
Statistics: A Complete Introduction.
Hodder & Stoughton, 2013.
D Huff and I Geis.
How to Lie with Statistics.
Norton, New York, 1954.
W. Mendenhall and T Sincich.
Statistics for Engineering and the Sciences.
MacMillan, New York, 3rd edition, 1992.
INBO 519.502462 MEN.
Kristiaan Schreve (SU)

Stats Block

January 26, 2015

182 / 181

References II
J. Schmuller.
Statistical Anlysis with Excel for Dummies.
Wiley, Hoboken, 2nd edition, 2009.
J Schmuller.
Statistical Analysis with Excel for Dummies.
Wiley, Hoboken, 3rd edition, 2013.
R.E. Walpole and R.H. Myers.
Probability and Statistics for Engineers and Scientists.
MacMillan, New York, 4th edition, 1990.

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

183 / 181

The End

Kristiaan Schreve (SU)

Stats Block

January 26, 2015

184 / 181

You might also like