Professional Documents
Culture Documents
Data Analysis
Outline
Background
terminology
Descriptive statistics
Inferential statistics
Graphs & tables
Scientific Method
Scientific Method
OBSERVATION
OBSERVATION
QUESTION
QUESTION
HYPOTHESIS
HYPOTHESIS
PREDICTION
PREDICTION
TEST
TEST
6
Hypochlora alba
Artemisia ludoviciana
10/6/2011
Scientific Method
OBSERVATION
QUESTION
Scientific Method
OBSERVATION
QUESTION
HYPOTHESIS
HYPOTHESIS
PREDICTION
PREDICTION
TEST
Hypotheses
TEST
Hypotheses
Null Hypothesis (Ho)
There is no difference between the means of
treatment A and B.
Null Hypothesis
There is no effect of plant host color on
Hypochlora alba color.
Alternative Hypothesis
The color of the host plant affects the color of
Hypochlora alba.
Hypotheses
11
Good Hypotheses
Null Hypothesis
There is no change between the means from
year one to year two.
Alternative Hypothesis
There is a change between the means from year
one to year two.
13
10/6/2011
Good Hypotheses
Types of Experiments
Descriptive:
Does not alter the environment
natural experiment
Shows patterns, but does not identify the
mechanism explaining those patterns
14
Variables
Manipulative:
Alters the environment using treatments
Isolates the variable of interest (identifies the
mechanism)
Controls for all other variables
15
Degrees of Freedom
16
17
Descriptive Statistics
Mean:
Average of all the numbers
Median:
Middle number of the group
Mode:
Number that appears the most frequently
Range:
Difference between the largest and smallest
number
DESCRIPTIVE
STATISTICS
18
19
10/6/2011
Describing Variation
Normal Distribution
1. Standard Deviation
2. Standard Error of Mean
3. Confidence Intervals
mean
20
Standard Deviation
Symbols
21
Sum of Squares
SS = (x x)2
x = sample value
x = sample mean
s = sample variance
= population mean
= population variance
n = sample size
= level of significance
Variance
s2 = (x x)2
n-1
Standard Deviation
S.D. = (x x)2
n-1
Degrees of Freedom
d.f. = n - 1
Standard Deviation
S.D. = SS
d.f.
Standard Deviation
23
General Rules
68% fall w/in 1 s.d.
95% fall w/in 2 s.d.
99% fall w/in 3 s.d.
*If normal distribution
Standard Deviation
S.D. = (x x)2
n-1
Standard Error of Mean
S.E.M. = (x x)2
n-1
n
Standard Deviation
S.D. = SS
d.f.
Standard Error of Mean
S.E.M. = s.d.
n
25
10/6/2011
Confidence Intervals
Confidence Interval
C.I. = x z(1-)
n
Use:
Large sample size
Normal distribution
Confidence Interval
C.I. = x t(, d.f.)
Confidence level
Use:
Small sample size (<30)
Students t distribution
26
27
Reporting Means
Low
variability
Medium
variability
High
variability
28
29
Inferential Statistics
Uses sample statistics to make inferences about
population parameters
Inherent variation in population
Are the differences you see due to inherent
variation or your treatments?
Must test hypotheses
Use test statistic and P values
INFERENTIAL
STATISTICS
31
32
10/6/2011
Types of Values
Discrete values:
Values are distinct
Categorical data
Ex. presence/absence
Continuous values:
One value flows into the next
There are an infinite number of other possible
values in between any two values
Ex. plant height
P-values:
Probability that the null hypothesis is true
Value tells you whether you reject or fail to
reject the null hypothesis
Standard to reject if P < 0.05 ( = 0.05)
Ex. P = 0.02
Reject Ho; 2% of the time the Ho is true
Ex. P = 0.88
Fail to reject Ho; 88% of the time the Ho
is true
33
Types of Samples
34
Statistical Tests
Independent Samples
Different sampling units each time
Paired Samples
Re-sample already established units
ex. permanent plots
Depend upon:
Discrete or continuous values
Independent or paired samples
Chi-square test for independence
Independent sample t-test
Paired t-test
Analysis of variance
Repeated measures analysis of variance
36
Chi-Square
Chi-Square
37
25
55
30
85
38
10/6/2011
Independent-sample t-test
Independent-sample t-test
t-test test statistic:
t = difference between two means
standard error of two means
t-test:
Continuous data
Tests the difference between sample means of
two groups
t = x1 x2
s x1 x2
where:
x1 = mean of sample 1
x2 = mean of sample 2
s x1 x2 = standard error of two means
39
Analysis of Variance
Two-tailed t-test
Ho: There is no change in population mean.
One-tailed t-test
- Ho: There is no increase in population mean.
Note: If fail to reject Ho, this could mean no
change or a decrease in population mean.
**More powerful than two-tailed test in
detecting true change.
One-tailed t-test P-value is half of two-tailed
t-test
41
s2
within groups
F=
Multiple comparisons
s2between groups
Analysis of Variance
Analysis of Variance
F=
40
MSbetween groups
MSwithin groups
43
44
10/6/2011
Paired t-test
Statistical Tests
Independent Samples
Chi-square test
Independent-sample t-test
Analysis of Variance (ANOVA)
Paired Samples
Paired t-test
Repeated-measures ANOVA
45
Repeated-measures ANOVA
Means
1990 = 0.44
1994 = 0.38
P = 0.55
Means
1990 = 0.44
1994 = 0.38
P = 0.0009
46
49
Reporting Statistics
Statistical Values
Test statistic:
Paired samples
Statistical Assumptions
Independent samples
See
Elzinga
p. 244-245
50
51
10/6/2011
Why?
Constraints to sample size?
Time
Money
2. Size
3. Number
Sample Size
Statistical Error
False-Change Error =
No change has taken place but sampling detects
change
Controlled by P-value
P-value = probability that the null hypothesis is
true
P-value = probability that no change actually
occurred
P-value = probability that the difference was due
to chance
Missed-Change Error =
Real change has taken place but sampling does
NOT detect change
Increase statistical power
Power = 1
Ex.
If = 0.70, power = 0.30
If = 0.05, power = 0.95
I want to be at least 95% certain of detecting a real change.
10/6/2011
General recommendations/experience
Graphically
Equations
Computer Programs
Resources
See Elzinga p. 74-87 (graphing)
See Elzinga p. 141-154, Appendix 16 & 18
(computer programs)
See Herrick vol. II Appendix C
(recommendations & calculations)
Tables
60
Table 3
Mean values ( standard error) of exchangeable cations in soil of
control and NaCl-treated soils from experiment #1 after plants were
harvested in the second year of the experiment. ** = P 0.05, *** =
P 0.0001.
Na (ppm)
Ca (ppm)
Mg (ppm)
K (ppm)
Newingham and
Belnap 2006
Histogram
Control
79
4
3690
77
195
9
267
20
NaCl
247
24
3133
133
150
6
204
16
***
**
**
**
61
Scatterplot
Number of Cases
50
40
30
20
10
Age
62
63
10
10/6/2011
Scatterplot
16
6
S.D.
0
0
confidence intervals)
20
20
20
25
September 2004
25
**
10
Ambient
CO2
Elevated
CO2
15
10
Ambient
CO2
Elevated
CO2
6
4
2
Ambient
CO2
Fertilizer
65
S.E.
August 2004
25
10
No Fertilizer
Error bars:
Whiskers on means to guide whether
bars/points are statistically (significantly)
different (ex. standard deviation, standard error, 95%
15
10
64
15
12
Population
June2004
14
66
Elevated
CO2
Letters
Denote
Significant
Differences
No Clipping
Clipping
6
5
4
3
2
S.E.
1
0
Benomyl
No Benomyl
67
Interesting
Results!
68
69
11