Professional Documents
Culture Documents
Lesson 2 Page 2 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Terms
First, let's go over some terms we will use often. 1. Descriptive statistics are used to describe the data (e.g., graphs, tables, averages, ranges, etc.). 2. Inferential statistics are used to infer, from a sample, facts about the population it came from. It follows that a parameter is a fact regarding a population, whereas a statistic is a fact regarding a sample. Statistical tests for inferential statistics are divided into parametric and nonparametric tests. Parametric tests are used to analyze data from interval or ratio scales (continuously distributed, following a normal distribution), while nonparametric tests are designed to handle ordinal data (ranks) and nominal data (categories).
Summary Statistics
There are two major distinctions in summary statistics: Whether the statistics are measures of central tendency (i.e., where do most of the numbers fall?), or Whether they are measures of dispersion (i.e., how much spread is there in the numbers?)
Lesson 2 Page 3 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
- population - sample 4. The standard deviation is the square root of the variance
5. The coefficient of variation equals the standard deviation divided by the mean
The range of the sample is not a good estimate of the range of the population; it is too small. This limitation is reduced by increasing the size of the sample. The interquartile range eliminates the effect of outliers on the range.
Lesson 2 Page 4 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
You should familiarize yourself with the basic ways in which summary statistics are calculated for data spreadsheets in your favorite office software(s) (e.g., Microsoft Excel) and/or statistical software (Minitab, SAS, SPSS, etc...). Also consult the univariate
Lesson 2 Page 5 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Topic 2.3: The Null Hypothesis and Type I and Type II Errors
The Null Hypothesis (H0)
The Null Hypothesis (H0) may be a hypothesis stating that there is NO DIFFERENCE: Between two samples Between the means of two sets of numbers Between the number of people or things in various categories
P = 1-
The power of a statistical test P is defined as 1 - beta. In discrimination testing, P is the probability of finding a difference if one actually exists, or the probability of making the correct decision that the two samples are perceptibly different. The power of the test P is dependent on: The magnitude of the difference between samples, The size of alpha, and The number of judges performing the test
In practice, we set the desired level of P to determine how many judges should be recruited to conduct the test. This is what is known as power analysis. It is an important tool for anyone involved in experimental design and data collection, and one that we will examine in more detail in our discrimination testing lesson.
Lesson 2 Page 6 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Confidence Interval
A confidence interval gives a sense for where a sample mean is likely to fall. It is typically set at +/- 2 s.e.m. (Standard Error of the Mean) for a 95% probability level. It is derived from the critical value of t at p < 0.05 (for a high number of degrees of freedom).
As mentioned above, the standard error of the mean is equal to the standard deviation divided by the square root of N, the number of observations from which the mean is derived.
Lesson 2 Page 7 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 8 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
A z-score is really the distance from the mean in terms of standard deviations.
Lesson 2 Page 9 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 10 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
2. Two-sample t-test, related samples. This test determines whether two samples were drawn from the same population (means not significantly different) or from different populations (means significantly different), in a related-samples design. We would use this t-test to compare the mean ratings for a panel of judges in two conditions - for example, rating the intensity of an attribute in a sample under white light vs. red light.
3. Two-sample t-test, independent (unrelated) samples. This test determines whether two samples were
drawn from the same population (means not significantly different) or from different populations (means significantly different), in an independent-sample design - for example comparing the mean ratings given to a sample by two different panels.
The calculated value for r is compared to a table of critical values for Pearson's Product-Moment correlation coefficient. Those are given for various significance levels, and for one-tailed and two-tailed scenarios. Note that the degrees of freedom for a correlation coefficient are n - 2, where n is the number of pairs of X and Y observations used to calculate r. This is the one time when degrees of freedom are not equal to n - 1!
Lesson 2 Page 12 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Both linear correlation and linear regression are based on the assumption that there is a linear relationship between the data sets. THIS IS AN IMPORTANT AND SOMETIMES OVERLOOKED ASSUMPTION. Linear correlation is based on four additional assumptions: Each pair of X and Y values must be independent. The data must come from a bivariate normal distribution. Generally, X and Y should be randomly sampled. X and Y should be homoscedastic (of equal variance).
In practice, these assumptions are rarely checked. A significant correlation between two variables X and Y may imply that: 1. 2. 3. 4. X causes Y Y causes X X and Y are both caused by some other factor None of the above
That is, correlation does NOT necessarily imply CAUSALITY. Another consideration in looking at the significance of correlation coefficients is the fact that tables give us information on what level of confidence we can have that the r is not zero. It does not tell us that this r will be of practical value. For example, an r of 0.2 could be significantly different from zero at .05 level but this would only mean that X would only account for four percent of the variance in Y. (Note: r is called the coefficient of determination and indicates the variance of Y accounted for by X, in this example 0.22 = 0.04 or 4%.)
Lesson 2 Page 13 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 14 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 15 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Sometimes, this can make a difference in the significance of the F-ratio. How do we run an analysis of variance and then read and interpret the results? This will be an important component of the tutorial at the end of the lesson, and the focus of one of the exercises for this lesson. We will use SAS (Statistical Analysis Systems - PC Version) to run our ANOVA. But you may use any other software with that capability. We usually present the results as a table showing F-ratios and their significance for the sources of variation (treatments) and their interactions as shown in the table below from your reading assignment.
Lesson 2 Page 16 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 17 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Where: n = number of treatments (e.g., number of judges, samples or replications) in a one-way ANOVA or one-factor repeated measures analysis of variance t = t-value for a two-tailed test with the degrees of freedom for the error term (that number is 1.98 with a high enough number of degrees of freedom) and an alpha level of 5, 1 or 0.1%. If the difference between 2 treatment means is larger than the LSD, then the means are deemed to be significantly different at the corresponding alpha level (5, 1 or 0.1%). Duncan's studentized range statistic q goes as follows:
This q value must exceed a tabulated value based on the number of means being compared. All of these multiple mean comparison tests are typically included as options in the statistical software you use. It is simply a matter of writing in the one or two lines in your program requesting that these multiple mean comparisons be run after the main ANOVA procedure. We will go through that procedure with our ANOVA example in the tutorial, and you will make multiple mean comparisons in an ANOVA assignment as well.
Lesson 2 Page 18 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Where: = the Spearman correlation coefficient (rho) d = the sum of the squares of the differences between ranks N = the number of cases The table below helps summarize the parallels between parametric and nonparametric statistics.
Lesson 2 Page 19 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 20 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 21 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
The experimental unit is the smallest subdivision of the experimental material such that any two could be assigned to different treatments.
Lesson 2 Page 22 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 23 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 24 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Xij = + Ti + Bj + Eij
Where: Xij is the observed value for the ith treatment and jth judge is the mean Ti is the effect of the ith treatment Bj is the effect of blocks Eij are random errors assumed to be normally and independently distributed with variance e The variance e includes the variation due to panelists, experimental materials, and other errors not controlled by the design.
Lesson 2 Page 25 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
The RCB design is frequently used when trained panels must evaluate several samples in replicate (not feasible in one single session). In this case, it is best to have each judge evaluate all samples in a single session, and then return to evaluate them again in another session, etc. In this type of study, the blocks are the sessions, and the samples are randomized across judges within each block.
Lesson 2 Page 26 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
These parameters are not independent and the following requirements apply:
rt = bk = N and (t - 1) = r(k - 1)
where N is the total number of observations in the experiment. The layout for a BIB design is:
Lesson 2 Page 27 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
A BIB design is used when there are too many treatments in the experiment for the judges to evaluate all the samples in a single session (block). In this case, judges evaluate subsets of samples during different sessions.
Lesson 2 Page 28 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
In designing a crossover home-use test, two groups of consumers are formed - I and II. Consumers are assigned randomly to the two groups. In the first period, Group I uses product A first and product B second, and vice versa for Group II. Two assumptions of this model, which may not always be met in a home-use test, are that (1) there is no productby-period interaction; that is, the difference between products A and B is the same regardless of the sequence in which they were evaluated; and (2) there are no order or carry-over effects from one product to the other. Neither assumption is likely to hold true, however.
Lesson 2 Page 29 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 30 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 31 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 32 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Tables
Table of critical values for chi-square Table of critical values for Pearson's product-moment correlation coefficient Tables of Spearman rank order correlation values Table of critical values of t (Student's t-test)
Lesson 2 Page 33 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006
Lesson 2 Page 34 of 34
Copyright The Regents of the University of California 2006 Copyright Dr. Jean-Xavier Guinard 2006