Professional Documents
Culture Documents
Ask yourself 2 questions: What kind of data have you collected? What is your goal? Goal Measurement (from Gaussian population) Mean, SD One-sample test Unpaired t test Paired t test One-way ANOVA Repeated measures ANOVA Person correlation Rank, score or measurement (from nonGaussian population) Median, interquartile range Wilcoxon test Mann-Whitney test Wilcoxon test Krukal-Waltis test Friedman test Spearman correlation Binominal (2 possible outcomes) Proportion Chi-square or binominal test Fishers test (chi-square for large sample) McNemars test Chi-square test Cochrane Q Contingency coefficients Simple logistic regression Multiple logistic regression
Describe one group Compare a group to a hypothethical value Compaire 2 unpaired groups Compare 2 paired groups Compare 3 unmatched groups Compare 3 matched groups Quantify associated between 2 variables Predict value from another measured variables Predict value from several measured or binominal variables
Simple linear regression or Non-parametric regression non-linear regression Multiple linear regression or multiple non linear regression
Normality test
th
outliers
Graphs --- Legacy dialogs --- Histogram A normal distribution should be represented by a bell-shaped curve
Normal Q-Q Plot In order to determine normality graphically we can use the output of a normal Q-Q Plot. If the data are normally distributed then the data points will be close to the diagonal line. If the data points stray from the line in an obvious nonlinear fashion then the data are not normally distributed. As we can see from the normal Q-Q plot below the data is normally distributed.
This determines whether one distribution (e.g. your dataset) is significantly different from one another (e.g. a normal distribution) and produce a numerical answer, yes or no. The above table presents the results from two well-known tests of normality, namely the KolmogorovSmirnov Test and the Shapiro-Wilk Test. Shapiro-Wilk Test is more appropriate for small sample sizes (< 50 samples) or the Kolmogorov-Smirnov Test is used if the sample size is greater than 50. We can see from the above table that for the "Beginner", "Intermediate" and "Advanced" Course Group the dependent variable, "Time", was normally distributed. How do we know this? If the Sig. value of the Shapiro-Wilk Test is greater the 0.05 then the data is normal. If it is below 0.05 then the data significantly deviate from a normal distribution.
How to recognise a normal (& non-normal distribution): In a perfect normal frequency distribution, the mean, median and mode are equal. The data is continuous and symmetrically distributed around the central point. This does not mean that there are no outliers, but the data is no bimodal (or multimodal). In a perfect normal frequency distribution: 68% samples fall between 1 standard deviations from the mean 95% of samples fall between 2 s.d from the mean 99.7% of samples fall between 3 s.d from the mean
PARAMETRIC TESTS
In a parametric test a sample statistic is obtained to estimate the population parameter. Because this estimation process involves a sample, a sampling distribution, and a population, certain parametric assumptions are required to ensure all components are compatible with each other. For example it is assumed that the sample data have a normal distribution scores in different groups have homogeneous variances. Parametric tests normally involve data expressed in absolute numbers or values rather than ranks T test: One sample T test Independent sample T test Paired samples T test Analysis of variance (ANOVA)
1. T-test
The t-test is a parametric test assesses whether the means of two groups are statistically different from each other. The T test assumes that the data analysed: Be continuous, interval data comprising a whole population or sampled randomly from a population Has a normal distribution Sample size should not differ hugely between the groups
When correlation (Correlation is a measure of association between two variables) and regression (Simple regression is used to examine the relationship between one dependent and one independent variable) are used to examine relationships between data sets, the t test is the primary example of how to examine differences between data sets, as with any statistical investigation, the starting point is to think about what question you want to ask.
Yes, there an instructional effect taking place in the computer class The observed mean difference is -4.5172. Since the value of t is -3.820 at p < .001, the mean difference (-4.5172) between pretest and posttest is statistically significant. According to the Sig. of 0.001 (which is less than 0.05), the hypothesis is rejected. Therefore, it can be inferred that there was instructional effect taking place in the computer class.
b. One-Sample T Test
The One-Sample T Test compares the mean score of a sample to a known value. Usually, the known value is a population mean. E.g. a researcher may want to check whether the average IQ score for a group of students differs from 100 Analyze menu --- Compare Means --- One-Sample T Test --- Move the dependent variable into the "Test Variables" box --- Type in the value you wish to compare your sample to in the box called "Test Value."
Assumption: -The dependent variable is normally distributed. You can check for normal distribution with a Q-Q plot.
Example: Hypotheses: Null: There is no significant difference between the sample mean and the population mean. Alternate: There is a significant difference between the sample mean and the population mean. SPSS Output: Following is a sample output of a one-sample T test. We compared the mean level of self-esteem for our sample of Wellesley college students to a known population value of 3.9. First, we see the descriptive statistics.
The mean of our sample is 4.04, which is slightly higher than our population mean of 3.9. Next, we see the results of our one-sample T test:
Our T value is 2.288, Our significance value is .024. There is a significant difference between the two groups (the significance is less than .05). Therefore, we can say that our sample mean of 4.04 is significantly greater than the population mean of 3.9.
H0: There is no difference between seedlings under the light and in the dark H1: There is sig. difference between seedlings under the light and in the dark
In the Groups 1 and 2 text box, type in the values that determines the respective groups
Yes, The mean difference in seedlings sprouted between the two treatments (light and dark) was -2.900. The value of t, which is -3.179, was statistically significant (p=0.005). Therefore, the null hypothesis is rejected. The columns labeled "Levene's Test for Equality of Variances" tell us whether an assumption of the t-test has been met. The t-test assumes that the variability of each group is approximately equal. If that assumption isn't met, then a special form of the t-test should be used. Look at the column labeled "Sig." under the heading "Levene's Test for Equality of Variances". In this example, the significance (p value) of Levene's test is .184. If this value is less than or equal to your level for the test (usually .05), then you can reject the null hypothesis that the variability of the two groups is equal, implying that the variances are unequal. If the p value is less than or equal to the level, then you should use the bottom row of the output (the row labeled "Equal variances not assumed.") If the p value is greater than your level, then you should use the middle row of the output (the row labeled "Equal variances assumed.") In this example, .184 is larger than , so we will assume that the variances are equal and we will use the middle row of the output.
Non-parametric test
Also called distribution free tests Non-parametric techniques are ideal for use when: you have data that are measured on nominal (categorical) and ordinal (ranked) scales. you have very small samples your data do not meet the stringent assumptions of the parametric techniques (because interval data are converted to tank-ordered data (i.e., merely ordering the data from the lowest to the highest value).
Wilcoxon signed rank test Whitney-Mann-Wilcoxon (WMW) test Kruskal-Waltis test Friedmans test
The Ranks table provides some interesting data on the comparison of participant's Before (Pre) and After (Post) Pain Score score. We can see from the table's legend that 11 participants had a higher pre-acupucture treatment Pain Score than after their treatment. However, 4 participants had a higher Pain Score after treatment and 10 participants saw no change in their Pain Score.
A Wilcoxon Signed Ranks Test showed that a 4 week, twice weekly acupuncture treatment course did not elicit a statistically significant change in lower back pain in individuals with existing lower back pain (Z = -1.807, P = 0.071). Indeed, median Pain Score rating was 5.0 both pre- and post-treatment.
The above table is very useful as it indicates which group had the highest cholesterol concentration; namely, the group with the highest mean rank. In this case, the diet group had the highest cholesterol concentrations.
From this data it can be concluded that there is a statistically significant difference between the exercise and diet treatment group's median cholesterol concentration at the end of both treatments (U = 110, P = 0.014). It can be further concluded that the exercise treatment elicited statistically significant lower cholesterol concentrations than the dietary group (P = 0.016).
3. Kruskal-Wallis Test
The Kruskal-Wallis Test is the nonparametric test equivalent to the one-way ANOVA. It is an extension of the Mann-Whitney Test to allow the comparison of more than two independent groups. It is used when we wish to compare three or more sets of scores that come from different groups. Analyze > Nonparametric Tests > K Independent Samples E.g. A medical researcher would like to investigate an anecdotal evidence with a study. The researcher identifies 3 wellknown, anti-depressive drugs which might have a positive effect on pain after a 4 week period of prescribing the drugs and labels them Drug A, Drug B and Drug C.
We can report that there was a statistically significant difference between the different drug treatments (H(2) = 8.520, P = 0.014) with a mean rank of 35.33 for Drug A, 34.83 for Drug B and 21.35 for Drug C.
4. Friedman test
non-parametric alternative to the one-way ANOVA with repeated measures. used to test for differences between groups when the dependent variable being measured is ordinal. It can also be used for continuous data that has violated the assumptions necessary to run the one-way ANOVA with repeated measures; for example, marked deviations from normality. Analyze > Nonparametric Tests > K Related Samples E.g. Three subjects runs once listening to no music at all, runs once listening to classical music and runs
another listening to dance music, in a random order. At the end of each run, subjects are asked to record how hard the running session felt on a scale of 1 to 10, with 1 being easy and 10 extremely hard. A Friedman test is then run to see if there are differences between the music type on perceived effort.
We can see, from our example, that there is an overall statistically significant difference between the mean ranks of the related groups. It is important to note that the Friedman Test is an omnibus test like its parametric alternative - that is, it tells you whether there are overall differences but does not pinpoint which groups in particular differ from each other.
Independent variable consists of two or more categorical independent groups. Dependent variable is either interval or ratio (continuous) Dependent variable is approximately normally distributed for each category of the independent variable Equality of variances between the independent groups (homogeneity of variances). Independence of cases.
At this point, it is important to realise that the one-way ANOVA is an omnibus test statistic and cannot tell you which specific groups were significantly different from each other, only that at least two groups were. To determine which specific groups differed from each other you need to use a post-hoc test. Post-hoc tests are described later. Analyze > Compare Means > One-Way ANOVA > Post Hoc > Continue E.g. There are 3 packages - a beginner, intermediate and advanced course for a particular spreadsheet program. 10 persons on the beginner course, 10 on the intermediate and 10 on the advanced course. When they all return from the training they were given a problem to solve using the spreadsheet program and times how long it takes them to complete the problem. Compare the three courses (beginner, intermediate, advanced) to see if there are any differences in the average time it took to complete the problem.
The descriptives table (see above) provides some very useful descriptive statistics including the mean, standard deviation and 95% confidence intervals for the dependent variable (Time) for each separate group (Beginners, Intermediate & Advanced) as well as when all groups are combined (Total).
One of the assumptions of the one-way ANOVA is that the variances of the groups you are comparing are similar. The table Test of Homogeneity of Variances (see below) shows the result of Levene's Test of Homogeneity of Variance, which tests for similar variances. If the significance value is greater than 0.05 (found in the Sig. column) then you have homogeneity of variances. We can see from this example that Levene's F Statistic has a significance value of 0.901 and, therefore, the assumption of homogeneity of variance is met. What if the Levene's F statistic was significant? This would mean that you do not have similar variances and you will need to refer to the Robust Tests of Equality of Means Table instead of the ANOVA Table.
The significance level is 0.021 (P = .021), which is below 0.05 and, therefore, there is a statistically significant difference in the mean length of time to complete the spreadsheet problem between the different courses taken. This is great to know but we do not know which of the specific groups differed.
if there was a violation of the assumption of homogeneity of variances we could still determine whether there were significant differences between the groups by not using the traditional ANOVA but using the Welch test. Like the ANOVA test, if the significance value is less than 0.05 then there are statistically significant differences between groups.
From the results so far we know that there are significant differences between the groups as a whole. The table below, Multiple Comparisons, shows which groups differed from each other. The Tukey post-hoc test is generally the preferred test for conducting post-hoc tests on a one-way ANOVA but there are many others. We can see from the table below that there is a significant difference in time to complete the problem between the group that took the beginner course and the intermediate course (P = 0.046) as well as between the beginner course and advanced course (P = 0.034). However, there were no differences between the groups that took the intermediate and advanced course (P = 0.989). There was a statistically significant difference between groups as determined by one-way ANOVA (F(2,27) = 4.467, p = .021). A Tukey post-hoc test revealed that the time to complete the problem was statistically significantly lower after taking the intermediate (23.6 3.3 min, P = .046) and advanced (23.4 3.2 min, P = .034) course compared to the beginners course (27.2 3.0 min). There were no statistically significant differences between the intermediate and advanced groups (P = .989).
1. Pearson correlation
There are numerous methods for calculating correlations: The parametric Pearson, or r value, correlation The non-parametric Spearman correlation The Pearson product-moment correlation coefficient is a measure of the strength and direction of association that exists between two variables measured on at least an interval scale. Pearson correlation coefficients (r) can take on only values from 1 to +1. The sign out the front indicates whether there is a positive correlation (as one variable increases, so too does the other) or a negative correlation (as one variable increases, the other decreases). The size of the absolute value (ignoring the sign) provides an indication of the strength of the relationship. At the other extreme, an r of zero implies an absence of a correlation there is no relationship between the two variables.
Correlation does not mean causation. 2 reasons why we cannot make causal statements: Dont know the direction of the cause Does X cause Y or does Y cause X A third variable F may be involved that is responsible for the covariance between X and Y Correlation indicates whether there is a relationship between 2 variables, but not what causes the relationship or what the relationship means Analyze > Correlate > Bivariate... E.g. A researcher wishes to know whether a person's height is related to how well they perform in a long jump. The researcher recruits untrained individuals from the general population, measures their height and gets them to perform a long jump. They then go about investigating whether there is an association between height and long jump performance.
The table presents the Pearson correlation coefficient, the significance value and the sample size that the calculation is based on. In this example we can see that the Pearson correlation coefficient, r, is 0.777 and that this is statistically significant (P < 0.0005).
The correlation coefficient may be understood by various ways means: Scatter plots Slope of the Regression line of z-scores
For correlation it does not make any difference which variable goes on the x-axis and which variable goes on the y-axis However for linear regression the variable that is the predictor goes on the x-axis. The variable being predicted goes on the y-axis If knowing the value for the variable on the x-axis gives a strong ability to predict the value for the variable on the y-axis, then this point should fall near the regression line, depending on the accuracy of the prediction.
2. Regression
Predicts an outcome from one or more variables Simple regression examines the outcome from a single variable Multiple regression is where several variables are combined to predict an outcome Involves constructing a single statistical model consisting of a straight line: line of best fit between the variables. This is achieved by considering the differences between the data points and this line of best fit, which are known as residuals. The variable we are using to predict the other variable's value is called the independent variable (predictor variable). The variable we are wishing to predict is called the dependent variable (outcome variable). Analyze > Regression > Linear... E.g. A saleman for a large car brand is interested in determining whether there is a relationship between an individual's income and the price they pay for a car. They will use this information to determine which cars to offer potential customers in new areas where average income is known.
The first table of interest is the Model Summary table. This table provides the R and R2 value. The R value is 0.873, which represents the simple correlation and, therefore, indicates a high degree of correlation. The R2 value indicates how much of the dependent variable, price, can be explained by the independent variable, income. In this case, 76.2% can be explained, which is very large.
The next table is the ANOVA table. This table indicates that the regression model predicts the outcome variable significantly well. How do we know this? Look at the "Regression" row and go to the Sig. column. This indicates the statistical significance of the regression model that was applied. Here, P < 0.0005 which is less than 0.05 and indicates that, overall, the model applied is significantly good enough in predicting the outcome variable.
The table below, Coefficients, provides us with information on each predictor variable. This provides us with the information necessary to predict price from income. We can see that both the constant and income contribute significantly to the model (by looking at the Sig. column). By looking at the B column under the Unstandardized Coefficients column we can present the regression equation as: Price = 8287 + 0.564(Income)
Correlation or regression
Regression and correlation are similar and easily confused. In some situations it makes sense to perform both calculations. Calculate correlation if: You measured both X and Y in each subject and wish to quantify how well they are associated. Calculate the Pearson (parametric) correlation coefficient if you can assume that both X and Y are sampled from normally-distributed populations. Otherwise calculate the Spearman (nonparametric) correlation coefficient. Don't calculate a correlation coefficient if you manipulated the X variable (e.g. in an experiment). Calculate regressions only if: One of the variables (X) is likely to precede or cause the other variable (Y). Choose linear regression if you manipulated the X variable, e.g. in an experiment. It makes a difference which variable is called X and which is called Y, as linear regression calculations are not symmetrical with respect to X and Y. If you swap the two variables, you will obtain a different regression line. In contrast, correlation calculations are symmetrical with respect to X and Y. If you swap the labels X and Y, you will still get the same correlation coefficient. Correlation gives an estimate as to the degree of association between the variables. Regression attempts to describe the dependence of a variable on one (or more) explanatory variables; it implicitly assumes that there is a one-way causal effect from the explanatory variable(s) to the response variable, regardless of whether the path of effect is direct or indirect.
Consider samples of the leg length and skull size from a population of elephants. It would be reasonable to suggest that these two variables are associated in some way, as elephants with short legs tend to have small heads and elephants with long legs tend to have big heads. We might demonstrate an association exists by performing a correlation analysis. However, would regression be an appropriate tool to describe a relationship between head size and leg length? Does an increase in skull size cause an increase in leg length? Does a decrease in leg length cause the skull to shrink? It is meaningless to apply a causal regression analysis to these variables as they are interdependent and one is not wholly dependent on the other, but more likely some other factor that affects them both (e.g. food supply, genetic makeup). On the other hand, consider these two variables: crop yield and temperature. They are measured independently, one by a weather station thermometer and the other by Farmer Giles' scales. While correlation analysis might show a high degree of association between these two variables, regression analysis might also be able to demonstrate the dependence of crop yield on temperature. However, careless use of regression analysis would also demonstrate that temperature is dependent on crop yield: this would suggest that if you grow really big crops you'll be guaranteed a hot summer! Dumb, or what?
The above table allows us to understand that both males and females prefer to learn using online materials vs. books.
When readings this table we are interested in the results for the Continuity correction. We can see here that Chisquare(1) = 0.487, P = 0.485. This tells us that there is no statistically significant association between Gender and Preferred Learning Medium. That is, both Males and Females equally prefer online learning vs. books.