You are on page 1of 26

Statistics Notes Introduction POPULATION - Complete set of events in which you are interested.

The total set of individual objects or persons of interest in a study. SAMPLE - Set of actual observations. A subset of the population that is actually observed. DESCRIPTIVE STATISTICS - Used to summarize sample data. Concerned with techniques that are used to describe or characterize the obtained data. An example would be a statement about the average length of time it takes for a mouse to lick its paw when placed on a warm surface, as would be the time it takes for a morphine-injected mouse to do the same thing. In all cases, descriptive statistics merely describes what the data has to say about some phenomenon. INFERENTIAL STATISTICS - Allow us to make generalizations from samples to populations. Based on samples, we make inferences about parameters. Involves techniques that use the obtained sample data to infer to populations. Using data collected from a small sample to infer something about the larger population; the danger here is that the sample used must be large enough to account for the variability in the larger population. PARAMETERS - The characteristics of the population about which we make inferences using the sample data. STATISTICS - Numerical values summarizing sample data. The corresponding characteristics of the sample data, upon which we base our inferences about the parameters. ---> Based on sample statistics, we make inferences about population parameters.

Population (Greek letters) parameter constants (mu) - mean of population (rho) - pop. correlation (sigma) - pop. stnd. dev. 2 - population variance

Sample (Roman letters) statistic variables 0 (x-bar) - mean of sample r - sample correlation s - sample stnd. dev. s2 - sample variance

Statistics are only one piece of the research process (and a small piece at that!). Also, it is very easy to use statistics improperly ... be careful of the techniques you use (and critical of what you read).

Chapter 2: The Measurement Process OPERATIONALIZATION - Carefully and specifically defining the variables (or characteristics) you are interested in studying in a way that is measurable and repeatable. VARIABLE - A characteristic that can take on more than one value among members of a sample or population. (e.g., sample statistics) Discrete Variables - a variable that can take on any of a small set of values (ex. - number of fingers on a hand). Continuous Variable - variables that can take on any valuable (ex. - height). Independent Variable - variables that are controlled or manipulated in order to measure their effect on the dependent variable. Dependent Variable - the variable that is being measured; "the effect of the IV on the DV" CONSTANT - A characteristic that does not change in a given situation. (e.g., population parameters). 4 Properties of Measurements (Cumulative) 1. Distinctiveness - Objects that are different get different scores. 2. Magnitude/Ordinality - The natural ordering of the numbers reflects the natural ordering of the trait being measured. 3. Equal Interval Size - The unit of measurement is the same size everywhere on the scale. 4. Absolute Zero - Assigning the score of zero means an absence of the trait being measured. Levels of Measurement Properties of Measurement Distinctiveness Magnitude/Ordinality Equal Interval Size Absolute Zero Examples: Nominal YES NO NO NO Ordinal YES YES NO NO Interval YES YES YES NO Ratio YES YES YES YES

Nominal - Numbers on a football team Ordinal - First, second, third place Interval - Temperature Ratio - Counts, Measurements *** Most attitude scales are ordinal

PROBABILITY SAMPLING TECHNIQUES - Probability sampling techniques allow us to specify the likelihood that any particular member of the population will be selected for the sample. This allows statistical inference. Simple Random sample - A technique whereby each member of the population of interest has a (theoretically) equal chance of being included in the sample. Systematic Random Sample - Let k denote the ratio given by k = N/n; where n is the desired sample size. A systematic random sample is a technique whereby a member is chosen at random out of the first k names in the sampling frame, and then every kth member listed after that one is chosen. Stratified Random Sample - Obtained by dividing the population into separate groups, called strata, and then taking a random sample form each stratum. Age, Income, etc. Cluster Sample - Obtained by first dividing the population into large groups, called clusters, and then randomly selecting form each cluster. Geographic, etc. NONPROBABILITY SAMPLING TECHNIQUES - Techniques for which it is not possible to specify the likelihood of choosing a particular member of the population for the sample; therefore, statistical inference is not applicable. Quota Sampling - Nonrandom sampling until a determined criteria is reached. Volunteer Sampling - Convenience sampling; Nonsystematic sampling of those who are willing to participate (volunteers).. SAMPLING ERROR - Refers to the error that is made when a statistic based on a sample is used to estimate or predict the value of a population parameter. Descriptive Techniques 3.1 - Tabular and Graphical Description FREQUENCY DISTRIBUTION - A listing of categories of possible values for a variable, together with a tabulation of the number of observations in each category. A distribution in which the values of the dependent variable are tabled or plotted against their frequency of occurrence. Should contain the following properties: Mutually Exclusive: Each subject can be classified into one and only one interval (category). Exhaustive: Set up to include all possible values of the variable in question; Every subject can be classified.

RELATIVE FREQUENCY - A proportion between 0 and 1 that represents the proportion of the total set of observations that is in that category. Computed by dividing the number of observations in a category by the total number of observations in the entire distribution. HISTOGRAM - Graph in which rectangles are used to represent frequencies of observations within each interval. The height of a bar over a particular category represents the relative number of observations in that category. Put rectangles over the real limits of score intervals so that the height of the rectangle tells the frequency of that interval. STEM-AND-LEAF DIAGRAMS (DISPLAY) - Graphical display representing original data arranged into a histogram. STEM - The first digit of a two-digit number. Vertical axis of display containing the leading digits. LEAVES - The second digit of a two-digit number. Horizontal axis of display containing the trailing digits. Relationship between the SAMPLE DISTRIBUTION and the POPULATION DISTRIBUTION As the sample size increases, the sample distribution more closely approximates the population distribution. A blurry picture the slowly gets clearer as sample size increases. 3.2 - Measures of Central Tendency - Numerical values referring to the center of the distribution. Helps to answer the question: Where does the distribution tend to center? SAMPLE MEAN (0) - The sum of the sample measurements divided by the number of scores (or sample size). The average. Properties - 1. Appropriate only for data measured on at least an interval scale. 2. The mean is the balance point or center of gravity of the observations. 3. Several means can be combined using weighted averages to obtain an overall mean: (n101 + n202) / (n1 + n2) SAMPLE MEDIAN (Med) - The measurement that falls in the middle when the sample measurements are ordered according to their magnitudes. May be an actual observed score (odd number of observations), or the average of two observed scores (even number of observations). The score corresponding to the point having 50% of the observations below it when observations are arranged in numerical order. The 50th percentile point. Properties - 1. Appropriate for data measured on at least an ordinal scale. 2. For symmetric distributions, the mean and median are identical. 3. For skewed distributions, the mean lies toward the direction of the skew relative to the median. 4. The median is insensitive to extreme scores and distances of the of the measurements from the middle measurement. Therefore, the median does not

use all available information, and therefore is not as useful as the mean for some inferential purposes; however, it may be more appropriate for highly skewed distributions, since it is relatively unaffected by outliers. MODE (Mo) - The value that occurs most frequently in the sample. The most commonly occurring score. The midpoint in the most frequently occurring interval. Properties - 1. Appropriate for all levels of measurement. 2. Useful for distributions that have more than one hump or distinct mound. 3. Mean, median, and mode are identical for a unimodal, symmetrical distribution. 4. Although not used as often as the mean or median, the mode may be more especially appropriate under certain circumstances (bimodal, trimodal, etc., distributions detecting two or more distinct subpopulations within a sample). Modal Category - The category or interval with the highest frequency. Modality - The term referring to the number of major peaks in a distribution. The number of "humps." Bimodal (Trimodal) - A distribution having two (three) distinct peaks.

PERCENTILE - The point below which a specified percentage of the observations fell. PERCENTILE POINT - The score below which a given percentage (%) of people fall. PERCENTILE RANK - The percentage of people falling below some score. QUARTILES - Those points that cut off the bottom and top quarter of a distribution. The 25th percentile is called the lower quartile. The 75th percentile is called the upper quartile. Interquartile Range - The range of the middle 50% of the observations.

3.3 Measures of Dispersion SAMPLE RANGE - The difference between the largest and smallest measurements in a sample. SAMPLE VARIANCE (s2) - Sum of the squared deviations about the mean divided by n - 1. s2 = (xi - 0)2 / n-1 DEVIATION - The difference between a single observed score and the sample mean.

SAMPLE STANDARD DEVIATION (s or ) - The positive square root of the variance. s = [(xi - 0)2 / n-1]1/2 EMPIRICAL RULE - If the histogram of a collection of measurements is approximately bellshaped, then: 5. 2. 3. Approximately 68% of the observation are within one SD of the mean. Approximately 95% of the observation are within two SDs of the mean All or nearly all (approximately 99%) of the observation are within three SDs of the mean.

NORMAL DISTRIBUTION - specific distribution having a characteristic bell-shaped form.

Probability Distributions 4.1 Probability Distributions for Discrete and Continuous Variables PROBABILITY - The probability of a particular outcome in a random observation on a variable is the proportion of times that outcome would occur in the long run in repeated random sampling on the variable. The probability distribution of a discrete variable is such that a probability is assigned to each possible value of the variable. Each of the probabilities is a number between 0 and 1, and the sum of the probabilities of all possible values equals 1. - Give example using dice The probability distribution of a continuous variable is one in which probabilities can be assigned to intervals of numbers. The probability that the variable falls in any particular interval is between 0 and 1, and the probability assigned to the interval containing all the possible values equals 1. - Give example using income 4.2 The Normal Probability Distribution NORMAL DISTRIBUTION - A specific type of distribution having a characteristic bellshaped form. The family of normal distributions is specified by a collection of symmetric bell-shaped curves, each characterized by the value of mean (any real number) and the standard deviation (any positive number). The family has the property that for each fixed number z, the probability concentrated to the right of + z is the same for all normal distributions. STANDARD NORMAL DISTRIBUTION - The standard normal distribution is the normal distribution with parameter = 0 and = 1. A normal distribution with a mean equal to 0 (zero) and variance (and standard deviation) equal to 1 (one). Denoted N (0,1).

Z-SCORES - The z-score of score Y is the number of standard deviations that Ys score is from . - Contrast normal distribution and standard normal distribution. 4.3 Sampling Distributions SAMPLING DISTRIBUTION - A sampling distribution is a probability distribution that specifies the probabilities of the possible values of a sample statistic, or the distribution of a statistic over repeated sampling from a specified population. For example, the sampling distribution of the mean is the distribution of sample means over repeated sampling from one population. * A theoretical frequency distribution of the scores for or values of a variable (or a statistic such as a mean). Any statistic that can be computed on a sample has a sampling distribution. A sampling distribution is constructed by assuming that an infinite number of samples of a given size have been drawn from a particular population and that their distributions have been recorded. Then the statistic, such as the mean, is computed for the scores of each of these hypothetical samples; then this infinite number of statistics is arranged in a distribution to arrive at the sampling distribution. The sampling distribution is compared with the actual sample statistic to determine if that statistic is or is not likely to be the way it is due to chance. (Note: Be sure not to confuse the sampling distribution with a sample distribution - the distribution of scores on a variable in a sample). It is hard to overestimate the importance of sampling distributions of statistics. The entire process of inferential statistics (by which we move from known information about samples to inferences about populations) depends on sampling distributions. We use sampling distributions to calculate the probability that sample statistics could be due to chance and thus to decide whether something that is true of a sample statistic is also likely to be true of a population parameter. 4.4 The Central Limit Theorem - Consider a random sample of n measurements from a population distribution having mean and standard deviation . Then, if n is sufficiently large, the sampling distribution of Y is approximately a normal distribution with mean and standard error Y = /%&. & n & Review of Population, Sample, and Sampling Distributions At this point, it may be instructive to review the three types of distributions among which it is necessary to distinguish in the remainder of this text--the population distribution, the distribution of the sample, and the sampling distribution. 4. The population distribution: This is the distribution from which we select the sample. This distribution is usually unknown in practice, so we formulate inferences about certain characteristics of it, such as the parameters and if the variable is at the interval level.

5.

The distribution of the sample: This is the distribution of the measurements that we actually observe--that is, the sample observations Y1, Y2...Yn. The sample distribution may be graphically displayed in the form of a histogram of the observations, or numerically described by such sample statistics as Y, the median, the sample standard & deviation s, and so forth. The larger the sample size n, the closer this distribution should resemble the population distribution, and the closer the sample statistics (such as Y) should be to the corresponding parameters of the population (such as ). & The sampling distribution of a statistic: This is the probability distribution of some sample statistic, such as Y. A sampling distribution describes how much variability & there will tend to be in the value of a statistic among samples of a certain size. - Give examples

6.

Statistical Inference: Estimation 5.1 Point Estimation POINT ESTIMATE - A point estimate of a parameter is a sample statistic that is used to predict the value of that parameter. The specific value taken as the estimate of a parameter. Unbiased: An estimator is unbiased if its sampling distribution is centered around the parameter (it has the parameter as it mean). That is, if we took repeated samples of some fixed size n and used an unbiased estimator, then the average of these estimates would equal the parameter value. Efficient: An estimator is efficient if it tends to be relatively close to the population parameter it estimates. That is, it has a small standard error relative to other estimates of parameter. ---> The most common point estimate of the population mean () is the sample mean (Y) & Point Estimates of the Mean and Standard Deviation - The sample mean Y is the most & intuitively appealing estimate of the mean of the population distribution of a variable, since the formulas for the two measures are similar. In fact, Y is unbiased and generally quite & efficient. We use the symbol 8 over a parameter to represent an estimate of that parameter. Example: = & is an estimate of the population mean 8 Y = s denotes an estimation of the population standard deviation 8

5.2 Confidence Interval for a Mean Large Samples An interval estimate for a parameter is customarily referred to as a confidence interval. CONFIDENCE INTERVAL; CONFIDENCE COEFFICIENT - A confidence interval for a parameter is an interval of numbers within which the value of the parameter is believed to lie. An interval, with limits at either end, with a specified probability of including the parameter being estimated. The likelihood that the interval contains the true value of the parameter is called the confidence coefficient. A common approach for constructing a confidence interval consists of taking the point estimate of the parameter and adding and subtracting some multiple (such as a z-value) of the standard error of that estimate. A confidence interval is a range of scores. A confidence coefficient is a percentage.

LARGE SAMPLE CONFIDENCE INTERVAL FOR - A large-sample confidence interval for is: & Y zY = & z(/%&) 8& Y 8 n Where: & Y = sample mean z = z-score Y = standard error of the sample 8& = estimation of the population standard deviation (sample standard 8 deviation) n = sample size The z-value is chosen so that the probability concentrated under a normal curve within z standard deviations of the mean equals the confidence coefficient. Z-Value 1.96 2.33 2.58 Confidence Interval 95% 98% 99%

The width of a confidence interval: 1. Increases as the confidence coefficient increases 2. Decreases as the sample size increases Properties of the Confidence Interval for a Mean 1. The greater the desired confidence, the wider the confidence interval

2. The width of the confidence interval is the difference between the upper endpoint & Y + (z / %&) and the lower endpoint Y - (Y + (z / %&). This width equals 2z / %&, n & & n n which is inversely proportional to the square root of the sample size. The larger n is, the narrower is the width of the interval. 5.3 Confidence Interval for a Proportion Large Samples Point and Interval Estimation for a Proportion - Let denote the parameter representing the proportion of the defined population classified in some specific category. The formula for the standard error of the sample proportion is: 8 8 = %&&&&&&&& (1 - ) /n LARGE SAMPLE CONFIDENCE INTERVAL FOR A large-sample 100(1 - )% confidence interval for the population proportion is: z/2 , 8 88 which equals: z/2 %&&&&&&&&, 8 (1 - ) /n 8 8 where is the sample proportion (there must be more than 5 8 observations both in the category and not in it. 5.4 Choice of Sample Size Sample Size for Estimating Proportions - We must decide on the degree of precision desired in our estimate. Second, a decision must be made regarding the probability with which the specified amount of error will not be exceeded. SAMPLE SIZE REQUIRED FOR ESTIMATING The sample size n needed to ensure that, with probability at least 1 - , the error of estimation of by is no greater than B, is: 8 n = .25(z/2 / B)2, where: n = sample size .25 = the value for (1 - ) when = .5; note: this is a safe value unless is not close to .5 z/2 = the z-value, or number of standard errors B = the boundaries of the error SAMPLE SIZE REQUIRED FOR ESTIMATING The sample size n needed to ensure that, with probability at least 1 - , the error of estimation

of by & is no greater than B, is: Y n = 2(z/2 / B)2, where: n = sample size 2 = the squared population standard deviation z/2 = the z-value, or number of standard errors B = the boundaries of the error Other Considerations in Determining Sample Size Weve talked about: Precision corresponds to the width of a confidence interval Confidence refers to the probability that the interval will contain the estimated parameter We also need to consider: Degree of Variability for the variables being measured. Generally, the more heterogeneous the population, the larger the sample needs to be. Amount and Type of Analyses. Usually, the more complex the analyses (that is, the more variables analyzed simultaneously), the larger the sample needed to make the results meaningful.

Statistical Inference: Testing and Hypotheses HYPOTHESIS - A hypothesis is a prediction about some aspect of a variable or a collection of variables. Hypotheses are derived from theory and tgey serve as guides to research. When a hypothesis can be stated in terms of one or more parameters of the appropriate populations distribution(s), statistical methods can be used to test its validity. 6.1 Elements of a Statistical Test - There are five basic elements of statistical tests of hypotheses about a parameter: Assumptions, Hypotheses, Test Statistic, P-value, and Conclusion. 1. Assumptions - All statistical tests are based on assumptions that must be met in order for the test to be valid. Among them: A. B. C. D. The assumed scale of measurement of the variable: Each test is specifically designed for a certain level of measurement. The form of the population distribution: For many tests, the varaible must be continuous, or even normal distributed. The method of sampling: The formulas for most tests require simple random sampling. The sample size: Many tests rely on results similar to the Central Limit Theorem and require a certain minimum sample size in order to be valid.

2. Hypotheses - A statistical test focuses on two hypotheses about the value of a parameter. Null Hypothesis (HO) - The statistical hypothesis tested by the statistical procedure; usually a hypothesis of no difference or no relationship. Alternative Hypothesis (HA) - The hypothesis that is adopted when Ho is rejected; usually the same as the research hypothesis. It consists of an alternative set of parameter values to those given in the null hypothesis. Rejection Region - The set of outcomes of an experiment that will lead to rejection of Ho. 3. Test Statistic - The test statistic typically involves a point estimate of the parameter aboput which the hypotheses are made. Knowledge of the sampling distribution allows us to calculate the probability that specific values of the statistic would occur if the null hypothesis were actually true. 4. P-value - The P-value (also called the alpha) is the probability, when Ho is true, of getting a test statistic value at least as favorable to Ha as the value actually observed. The P-value is used as a measure of the weight of evidence supporting the null hypothesis moderate to large values indicate that the data are consistent with Ho. 5. Conclusion - The researcher should routinely report the P-value, so that others can judge the extent of the evidence against Ho. In practice, researchers generally require very small P-values, such as P < .05, in order to conclude that the data contain sufficient evidence to reject Ho. In summary, the Elements of a Statistical Test: 7. 8. 9. 10. 11. Assumptions: Measurement scale, population, sample. Hypothesis: Null hypothesis (Ho); Alternative hypothesis (Ha). Test Statistic P-value: Weight of evidence supporting Ho. Conclusion: Report P-value; Formal decision (reject or dont reject Ho optional).

6.2 Test for a Mean Large Samples 1. Assumptions - A random sample of size n $ 30 is selected. Measurements are obtained on a variable measured on at least an interval scale. 2. Hypotheses Ho: = 0 Ha: > o; Ha: < o; or Ha: o 3. Test Statistic -

z = (Y - o) / Y & 8& 4. P-value - The probability of obtaining a z-score above the observed z-score (i.e., to the right of it on the real number line). Usually .05 is set as the critical P-value for rejection of Ho. 5. Conclusion - The smaller P is, the more evidence there is against Ho and in favor of Ha. 6.3 Test for a Proportion Large Samples Assumptions - The size of the random sample must be large enough so that the sampling distribution of is approximately normal. 8 The normal approximation for the sampling distribution of is reasonably good when: 8 n > 5 / [min(, 1 - )]; where the notation min(, 1 - ) denotes the minimum of the numbers and 1 - . For example, for testing Ho: = .5, we need n > 5 / .5 = 10, whereas for testing Ho: = .9 (or Ho: = .1) we need n > 5 / .1 = 50. This sample size requirement reflects the fact that the sampling distribution of tends to be more skewed when is near 8 0 or near 1, and larger sample sizes are then needed before a symmetric bell shape is achieved. Hypotheses Ho: = 0 Ha: > o; Ha: < o; or Ha: o Test Statistic z = ( - o) / 8 8 8 P-value - Determined just as in the test for a mean. Conclusion - The smaller P is, the more evidence there is against Ho and in favor of Ha. 6.4 Small-Sample Inference for a Mean The t Distribution - For small sample sizes, the sampling distribution of Y might not be normal, and Y might not be very close to Y. In this case, & 8& & we use the t distribution. Properties of the t probability distribution: 1. The t distribution is symmetric about 0. This property is analogous to the property that the sampling distribution is the z-statistic (the standard normal distribution) is also symmetric about 0. The dispersion of the t distribution depends on the degrees of freedom. The standard

2.

3.

4.

deviation of the t distribution always exceeds 1, but decreases to 1 as df (and hence n) increases without limit. The t distribution is mound-shaped about 0, but it has more probability concentrated in the tails than does the standard normal distribution. The larger the value of df, though, the more closely it resembles the standard normal distribution. In the limit as df increases indefinitely, the two distributions are identical. Since the t distribution has a slightly different shape for each distinct value of df, a different set of t values is needed for each df value.

1. Assumptions - A random sample of n # 30 is selected. The variable is measured on at least an interval level scale and has a normal population distribution. 2. Hypotheses - Same as in the large-sample test for a mean. Ho: = 0 Ha: > o; Ha: < o; or Ha: o 3. Test Statistic t = (Y - o) / Y = (Y - o) / ( / %&) & 8& & 8 n 4. P-value - Similar to the large-sample calculation, except different sets of t values are required for each distinct value of df. 5. Conclusion - Specify the smallest number that we know to exceed P. The smaller P is, the more evidence there is against Ho and in favor of Ha. * The t distribution can also be used to construct confidence intervals for a mean when the size of the random sample is too small to use the large sample method: CI = & tc( / %&), where df = n - 1 for the t-value. Y 8 n 6.5 Small-Sample Inference for Proportion The Binomial Distribution In general, the binomial distribution arises in the following context: 1. 2. 3. We have a sequence of n observations, and for each we observe the classification on some categorical variable. The probability that an observation is classified in a pareticular category remains the same for each observation. The outcome of each observation does not depend on the outcome of other observations. Probabilities for a Binomial Distribution: If the probability of being classified in a given category equals for each observation, then the probability that X out of n independent observations are classified in that category, denoted by P(X), equals:

P(X) = [n! / X! (n - X!)] * [x (1 - )n - X]

X = 0,1,2,...,n

6.6 Making Decisions in Statistical Tests - The -level is a number such that Ho is rejected if the P-value is less than its value. The -levels that are traditionally chosen are .10, .05, .01, and .001. The choice of the -level reflects how careful the researcher wishes to be in making an inference.

The Real World Ho True (P > ) Decision Don't reject Ho Reject Ho Right Type I error () HA True (P < ) Type II error () Right 1 - = Power

Type I Error - The error of rejecting Ho when it is true. Alpha () - The probability of a Type I error. Type II Error - The error of not rejecting Ho when it is false. Beta () - The probability of a Type II error. Power - The probability of correctly rejecting a false Ho. A Statistical Test of Hypothesis is Composed of Five Elements: 1. Assumptions about the scale of measurement of the variable, the form of the population distribution, the sampling method, and the sample size. 2. Null and alternative hypotheses about the value of the parameter. 3. A test statistic that can be used to describe how consistent the observed data are with what would be expected if the null hypothesis were true. 4. The P-value, which describes in a particular sense how likely the observed result would be to occur if the null hypothesis were true. 5. A conclusion based on the sample evidence about the null hypothesis The large-sample tests for a mean and for a proportion are founded on the Central Limit Theorem and use the standard normal sampling distribution. The small-sample test for a mean uses the t sampling distribution, and the small-sample test for a proportion uses the binomial sampling distribution. In each test, small P values occur when the test statistic is far from the mean of the

sampling distribution under the null hypothesis. The sample size is a crucial factor In both the estimation and hypothesis-testing procedures. Small sample sizes tend to yield wide confidence intervals for a parameter, and thus imprecise estimations. With small sample sizes it is difficult to reject false null hypotheses, especially when the actual parameter value is not very different from the value stated in the null hypothesis. FACTORS THAT AFFECT POWER: 1. Effect Size - How strongly the IV affects the DV. In general, the larger the effect size, the more powerful the test (ex.- spotlight vs. nightlight in darkness). 2. Sample Size - Large sample size reduce standard error ---> increased power. But large sample sizes can backfire of they are too large (i.e., some census data): may get statistically significant, but meaningless results. 3. Alpha () - Increasing alpha increases power (but this also increase the risk of a type II () error. To avoid type I error, use tiny alpha. To avoid type II error, use larger alpha. 4. One- vs. Two-Tailed Tests - One tailed tests are more powerful than two-tailed tests, but you have to pick the right tail! 5. Precision - Measuring variables with more precision leads to greater power (ex. using scale for weight as opposed to "guessing"). 6. Choice of Test - Always opt for the more powerful test.

Comparison of Two Groups INDEPENDENT SAMPLES - The probability that a particular member of one population is chosen is not dependent on which members are chosen from the other sample. DEPENDENT SAMPLES - Occurs when members of one sample are naturally matched with members of the other sample (married couples, twins, pre-post, etc.) DICHOTOMOUS VARIABLE - A variable having only two categories, levels, or possible outcomes. DEPENDENT VARIABLE (RESPONSE VARIABLE) - The variable about which comparisons are made. INDEPENDENT VARIABLE - The variable by which the groups are defined. 7.1 Nominal Scales: Difference of Proportions Confidence Interval for 1 - 2 - A natural way to compare two population proportions is to describe the difference between them, 2 - 1.

If two estimates are formed from two independent samples, the variance of the sampling distribution for their difference (or their sum) is the sum of the variances of the sampling distributions of the two separate estimates. Large Sample Confidence Interval for 2 - 1 - A large sample (1 - )% confidence interval for 2 - 1 is: (2 - 1) z/22-1, 8 8 88 8 or _______________________ (2 - 1) z/2 p[1(1-1)/n1] + [2(1-2)/n2] 8 8 8 8 8 8 Testing Hypotheses About 2 - 1 - The population proportions of 2 and 1 can also be compared through a test of the hypothesis Ho: 2 = 1, or equivalently, Ho: 2 - 1 = 0. Z = (Estimate - Null Hypothesis value) / Standard Error = (2 - 1) / 2-1 8 8 88 8 7.2 Interval Scales: Difference of Means Confidence Interval for 2 - 1 - The most common procedure for comparing two groups on a characteristic measured on at least an interval scale is to make inferences about their means 1 and 2. From the central limit theorem, we know that if the samples are sufficiently large (in this case, n1 and n2 should both be at least 20), we can assume that the sampling distributions of &1 and &2 are approximately normal about 1 and 2 Y Y respectively. This leads us to form the following confidence interval for 1 and 2.: & Y Y2 - &1 z/2Y2 - Y1 8& & Large Sample Confidence Interval for 1 - 2.: A large sample (1 - )% confidence interval for 2 - 1 is: ________________ & Y Y2 - &1 z/2Y2 - Y1 p(21 / n1) + (22 / n2) 8& & Testing Hypotheses About 1 and 2: A large sample (n1 $ 20, n2 $ 20) test of the hypothesis that 1 = 2 can also be based on &2 - &1 and Y2 - Y1. The standard form of the Y Y 8& & z test statistic is: Z = (Estimate of parameter - Null hypothesis value of parameter) / Standard error of estimate Small Sample Inferences for 2 - 1 If either sample size n1 or n2 is less than 20, an alternate formula must be used for confidence intervals and tests. Page 174 Paired Differences for Dependent Samples: Dependent samples occur when the observations in one sample are matched with observations in another sample (e.g., prepost tests).

Confidence interval: page 176 Test Statistic: page 176 7.3 Ordinal Scales: Wilcoxon Test (Also referred to as the Mann-Whitney Test): When at least ordinal measurement has been achieved for the variables being studied, the Wilcoxon-MannWhitney test may be used to test whether two independent groups have been drawn from the same population. This is one of the most powerful of the nonparametric tests, and it is a very useful alternative to the parametric t test when the researcher wishes to avoid the t test's assumptions or when the measurement in the research is weaker than interval scaling. Assumes independent random samples, but makes no assumption about the form of the population distribution. The scores for each observation are ranked, and then the ranks for each group are summed. These sums are then compared to a table of probabilities for each possible outcome. All probabilities at or below (above) the value are summed to arrive at the p-value. MANN-WHITNEY TEST - A nonparametric test for comparing the central tendency of two independent samples. NONPARAMETRIC (DISTRIBUTION-FREE) TESTS - Statistical tests that do not rely on parameter estimation or precise distributional assumptions. RANKED DATA - Data for which the observations have been replaced by their numerical ranks from lowest to highest. Comparing Ordinal Categorical Distributions: In social science research using likert-types scales, the primary statistical procedure is usually a test of whether there is a difference between the two groups. RIGIT SCORE: The Rigit score for a response equals the proportion of observations below that category plus half the proportion in that category. WILCOXON'S MATCHED-PAIRS SIGNED-RANKS TEST - A nonparametric test for comparing the central tendency of two matched (related) samples. 7.4 Comparing Parametric and Nonparametric Statistics: One might wonder whether anything is sacrificed by using a nonparametric procedure when a parametric procedure is actually justified, in the sense that the assumptions are completely fulfilled for using that parametric procedure Intuitively, we might believe that nonparametric procedures are not as good in that case, since they use only the ordinal characteristics of the data. However, statisticians have shown that some nonparametric procedures are very nearly as good even in the exact case for which the parametric tests are designed. For example, the Wilcoxon test is never much less efficient than its parametric equivalents, yet it can be much more efficient if the population distributions are highly nonnormal.

Measuring Association for Categorical Variables Cross-Classification Tables: Example using table 8.1 (page 199): Religious Preference Liberal Protestant Attitude Favor Oppose 103 187 Conservati ve Protestant 182 238 Catholic None Row Total

80 286

16 74

381 785 1,166

Column Total 290 420 366 90 - This is a 2 (# rows) x 4 (# columns) table, with 3 df [(r-1)(c-1)] - Row totals and column totals are marginal frequencies - The collection of marginal frequencies is know as the marginal distribution Conditional Distribution (Using percentages): Religious Preference Liberal Protestant Attitude Favor Oppose 35.5 64.5 100.0 290 Conservati ve Protestant 43.3 56.7 100.0 420 Catholic None

21.9 78.1 100.0 366

17.8 82.2 100.0 90

Total Percentage Sample Size

Joint Distribution (Using percentages): Religious Preference Liberal Protestant Attitude Favor Oppose 8.8 16.0 24.8 290 Conservati ve Protestant 15.6 20.4 36.0 420 Catholic None

6.9 24.5 31.4 366

1.4 6.3 7.7 90

Total Percentage Sample Size

Statistical Independence and Dependence - Two categorical variable are statistically independent if the population conditional distributions on one of them are identical for each of the levels of the other. The variables are statically independent if the conditional distributions are not identical. Example of Cross-Classification Exhibiting Statistical Independence: Religious Preference Liberal Protestant Attitude Favor Oppose 750 (30%) 1,750 (70%) Conservati ve Protestant 1,200 (30%) 2,800 (70%) Catholic None Row Total

900 (30%) 2,100 (70%)

150 (30%) 350 (70%)

3,000 7,000

Column Total 2,500 4,000 3,000 500 10,000 - The percentage of people who favor (or oppose) does not depend on the religious group they are in. MARGINAL TOTALS - Row and Column totals; The totals for the levels of one variable summed across the levels of the other variable. Chi-Square Test of Independence: Designed to test for independence between two nominal variables. It is appropriate for random samples or stratified random samples, and the sample size much usually be large (generally > 5 for each cell entry). Ho = the variables are statistically independent; Ha = the variables are statistically dependent; df = (r-1)(c-1). - Chi-Square Statistic: 2 = G[(fo - fe)2 / fe] * Note: The chi-square statistic provides a measure of how close the observed frequencies are to the frequencies that would be expected if the variables were independent. This statistic indicates how certain we can be that the variables are dependent, not how strong that dependence is. We must use measures of association for that. The Chi-Square Distribution - Some of the main properties of the chi-square distribution are: 4. 5. It is skewed to the right. It is concentrated on the positive part of the real line. It is impossible to get a negative 2 test statistic value, since it involves a sum of squared differences divided by positive expected frequencies. The minimum possible value, 2 = 0, occurs when the observed frequency in each cell equals the expected frequency for that cell, that is, when the variables are completely independent in the sample.

6.

The exact shape of the distribution depends on the degrees of freedom. For the test of independence, the formula for the degrees of freedom is df =(r - 1)(c - 1). For a 2 x 4 table, for example, df = (2 - 1)(4 - 1) = 1(3) = 3. The mean and standard deviation of the chi-square distribution depend on the value of df. Larger table dimensions produce larger values for df = - 1)(c 1). As df increases, the shape of the chi-square curve more closely resembles a normal curve.

Measures of Association: A Measure of Association is a statistic that summarizes some aspect of the degree of the statistical dependence between two or more variables. The magnitude of the measure of association may be referred to as the proportional reduction in error.

The Four Elements of a Measure of Association for Nominal Variables: 1. A rule for predicting the classification of each member on the dependent variable, ignoring information about the classification of that member on the independent variable. These predictions are based on the marginal distribution of the dependent variable. A rule for predicting the classification of each member on the dependent variable, using the information about the classification of that member on the independent variable. These predictions are based on the frequencies of observations in the various categories of the dependent variable within each level of the independent variable. A definition of what is meant by a prediction error. For nominal variables, an error consists of a mis-classification of a member on the dependent variable. The total number of times that the wrong category of the dependent variable is predicted using Rule 1 is denoted by E1. The total number of classification errors using Rule 2 is denoted by E2. The definition of the measure. The difference between the total numbers of classification errors committed using the two rules is E1 - E2. There are (E1 - E2) fewer errors made when the independent variable is used in making the predictions, so E1 - E2 is the reduction in error. The measure of association given by this proportional reduction in error (PRE) is: PRE = (E1 - E2) / E1 ; yields the percentage increase in prediction Example: The Goodman & Kruskal Tau

2.

3.

4.

Common Properties of Ordinal Measures: There are several ordinal measures of association, among them: gamma, Kendall's tau-b and tau-c, Spearman's rho-b and rho-c, and Somers' d. All of these measures are quite similar in their basic purposes and characteristics, and have some common properties.

1.

2.

3. 4.

Ordinal measures of association take on values between -1 and +1. When the variables are at least ordinal in level, it is possible to distinguish between two types of association - positive and negative. Positive association between variables X and Y results when a member ranked high on X tends also to be ranked high on Y, and those ranked low on X tend to be ranked low on Y. The population values of ordinal measures equal O if the variables are statistically independent. If an ordinal measure equals 0, however, the variables need not be statistically independent. The stronger the relationship, the larger the absolute value of the measure. A value of .6 or -.6 represents a stronger association than a value of .3 or -.3 for example. With the exception of Somers' d, the ordinal measures of association are symmetric. By this, we mean that their values are not based on the identification of a dependent and independent variable. The measures assume the same value when Y is the independent variable, as when Y is the dependent variable.

Concordant and Discordant Pairs: A pair of observations is concordant if the member that ranks higher on one variable also ranks higher on the other variable. A pair of observations is discordant if the member that ranks higher on one variable ranks lower on the other. Example: Months Worked in Year Before Revolution 6 or less Favorable Attitude Indecisive Hostile Total Percent 54 (85.7) 6 (9.5) 3 (4.8) 100.0 7-9 14 (73.3) 2 (10.5) 3 (15.8) 100.0 10 or more 65 (61.9) 14 (13.3) 26 (24.8) 100.0 133 22 32 Total

Sample Size 63 19 105 187 - The cells denoted with //// indicate pairs that are concordant with the shaded cell - 65 (61.9); The cells denoted with \\\\ indicate pairs that are discordant with the shaded cell - 54 (85.7) In brief: Most analyses of relationships between categorical variables involve the construction of cross-classification tables and percentage distributions (called conditional distributions) within rows or columns. Two variables are said to be independent if the conditional distributions on one variable are equal for each level of the other variable. The chi-square test is used to test for independence between two nominal variables, based on sample data. The strength of association between two nominal variables can be measured by the difference of proportions for 2 x 2 tables and by Goodman and Kruskal's tau for larger tables. Ordinal measures of association measure the extent to which subjects' rankings on one

variable are associated with their rankings on the other variable. Most of these measures are based on the relative occurrences of concordant and discordant pairs. Extra Stuff (FYI) ONE-DIMENSIONAL TESTS - Chi-Square Goodness-of-fit Test Goodness of Fit Test Model Fitting CHI-SQUARE TEST (2) - A statistical test often used for analyzing categorical data. Advantages are that only nominal-level data are needed. Also called "Contingency Table Analysis" (Math) and "Crosstabs" (Sociology). Properties Mutually Exclusive: Each subject can be classified into only one category. Exhaustive: Every subject can be classified. CONTINGENCY TABLE - A two-dimensional table in which each observation is classified in the basis of two variables simultaneously. Table of frequencies. CONTINGENCY TABLE ANALYSIS - Analyzing tables of frequencies, such as Chisquare, Crosstabs, etc. GOODNESS-OF-FIT TEST (MODEL FITTING) - A test for comparing observed frequencies with theoretically predicted frequencies. Tests how well a model fits the table (fo-fe). EXPECTED FREQUENCIES - The expected value for the number of observations in a cell if Ho is true. The expected frequencies can be derived by multiplying the row and column totals for that cell, and then dividing by the total sample size. CHI-SQUARE TEST FOR INDEPENDENCE - Ho: IV's are independent; HA: IV's are dependent (fo-fe). STATISTICAL INDEPENDENCE - What 2 is testing; Ho: "variables are not related to each other." Linear Regression and Correlation The Linear Regression Model - Allows us to determine whether an association exists between two variables and, if so, the direction and strength of that association. The term linear function refers to the fact that the formula yi = + x maps out a straight-line graph when various xvalues are substituted into the formula. For particular values of and . The linear function yi = + x is said to be deterministic, because there is only one fixed

corresponding value of Y for each value of X. This is unrealistic, since we expect some variability among the Y-values among subjects with the same X-value. Therefore, a more appropriate model is the probabilistic model, which allows for variability: yi = + x + g, where g represents the deviation of a particular observation from the mean, or point estimate.

BEST LEAST SQUARES REGRESSION LINE - The line through the scatterplot that minimizes the sum of squared errors. Regression Equation yi = r(sy/sx)(xi-x) + y - Where: x is the predictor and y is the criterion Reduce to: yi = + x (linear equation for a line) - Where: is the y-intercept and is the slope LINEAR RELATIONSHIP - A situation in which the best-fitting regression line is a straight line.

PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT (r) - The most common correlation coefficient. A number between -1 ad +1 that tells the direction and strength of the linear relationship between 2 variables. Strength - How closely the points in a scatterplot hug a line. Direction - Positive or negative. PROPERTIES OF r: The properties of the Pearson correlation are quite similar in many respects to those of the ordinal measures of association. 4. -1 # r # +1-- The standardized version of the slope, unlike b, is constrained to fall between -1 and +l. r has the same sign as b. Since r is just the slope b multiplied by the ratio of two (positive) estimated standard deviations the sign is preserved. r = 1 when all the sample points fall exactly on the prediction line. These correspond to perfect positive and perfect negative linear associations. The larger the absolute value of r, the stronger the degree of linear association. Two variables with a correlation of ().80 are more strongly related than two variables with a correlation of ().40.

5.

6.

7.

8. 9.

The value of r is not dependent on the units in which the variables are measured. The correlation is a symmetric measure. If we formed a prediction equation that used Y to predict X the correlation would be the same as in using X to predict Y. The correlation is appropriate for use only when a straight line is a reasonable model for the relationship (that it, when X and Y are linearly related).

10.

POINTS TO CONSIDER WHEN DEALING WITH r: 1. 2. 3. 4. 5. When r = 0 (or is very low), remember the relationship could be non-linear. Attenuation die to restriction of range. Beware of combining groups with different relationships. Correlation does not imply causation. Interpreting r2: a. You don't interpret r, you interpret r2. b. r2 = the proportion of variance in y explained by/accounted for by/shared with x. Extra Stuff PREDICTOR VARIABLE - The variable from which a prediction is made. RANKED DATA - Data for which the observations have been replaced by their numerical ranks from lowest to highest. REGRESSION - Used to predict one variable from the other. SCATTER DIAGRAM (SCATTERPLOT, SCATTERGRAM) - A figure in which the individual data points are plotted in two-dimensional space. SPEARMAN'S CORRELATION COEFFICIENT FOR RANKED DATA (rs) - A correlation coefficient in ranked data. INTERCEPT - The value of Y when X is 0 (zero); denoted as in the linear regression function. LEAST SQUARES CRITERION - Pick the line that minimizes the sum of squared errors. REGRESSION COEFFICIENTS - The general name given to the slope and the intercept; most often refers just to the slope. RESIDUAL - The difference between the obtained value and the predicted value of Y, i.e., Y Y. 2 RESIDUAL VARIANCE (ERROR VARIANCE) - The square of the standard error of estimate.

SLOPE - The amount of change in Y for a one-unit change in X. STANDARD ERROR OF ESTIMATE (sy.x) - The average of the squared deviations about the regression line. STANDARDIZED REGRESSION COEFFICIENT, or (beta) - The regression coefficient that results from data that have been standardized. SUM OF SQUARES - The sum of the squared deviations around some point. CORRELATION - The relationship between variables. Describing the relationship between variables. CURVILINEAR RELATIONSHIP - A situation that is best represented by something other than a straight line. DICHOTOMOUS VARIABLES - Variables that can take on only two different values. (phi) - The correlation coefficient when both variables are measured as dichotomies. POINT BISERIAL CORRELATION (rpb) - The correlation coefficient when one of the variables is measured as a dichotomy.

You might also like