The common problems that confront us are the formulation of test procedures or set of rules that lead to the acceptance or rejection of some statement or hypothesis about a particular research. For example, a pharmacist researcher might be required to decide on the basis of experimental evidence whether a certain newly developed drug is much superior to the one presently being used, a medical technologist might have to decide on the basis of sample data whether there is a difference in the results of the two laboratory procedures, a botanist might wish to establish that a certain plant posses a particular characteristic different from its family because of the geographical location, or a biochemist might wish to connect that there is a relationship between the genes and behavioral pattern of certain animals. The procedures for establishing a set of rules that lead to the acceptance or rejection of these kinds of statements comprise a major area of biostatistical inference called hypothesis testing.
One of the principal objectives of research is comparison: How does one group differ from another? Specifically we may encounter questions such as these: What is the mean serum cholesterol level of a group of middle – aged men? How does it differ from that of women? From that of men of other ages? How does today’s level differ from that of decade ago? Is the latest drug effective in reducing cholesterol levels? What are the effects of various diets on serum cholesterol levels?
These are typical questions that can be handled by the primary tools of classical statistical inference – estimation and hypothesis testing. The unknown characteristic, or parameter, of a population is usually estimated from a statistic computed from a sample. Ordinarily, we are interested in estimating the mean and the standard deviation of some characteristic of the population. The purpose of statistical inference is to reach conclusions from our data and to support our conclusions with probability statements. With such information, we will be able to decide whether an observed effect is real or due to chance. In this lesson, we will use a single sample to explain hypothesis testing.
Before getting into the step – by step procedure of a test of significance, you will find it helpful to look over the following definitions.
Hypothesis. A statement of belief used in the evaluation of population values.
Null Hypothesis, H _{o} . A claim that there is no difference between the population mean µ and the hypothesized value µ _{o} .
Alternative hypothesis, H1. A claim that agrees with the null hypothesis. If the null hypothesis is rejected, we are left with no choice but to fail to reject the alternative hypothesis that µ is not equal to µ _{o} . Sometimes referre to as the research hypothesis.
Test Statistic. A statistic used to determine the relative position of the mean in the hypothesized probability distribution of sample means.
Critical region. The region on the far end of the distribution. If only one end of the distribution, commonly termed “the tail”, is involved, the region is referred to as one – tailed test; if both ends are involved, the region is known as two – tailed test. When the computed Z or t falls in the critical region, we reject the null hypothesis. The critical region is sometimes called the rejection region. The probability that a test statistic falls in the critical region is denoted by .
Critical Value. The number that divides the normal distribution into the region where we will reject the null hypothesis and the region where we fail to reject the null hypothesis.
Significance level. The level that your results will most likely not to be criticized. Usually from 95% and higher.
Statistical Hypothesis. A statistical hypothesis is an assertion or conjecture concerning one or more population. Definition: Type I Error. Rejection of the null hypothesis when it is true is called a type I error. Definition: Type II Error. Acceptance of the null hypothesis when it is false is called a type II error.
This explains how to determine if the test is a left tail, right tail, or twotail test.
The type of test is determined by the Alternative Hypothesis ( H _{1} )
Left Tailed Test H _{1} : parameter < value Notice the inequality points to the left Decision Rule: Reject H _{0} if t.s. < c.v.
Right Tailed Test H _{1} : parameter > value Notice the inequality points to the right Decision Rule: Reject H _{0} if t.s. > c.v.
Two Tailed Test H _{1} : parameter not equal value Another way to write not equal is < or > Notice the inequality points to both sides Decision Rule: Reject H _{0} if t.s. < c.v. (left) or t.s. > c.v. (right)
The decision rule can be summarized as follows:
Reject H _{0} if the test statistic falls in the critical region (Reject H _{0} if the test statistic is more extreme than the critical value)
The type I error and type II error are related. A decrease in the probability of one result is an increase in the probability of the other. The size of the critical region, and therefore the probability of committing a type I error, can always be reduced by adjusting the critical value(s). An increase in the sample size n will reduce and . If the null hypothesis is false, is a maximum when the true value of a parameter is close to the hypothesized value. The greater the distance between the true value and the hypothesized value, the smaller will be.
1. State the null hypothesis H _{o} that = _{o} . Choose an appropriate alternative hypothesis
H _{i} from one of the alternatives < _{o} , > _{o} , or 2. Choose a significance level of size .
π
_{o}_{.}
3. Select the appropriate test statistic and compute the value of the test statistic from the sample data.
4
..
Establish the critical region
5.
Decision: Reject H _{o} if the test statistic has a value in the critical region; otherwise,
accept H _{o} .
6. Conclusion
If the population standard deviation, sigma, is known, then the population mean has a normal distribution, and you will be using the zscore formula for sample means. The test statistic is the standard formula you've seen before. The critical value is obtained from the normal table, or the bottom line from the ttable. Population Standard Deviation Unknown If the population standard deviation, sigma, is unknown, then the population mean has a student's t distribution, and you will be using the tscore formula for sample means. The test statistic is very similar to that for the zscore, except that sigma has been replaced by s and z has been replaced by t.
The critical value is obtained from the ttable. The degrees of freedom for this test is n1. If you're performing a ttest where you found the statistics on the calculator (as opposed to being given them in the problem), then use the VARS key to pull up the statistics in the calculation of the test statistic. This will save you data entry and avoid round off errors.
Example 1. To illustrate the basic concepts of a test of significance, let us again consider the Honolulu Heart Study. Suppose someone claims that the mean age of the population of 7683 individuals is 53.00 years. How can you verify (or reject) this claim? Start by drawing say a sample of 100 persons. Suppose the sample mean equals 54.85. Now the question is, “What is the likelihood of finding a sample mean of 54.85 in a sample of 100 from a distribution whose true mean, µ, is 53, given that = 5.50.
H _{0} : µ 

H1 
: 
µ 
# 53 
2.
Significance level: = 0.05
3.
Test statistic:
4. Critical region Z = =/ 1.96
= 
54.85 
53 
= 3.36 
5.5/
100

5. decision: Reject the null hypothesis.
6. Conclusion: The mean age of 54.85 is significantly different from (greater than) the
population mean of 53 or that of the sample probably came from another population with a mean other than 53.
Useful links:
http://www.math.virginia.edu/~der/usem170/Chapter11/sld008.htm