Professional Documents
Culture Documents
In statistics, during a statistical survey or a research, a hypothesis has to be set and defined. It is termed as a
statistical hypothesis It is actually an assumption for the population parameter. Though, it is definite that this
hypothesis is always proved to be true. The hypothesis testing refers to the predefined formal procedures that are
used by statisticians whether to accept or reject the hypotheses. Hypothesis testing is defined as the process of
choosing hypotheses for a particular probability distribution, on the basis of observed data.
Hypothesis testing is a core and important topic in statistics. In the research hypothesis testing, a hypothesis is an
optional but important detail of the phenomenon. The null hypothesis is defined as a hypothesis that is aimed to
challenge a researcher. Generally, the null hypothesis represent the current explanation or the vision of a feature
which the researcher is going to test. Hypothesis testing includes the tests that are used to determine the outcomes
that would lead to the rejection of a null hypothesis in order to get a specified level of significance. This helps to
know if the results have enough information, provided that conventional wisdom is being utilized for the
establishment of null hypothesis.
A hypothesis testing is utilized in the reference of a research study. Hypothesis test is used to evaluate and analyze
the results of the research study. Let us learn more about this topic.
Hypothesis testing is one of the most important concepts in statistics. A statistical hypothesis is an assumption
about a population parameter. This assumption may or may not be true. The methodology employed by the analyst
depends on the nature of the data used and the goals of the analysis. The goal is to either accept or reject the null
hypothesis.
Hypothesis Testing Terms
Back to Top
1. Test Statistic
The decision, whether to accept and reject the null hypothesis is made based on this value. The test statistic is a
defined formula based on the distribution t, z, F etc. If the calculated test statistic value is less than the critical value,
we accept the hypothesis, otherwise, we reject the hypothesis.
z test statistic is used for testing the mean of the large sample. The test statistic is given by
zz = x¯−μσn√x¯−μσn
where, x¯x¯ is the sample mean, μμ is the population mean, σσ is the population standard deviation
2. Level of Significance
The confidence at which a null hypothesis is accepted or rejected is called level of significance. The level of
significance is denoted by αα
3. Critical Value
Critical value is the value that divides the regions into two-Acceptance region and rejection region. If the computed
test statistic falls in the rejection region, we reject the hypothesis. Otherwise, we accept the hypothesis. The critical
value depends upon the level of significance and alternative hypothesis.
The alternative hypothesis is one sided if the parameter is larger or smaller than the null hypothesis value. It is two
sided when the parameter is different from the null hypothesis value. The null hypothesis is usually tested against
an alternative hypothesis(H1). The alternative hypothesis can take one of three forms:
H1: B1 > 1, is one-sided alternative hypothesis.
H1: B1 < 1, also a one-sided alternative hypothesis.
H1: B1 ≠≠ 1, is two-sided alternative hypothesis. That is, the true value is either greater or less than 1.
5. P - Value
The probability that the statistic takes a value as extreme or more than extreme assuming that the null hypothesis
is true is called P- value. The P-value is the probability of observing a sample statistic as extreme as the test
statistic, assuming the null hypothesis is true. The P value is the probability of seeing the observed difference, or
greater, just by chance if the null hypothesis is true. The larger the P value, the smaller will be the evidence against
the null hypothesis.
Hypothesis Benefits and Process
Back to Top
Hypothesis testing begins with the hypothesis made about the population parameter. Then, collect data from
appropriate sample and obtained information from the sample is used to decide how likely it is that the
hypothesized population parameter is correct. The purpose of hypothesis testing is not to question the computed
value of the sample statistic but to make a judgement about the difference between two samples and a
hypothesized population parameter.
Hypothesis Testing Steps
Back to Top
We illustrate the five steps to hypothesis testing in the context of testing a specified value for a population
proportion. The procedure for hypothesis testing is given below :
Set up a null hypothesis and alternative hypothesis.
Decide about the test criterion to be used.
Calculate the test statistic using the given values from the sample
Find the critical value at the required level of significance and degrees of freedom.
Decide whether to accept or reject the hypothesis. If the calculated test statistic value is less than the critical value,
we accept the hypothesis otherwise we reject the hypothesis.
Different Types of Hypothesis:
There are 5 different types of hypothesis as follows:
1) Simple Hypothesis
If a hypothesis is concerned with the population completely such as functional form and the parameter, it is called
simple hypothesis.
Example:
The hypothesis “Population is normal with mean as 15 and standard deviation as 5" is a simple hypothesis
2) Composite Hypothesis or Multiple Hypothesis
If the hypothesis concerning the population is not explicitly defined based on the parameters, then it is composite
hypothesis or multiple hypothesis.
Example:
The hypothesis “population is normal with mean is 15" is a composite or multiple hypothesis.
3) Parametric Hypothesis
A hypothesis, which specifies only the parameters of the probability density function, is called parametric
hypothesis.
Example:
If a hypothesis specifies only the form of the density function in the population, it is called a non- parametric
hypothesis.
Example:
A null hypothesis can be defined as a statistical hypothesis, which is stated for acceptance. It is the original
hypothesis. Any other hypothesis other than null hypothesis is called Alternative hypothesis. When null hypothesis
is rejected we accept the alternative hypothesis. Null hypothesis is denoted by H 0 and alternative hypothesis is
denoted by H1.
Example:
When we want to test if the population mean is 30, then null hypothesis is “Population mean is 30'' and alternative
Hypothesis is “Population mean is not 30".
Logic of Hypothesis Testing
Back to Top
The logic of hypothesis testing is similar to the "presumed innocent until proven guilty". In hypothesis testing, we
assume that the null hypothesis is a possible truth until the sample data conclusively demonstrate otherwise. A
hypothesis test is a statistical method that uses sample data to evaluate a hypothesis about a population.
The probability of rejecting the null hypothesis, when it is true, is called Type I error whereas the probability of
accepting the null hypothesis is called Type II error. Probability of Type II error is denoted by ββ.
Example:
Suppose a toy manufacturer and its main supplier agreed that the quality of each shipment will meet a particular
benchmark. Our null hypothesis is that the quality is 90%. If we accept the shipment, given the quality is less than
90%, then we have committed Type I error. If we reject the shipment, given the the quality is greater than 90%, we
have committed Type II error.
Power of the Test
Power of a test is defined as the probability that the test will reject the null hypothesis when the alternative
hypothesis is true.
For a fixed level of significance, if we increase the sample size, the probability of Type II error decreases, which in
turn increases the power. So to increase the power, the best method is to increase the sample size.
Only one of the Type I error or the Type II error is possible at a time.
The power of a test is defined as 1 minus the probability of type II error. Power = 1−β1−β.
Hypothesis Testing Procedure
Back to Top
Step 3: Calculating the test criterion based on the values obtained from the sample
Step 4: Finding the critical value with required level of significance and degrees of freedom
The problem of multiple hypothesis testing arises when there are more than one hypothesis to be tested
simultaneously for statistical significance. Multiple hypothesis testing occurs in a vast variety of field and for a
variety of purposes. Testing of more than one hypothesis is used in many field and for many purposes.
An alternate way of multiple hypothesis testing is multiple decision problem. When considering multiple testing
problems, the concern is with Type 1 errors when hypothesis are true and type 11 errors when they are false. The
evaluation of the procedures is based on criteria involving balance between these errors.
Bayesian Hypothesis Testing
Back to Top
Bayesian involves specifying a hypothesis and collecting evidence that support or does not support the statistical
hypothesis. The amount of evidence can be used to specify the degree of belief in a hypothesis in probabilistic
terms. The probability of supporting hypothesis can become vary high or low. Hypothesis with a high probabilistic
terms are accepted as true, and with low are rejected as false.
Bayesian hypothesis testing works just like any other type of Bayesian inference. Let us consider the case where we
are considering only two hypotheses, H1H1 and H2H2
The probability of our data P(x⃗ x→) takes into account the possibility of each hypothesis under consideration to be
true:
If the level of significance is a small value, then the sample data fail to support null hypothesis and it reject H0H0. If
the level of significance is a large value, then we fail to reject null hypothesis.
Hypothesis Testing Example
Back to Top
Making Assumptions
In hypothesis testing we make assumptions about the level of measurement of the variable, the sampling method, the
shape of the population distribution, and the sample size. In our example, we made these assumptions:
We used a random sample.
Our variable, price, is at the interval-ratio level of measurement.
N > 50, so we need not assume a normal population.
Hypothesis Tests
Statisticians follow a formal process to determine whether to reject a null hypothesis, based on sample data. This
process, called hypothesis testing, consists of four steps.
State the hypotheses. This involves stating the null and alternative hypotheses. The
hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true,
the other must be false.
Formulate an analysis plan. The analysis plan describes how to use sample data to evaluate
the null hypothesis. The evaluation often focuses around a single test statistic.
Analyze sample data. Find the value of the test statistic (mean score, proportion, t statistic,
z-score, etc.) described in the analysis plan.
Interpret results. Apply the decision rule described in the analysis plan. If the value of the
test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.
Decision Errors
Two types of errors can result from a hypothesis test.
Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it is
true. The probability of committing a Type I error is called the significance level. This
probability is also called alpha, and is often denoted by α.
Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis
that is false. The probability of committing a Type II error is called Beta, and is often
denoted by β. The probability of not committing a Type II error is called the Power of the
test.
Decision Rules
The analysis plan includes decision rules for rejecting the null hypothesis. In practice, statisticians describe these
decision rules in two ways - with reference to a P-value or with reference to a region of acceptance.
P-value. The strength of evidence in support of a null hypothesis is measured by the P-
value. Suppose the test statistic is equal to S. The P-value is the probability of observing a
test statistic as extreme as S, assuming the null hypotheis is true. If the P-value is less than
the significance level, we reject the null hypothesis.
Region of acceptance. The region of acceptance is a range of values. If the test statistic
falls within the region of acceptance, the null hypothesis is not rejected. The region of
acceptance is defined so that the chance of making a Type I error is equal to the significance
level.
The set of values outside the region of acceptance is called the region of rejection. If the
test statistic falls within the region of rejection, the null hypothesis is rejected. In such
cases, we say that the hypothesis has been rejected at the α level of significance.
These approaches are equivalent. Some statistics texts use the P-value approach; others use the region of
acceptance approach. In subsequent lessons, this tutorial will present examples that illustrate each approach.
One-Tailed and Two-Tailed Tests
A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling distribution, is
called a one-tailed test. For example, suppose the null hypothesis states that the mean is less than or equal to 10.
The alternative hypothesis would be that the mean is greater than 10. The region of rejection would consist of a
range of numbers located on the right side of sampling distribution; that is, a set of numbers greater than 10.
A test of a statistical hypothesis, where the region of rejection is on both sides of the sampling distribution, is called
a two-tailed test. For example, suppose the null hypothesis states that the mean is equal to 10. The alternative
hypothesis would be that the mean is less than 10 or greater than 10. The region of rejection would consist of a
range of numbers located on both sides of sampling distribution; that is, the region of rejection would consist partly
of numbers that were less than 10 and partly of numbers that were greater than 10.