You are on page 1of 10

Make a descriptive note on hypothesis testing in reference to testing the

significance of difference of means by using information from large samples.


The note should answer the following specific questions:

a) What is hypothesis and what are its types?

b) What are confidence and significance levels?

c) What are the errors associated with decisions based on hypothesis testing?

d) What is critical value and critical region?

e) What is test statistic? What is its formula?

f) How decisions are made on the basis of statistical calculations?

g) What is the importance of hypothesis testing in business?

Ans: Hypothesis Testing is one of the most important aspects of the theory of decision making. It
consists of decisions rules required for drawing probabilistic inferences about the population
parameters. It often involves deciding at given point of time whether a given population
parameter is the same as before, as claimed or changed.

A quantitative statement about the population parameter is called a hypothesis. In other


words, it is an assumption that is made about the population parameter and its validity is tested.
It may or may not be found valid verification. The act of verification involves testing the validity
of such assumption which, when undertaken on the basis sample evidence, called statistical
hypothesis or testing of significance. In other words, a procedure to assess the significance of a
statistic or difference between two independent statistics is known as a test of significance. By
testing the hypothesis we can find out whether we can find out it deserves the acceptance or
rejection of the hypothesis. The truth or falsity of a statistical hypothesis is based on the
information contained in the sample which may be consistent or inconsistent with the hypothesis
and accordingly, the hypothesis may be accepted or rejected. The acceptance of hypothesis
means there is no any sufficient evidence provided by the sample to reject it and does not
necessarily, imply that it is true. The main goal of testing of hypothesis is test characteristics of
hypothesized population parameter based on the sample information whether the difference
between the population parameter and sample statistic is significant or not.
Generally, two complementary hypotheses are set up at one time. If one of the hypothesis is
accepted, then the other hypothesis is rejected and vice-versa. The two complementary
hypotheses that are set up in testing of hypothesis are the null hypothesis and alternative
hypothesis.

Null hypothesis

A statistical hypothesis or assumption made about the population parameter to testing its validity
for the purpose of possible acceptance is called null hypothesis. Null hypothesis is also called
hypothesis of no difference. We should adopt neutral or null regarding the outcome of the sample
while setting up the null hypothesis. The null hypothesis is usually denoted by Ho or H sub
Zero. For example, the null hypothesis may be set up as follows.

I. If we are denoted to test the significance of the difference between a sample statistic and
population parameter or between two sample statistics, then we set up the null hypothesis
that there is no significant between the sample statistic and the population parameter or
between two sample statistics. This means that the difference is just due to fluctuation of
sampling.

II. If we want to test any statement about the population parameter, we set up the null
hypothesis that it is true. For example, if the population mean has specified value o,
then set up the null hypothesis as

Ho; = o. That is, the population mean has some specified value o. In other words,
there is no significant difference between sample mean and population mean ().

Alternate hypothesis

A complementary hypothesis to null hypothesis is called an alternative hypothesis. In other


words, a hypothesis which is set up against the null hypothesis is called an alternative
hypothesis. An alternative hypothesis is also called hypothesis of difference. It is usually by the
H1 (H sub one). For example, if we are interested to test null hypothesis that the population
mean has specified value o, then alternate hypothesis is set up as follows:

Ho: = o. That is, population has specified value o.

We will consider three possible alternative hypotheses.

I. H1: o. That is, population mean is not equal to sample statistic o.

II. H1: > o. That is, the population mean is greater than o.

III. H1: < o. That is, population mean is less then o.


Out of above three possible alternatives hypothesis, we should select only one alternative
hypothesis depending on the nature of the problem involving that parameter to which null
hypothesis relates.

Types of errors in testing of hypothesis (type I and type II)

The decision to accept or reject the null hypothesis Ho is made on the basis of the information
supplied by the sample data. There are the following four possibilities in testing of hypothesis.

I. Accepting the null hypothesis when the null hypothesis is true.

II. Rejecting the null hypothesis when null hypothesis is true.

III. Accepting the null hypothesis when the null hypothesis is false.

IV. Rejecting the null hypothesis when the null hypothesis is false.

The above decision can be presented in the following table.

Accept Ho Reject Ho

Ho true Correct decision (no error) Wrong decision (Type I


error)
Probability= 1-
Probability=

Ho false Wrong decision (Type II Correct decision (no error)


error)

Probability =
Probability = 1-

In the above case, decision i and decision iv are correct decision while decisions ii and iii
are wrong.

In the testing of hypothesis, we may commit two types of errors. The error
committed in rejecting null hypothesis Ho when it is true is called type I error or the
error of the first kind and its probability is denoted by . The error committed in
accepting null hypothesis Ho when it is false is called type II error or the error of second
kind and it is denoted by . Thus,

P (rejecting Ho when it is true) = P (type I error) = and

P (accepting Ho when it is false) = P (type II error) =


Where and are also called the sizes of type I error and type II error respectively.

While inspecting the quality of manufactured lot, the type I error amounts to
reject a good lot and type II error amounts to accepting a bad lot. Accordingly,

= P (type I error) = P (rejecting a good lot)

And = P (type II error) = P (accepting a bad lot),

Where and are also known as producers risk and consumers risk respectively.

Though efforts are made to reduce both type I and type II errors. But it is not
possible to reduce both at the same time. The probability of making one type of error can
be reducing both at same time. The probability of making one type of error can be
reduced only if we are willing to increase the probability of making the other type of
error.

Level of significance.

The maximum size of the type I error that we are prepared to risk is called the level of
significance. In other words, the probability of rejecting a true null hypothesis is called level of
significance and is denoted by . Symbolically, it is defined as

= P (type I error)

= P (Rejecting a true null hypothesis)

Hypothesis generally is tested at 1% or 5% level of significance. But, the most commonly used
level of significance in practice is 5%. If we adopt = 5% level of significance. It shows that in 5
true samples out of 100, we are likely to reject a correct Ho

One tailed and two tailed test

The critical region may represent by a portion of area under normal probability curve of the
sampling distribution of the statistic in two ways:

I. Two tail or sides under the curve and

II. One tail or side under the curve which is either the right or left tail.
The testing of hypothesis which is based on critical region represented by tails under the
normal curve is called two tail tests. In other words a two tail test is a hypothesis in which the
null hypothesis is rejected if the sample value significantly higher or lower than the hypothesized
vale of the population parameter.

Critical value and critical region

A critical value is a line on a graph that splits the graph into sections. One or two of the sections
is the "rejection region"; if your test value falls into that region, then you reject the null
hypothesis. A one tailed test with the rejection in one tail.

Sampling Distribution

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.5

-5 -4 -3 -2 -1 0 1 2 3 4 5

A one tailed test with the rejection in one tail. The critical value is the black bold line of the left
of that region.

The critical value approach involves determining "likely" or "unlikely" by determining


whether or not the observed test statistic is more extreme than would be expected if the null
hypothesis were true. That is, it entails comparing the observed test statistic to some cutoff value,
called the "critical value ." If the test statistic is more extreme than the critical value, then the
null hypothesis is rejected in favor of the alternative hypothesis. If the test statistic is not as
extreme as the critical value, then the null hypothesis is not rejected.

Specifically, the four steps involved in using the critical value approach to conducting
any hypothesis test are:

1. Specify the null and alternative hypotheses.

2. Using the sample data and assuming the null hypothesis is true, calculate the value of the test
statistic. To conduct the hypothesis test for the population mean , we use the t statistic which
follows a t-distribution with n - 1 degrees of freedom.

3. Determine the critical value by finding the value of the known distribution of the test statistic
such that the probability of making a Type I error which is denoted (Greek letter "alpha")
and is called the "significance level of the test " is small (typically 0.01, 0.05, or 0.10).

4. Compare the test statistic to the critical value. If the test statistic is more extreme in the
direction of the alternative than the critical value, reject the null hypothesis in favor of the
alternative hypothesis. If the test statistic is less extreme than the critical value, do not reject the
null hypothesis.

Test statistic

A test statistic is a statistic (a quantity derived from the sample) used in statistical
hypothesis testing. A hypothesis test is typically specified in terms of a test statistic, considered
as a numerical summary of a data- set that reduces the data to one value that can be used to
perform the hypothesis test. In general, a test statistic is selected or defined in such a way as to
quantify, within observed data, behaviors that would distinguish the null from the alternative
hypothesis, where such an alternative is prescribed, or that would characterize the null
hypothesis if there is no explicitly stated alternative hypothesis.

An important property of a test statistic is that its sampling distribution under the null
hypothesis must be calculable, either exactly or approximately, which allows p-values to be
calculated. A test statistic shares some of the same qualities of a descriptive statistic, and many
statistics can be used as both test statistics and descriptive statistics. However, a test statistic is
specifically intended for use in statistical testing, whereas the main quality of a descriptive
statistic is that it is easily interpretable. Some informative descriptive statistics, such as the
sample range , do not make good test statistics since it is difficult to determine their sampling
distribution.

Example

For example, suppose the task is to test whether a coin is fair (i.e. has equal probabilities
of producing a head or a tail). If the coin is flipped 100 times and the results are recorded, the
raw data can be represented as a sequence of 100 heads and tails. If there is interest in the
marginal probability of obtaining a head, only the number T out of the 100 flips that produced a
head needs to be recorded. But T can also be used as a test statistic in one of two ways:

The exact sampling distribution of T under the null hypothesis is the binomial distribution
with parameters 0.5 and 100.
the value of T can be compared with its expected value under the null hypothesis of 50,
and since the sample size is large a normal distribution can be used as an approximation
to the sampling distribution either for T or for the revised test statistic T 50.

Using one of these sampling distributions, it is possible to compute either a one-tailed or two-
tailed p-value for the null hypothesis that the coin is fair. Note that the test statistic in this case
reduces a set of 100 numbers to a single numerical summary that can be used for testing.

The different formulas of test statistic are given below:

I. Standardized test statistic for Z- scores:

II. T- score (single population):


III. T- score (two population):

Problem

The mean life of a particular battery is 75 hours. A sample of 9 light bulbs is chosen and
found to have a standard deviation of 10 hours and a mean of 80 hours. Find the standardized
test statistic.

solution;

The population standard deviation isnt known,so Im going to use the t-score
formula .

Step 1: Plug the information into the formula and solve:

xx = sample mean = 80

0 = population mean = 75

s = sample standard deviation = 10

n = sample size = 9

t = 80-75 / (10/9) = 1.5.

This means that the standardized test statistic (in this case, the t-score) is 1.5.

Decisions made on the basis of statistical calculations

Statistical Decision-Making Process Unlike the deterministic decision-making process, such as


linear optimization by solving systems of equations, Parametric systems of equations and in
decision making under pure uncertainty, the variables are often more numerous and more
difficult to measure and control. However, the steps are the same. They are:

1. Simplification

2. Building a decision model


3. Testing the model

4. Using the model to find the solution:

It is a simplified representation of the actual situation.


It need not be complete or exact in all respects.
It concentrates on the most essential relationships and ignores the less essential
ones.
It is more easily understood than the empirical (i.e., observed) situation, and
hence permits the problem to be solved more readily with minimum time and
effort.

5. It can be used again and again for similar problems or can be modified.

Common Statistical Terminology with Applications:

Population: A population is any entire collection of people, animals, plants or things on which
we may collect data. It is the entire group of interest, which we wish to describe or about which
we wish to draw conclusions. In the above figure the life of the light bulbs manufactured say by
GE, is the concerned population.

Qualitative and Quantitative Variables: Any object or event, which can vary in
successive observations either in quantity or quality, is called a "variable." Variables are
classified accordingly as quantitative or qualitative. A qualitative variable, unlike a quantitative
variable does not vary in magnitude in successive observations. The values of quantitative and
qualitative variables are called"Variates" and "Attributes", respectively. Variable: A characteristic
or phenomenon, which may take different, values, such as weight, gender since they are different
from individual to individual.

Randomness: Randomness means unpredictability. The fascinating fact about inferential


statistics is that, although each random observation may not be predictable when taken alone,
collectively they follow a predictable pattern called its distribution function. For example, it is a
fact that the distribution of a sample average follows a normal distribution for sample size over
30. In other words, an extreme value of the sample mean is less likely than an extreme value of a
few raw data.

Importance of hypothesis testing in business

Business owners like to know how their decisions will affect their business. Before making
decisions, managers may explore the benefits of hypothesis testing, the experimentation of
decisions in a "laboratory" setting. By making such tests, managers can have more confidence in
their decisions.

How Hypotheses Help:

Essentially good hypotheses lead decision- makers like you to new and better ways to
achieve your business goals. When you need to make decisions such as how much you should
spend on advertising or what effect a price increase will have your customer base, its easy to
make wild assumptions or get lost in analysis paralysis. A business hypothesis solves this
problem, because, at the start, its based on some foundational information. In all of science,
hypotheses are grounded in theory. Theory tells you what you can generally expect from a
certain line of inquiry. A hypothesis based on years of business research in a particular area, then,
helps you focus, define and appropriately direct your research. You wont go on a wild goose
chase to prove or disprove it. A hypothesis predicts the relationship between two variables. If you
want to study pricing and customer loyalty, you wont waste your time and resources studying
tangential areas.

Much of running a small business is a gamble, buoyed by boldness, intuition and guts.
But wise business leaders also conduct formal and informal research to inform their business
decisions. Good research starts with a good hypothesis, which is simply a statement making a
prediction based on a set of observations. For example, if youre considering offering flexible
work hours to your employees, you might hypothesize that this policy change will positively
affect their productivity and contribute to your bottom line. The ultimate job of the hypothesis in
business is to serve as a guidepost to your testing and research methods.

Conclusion

A hypothesis is generally a speculative statement that needs to be verified in

a research work. During hypothesis formulation, it is important to keep the statement simple,

Precise and clear, and derive it from an existing body of knowledge. Two types of hypothesis

categories are research and alternate.

You might also like