Professional Documents
Culture Documents
CONTENTS
Page 1
ANOVA Case Study
1.INTRODUCTION
Losing a job can be the most upsetting event in an individual’s life. A job loss means a lower
living standard in the present, anxiety about the future, and reduced self-esteem. In addition,
unemployment affects not only an individual but also the economy as a whole. At this
moment, with the development of the country, the unemployment sector has being focused
and be the priority for growth of Vietnam’s economy.
The unemployment rate is one of the economic indicators used in determining the general
state of the economy and its potential for growth. Basing on the unemployment rate,
economists determine the state of the economic climate and analyze where the country is
going in terms of jobs and outlook. The unemployment rate gives job seekers an idea of how
competitive the job market is and decides their appropriate course of action. Depending on
the unemployment figures, the government may step in, offering federal assistance to
jumpstart the economy.
The unemployment rate is affected by many factors like geographic, education, custom and
so on. Therefore, in the concern of unemployment situation in Vietnam, we decided to make
a project to check whether the unemployment rate differs from regions to regions in Vietnam
and determine the effect of demographic factor on the differences in the growth of these
regions.
To answer this question, analysis of variance (ANOVA) was chosen as the statistic method.
Particularly, we test the equality of annual unemployment rate in urban area in the eight
regions: Northwest (Tay Bac), Northeast (Dong Bac), Red River Delta (Dong Bang Song
Hong), North Central Coast (Bac Trung Bo), South Central Coast (Nam Trung Bo), Central
Highlands (Tay Nguyen), Southeast (Dong Nam Bo), Mekong River Delta (Dong Bang Song
Cuu Long). From the result of this analysis, we can help give some recommendation for
government in solving the unemployment issue, one critical economic problem.
After analyzing and testing the data collected, we found that there is sufficient statistical
evidence to believe that the differences in the unemployment rate between these areas exist.
Specifically, as the value of test statistic is large enough (F = 6.05) in comparison with the
critical value (F0.05, 7, 64 = 2.16), we reject Ho.
Page 1
ANOVA Case Study
2. RESEARCH METHODOLOGY
2.1 POPULATIONS
For this analysis, the experimental units are the years we recorded the unemployment rate.
The response variable is the annual rate of unemployment in urban area for each region. In
term of factor that defines the populations, we classify the populations by the regions in our
country. As the result, there are eight populations corresponding to the eight regions:
Northwest, Northeast, Red River Delta, North Central Coast, South Central Coast, Central
Highlands, Southeast and Mekong River Delta.
Page 1
ANOVA Case Study
Since level of significance is the probability of making type I error, we decide to choose the
common significant level: α = 0.05 for ANOVA and all supporting tests in this case study,
which means 5 in 100 chances of making Type I error. This level of significance can ensure
the accuracy of the ANOVA and other tests.
We have eight populations resembling eight regions in Vietnam. For each population, we
check the normality by producing the histogram for each sample draw out from that
population. As these eight histograms are not extremely non-normal, we assume that the eight
populations are normally distributed.
Page 1
ANOVA Case Study
To save time, we test the difference between the lowest and highest sample variances. If they
are indifferent, we can believe that the variances of eight populations are equal.
Page 1
ANOVA Case Study
F=MSTMSE
α=0.05
Total 72.717 71
Step 6: Conclusion
Therefore, there is enough statistical evidence to infer at 5% level of significance that the
difference among the annual unemployment rate in urban area of eight regions exists.
Because the objective problem is to compare two population means, the two populations are
assumed to be normally distributed based on the histograms of their samples, and the
Page 2
ANOVA Case Study
population variances are believed to be equal; we conduct t-test for population means. The
detailed procedure of this test is described in the appendix. The result of this test shows that
we have enough statistical evidence to infer at 5% level of significance that these two regions
have difference unemployment rate. This test makes the result of ANOVA more believable.
1. EVALUATION
4.1 LIMITATIONS
In the entire process of ANOVA testing, we to admit that there exist some limitations
regarding to our collected data and the assumptions.
Firstly, regarding to the conditions required for applying ANOVA hypothesis testing
procedure, we had to assume that variances of eight populations are equal and the populations
are normally distributed. In fact, the f-test for population variances represents that the
populations variances, varying in the interval from 0.252675 to 1.681775, may not be equal.
In addition, the normality requirement is checked solely based on the histograms, so the
assumption of the normality may not 100% correct. These limitations would affect the result
of our test, making it not 100% reliable.
Secondly, our topic only refers to one factor affecting the unemployment rates in the eight
main regions of Vietnam – geographic factor, although there are several factors affecting this
rate such as government policies, the quality of the workforce and the types of
unemployment. Therefore, we can only explain the differences based on the geographic
differences, which is not enough to concern the problem of unemployment.
The last main limitations regards to the data picked out in our research. Because of the
limitation of the source of data, the figures are not recently data. They are the annual
unemployment rate measured in the period from 2000 to 2008. Another problem is that we
just took the information of urban area of eight regions, so that the result is not enough to
provide a general view of unemployment issue.
4.2 IMPLICATION
In spite of some drawbacks mentioned above, this study offers some insight into the
difference in the unemployment rate of different regions in Vietnam. In more detail, the
results of this research show us the importance of geographical factor to the unemployment
rate in particular and in the labor market, the whole economy in general. Moreover, this may
Page 2
ANOVA Case Study
also suggest further studies into this issue, preferably with larger scale and more
sophistication.
5.2 RECOMMENDATIONS
With the figures picked out in this research, it is not deniable that the average unemployment
rate in Vietnam is quite high in the period from 2000 to 2008. Unemployment negatively
affects the living standard of people who do not have any work to do and influences the
economy as well as the whole society. As geography is one major factor affecting the
unemployment rate, more researches should be conducted to explain the effect of geographic
factor on unemployment rate. Government should consider the results of these researches to
find relevant solutions on the unemployment problem.
In addition, we recommend that follow-up studies should subject other affecting factors. They
should also take on a more sophisticated testing method to make sure that the unemployment
rate will be understood more clearly.
Lastly, it is highly advisable that subsequent research attempt larger population with the up-
to-date figures will help enhance the meaningfulness of the test.
REFERENCES
• Selvanathan. A, Selvanatan. S, Keller. G, Warrack. B, 2004, Australian Business Statistic, 3rd
edition, Thomson, Australia
• Levine. D.M, Stephan. D.F, Krehbiel. T.C, Berenson. M.L, 2004, Statistic for Managers, 5th
edition, Pearson, USA
• General Statistic Office, ‘The unemployment rate of labor force in urban area by regions
(1996-2008), downloaded March 18th 2011 at
Page 2
ANOVA Case Study
http://www.molisa.gov.vn/docs/SLTK/DetailSLTK/tabid/215/DocID/4686/TabModuleSetting
sId/496/language/vi-VN/Default.aspx
Page 2
ANOVA Case Study
APPENDIX
1. HISTOGRAMS
Ho: σ12σ22=1
Ha: σ12σ22≠1
As we want to compare the two population variances and the populations are independently
sampled, we use F-statistic:
F=s12s22
α=0.05
F1-α2,ν1,ν2=F.975,8,8≈0.23
Page 1
ANOVA Case Study
s12≈1.68
s22≈0.25
F=s12s22=1.680.25=6.72
Step 6: Conclusion
There is enough evidence to conclude at 5% level of significance that the variances of these
two populations differ.
Ho: σ12σ22=1
Ha: σ12σ22≠1
As we want to compare the two population variances and the populations are independently
sampled, we use F-statistic:
F=s12s22
α=0.05
F1-α2,ν1,ν2=F.975,8,8≈0.23
Page 2
ANOVA Case Study
s12≈1.68
s22≈0.45
F=s12s22=1.680.45=3.73
Step 6: Conclusion
There is not enough evidence to conclude at 5% level of significance that the variances of
these two populations differ.
Ho: μ1-μ2=0
Ha: μ1-μ2≠0
Because the objective is to compare two population means, two populations are
independently sampled and normally distributed and the variances are unknown but assumed
to be equal (as tested above), we use t-statistic for equal variances populations:
t=x1-x2-(μ1-μ2)sp21n1+1n1
α=0.05
Page 1
ANOVA Case Study
Reject Ho if t>2.12
Central Red
Highlands River
Delta
Observations 9 9
D.f 16
t Stat -4.7568011
Step 6: Conclusion
As t≈4.76>2.12 , we reject Ho
Therefore, there is enough statistical evidence to infer at 5% level of significance that the
unemployment rate in the Red River Delta and Central Highlands region are different.
Page 1
ANOVA Case Study
Page 3