Professional Documents
Culture Documents
1. INTRODUCTION
4. GENERAL DISCUSSION
5. CONCLUTION
`
19
1. Introduction
The Narrowing Male-Female Unemployment Differential
Unemployment rates of developed and developing countries pose a complicated puzzle. Most
developed countries have higher unemployment rates than developing countries.
Unemployment (or joblessness) occurs when people are without work and actively seeking
work. The unemployment rate is a measure of the prevalence of unemployment and it is calculated as a
percentage by dividing the number of unemployed individuals by all individuals currently in thelabor
force. During periods of recession, an economy usually experiences a relatively high unemployment
rate. According to International Labour Organization report, more than 197 million people globally are
out of work or 6% of the world's workforce were without a job in 2012.
As gender roles have followed the formation of agricultural and then industrial societies, newly
developed professions and fields of occupation have been frequently inflected by gender. Some
examples of the ways in which gender affects a field include:
Note that these gender restrictions may not be universal in time and place, and that they operate to
restrict both men and women. However, in practice, norms and laws have historically restricted
women's access to particular occupations; civil rights laws and cases have thus primarily focused on
equal access to and participation by woman in the workforce. These barriers may also be manifested in
hidden bias and by means of many micro inequities.
Women in the workforce earning wages or a salary are part of a modern phenomenon, one that
developed at the same time as the growth of paid employment for men; yet women have been
challenged by inequality in the workforce. Until modern times, legal and cultural practices, combined
with the inertia of longstanding religious and educational conventions, restricted women's entry and
3
participation in the workforce. Economic dependency upon men, and consequently the poor socioeconomic status of women, have had the same impact, particularly as occupations have become
professionalized over the 19th and 20th centuries.
The main objective of the study is to, identify how the unemployment rate of men willaffect the
unemployment rate of female workers.
In addition, by summarizing the data using descriptive statistics methods and presenting the main
features graphically, we can also compare how each variable affects the other.
2. PRSENTATION OF INFORMATION
The data given is a sample taken from a labour force survey conducted by a research company in Sri
Lanka. The survey was carried out as a household survey and all the members of a randomly selected
household, in the working age population (i.e. age 15) were
considered in the survey.
During the 1975-2010 period the unemployment rate for women
was higher than the rate for men in every year but one. In recent
years, however, a dramatic and unanticipated narrowing of the
male-female unemployment rate differential has occurred. In
2007 and 2008 the female rate was less than the male rate. And
since 2009 the female rate has exceeded the male rate by
historically small amount. The relatively high female
unemployment rate has being taken as evidence of the
disadvantages women face in the job market, or of their
relatively weak attachment to the labour force. Since the
narrowing of the male-female rate differential could indicate a
change in these underlying factors, a new examination of malefemale unemployment differential seems appropriate.
Year
Male UE Female UE
1975
15.1
15.7
1976
12.8
14.4
1977
12.8
13.6
1978
12.8
13.3
1979
15.3
16
1980
14.2
14.9
1981
13.8
14.8
1982
14.1
14.7
1983
16.8
16.8
1984
15.2
15.9
1985
15.4
15.9
1986
16.4
17.2
1987
15.2
16.2
1988
15.2
16.5
1989
14.6
16.2
1990
14
15.5
1991
13.2
14.8
1992
13.1
15.2
1993
12.9
14.8
1994
12.8
14.7
1995
14.4
15.9
1996
15.3
16.9
1997
15
16.6
1998
14.2
16
1999
14.9
16.7
2000
17.9
19.3
2001
17.1
18.6
2002
16.3
18.2
2003
15.3
17.2
2004
15.1
16.8
2005
16.9
17.4
2006
17.4
17.9
2007
19.9
19.4
2008
19.9
19.2
2009
17.4
17.6
2010
17
17.4
5
3. METHODOLOGY
Statistics is a field of mathematics that pertains to data analysis. Statistical methods and equations can
be applied to a data set in order to analyze and interpret results, explain variations in the data, or predict
future data. A few examples of statistical information we can calculate are:
Span of values over which your data set occurs (range), and
Midpoint between the lowest and highest value of the set (median)
Mean
The mean, is obtained by dividing the sum of observed values by the number of observations, n.
Although data points fall above, below, or on the mean, it can be considered a good estimate for
predicting subsequent data points. The formula for the mean is given below as equation,
Median
The median is the middle value of a set of data containing an odd number of values, or the average of
the two middle values of a set of data with an even number of values. The median is especially helpful
when separating data into two equal sized bins.
Standard Deviation
The standard deviation gives an idea of how close the entire set of data is to the average value. Data
sets with a small standard deviation have tightly grouped, precise data. Data sets with large standard
deviations have data spread out over a wide range of values.
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out.
It is one of several descriptors of a probability distribution, describing how far the numbers lie from
the mean (expected value).
Skewness
Skewness is the degree of departure from symmetry of a
distribution. A positively skewed distribution has a "tail" which is
pulled in the positive direction. A negatively skewed distribution
has a "tail" which is pulled in the negative direction.
Since we have a large sample, it appropriate to use Fisher-Pearson statistics to calculate Skewness.
Kurtosis
Kurtosis is the degree of peakedness of a distribution.
Most departures from normality display combinations of both skewness and kurtosis different from a
normal distribution.
Correlation
When two sets of data are strongly linked together we say they have a High Correlation.
Like this:
The value shows how good the correlation is (not how steep the line is), and if it is positive or
negative.
Are there periodicities in the data, maybe controlled by daily or annual cycles?
Is there a trend?
Are two time series (e.g. global temperature and CO levels) correlated? If so, is there a delay
between the two?
A variety of methods have been invented to investigate such problems. Although I will endeavor to
make things simple, it cannot be denied that time series analysis is a rather complicated and technical
field, with many pitfalls and subtleties. The case studies will demonstrate some of these issues.
Regression
Linear regression uses one independent variable to explain and/or predict the outcome of Y.
The general form of linear regression is:
Linear Regression: Y = m + bX
Where:
Y= the variable that we are trying to predict
X= the variable that we are using to predict Y
b= the intercept
m= the slope
Regression takes a group of random variables, thought to be predicting Y, and tries to find a
mathematical relationship between them. This relationship is typically in the form of a straight line
(linear regression) that best approximates all the individual data points. Regression is often used to
determine how much specific factors such as the price of a commodity, interest rates, particular
industries or sectors influence the price movement of an asset.
When we choose to analyze your data using linear regression, part of the process involves checking to
make sure that the data you want to analyze can actually be analyzed using linear regression. We need
to do this because it is only appropriate to use linear regression if our data "passes" six assumptions that
are required for linear regression to give a valid result.
Assumption #1: Two variables should be measured at the interval or ratio level (i.e., they
are continuous). Examples of variables that meet this criterion include revision time (measured
in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100),
weight (measured in kg), and so forth.
Assumption #2: There needs to be a linear relationship between the two variables. Whilst there
are a number of ways to check whether a linear relationship exists between the two
variables.Scatterplot can plot the dependent variable against the independent variable, and then
visually inspect the scatterplot to check for linearity. Scatterplot may look something like one of
the following:
Assumption #3: There should be no significant outliers. Outliers are simply single data points
within data that do not follow the usual pattern. The following scatterplots highlight the
potential impact of outliers:
The problem with outliers is that they can have a negative effect on the regression equation that
is used to predict the value of the dependent (outcome) variable based on the independent
(predictor) variable. This will reduce the predictive accuracy of r results.
10
Assumption #4: Should have independence of observations, which you can easily check using
the Durbin-Watson statistic.
Assumption #5: Data needs to show homoscedasticity, which is where the variances along the
line of best fit remain similar as you move along the line.
Assumption #6: Finally, we need to check that the residuals (errors) of two variables
are approximately normally distributed.
11
Above equation is more appropriate for a small data set where number of observations (n) 10, but our
sample size is 36 Fisher-Pearson equation would be more meaningful.
)(
(
(
)
)
(
(
)
(
)(
(
)(
(
)(
)
)
)
)
12
Calculation of summary statistics for the Female unemployment data set as follows,
Female unemployment data set is bi-modal with values 14.8 and 15.9, therefore we calculate Skewness
for both.
(
)(
)
)
)(
)
(
)
(
(
(
)(
(
)(
(
)
)
)
13
Given below is the descriptive statistics computed for the data set,
Descriptive Statistics
N
Minimum Maximum
Statistic
Male
Unemployment
Femal
Unemployment
Statistic
Statistic
Mean
Std.
Variance
Deviation
Statistic
Statistic
Skewness
Kurtosis
Std.
Error
36
12.80
19.90 15.2694
1.85547
3.443
.734
.393
.380
.768
36
13.30
19.40 16.3389
1.52994
2.341
.209
.393
-.330
.768
Table 1
higher
for the female workers.
positive
kurtosis and indicates that its
leptokurtic distribution has a higher
peak than the normal distribution
and has heavier tails.
Female unemployment has a negative kurtosis and indicates that its pure platykurtic distribution
has a lower peak than a normal distribution and lighter tails.
From the Box & whiskers plot we can see that Female unemployment is symmetrically
distributed while male unemployment is skewed.
In both distributions median is less than the mean, indicating they are positively skewed.
14
Figure 1
Scatter plot (Figure 1) shows how unemployment rate for male and female changes over the time
period and it also shows that both rates move similarly, indicating positive correlations.
Validating assumptions
Assumption #1: Both data sets are percentage values and thus measured at ratio level.
Assumption #2:Above scatter plot positively correlated, thus proving there exist a linear relationship
between two variables.
Assumption #3:No significant outliers are to found by examining the scatter plot (Figure 1)
Assumption #4:Independence of observation is satisfied since the sampling of one person does not
affect the outcome of the second person, thus to confirm Durbin-Watson (0.361) is not in between the
upper or lower margin.
15
Figure 2
Assumption #6:Normality of Error, this assumption is often tested by simply plotting the Standardized
Residuals (each residual divided by its standard error) on a histogram with a superimposed normal
distribution.
Figure 3
From the histogram and the normal p-p plot we can see that the residuals follow a normal distribution
and hence we can conclude that the data set meets all the six assumptions that was examined and a
regression analysis can be carried out on the data set.
16
Xsum - The sum of all the values in the x column (Male UE).
Ysum - The sum of all the values in the y column (Female UE).
XYsum - The sum of the products of the xn and yn that are recorded at the same time (vertical on this
chart).
X2sum - The total of each value in the x column squared and then added together.
Y2sum - The total of each value in the y column squared and then added together.
N - The total number of elements (or trials in your experiment).
The best form for our line is slope-intercept form, which looks like y = mx + b. Therefore, it is only
necessary to compute m and b to determine the best fit line. Those values can be computed by the
following equations:
After plugging in the values that we found, we get: m = 0.767 and b = 4.632.
Since we have a large data set, SPSS was used to analyze and fit the regression line to the data set.
Table 2
R
R Square
a
.930
.865
Model Summary
Adjusted R Square
.861
This table provides the R and R2 value. The R value is 0.930, which represents the simple correlation. It
indicates a high degree of correlation. The R2 value indicates how much of the dependent variable,
"female unemployment", can be explained by the independent variable, "male unemployment". In this
case, 86.5% can be explained, which is very large.
17
Table 3
Model
1
Regression
Residual
Total
ANOVA
Sum of Squares
70.828
11.097
81.926
df
Mean Square
F
Sig.
1
70.828 217.010 .000b
34
.326
35
Next we look table is the ANOVA table. This table indicates that the regression model predicts the
outcome variable significantly well. We say this because Sig. column value for regression is 0.00, this
indicates the statistical significance of the regression model that was applied. Here, p < 0.0005, which
is less than 0.05, and indicates that, overall, the model applied can statistically significantly predict the
outcome variable.
Table 4
Coefficients
Unstandardized Coefficients
Model
B
1
(Constant)
Male
Unemployment
4.632
.767
Std. Error
.800
.052
Standardized
Coefficients
Sig.
Beta
.930
5.787
14.731
.000
.000
The table above, Coefficients, provides us with information on each predictor variable. This gives us
the information we need to predict female-unemployment from male-unemployment. We can see that
both the constant and male-unemployment contribute significantly to the model (by looking at
the Sig. column). By looking at the B column under the Unstandardized Coefficients column, we can present
the regression equation as:
18
25
y = 0.1094x + 14.316
R = 0.5672
15
y = 0.1103x + 13.229
R = 0.3923
10
Male
Female
Year
Linear (Male )
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
0
1975
Unemployment
20
Linear (Female)
A linear trend line is a best-fit straight line that is used with simple linear data sets. Your data is linear if the pattern in its data points
resembles a line. A linear trend line usually shows that something is increasing or decreasing at a steady rate.
In the data set, a linear trend line clearly shows that both male and female unemployment moves consistently over years, we can
see that the R-squared (fraction of variance explained by a model) value is not significant, which is not a good fit of the line to the
data.
This might have occurred because we have used annual data, any seasonal and cyclical effects might be masked due to this, a better
line of fit might have being possible with monthly or quarterly data.
Further time series modeling is not possible or rather we could say not meaningful as seasonal decomposition cannot be carried out
on annual data, we say a seasonal component is included because of R-squared is not significant.
19
5. GENERAL DISCUSSION
From our results we can conclude the following,
According to our data set we can see that year by year the gap between male-female unemployment
declines and they move in a similar pattern. And can be used to predict one another.
Our preferred explanations focus on the restrictions on the set ofavailable jobs that are acceptable to
women, mainly due to thepresence of young children that create frictions to their employment.
When mothers return to work after childbirth, they have to searchthe set of available vacancies, which
takes time and effort. But manyfirms have increased workplace assistance that helps mothers ofyoung
children return to the previous firm in typical work, and theseoffers are immediately apparent without
the need for job search. Soreturning mothers, on average, now face fewer frictions in findingwork after
childbirth. There is also evidence that new jobs taken bywomen are increasingly likely to continue into
a second year, whichwould also lower the inflow rate into unemployment. These pieces ofevidence
may be consistent with a lowering of the natural rate offemale unemployment, although of course that
is only oneinterpretation.
In order to get a better view how the two variables behave we can do a further analysis by obtaining
monthly or quarterly data, and considering other factors that affect employment, such as education
level, health, inflation.
20