Professional Documents
Culture Documents
X
=1
34
Y=
96
(X*
X)=21
74
(XY*
Y)=125
0
X
Y=1
517
Ques. Explain the method to calculate Spearman Rank Correlation method with
the help of an example.
Macro Economics 25
Ans. When the observations or measurements of the bivariate variable are based on
the ordinal scale in the form of ranks, the rank difference co-efficient of
correlation is computed by using the following formula:
Where: = the Spearmans Rank Correlation coefficient
D = difference between paired ranks
N = Number of subjects or items ranked
Steps to calculate Spearmans Rank Correlation Coefficient:
Step 1: Assign Rank to the values of variable X in descending order i.e. highest
value gets rank 1; second highest value gets rank 2 and so on.
Step 2: Assign Rank to the values of variable Y in descending order i.e. highest
value gets rank 1; second highest value gets rank 2 and so on.
Step 3: Calculate the difference in ranks of two variables to findd.
Step 4: Square the values ofd and find their sum i.e. d
2
.
Step 5: put the values in the above formula to get the value of .
In case of repeated ranks both the values are assigned the mean value of the ranks
i.e. if a value is repeated at 5
th
and 6
th
position then each are assigned 5.5 rank and
the next value gets the 7
th
rank. Similarly, in case a value is repeated thrice then the
mean of their rank is allotted to each value i.e. if a value is repeated at 5
th
, 6
th
, and 7
th
position then all three of them are assigned rank 6 and the next value gets 8
th
rank.
Example:
The following data give the scores of 10 students on two trials. Compute the
correlation between the scores of two trials by rank correlation method.
X Y rank on X rank onY D D*D
10 16 6.5 5.5 1 1
15 16 3 5.5 -2.5 6.25
11 24 5 1.5 3.5 12.25
14 18 4 4 0 0
16 22 2 3 -1 1
20 24 1 1.5 -0.5 0.25
10 14 6.5 7.5 -1 1
8 10 9 10 -1 1
7 12 10 9 1 1
9 14 8 7.5 0.5 0.25
total 0 24
26
26
Page | 26
Ques. What is the interpretation of different values of correlation coefficient?
Ans.
Size of Correlation Interpretation
+1/-1 Perfect Positive/ Perfect Negative Correlation
+.90to +.99/ -.90 to -.99 Very high positive/ Negative Correlation
+.70to +.90/ -.70 to -.90 High Positive/ Negative Coorelation
+.50to +.70/ -.50 to -.70 Moderate positive/ Negative Correlation
+.30to +.50/ -.30 to -.50 Low Positive/ Negative Correlation
+.10to +.30/ -.10 to -.30 Very Low Positive/ Negative Correlation
+.00to +.10/ -.00 to -.10 Markedly Low and negligible Positive/ Negative Correlation
Ques. What is the importance and use of Correlation in Statistical analysis?
Ans. Correlation is one of the most widely used analytic procedures in the field of
Educational Measurement and Evaluation. It not only describes the relationship
of paired variables, but it is also useful in:
1. Prediction of one variable - the dependent variable, on the basis of the other
variable, the independent variable.
2. Determining the reliability and validity of the test or the question paper.
3. Determining the role of various correlates to a certain ability.
4. Factor analysis technique for determining the factor loadings of the
underlying variables in human abilities.
Macro Economics 27
Chapter 2
Regression Analysis
Ques. Explain the concept of Regression analysis in statistical enquiry.
Ans. Regression analysis is a statistical tool for the investigation of relationships
between variables. Usually, the investigator seeks to ascertain the causal effect of
one variable upon anotherthe effect of a price increase upon demand, for
example, or the effect of changes in the money supply upon the inflation rate. To
explore such issues, the investigator assembles data on the underlying variables
of interest and employs regression to estimate the quantitative effect of the causal
variables upon the variable that they influence. The investigator also typically
assesses the statistical significance of the estimated relationships, that is, the
degree of confidence that the true relationship is close to the estimated
relationship.
For purposes of illustration, suppose that we wish to identify and quantify the
factors that determine earnings in the labor market. A moments reflection
suggests a myriad of factors that are associated with variations in earnings across
individualsoccupation, age, experience, educational attainment, motivation,
and innate ability come to mind, perhaps along with factors such as race and
gender that can be of particular concern to lawyers. For the time being, let us
restrict attention to a single factorcall it education. Regression analysis with a
single explanatory variable is termed simple regression.
At the outset of any regression study, one formulates some hypothesis about the
relationship between the variables of interest, here, education and earnings.
Common experience suggests that better educated people tend to make more
money. It further suggests that the causal relation likely runs from education to
earnings rather than the other way around. Thus, the tentative hypothesis is that
higher levels of education cause higher levels of earnings, other things being
equal.
To investigate this hypothesis, imagine that we gather data on education and
earnings for various individuals. Let E denote education in years of schooling for
each individual, and let I denote that individuals earnings in dollars per year.
We can plot this information for all of the individuals in the sample using a two-
dimensional diagram, conventionally termed a scatter diagram. Each point in
the diagram represents an individual in the sample.
28
28
Page | 28
The diagram indeed suggests that higher values of E tend to yield higher values
of I, but the relationship is not perfectit seems that knowledge of E does not
suffice for an entirely accurate prediction about I. To refine the hypothesis
further, it is natural to suppose that people in the labor force with no education
nevertheless make some positive amount of money, and that education increases
earnings above this baseline. We might also suppose that education affects
income in a linear fashionthat is, each additional year of schooling adds the
same amount to income. This linearity assumption is common in regression
studies but is by no means essential to the application of the technique, and can
be relaxed where the investigator has reason to suppose a priori that the
relationship in question is nonlinear.
Then, the hypothesized relationship between education and earnings may be
written
I = a + bE + e
where
a = a constant amount (what one earns with zero education);
b = the effect in dollars of an additional year of schooling on income,
hypothesized to be positive; and
e = the noise term reflecting other factors that influence earnings.
The variable I is termed the dependent or endogenous variable; E is termed
the independent, explanatory, or exogenous variable; a is the constant
term and b the coefficient of the variable E. the estimated error for each
observation is defined as the vertical distance between the value of I along the
estimated line I = a + bE (generated by plugging the actual value of E into this
equation) and the true value of I for the same observation. Superimposing a
candidate line on the scatter diagram, the estimated errors for each observation
may be seen as follows:
Macro Economics 29
Ques. What is the mathematical formulation to find the value of parameters in a
regression analysis? Explain with an example.
Ans. Suppose we reckon that some variable of interest, y, is driven by some other
variable x. We then call y the dependent variable and x the independent variable.
In addition, suppose that the relationship between y and x is basically linear, but
is inexact: besides its determination by x, y has a random component, u, which
we call the disturbance or error.
Let i index the observations on the data pairs (x, y). The simple linear model
formalizes the ideas just stated:
Y
i=
0
+
1
X
i
+ u
i
The parameters 0 and 1 represent the y-intercept and the slope of the
relationship, respectively.
In order to work with this model we need to make some assumptions about the
behavior of the error term. For now well assume three things:
E (ui ) =0 u has a mean of zero for all i
E (u
i
2
) =
2
it has the same variance for all i
E (uiuj )=0 i = j no correlation across observations
We define the estimated error or residual associated with each pair of data values
as the actual Yi value minus the prediction based on Xi along with the estimated
coefficients
In a scatter diagram of y against x, this is the vertical distance between observed
yi and the fitted value. The most common technique for determining the coefficients
and is Ordinary Least Squares (OLS): values for and are chosen so as to
minimize the sum of the squared residuals or SSR. The SSR may be written as
30
30
Page | 30
The minimization of SSR is a calculus exercise: we need to find the partial
derivatives of SSR with respect to both and and set them equal to zero. This
generates two equations (known as the normal equations of least squares) in the two
unknowns, and . These equations are then solved jointly to yield the estimated
coefficients.
Macro Economics 31
Example:
X Values Y Values
60 3.1
61 3.6
62 3.8
63 4
65 4.1
Step 1: Count the number of values.
N = 5
Step 2: Find XY, X
2
See the below table
X Value Y Value X*Y X*X
60 3.1 60 * 3.1 = 186 60 * 60 = 3600
61 3.6 61 * 3.6 = 219.6 61 * 61 = 3721
62 3.8 62 * 3.8 = 235.6 62 * 62 = 3844
63 4 63 * 4 = 252 63 * 63 = 3969
65 4.1 65 * 4.1 = 266.5 65 * 65 = 4225
Step 3: Find X, Y, XY, X
2
.
X = 311
Y = 18.6
32
32
Page | 32
XY = 1159.7
X
2
= 19359
Step 4: Substitute in the above slope formula given.
Slope (
1
) = (NXY - (X)(Y)) / (NX
2
- (X)
2
)
= ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)
2
)
= (5798.5 - 5784.6)/(96795 - 96721)
= 13.9/74
= 0.19
Step 5: Now, again substitute in the above intercept formula given.
Intercept (
o
) = (Y
1
(X)) / N
= (18.6 - 0.19(311))/5
= (18.6 - 59.09)/5
= -40.49/5
= -8.098
Step 6: Then substitute these values in regression equation formula
Regression Equation(y) =
o
+
1
x
= -8.098 + 0.19x.
Suppose if we want to know the approximate y value for the variable x = 64. Then we
can substitute the value in the above equation.
Regression Equation(y) =
o
+
1
x
= -8.098 + 0.19(64).
= -8.098 + 12.16
= 4.06
Ques. What are the strength and limitations of regression analysis?
Ans. Strengths
Provides an opportunity to specify hypotheses concerning the nature of effects
(action theory), as well as explanatory factors.
If successfully executed, it can produce a quantitative estimate of net effects.
Limitations
The technique is demanding because it requires quantitative data relating to
several thousand individuals.
Implementing the data collection can be time-consuming and expensive.
In case of circular relation, this method of analysis fails to show result.
Macro Economics 33
Chapter 3
Interpolation
Ques. Explain the concept of interpolation
Ans. Interpolation is defined as the technique of estimating the value of Y
x
for any
intermediate value of the variable X. Suppose,
= Y
x
Suppose we are given the values x
0
, x
1
, x
2, ,
x
n
and let the corresponding values
of Y be Y
0
, Y
1
, Y
2
, , Y
n
respectively. If we want to estimate the value of Y
x
for any value of X between the limits X
0
and X
n
, this can be done by applying
the technique of interpolation.
There are two main methods of Interpolation:
1. Binomial Expansion
2. Newtons method
Ques. Explain the binomial method of Interpolation.
Ans. Binomial Expansion method is used when values of independent variable X are
at equal intervals but one, two or more values of the dependent variable may be
missing. These missing values can be easily interpolated by using the following
results of the calculus of finite differences.
Suppose we are given (n+1) equidistant arguments but the entry corresponding to
any one of them is missing. Thus, we are given n entries and hence we can
express the function Y = f(X) by a polynomial of (n-1)th degree.
By fundamental theorem of finite differences, since Y = f(X) is a polynomial of
(n-1)th degree, (n-1)th order differences are constant, and nth and higher order
differences are zero. Symbolically,
In particular taking x=a(the first argument), we get
Expanding by binomial theorem, we get
34
34
Page | 34
Chapter 4
Association of Attributes
Ques. What do you mean by attributes? What are the different criteria used for
studying the association of attributes?
Ans. Attributes are defined as quality or characteristic. Some examples of attributes
are gender, beauty, honesty etc. in the study of attributes; the objects are
classified according to the presence or absence of the attribute in them. Two
attributes are said to be associated if they are not independent but are related in
some way or the other.
The presence of Attributes is represented by capital letters of the English
alphabet i.e. A, B, C, D, and so on and their absence is denoted by small letters
of the Greek alphabet i.e. , , , and so on.
We have the following criteria of studying the association between two attributes:
1. Proportion Method
This method consists in comparing the presence or absence of a given attribute in
the other.
Two attributes are said to be:
Positively associated if or
Negatively associated if or
However if or , then A and B are independent.
2. Comparison of Observed and Expected frequencies
Two attributes are said to be:
Positively associated if or >0
Negatively associated if or <0
However if or =0, then A and B are independent.
3. Yules Coefficient of Association
This coefficient is a mathematical measure of the extent of association between
two attributes. The Yules Coefficient is represented by Q and is given by
Positively associated if Q=+1
Negatively associated if Q=-1
4. Coefficient of Colligation
The coefficient is represented by Y.
Macro Economics 35
Relation between Q and Y
Ques. Out of 715 literates in a particular city of India, number of criminals was 8;
while out of 975 illiterates in the same city, 17 were criminals. Find out if
illiteracy and criminality are associated or independent by using all the
criteria of association of attributes.
Ans. let us define the attributes:
A: illiteracy B: Criminality
: Literacy : Non-criminality
Then, in the usual notation, we are given
(A) = 975, (AB)= 17, ()= 715, (B)= 8, (A)= 958, ()= 707, (B)= 25, ()= 1665
Proportion Method
Now, since 68%> 57.53%, this implies
Therefore A and B are positively related.
Comparison of Observed and Expected frequencies
(AB) = 17
Since 17> 14.42, this implies
Therefore A and B are positively related.
Yules Coefficient of Association
Hence, there is low degree of positive association between two attributes A and
B.
36
36
Page | 36
Multiple choice questions
Set 1
1 A laboratory assistant measures the weight of a sample of bread. This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio
2 A tutor grades his students as A, B, C, D or E. This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio
3 Departments of Merlin plc are coded 1 to 7. This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio
4 A weather forecaster predicts tomorrows temperature in (
o
C) This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio
5 An athlete wears numbers on his vest. This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio
6 A doctor records the state of health of a patient as good. This variable is:
(a) nominal (b) ordinal (c) interval (d) ratio ?
The following graph describing the marks for a group of students is for use with
questions 7 and 8
7 A reasonable estimate for the mode is:
100 90 80 70 60 50 40 30 20 10 0
10
5
0
Marks %
s
t
u
d
e
n
t
s
p
e
r
1
0
m
a
r
k
i
n
t
e
r
v
a
l
F
r
e
q
u
e
n
c
y
d
e
n
s
i
t
y
Macro Economics 37
(a) 43% (b) 45% (c) 47% (d) 50%
8 The number of students in the sample is:
(a) 8 (b) 11 (c) 45 (d) 48
The following graph describing the same data and is for use with questions 9 to 13
9 A reasonable estimate for the median is:
(a) 5 (b) 50 (c) 52 (d) 55
10 A reasonable estimate for the interquartile range is:
(a) 0.05 (b) 23 (c) 50 (d) 75
11 The percentage of students with marks over 40 could be:
(a) 30 (b) 40 (c) 60 (d) 75
12 The percentage of students who scored between 40 and 70 could be:
(a) 30 (b) 40 (c) 60 (d) 75
13 The top 20% of the students scored over:
(a) 20 (b) 65 (c) 80 (d) 90
The following information is for questions 14 to 20
100 90 80 70 60 50 40 30 20 10 0
100
50
0
Marks %
C
u
m
u
l
a
t
i
v
e
P
e
r
c
e
n
t
38
38
Page | 38
The profits, in 000, from a random sample of eight weeks from three connected
newsagents, X, Y and Z, were found to be:
Week 1 2 3 4 5 6 7 8
Shop X 13 22 21 16 21 17 25 23
Shop Y 29 26 33 34 36 38 28 31
Shop Z 17 ? 20 18 18 17 23 19
14 The median weekly profit for shop X is: (000)
(a) 18 (b) 19 (c) 20 d) 21
15 The range of weekly profit for shop X is: (000)
(a) 13 to 25 (b) 16 to 21 (c) 5 (d) 12
16 The estimated mean of all weekly profits for shop Y is: (000)
(a) 30.5 (b) 30.7 (c) 31.9 (d) 32.0
17 The estimated standard deviation of the weekly profit of shop Y is: (000)
(a) 3.7 (b) 3.9 (c) 4.0 (d) 4.1
18 If the mean weekly profit for shop Z was 19 250, the missing value is:
(a) 21 (b) 22 (c) 23 (d) 24
19 The modal weekly profit for shop X over the eight weeks is: (000)
(a) 21 (b) 19 (c) 18 (d) 16
20 In total, the three shops employ three men who earn on average 5.20 per hour and seven
women who earn on average 3.80 per hour. The average hourly pay for all these ten
workers is:
(a) 4.15 (b) 4.22 (c) 4.50 (d) 4.64
The following frequency distribution of salaries represents a sample from a
company and is for use in questions 21 to 24
Salary (000) Frequency
10 and under 12 4
12 and under 14 5
Macro Economics 39
14 and under 16 8
16 and under 18 7
18 and under 20 4
20 and under 25 2
25 and under 30 3
30 and under 50 3
50 and under 100 1
Set 2
21 A reasonable estimate for the median value is: (000)
(a) Between 18 and 20 (b) between 16 an 18 (c) Between 14 and 16 (d) 55
22 The best estimate for the modal value is: (000)
(a) Between 14 and 16 (b) between 16 an 18 (c) Between 50 and 100 (d) 8
23 The best estimate for the mean is: (000)
(a) 15.0 (b) 20.2 (c) 19.0 (d) 54.9
24 The best estimation of the standard deviation for the whole company is: (000)
(a) 8.0 (b) 11.9 (c) 12.1 (d) 15.0
25 One reason that the sample mean is the usual estimator of the population mean is that:
(a) The average of all sample means equals the population mean
(b) The sample mean equals the population mean
(c) The sample mean is unaffected by extreme values
(d) The sample mean occurs more often than the mode or the median
26 In a five horse race two horses have the probability of 0.25 of winning, two more have
the probability of 0.2 of winning. The probability that the fifth horse will win is:
(a) 0.10 (b) 0.33 (c) 0.45 (d) 0.55
27 If three coins are tossed, the probability of getting exactly one head is:
(a) 1/3 (b) 1/8 (c) 2/8 (d) 3/8
28 If a pair of dice are rolled the probability of getting a double is:
40
40
Page | 40
(a) 1/36 (b) 1/18 (c) 1/6 (d) 1/3
The following information is for questions 29 to 34
A survey of a sample of users of home computers by region revealed that they used the
following hardware:
Hardware
Region CD ROM Modem Both Neither
South 13 7 11 17
North 15 6 14 19
West 10 8 10 20
The probability that a user selected at random has:
29 A CD ROM only:
(a) 0.278 (b) 0.271 (c) 0.253 (d) 0.208
30 Comes from the North:
(a) 0.400 (b) 0.395 (c) 0.360 (d) 0.339
31 Neither CD ROM nor modem is:
(a) 0.373 (b) 0.357 (c) 0.354 (d) 0.352
32 A CD ROM only and comes from the South is:
(a) 0.573 (b) 0.513 (c) 0.087 (d) 0.081
33 A modem only or comes from the North is:
(a) 0.500 (b) 0.460 (c) 0.050 (d) 0.040
34 Both CD ROM and modem given that they are from the West is:
(a) 0.075 (b) 0.208 (c) 0.553 (d) 0.728
The following information is for use with questions 35 to 40
Paper cups of coffee sold in a university canteen may be large, medium or small.
They may be sold to academics, administrators or students. A random sample taken one
day produced the following information:
Size of container
Macro Economics 41
Large Medium Small
Academics 15 25 20
Administrators 10 20 5
Students 25 10 10
A cup is selected at random.
35 The probability that it is a small one is:
(a) 0.429 (b) 0.333 (c) 0.250 (d) 0.143
36 The probability that it is bought by a student is:
(a) 0.357 (b) 0.321 (c) 0.333 (d) 0.286
37 The probability that it is small or bought by a student is:
(a) 0.500 (b) 0.286 (c) 0.222 (d) 0.071
38 The probability that it is large and bought by an administrator is:
(a) 0.286 (b) 0.250 (c) 0.200 (d) 0.071
39 The probability that a medium cup is bought by an academic is:
(a) 0.643 (b) 0.455 (c) 0.417 (d) 0.179
40 The probability that it is not small if bought by an academic:
(a) 0.750 (b) 0.667 (c) 0.666 (d) 0.250
Set 3
The following information is for use with questions 41 to 45
A haulier has a large fleet of vehicles. The annual mileage follows a normal distribution
with mean 74 000 and standard deviation 3000.
41 The probability that a vehicle selected at random does over 75 000 a year is:
(a) 0.131 (b) 0.369 (c) 0.631 (d) 0.839
42 The probability that a vehicle selected at random does over 70 000 a year is:
(a) 0.907 (b) 0.593 (c) 0.407 (d) 0.093
42
42
Page | 42
43 The probability that a vehicle selected at random does between
73 000 and 77 000 miles a year is:
(a) 0.971 (b) 0.712 (c) 0.471 (d) 0.212
44 The probability that a vehicle selected at random does between
70 000 and 73 000 miles a year is:
(a) 0.131 (b) 0.278 (c) 0.540 (d) 0.778
45 The mileage exceeded by the top 10% of the vehicles is:
(a) 70 150 (b) 71 300 (c) 76 700 (d) 77 850
The following information is for use with questions 46 to 51
Garden fertiliser is packed into plastic bags which are nominally 1.5
kg. The net weights
of these bags are known to follow a normal distribution with a mean weight of 1.55
kg
and a standard deviation of 0.05
kg. A bag is selected at random:
46 The probability that the bag contains less than 1.5
kg is:
(a) 0.841 (b) 0.659 (c) 0.341 (d) 0.159
47 The probability that the bag contains less than 1.6
kg is:
(a) 0.841 (b) 0.659 (c) 0.341 (d) 0.159
48 The probability that the bag contains between 1.51 and 1.57
kg is:
(a) 0.867 (b) 0.556 (c) 0.444 (d) 0.133
49 The probability that the bag contains between 1.48 and 1.53
kg is:
(a) 0.736 (b) 0.575 (c) 0.425 (d) 0.264
50 The weight exceeded by 10% of all the bags is:
(a) 1.61
kg (b) 1.56
kg (c) 1.54
kg (d) 1.49
kg
51 The 20
th
percentile of the distribution is:
(a) 1.59
kg (b) 1.58
kg (c) 1.52
kg (d) 1.51
kg
The following information is for use with questions 52 to 53
A random sample of nine durations for the completion of a task was found to be normally
distributed with a mean of 20 minutes and a standard deviation of 12 minutes.
Macro Economics 43
52 The table value used in the calculation of the 95% confidence interval for the mean of all
the times taken to complete the task is:
(a) 1.86 (b) 1.96 (c) 2.26 (d) 2.31
53 The 95% confidence interval used to estimate the mean of all the times is: (minutes)
(a) 17.7 to 22.3 (b) 16.9 to 23.1
(c) 10.8 to 29.2 (d) 16.0 to 24.0
54 A 95% confidence interval can be interpreted as meaning:
(a) It includes 95% of the population;
(b) There is a 95% chance that it includes the sample mean;
(c) 95% of samples would provide confidence intervals which include the population
mean;
(d) None of the above.
The following information is for use with questions 55 to 58
A company has a large staff of machine operators. The Manager selects a random sample
of eight male machine operators and finds that during a particular year the hours of
overtime they worked were:
142 127 171 137 161 183 148 124
55 The estimated mean and standard deviation of all the male operators respectively are:
(a) 149 and 19.6 (b) 149 and 21.0 (c) 145 and 19.6 (d) 145 and 21.0
56 Assuming normal distribution, the 95% confidence interval for the population mean is:
(a) 132 to 167 (b) 135 to 164 (c) 35 (d) 29
57 Assuming normal distribution, the 99% confidence interval for the population mean is:
(a) 130 to 168 (b) 123 to 175 (c) 52 (d) 38
58 If the mean of a further sample of six female operators selected at random had an average
of 37 hours overtime, the best estimate for the overall average number of hours overtime
is:
44
44
Page | 44
(a) 101 (b) 99 (c) 93 (d) 91
The following information is for questions 59 to 61
The time taken by packers at a mail order firm is known to be normally distributed. A
random sample of 10 packers were timed at a particular plant in order to estimate the
mean time taken by all the packers with the following results:
Packer A B C D E F G H I J
Time (min) 4.2 6.5 7.1 4.6 9.3 4.5 6.2 8.3 7.4 5.2
59 The mean and standard deviation needed in order to calculate a 95% confidence interval
for the mean packing time are:
(a) x = 6.33 s = 1.72 (b) x = 6.33 s = 1.632
(c) x = 7.03 s = 1.72 (d) x = 7.03 s = 1.632
Set 4
60 The table value to make use of in your calculation is:
(a) 1.83 (b) 1.96 (c) 2.23 (d) 2.26
61 The confidence interval produced is: (minutes)
(a) 5.10 to 7.56 (b) 5.16 to 7.50 (c) 5.20 to 7.45 (d) 5.26 to 7.40
62 In hypothesis testing the significance level is the risk of:
(a) Rejecting H
0
when H
0
is correct
(b) Rejecting H
0
when H
1
is correct
(c) Rejecting H
1
when H
1
is correct
(d) Rejecting H
1
when H
0
is correct
63 An example of a two-tailed alternative hypothesis is:
(a) H
1
: < 0 (b) H
1
: = 0 (c) H
1
: 0 (d) H
1
: > 0
The following information is for use with questions 64 to 70
Macro Economics 45
A new method of training was to be introduced into your company. In order to test its
efficiency the training manager paired his next batch of recruits on equal mental ability.
He then trained one of each pair by the old method, X, and the other by the new method,
Y. Giving them both the same test produced the following results:
Pair A B C D E F G H I J
Old Method X 75 45 81 60 56 59 61 48 39 56
New Method Y 70 46 93 75 63 57 67 52 61 71
64 A one-sample t-test is carried out on the results obtained by the old method to see if the
mean mark is 65. The test statistic calculated would be:
(a) 1.644 (b) 1.732 (c) 1.826 (d) 2.802
65 The critical value for comparing this test statistic with, at 5% significance, would be:
(a) 1.81 (b) 1.96 (c) 2.23 (d) 2.26
66 If the magnitude of the test statistic is less than the critical value and H
1
is two tailed we
should:
(a) Reject H
0
(b) Not reject H
0
(c) Accept H
1
(d) Accept neither
67 A paired t-test is carried out, at 5% significance, on the above marks in order to see if
training by the new method is more efficient. In analysing the results the critical value
should be obtained for:
(a) 9 degrees of freedom (b) 10 degrees of freedom
(c) 19 degrees of freedom (d) 20 degrees of freedom
68 The critical value obtained for this test would be:
(a) 1.81 (b) 1.83 (c) 2.23 d) 2.26
69 The test statistic calculated from these results in order to carry out the paired t-test is:
(a) 2.66 (b) 2.80 (c) 3.76 d) 3.96
70 If the null hypothesis is rejected, the conclusion would be:
(a) The new method, Y, probably produces different marks to the old method, X.
(b) The new method, Y, probably produces a worse mark than the old method, X
(c) The new method, Y, probably produces a better mean mark to the old method, X.
(d) The new method, Y, probably produces a different mean mark to the old method, X.
The following information is for use with questions 71 to 73
46
46
Page | 46
x 1 2 3 4 5 6 7 8
y 23 18 17 14 10 6 5 1
x = 36 y = 94 x
2
= 204 y
2
= 1500 xy = 295
71 The value of the correlation coefficient between x and y is:
(a) 0.993 (b) 0.533 (c) -0.533 (d) -0.993
72 In order for the correlation coefficient to be significant, at 5%, its absolute value must be:
(a) less than 0.707 (b) more than 0.707 (c) more than 0.632 (d) more than 0.632
73 The regression line can be described by the equation:
(a) y = 25.5 3.05x (b) y = -25.5 + 3.05x (c) y = 3.05 25.5x (d) -3.05 + 25.5x
74 The gradient of the regression line estimates:
(a) The value of y divided by the value of x
(b) The increase in y for an increase of 1 in x
(c) The value of y when x is zero
(d) The average increase in y for an increase of 1 in x
The following information is for questions 75 to 78
The numbers of a particular type of sports car sold in five different towns are given
below, together with the populations of those towns in thousands:
Population (x) 37 56 98 72 114
No. of cars sold (y) 8 9 17 12 24
( x = 377 y = 70 x
2
= 32289 y
2
= 1154 xy = 6066)
75 The regression equation of y on x is:
(a) A = -1.38, B = 0.204 (b) y = 12.0 + 4.53x
(c) y = -1.38 + 0.204x (d) A = 12.0, B = 4.53
76 The correlation coefficient is:
(a) 0.961 (b) 0.871 (c) 0.204 (d) -1.38
Macro Economics 47
77 In this example, we use x for the towns' populations because:
(a) It is the first variable given (b) It is the independent variable
(c) It is what we are trying to estimate (d) It is the larger variable
78 The expected sales of this particular car in a town of 80 000 would be:
(a) 15 (b) 16 (c) 17 (d) 18
The following information is for questions 79 and 80
The statistics and computing marks for a random sample of first year students were:
Statistics 72 67 53 80 63 45 32 54
Computing 66 59 72 68 54 60 49 50
79 The correlation coefficient describing the association between the two sets of marks is:
(a) +0.4625 (b) - 0.4625 (c) +0.5325 (d) - 0.5325
80 The regression equation suitable for predicting computing marks from statistics marks is:
(a) y = 42.76 + 0.292x (b) y = 42.76 0.292x
(c) y = 0.168 + 0.972x (d) y = -0.168 + 0.972x
Set 5
The following information is for questions 81 to 87
The profits (000) of the XYZ garages plc during the period 1997 to 2000 were:
Year 1997 1998 1999 2000
Profit 167 145 196 204
Index 100
81 The index value for 1998 was:
(a) -22 (b) 78 (c) 87 (d) 115
82 The index value for 2000 was:
(a) 37 (b) 104 (c) 108 (d) 122
48
48
Page | 48
83 If the index for 2001 is 110, the profit is: (000)
(a) 177 (b) 184 (c) 214 (d) 224
Answers to multichoice questions
Questio
n
Answer Questio
n
Answer Questio
n
Answer Questio
n
Answer
1 d 22 a 43 c 64 b
2 b 23 b 44 b 65 d
3 a 24 c 45 d 66 b
4 c 25 a 46 d 67 a
5 a 26 a 47 a 68 b
6 b 27 d 48 c 69 b
7 c 28 c 49 d 70 c
8 d 29 c 50 a 71 d
9 c 30 c 51 d 72 b
10 b 31 a 52 d 73 a
11 d 32 c 53 c 74 d
12 c 33 b 54 c 75 c
13 b 34 b 55 b 76 a
14 d 35 c 56 a 77 b
15 d 36 b 57 b 78 a
16 c 37 a 58 a 79 c
17 d 38 d 59 a 80 a
18 b 39 b 60 d 81 c
19 a 40 b 61 a 82 d
20 b 41 b 62 a 83 b
Bibliography
Introduction to statistical method in economics, Herman Bennett, MIT
HyperStat online : an introductory statistics book and online tutorial for help in
statistics, David M Lane, Rice University
Applied Statistics and the SAS Programming Language by Ronald P. Cody, Ron Cody,
and Jeffrey Smith
Lee C. Adkins, R. Carter Hill (2011) Using Stata for Principles of Econometrics
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams (1999) Essentials of
Statistics for Business and Economics
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams (2001) Quantitative
Methods for Business
David R. Anderson, Dennis J. Sweeney, Thomas A. Williams (2002) Statistics for
Business and Economics with Student Test Review CD-ROM
David A. Anderson et al. (2010) Statistics for Business and Economics
Michael Barrow (2009) Statistics for Economics, Accounting and Business Studies
Plus MathXL Pack
Mr Michael Barrow (2010) Statistics for Economics, Accounting and Business Studies
with MyMathLab Global Student Access Card
Mike Barrow (1996) Statistics for Economics, Accounting and Business Studies
Christopher F. Baum (2006) An Introduction to Modern Econometrics Using Stata
Bernard Baumohl (2007) The Secrets of Economic Indicators: Hidden Clues to Future
Economic Trends and Investment Opportunities
Glyn Burton, George Carroll, Stuart Wall (2001) Quantitative Methods for Business
and Economics
William L. Carlson, Betty Thorne (1996) Applied Statistical Methods
Gary E. Clayton, Martin Gerhard Giesbrecht (2003) A Guide to Everyday Economic
Statistics
Important Websites
www.economicsnetwork.ac.uk ... Online Text and Notes
books.google.com Mathematics Probability & Statistics General
users.math.yale.edu/~bbm3/web_pdfs/032statisticalEconomics.pdf
en.wikipedia.org/wiki/Economic_statistics