Important Instructions: a) THIS IS AN OPEN BOOK AND NOTES EXAMINATION. b) There are six questions in the paper. For full credit, attempt all questions. c) Please write your answers in the space provided. You can use the back of the sheets if the space provided is insufficient. However, make sure that you clearly mark the question number. Answers written in any other places may not be evaluated. d) Please provide pointed answers. You will get full credit only if you show all the relevant steps. e) Do not write until the start signal is given and stop writing immediately once the stop signal is given. Anyone not following these instructions would be given a F grade. _____________________________________________________________________ FOR OFFICE USE ONLY QN. 1 2 3 4 5 6 TOTAL Marks obtained
Out of 5 11 14 10 10 10 60
1. A bank offers short term loans at the rate of 10% interest per annum, accruing annually with the option of zero payment for the first year. Average loan amount per customer in January 2012 was (unknown). A survey among 100 customers availing this loan, taken in J anuary 2013, revealed that the average amount they owed the bank at that point of time was 121,000 with a sample standard deviation of 110,000. a) [1 Mark] A point estimate of is (i) 100,000 (ii) 110,000 (iii) 121,000 (iv) 133100 (v) none of (i)-(iv)
b) [1 Mark] A point estimate of the standard error of the point estimator of is (i) 10,000 (ii) 11,000 (iii) 100,000 (iv) 110,000 (v) none of (i)-(iv)
c) [1 mark] Can it be assumed that the sample mean is distributed approximately normally? J ustify.
d) [2 marks] Would your answer to part c) change if you learn that 90% of the people who took loans from the bank did not borrow more than 20,000? J ustify.
2. After learning that the demand of their courses is causing the students to spend nearly sleepless nights, the professors at a leading management school decide to take steps to reduce the workload of the students. After a month, they ask 18 randomly chosen students about the number of hours they were sleeping on the average before and after the steps were taken. The sample mean and standard deviation of the responses (after-before) were D X =3.56 and S D = 1.21 respectively. a) [1 mark] Assuming D to be the population mean difference (after-before) in the amount of sleep, formulate the appropriate null and alternative hypotheses to test the efficacy of the steps taken by the faculty. Null:
Alternative:
b) [3 marks] Clearly stating the assumption required, perform a test of your hypotheses at a 5% level of significance (either using the p-value or the critical value method)
c) [3 marks] What is the power of your test if D =1 (hour)?
d) [2 marks] Compute and interpret a 99% confidence interval of D .
e) [1 mark] Based on the above confidence interval, what can you conclude about the p- value (p) for testing H 0 : D =2 vs H a : D 2? (tick the most appropriate one) i) p 0.01 ii) p 0.02 iii) p 0.01 iii) p 0.99 iv) p 0.02 v) p 0.99 (Note: Unless you attempt at least one of part d) and part f), part e) would not be graded)
f) [1 Mark] The reason for the choice you made in part e) is
3. An alarming number of Indians suffer from Type II diabetes, which is usually regarded as a lifestyle related disease. Suppose the central government is currently running an awareness campaign to increase the citizens awareness about the disease, and to promote the importance of exercise to keep it in check. A leading newspaper conducts a survey to see whether exercise really helps to check diabetes. In doing so, they follow some diabetics exercise routine for a month. Following is the result of the survey:
a) Suppose you want to perform a chi-square test of independence to check for association between the duration of exercise and the change in sugar level. i) [1 mark] State the null and alternative hypotheses for the test
Null:
Alternative:
ii) [1 mark] What is the distribution of the test statistic under the null hypothesis? (Specify the complete distribution.)
iii) [3 marks] Compute the value of the test statistic. (Show all the necessary steps for full credit.)
Hours of Exercise per week Sugar level increased Sugar level decreased Less than 2 hours 81 79 More than 2 hours 145 215 iv) [2 marks] Based on your test statistic above, make a decision (at 5% level of significance) and interpret it (in the context of the actual variables).
b) [2 marks] Estimate and interpret the relative risk of an increased sugar level for exercising less relative to exercising more.
c) i) [2 marks] Estimate and interpret the odds ratio of an increased sugar level for exercising less compared to exercising more.
ii) [1 mark] Suppose Mr X mistakenly assumes that it is the sugar level that influences the duration of exercise. Accordingly he works with the above table but with exercise levels in the columns and sugar levels along the rows. Then, Mr. Xs odds ratio will be the same as the one in (c): True/False (circle the right alternative.) d) [2 marks] Obtain an estimate of Yules Q for this problem and interpret your value.
4. The CEO of a company that owns five resorts in Rajasthan wants to evaluate and compare satisfaction levels of guests with these resorts. Accordingly, the companys research department randomly sampled 101 people who had stayed at each of these places in the past six months and asked them to rate10 different aspects of their stay on a scale of 0-10, with 0 =very poor and 10 =excellent. Hence the total score may vary between 0-100. Based on these values, the department would like to perform a one-way ANOVA on the mean satisfaction ratings for the five resorts. Assume that the scores are approximately normal. a) [2 marks] State the null and alternative hypotheses (explaining notations, if any, that you use). Null:
Alternative:
b) [4 marks] Suppose the sums of squares between (SSB or SSTr) is 1350 while the sums of squares within (SSW or SSE) is 5600. Based on this information, complete the following ANOVA table Source DF Sum of Squares Mean Squares F statistic Between groups
Within groups ----- Total ------ -----
c) [1 mark] The degrees of freedom of your test statistic are (tick 1 option) i) 5 ii) 101 iii) (5, 500) iv) (4, 500) v) (5, 505)
d) [2 marks] Based on the test above, your decision will be (tick 1 option) i) To reject H 0 at = 0.1 but not at = 0.01 or 0.05 ii) To reject H 0 at = 0.05 and 0.1 but not at = 0.01 iii) To reject H 0 at = 0.01 but not at = 0.05 or 0.1 iv) To reject H 0 at = 0.1, 0.05 and 0.01 v) Not to reject H 0 at any of the above significance levels. Note: If you did not answer both parts b) and c), you must
justify your answer for part d) below:
e) [1 mark] Based on your decision above, what can you conclude above the satisfaction levels at the five resorts (at = 0.05)? (Choose only one.) i) All the resorts differ significantly with regard to the mean satisfaction levels. ii) At least three of the resorts differ significantly with regard to the mean satisfaction levels. iii) At least one of the resorts differ significantly from the rest with regard to the mean satisfaction levels. iv) There is no significant difference between the resorts with regard to the mean satisfaction levels. v) We cannot conclude anything definite regarding the mean satisfaction levels at the five resorts. Note: If you did not answer part d), you must
justify your answer for part e) below: 5. Suppose that the marketing manager in charge of promoting a new Hindi movie feels that 70% of the revenue for the movie will come from West and North India, but he is not sure about the break up. He further feels that he can expect twice the revenue from the East compared to the South. The actual numbers of viewers of the movie from the four regions, during the first week, are categorized in the form of a contingency table as shown below. (Assume ticket prices are the same everywhere.) North India South India East India West India Number of viewers 157,000 42,000 86,000 166,000
a) [2 Marks] Write down the appropriate null and alternative hypotheses for testing the goodness of fit of the managers assertion
Null:
Alternative:
b) [1 Mark] Identify the distribution of the test statistic under H 0 . You must specify the full distribution.
c) [5 Marks] Compute the value of the appropriate test statistic. Show all the necessary steps.
d) [2 Marks] State your conclusion when testing at the 5% level of significance. Interpret the result in the context of the managers problem.
6. In a region there are two big cities, City A and City B. There is a perception that people in City A are wealthier compared to City B. In fact, a recent newspaper report claims that people in City A earn twice as much as people in City B. It is also known that the size of the population of City B is twice as much as that of City A. Let the population of City A be N. a) Suppose that the assertion of the newspaper is true for the moment. Let the mean monthly income of City A be 2, and that of City B be , where is unknown. Now, a survey has been conducted to obtain simple random samples (with replacement) of sizes n and 4n from City A and City B respectively. Let the sample means and sample standard deviations of the samples from the two cities be ( ) A A X ,S and ( ) B B X ,S respectively. i) [2 Marks] Prove that any linear combination of A X and B X of the form ( ) B X A 2 cX 1 c , + where c is any real number, unbiasedly estimates .
ii) [3 Marks] Assume that the variance of the income distributions for the two cities are equal, say 2 . Find the value of c for which the variance of ( ) B X A 2 cX 1 c + would be the smallest.
b) Now, let the mean income of City A be A and the mean income of City B be B . We want to test the null H 0 : A = 2 B against the alternative H 1 : A 2 B . Suppose again that the variance of income levels of both the cities are equal, say 2 , which is unknown. Samples of size 100 from City A and size 400 from City B are obtained with sample means and standard deviations: A 19 , X 500 = A S 1000, = B 10 , X 500 = B S 900 = respectively. To avoid complications, lets assume that the samples are obtained with replacement.
i) [1 Mark] A point estimate of A -2 B is (tick only one) I) 9000 II) 450 III) - 1500 IV) -750 V) 1500
ii) [1 Mark] Variance of the point estimator of A -2 B is (tick only one) I) 2 II) 2 2 III) 5 2 IV) 2 ( ) 1 1 100 400 + V) 2 ( ) 1 2 100 400 + VI) 2 ( ) 1 4 100 400 + VII) 2 ( ) 4 1 100 400 + VIII) 2 ( ) 1000 900 100 400 +
iii) [1 Mark] A point estimate of , correct upto 2 decimal places, is (tick only one) I) 920.00 II) 920.74 III) 920.87 IV) 950.00 IV) 955.69
iv) [2 Marks] Construct an approximate 95% confidence interval for A B , 2 and hence test H 0 : A = 2 B against the alternative H 1 : A 2 B at a 5% level of significance. What is your conclusion?