134 views

Uploaded by ana_pacios

- Lab 6
- HW8
- Manual
- Tutorial 2 ME323 Group 23
- jam1
- 4 Confidence Intervals
- Use of a low.docx
- Introductory%20Guide%20to%20using%20Stata
- Statistics2.2
- Zorn - Stata 4 Dummies - 2007
- Murtha, J. - Risk Analysis for the Oil Industry
- Estimations 12 Nov 12
- Adjust Jidoka Occupational Fatigue Factors to Reduce Idle Times
- kassem hajj
- Experiment 1
- Tests of Normality MB LENI
- 17761_work Sampling 1
- low-car comparative by MSA
- help_outreg.pdf
- Msa

You are on page 1of 5

1. Getting Started with Stata The point of this discussion section is to get you started using the statistical software package Stata. Starting from the Excel dataset cps98.csv (available on the course website), load the data into Stata. This le contains the data on average hourly earnings, education, gender and age of individuals for a sample of workers in year 1998. A quick way to do this is to save the Excel le and then load it into Stata using the insheet command. You might nd it easier to use Import command under the File menu. Use the sum command to summarize the variables in the dataset. What is the average hourly earnings and their standard deviation in the sample? How many male workers took the survey? What is the average, minimum and maximum age of the respondents? Use the tab age command to look at the distribution of age in the sample. What is the mode of this distribution? What is the median (approximately)? Use the hist ahe command to plot the histogram of the average hourly earnings. What can you say about the shape of its distribution? Use sum ahe if female == 1 command to calculate the average earnings for the females. What is the average earnings for the males? Do you nd the dierence economically signicant? Use the sum if ahe > 40 command to see who are the top earners their age, gender, education level. How about those with hourly earnings less than 3? Use the gen ahe2 = ahe*ahe command to generate a new variable ahe2 equal to hourly wages squared. Use the scatter ahe2 ahe, title(Ahe2) command to graph the relationship between ahe and ahe2 . Now try adding the , xlabel(0(2)50) ylabel(0(1000)3000) option to see how to change the axes in your graph. Use the pwcorr command to calculate the sample correlation between average hourly earnings and age. Does it come out with an expected sign? Plot the relationship with scatter command. Does this correlation change for top and bottom earners?

2. We know that, by denition, 1 X = n s2 = X sXY a. Prove that b. Prove that c. Prove that d. Prove that Solution: a. (Xi X) = Xi X= Xi nX = nX nX = 0 b. We prove one preliminary result: (Xi X)(Yi Y ) = = = = = = (Xi Yi Xi Y XYi + XY ) Xi Yi Y Xi X Yi + XY

n i=1 n i=1 n i=1 n i=1 n

Xi

i=1 n

1 n1 1 n1

Xi X

i=1 n

Xi X

i=1

Yi Y

Xi X = 0 Xi X Yi = (n 1) sXY Xi X Xi Y + 5 = (n 1) s2

k j=1

1 + Xi X

Yj Y

= n (k 1) s2 Y

Therefore, just use Yi instead of Yi , and the result will of course still hold: (Xi X)Yi = c. (Xi X)(Xi Y + 5) = = (Xi X)Xi (Xi X)Y + (Xi X)5 (Xi X)(Yi Y ) = (n 1)SXY

(Xi X)Xi 0 + 0

2 = (n 1)SXX = (n 1)SX

Page 2

d. (1 + Xi X)(Yj Y )2

i j

=

i j

(Yj Y )2 +

i 2 j

j

= n

j

(Yj Y ) +

i 2 1)SY

= n(k

+0

3. Stock & Watson 3.5 (note: part a is a little tricky) A survey of 1055 registered voters is conducted, and the voters are asked to choose between candidate A and candidate B. Let p denote the fraction of voters in the population who prefer candidate A, and let p denote the fraction of voters in the sample who prefer candidate A. a. You are interested in the competing hypotheses: H0 : p = 0.5 vs. H1 : p = 0.5. Suppose you decide to reject H0 if | 0.5| > 0.02. p i. What is the size of this test? ii. Compute the power of this test if p = 0.53. b. In the survey p = 0.54. i. Test H0 : p = 0.5 vs. H1 : p = 0.5. using a 5% signicance level. ii. Test H0 : p = 0.5 vs. H1 : p > 0.5. using a 5% signicance level. iii. Construct a 95% condence interval for p. iv. Construct a 99% condence interval for p. v. Construct a 50% condence interval for p. c. Suppose that the survey is carried out 20 times, using independently selected voters in each survey. For each of these 20 surveys, a 95% condence interval for p is constructed. i. What is the probability that the true value of p is contained in all 20 of these condence intervals. ii. How many of these condence intervals do you expect to contain the true value of p?

1 d. In survey jargon, the margin of error is 1.96 SE (); that is, it is 2 the length of p the 95% condence interval. Suppose you wanted to design a survey that had a margin oferror of at most 1%. That is, you wanted Pr (| p| > 0.01) .05. How large should p n be if the survey uses simple random sampling?

Page 3

Solution: a. i. The size is given by Pr (| 0.5| > 0.02), where the probability is computed asp suming that p = 0.5. Pr (| 0.5| > 0.02) = 1 Pr (.02 p 0.5 0.02) p = 1 Pr

0.02 (.5.5)/1055

p0.5 (.5.5)/1055

0.02 (.5.5)/1055

= 1 Pr 1.30 = 0.19

p0.5 (.5.5)/1055

1.30

where the nal equality uses the central limit theorem approximation (and the normal tables). ii. The power is given by Pr (| 0.5| > 0.02), where the probability is computed p assuming that p = 0.53. Pr (| 0.5| > 0.02) = 1 Pr (.02 p 0.5 0.02) p = 1 Pr = 1 Pr

0.02 (.53.47)/1055 0.05 (.53.47)/1055

= 1 Pr 3.25 = 0.74

p0.53 (.53.47)/1055

0.65

where the nal equality uses the central limit theorem approximation (and the normal tables). b. i. t =

0.54.5 (0.540.46)/1055

= 2.61, Pr (|t| > 2.61) = .009 so that the null is rejected at the

5% level. ii. Pr (t > 2.61) = .0045 so that the null is rejected at the 5% level. iii. 0.54 1.96 (0.54 0.46)/1055 = 0.54 0.03, or 0.51 to 0.57. iv. 0.54 2.58 (0.54 0.46)/1055 = 0.54 0.04, or 0.50 to 0.58. v. 0.54 0.67 (0.54 0.46)/1055 = 0.54 0.01, or 0.53 to 0.55. c. i. The probability is 0.95 in any single survey, there are 20 independent surveys, so the probability is 0.9520 = 0.36. ii. 95% of the 20 condence intervals or 19.

Page 4

d. The relevant equation is 1.96 SE () < .01 or 1.96 p(1 p)/n < .01. Thus n p 1.962 p(1p) must be chosen so that n > , so that the answer depends on the value of p. .012 Note that the largest value that p(1 p) can take on is 0.25 (that is, p = 0.5 makes 2 p(1p) p(1 p) as large as possible). Thus if n > 1.96.012 = 9604, then the margin of error is less than 0.01 for all values of p.

Page 5

- Lab 6Uploaded byedexannisse
- HW8Uploaded byBen Jacobson
- ManualUploaded byJoab Dan Valdivia Coria
- Tutorial 2 ME323 Group 23Uploaded byabhi
- jam1Uploaded bykenmatsuda
- 4 Confidence IntervalsUploaded byasvanth
- Use of a low.docxUploaded byirma suwandi sadikin
- Introductory%20Guide%20to%20using%20StataUploaded byMarius Argetoianu
- Statistics2.2Uploaded byKushagra Mittal
- Zorn - Stata 4 Dummies - 2007Uploaded byBooksboy
- Murtha, J. - Risk Analysis for the Oil IndustryUploaded byJaime Quiroga
- Estimations 12 Nov 12Uploaded byGolamKibriabipu
- Adjust Jidoka Occupational Fatigue Factors to Reduce Idle TimesUploaded byIrshad Ali
- kassem hajjUploaded byapi-240347922
- Experiment 1Uploaded byCharls Deimoy
- Tests of Normality MB LENIUploaded byDefri Syahputra SKep Ns
- 17761_work Sampling 1Uploaded byVicky Singh
- low-car comparative by MSAUploaded byMichael Andersen
- help_outreg.pdfUploaded byDaniel Esteban Ponce Maripangui
- MsaUploaded byRohit Arora
- Reyem AffiarUploaded byRohit Verma
- Forecasting TipsUploaded byPavani Kodukula
- appA.pdfUploaded byGheils Gapul
- Chapter10_Lecture5Uploaded byNdomadu
- Bivariat TUploaded byriskyandi
- R programming exam with solutionsUploaded byJohana Coen Janssen
- Hp 49g+_user's Guide_with BookmarksUploaded byBea Floyd
- Apriori Dan BayesUploaded byAchmad Fauzan
- Jurnal bacaan analisis data UDT bootstrap.pdfUploaded byIzzatul Khoirunnisa
- Stat DassUploaded byMuhammad Reyhan Fahlevi

- E.T. the prequelUploaded bycohen33c
- Bk5_Sol_C14_EUploaded byMa Ka Yi
- Coisa Para Fazer Uppppppload2Uploaded byEryco Azevedo
- EHT Application NoteUploaded byJuan Sanchez
- Major Viral, Bacterial, And Fungal DiseasesUploaded byMary Lyon
- Leaflets LifeplusUploaded byrosrajen
- confintea_bulletin8_en.pdfUploaded byCRADALL
- NASA Hubble Focus Our Amazing Solar SystemUploaded byrpguido
- nutr 430 dash diet paperUploaded byapi-319660438
- Erp-scm-pdfUploaded byStacy
- MESIN TUGAS 2.docUploaded byop ek
- treating a pt with a cocaine addiction-1Uploaded byapi-331341420
- Tunable AIX ParametersUploaded byfsussan
- 01 Use Any Version ofUploaded by132574
- Impact of Recession on Cement IndustryUploaded byVikas Yadav
- 2005 Chrysler 300 LX Owners Manual.pdfUploaded byAnaYPedroVeronicoGaray
- Current LogUploaded bygajali badrun
- OneUploaded byjeankopler
- Ultra Dynamics of shipUploaded bySaiTeja Bandaru
- Supply Chain ManagementUploaded byPooja Shah
- Printing MachinesUploaded byNeetek Sahay
- Sun OBPUploaded bygoyaltcs
- A126_A132.pdfUploaded byOnM dept
- REF-01Uploaded byJCM
- List of E-commerce PlayersUploaded bygerardak
- OAUGPresentation012910.pptxUploaded byrpillz
- L580.pdfUploaded byCao Hao Nguyen
- Vastu Colours & DirectionsUploaded bysamgurao_420259170
- Installing AFC V5R18SP4 App BUploaded byryzon600
- Research Argument Essay Rough Draft.docxUploaded byDayna Rodriguez