You are on page 1of 5

Introductory SAS tutorial Prof.

David Mendona Thursday, 13 June 2002 3p-5p in CO-lab


1 1.1 Background on SAS What is SAS and why do we care?

The standard for large-scale statistical computing operations (e.g., marketing, data analysis) in a wide variety of industries (e.g., pharmaceutical, retail). SAS serves more than 38,000 business, government, and university sites in 199 countries, including 98%of the top 100 companies on the Fortune 500 and 90% of all companies on Fortune 500. Its also the largest private-held software company in the world. There are many facets to SAS software (e.g., Base SAS and various modules such as SAS/GIS, SAS/GRAPH, SAS/STAT, SAS/OR, etc.). Today we will concentrate on Base SAS and SAS/STAT. 2 Basics Extensive but labyrinthine online help. Windows-specific help: Base SAS>Host Specific>MS Windows Most parts of SAS programs are multi-platform. Be careful of case-sensitive properties of some operating systems, though. 2.1 SAS Work Area

Fig. 1. Display system options proc options; run; Explanations of all the displayed options are available in the online help. 2.2 File I/O Objective: Understand and use data import/export from/to other applications (e.g., Excel, MATLAB) First create a directory called SAS Demos under the directory My SAS Files, then another directory, Tutor within the directory SAS Demos. Next create a SAS library called Tutor. Its a good idea never to move these library around once theyre established. If you do, all links to them will have to be updated. Go to http://web.njit.edu/~mendonca. Get the framingham file and put it in the SAS Demos directory. Data Input Two approaches are presented here. Complete information is found in discussion of SAS DATA step. Open SAS Online Help. Go to section Base SAS> SAS Language Reference: Concepts> DATA Step Concepts Fig. 2. Input Approach 1: General (import from delimited file) DATA Tutor.fham; /* note: if the variable is text, put the dollar sign after its name */ INFILE "C:\Documents and Settings\mendonca\My Documents\My SAS Files\SAS Demos\Tutor\framingham.csv" delimiter = ',' ; /* this is the Framingham study data set */ INPUT cause $ age chol sex $; RUN; Save as importcsv.sas. Fig. 3. Input Approach 2: Specific (import Excel via OBDC) PROC IMPORT OUT= Tutor.GENL DATAFILE= "C:\Documents and Settings\mendonca\My Documents\SAS Demos\Tutor\General.xls" DBMS=EXCEL2000 REPLACE; GETNAMES=YES; RUN; See SAS Help for PROC IMPORT under the SAS procedures guide. Download General.xls. Fig. 4. Output Approach PROC EXPORT DATA= Tutor.Genl OUTFILE= "C:\Documents and Settings\mendonca\My Documents\SAS Demos\Tutor\General.xls" DBMS=EXCEL2000 REPLACE; RUN; For other ideas, see SAS Help for PROC EXPORT under the SAS procedures guide. Save as inputoutput.sas.

Statistical Operations Objectives: Understand how to generate descriptive statistics, perform a t-test and run a generalized linear model. 2.3 Descriptive Statistics To have a look at some basic summary statistics, use PROC MEANS. See SAS Help for The Means Procedure at Base SAS>SAS Procedures>Procedures. Fig. 5. PROC MEANS /*Basic data description*/ DATA heart; SET Tutor.fham; RUN; /*The data must first be sorted by the vars in the BY statement of PROC MEANS*/ PROC SORT DATA = heart; by cause sex; RUN; PROC MEANS DATA = heart; /*the data step is redundant*/ VAR age chol; BY cause sex; RUN; /*the output is visible in the Results window*/

2.4 Statistical Tests SAS has the capability of performing a very wide variety of parametric and non-parametric statistical tests. Tests based on the Students t distribution are common and so are discussed here. The TTEST procedure performs t tests for one sample, two samples, and paired observations. The one-sample t test compares the mean of the sample to a given number. The two-sample t test compares the mean of the first sample minus the mean of the second sample to a given number. The paired observations t test compares the mean of the differences in the observations to a given number [PROC TTEST documentation]. See help in SAS/STAT Users Guide>The PROC TTEST procedure. Get file PROC_TTESTdata.sas. Fig. 6. PROC TTEST proc ttest data=graze; class GrazeType; var WtGain; run; Is this a one- or two-sample test? What is the null hypothesis? Should the results be based on the assumption of equal or unequal variance? What is the result? 2.5 General Linear Models The GLM procedure uses the method of least squares to fit general linear models. Among the statistical methods available in PROC GLM are regression, analysis of variance, analysis of covariance, multivariate analysis of variance, and partial correlation [PROC GLM documentation]. So, it does all analyses that PROC REG does and more: simple regression multiple regression

analysis of variance (ANOVA), especially for unbalanced data analysis of covariance response-surface models weighted regression polynomial regression partial correlation multivariate analysis of variance (MANOVA) repeated measures analysis of variance

A limitation is that it does not have a built-in PLOT option. Help: SAS/STAT Users Guide>The PROC GLM procedure. Download the file Fig. 7. PROC GLM proc glm data=fitness; model Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse; run;

3 Graphing Procedures Objective: Understand how to generate a scatter plot. See SAS Help for The PLOT Procedure at Base SAS>SAS Procedures>Procedures. Fit the model. Fig. 8. PROC PLOT proc glm data=fitness; model Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse; /*need access to output data*/ output out=Tutor.mfitness predicted=Oxygenhat r=resid stdr=eresid run; DATA mfitness; SET Tutor.mfitness; RUN; PROC PLOT DATA = mfitness; plot resid*Oxygenhat; title 'Physical Fitness Analysis'; title2 'Model: Oxygen=Age Weight RunTime RunPulse RestPulse MaxPulse'; title3 'Residuals vs. Predicted'; run;

Links http://www.sas.com http://gsbapp2.uchicago.edu/sas/sashtml/main.htm (look under Base SAS, SAS/STAT and SAS/GRAPH) http://www.stat.wisc.edu/computing/sas/intro.html http://www.stat.wisc.edu/computing/sas/ (miscellaneous links) http://www.sas-jobs.com/ http://www.sasusers.com/

5 Contact Information David Mendona Email: djm@njit.edu Phone: x5212

You might also like