Professional Documents
Culture Documents
Outline
Copying a data set Changing attributes Appending data sets Procedures Rename variables Alternatives to logical OR constructs Formats Indexing Disk Space Views Proc Sort and disk space Hashing merging data sets
1); then output work.data1; then output work.data2; then output work.data3;
Copying a dataset:
/* inefficient */ data work.data; set lib1.data; run;
/* efficient */ proc datasets lib = work nolist; copy in = lib1 out = work; select data; quit;
Changing Attributes:
/* inefficient */ /* reads & writes one observation at a time */ data work.data; set lib1.data; label age = 'Years'; format salary dollar10.; rename cars = autos; Run;
Changing Attributes:
/* efficient */ proc datasets lib = work nolist; copy in = lib1 out = work; select data; modify data (label = "Demographic Data"); label age = 'Years'; format salary dollar10.; rename cars = autos; change data = demograph; contents data = demograph; quit;
Appending datasets:
/* inefficient */ /* reads and writes one observation at a time */ data work.data1; set work.data1 work.data2; run;
Appending datasets:
/* efficient */ proc datasets nolist; append base = work.data1 data = work.data2; quit;
Rename: Occurs just once, at compile time, not execution time Might be able to avoid reading the data one observation at a time
A';
B';
C'; unknown';
data work.data; set lib1.data; where put(status, $status.) = 'A'; run; OR data work.data; set lib1.data (where = (put(status, $status.) = 'A')); Run;
#
1 2
Index
composite var1
Variables
var1 var2
Use _NULL_ as the data set name when you do not need to create a dataset (e.g., when creating macro variables)
Use the KEEP &/or DROP data set options (on input &/or output) or statements to limit the variables. Use the WHERE data set option (on input &/or output) or statement to limit the observations.
title 'Printing View work.class'; proc print data = work.class heading = h n; run;
/* print source statements in log */ proc sql; describe view work.subview; quit;
Tagsort:
Uses tags to retrieve the records from the input data set in sorted order
Not supported by the multi-threaded sort. Best used when the total length of BY variables is small compared to length of entire observation.
* Write obs to the output data set(s); output <output data set>;
run;
Questions ?