Sunday, 20 September 2015

Introduction to SAS by example...

What is SAS?
SAS stands for statistical analysis system. SAS is a collection of modules that are used to process and analyze data. SAS
is also an extremely powerful, general-purpose programming language.
Here we are going to see examples from base SAS programming which are particularly solved problems from the ‘SAS by example by Ron Cody’.

Getting data file into SAS
Here is an example of getting your data from text file to SAS dataset. Here default delimiter is a blank space (‘ ‘).

The program:
libname myprac1 '/folders/myfolders/';

data myprac1.veg;
   infile "/folders/myfolders/veggies.txt";
   input Name $ Code $ Days Number Price;
   CostPerSeed = Price / Number;
run;
proc print data=myprac1.veg;
run;

Output:
Obs
Name
Code
Days
Number
Price
CostPerSeed
1
Cucumber
50104-A
55
30
195
6.50000
2
Cucumber
51789-A
56
30
215
7.16667
3
Carrot
50179-A
68
1500
355
0.23667
4
Carrot
50872-A
65
1500
325
0.21667
5
Beetroot
57224-A
75
200
375
1.87500
6
Beetroot
62471-A
80
200
395
1.97500
7
Brinjal
57828-A
66
200
295
1.47500
8
Brinjal
52233-A
70
30
225
7.50000

Above code starts from defining private library name with statement ‘libname’. Default library for SAS datasets are ‘Work’ library. Proc step is used to perform various process on your dataset. Here, we are printing dataset.

Conditional processing
SAS evaluates the expression in an IF-THEN statement to produce a result that is either non-zero, zero, or missing. A non-zero and nonmissing result causes the expression to be true; a result of zero or missing causes the expression to be false.
If the conditions that are specified in the IF clause are met, the IF-THEN statement executes a SAS statement for observations that are read from a SAS data set, for records in an external file, or for computed values. An optional ELSE statement gives an alternative action if the THEN clause is not executed. The ELSE statement, if used, must immediately follow the IF-THEN statement.

Below are some problems from Chapter 7 of Cody`s book given with their solution code and output. 

NOTE: Problem statements are given in each code snippet in first comment section. Various comments have been incorporated in order to improve the readability. Learning explanations are given below every code snippet.

Code: ch7_1.sas

Please note that problem statements are given in code snippet in comment section as below.

/*

Problem 1

Run the program here to create a temporary SAS data set called School:

data school;

input Age Quiz : $1. Midterm Final;

/* Add you statements here 

datalines;

12 A 92 95

12 B 88 88

13 C 78 75

13 A 92 93

12 F 55 62

13 B 88 82

;*//*

Using IF and ELSE IF statements, compute two new variables as follows: Grade

(numeric), with a value of 6 if Age is 12 and a value of 8 if Age is 13.

The quiz grades have numerical equivalents as follows: A = 95, B = 85, C = 75,

D = 70, and F = 65. Using this information, compute a course grade (Course) as a

weighted average of the Quiz (20%), Midterm (30%) and Final (50%).;*/



libname a15022 '/folders/myfolders/my_prac';



data a15022.school;

input Age Quiz : $1. Midterm Final;

/* Add you statements here */



if Age eq 12 then Grade=6;

else if Age eq 13 then Grade=8;



if Quiz eq 'A' then Course_Grade=(95*20+Midterm*30+Final*50)/100;

if Quiz eq 'B' then Course_Grade=(85*20+Midterm*30+Final*50)/100;

if Quiz eq 'C' then Course_Grade=(75*20+Midterm*30+Final*50)/100;

if Quiz eq 'F' then Course_Grade=(65*20+Midterm*30+Final*50)/100;

datalines;

12 A 92 95

12 B 88 88

13 C 78 75

13 A 92 93

12 F 55 62

13 B 88 82

;



proc print data=a15022.school;

run;

Learning: Use of conditional statements to assign grades based on scores.

Output:
Obs
Age
Quiz
Midterm
Final
Grade
Course_Grade
1
12
A
92
95
6
94.1
2
12
B
88
88
6
87.4
3
13
C
78
75
8
75.9
4
13
A
92
93
8
93.1
5
12
F
55
62
6
60.5
6
13
B
88
82
8
84.4


Code: ch7_2.sas

/*

2. Using the SAS data set Hosp, use PROC PRINT to list observations for Subject

values of 5, 100, 150, and 200. Do this twice, once using OR operators and once

using the IN operator.

Note: Subject is a numeric variable.

*/

libname a15022 '/folders/myfolders/my_prac';

**Creating Database;

data a15022.hosp;

   do j = 1 to 1000;

      AdmitDate = int(ranuni(1234)*1200 + 15500);

      quarter = intck('qtr','01jan2002'd,AdmitDate);

      do i = 1 to quarter;

         if ranuni(0) lt .1 and weekday(AdmitDate) eq 1 then

            AdmitDate = AdmitDate + 1;

         if ranuni(0) lt .1 and weekday(AdmitDate) eq 7 then

            AdmitDate = AdmitDate - int(3*ranuni(0) + 1);

         DOB = int(25000*Ranuni(0) + '01jan1920'd);

         DischrDate = AdmitDate + abs(10*rannor(0) + 1);

         Subject + 1;

         output;

      end;

   end;

   drop i j;

   format AdmitDate DOB DischrDate mmddyy10.;

run;



proc print data=a15022.hosp;

      where subject eq 5 or subject eq 100

      or subject eq 150 or subject eq 200;

      run;

      

      

proc print data=a15022.hosp;

      where subject in (5,100,150,200);

run;

Learning: PROC PRINT with or and in operator. Multiple ‘or’ can be replaced with ‘in’ statement.

Output:
Obs
AdmitDate
quarter
DOB
DischrDate
Subject
5
08/03/2003
6
06/28/1928
08/12/2003
5
100
11/05/2004
11
09/10/1980
11/15/2004
100
150
03/31/2005
12
02/07/1923
04/11/2005
150
200
07/02/2004
10
09/25/1931
07/10/2004
200


Obs
AdmitDate
quarter
DOB
DischrDate
Subject
5
08/03/2003
6
06/28/1928
08/12/2003
5
100
11/05/2004
11
09/10/1980
11/15/2004
100
150
03/31/2005
12
02/07/1923
04/11/2005
150
200
07/02/2004
10
09/25/1931
07/10/2004
200


Code: Ch7_7.sas

/*Using the Sales data set, list all the observations where Region is North and

Quantity is less than 60. Include in this list any observations where the customer

name (Customer) is Pet's are Us*/



libname a15022 '/folders/myfolders/my_prac';



*Data set SALES;

data a15022.sales;

   input    EmpID     :       $4. 

            Name      &      $15.

            Region    :       $5.

            Customer  &      $18.

            Date      : mmddyy10.

            Item      :       $8.

            Quantity  :        5.

            UnitCost  :  dollar9.;

   TotalSales = Quantity * UnitCost;

/*   format date mmddyy10. UnitCost TotalSales dollar9.;*/

   drop Date;

datalines;

1843 George Smith  North Barco Corporation  10/10/2006 144L 50 $8.99

1843 George Smith  South Cost Cutter's  10/11/2006 122 100 $5.99

1843 George Smith  North Minimart Inc.  10/11/2006 188S 3 $5,199

1843 George Smith  North Barco Corporation  10/15/2006 908X 1 $5,129

1843 George Smith  South Ely Corp.  10/15/2006 122L 10 $29.95

0177 Glenda Johnson  East Food Unlimited  9/1/2006 188X 100 $6.99

0177 Glenda Johnson  East Shop and Drop  9/2/2006 144L 100 $8.99

1843 George Smith  South Cost Cutter's  10/18/2006 855W 1 $9,109

9888 Sharon Lu  West Cost Cutter's  11/14/2006 122 50 $5.99

9888 Sharon Lu  West Pet's are Us  11/15/2006 100W 1000 $1.99

0017 Jason Nguyen  East Roger's Spirits  11/15/2006 122L 500 $39.99

0017 Jason Nguyen  South Spirited Spirits  12/22/2006 407XX 100 $19.95

0177 Glenda Johnson  North Minimart Inc.  12/21/2006 777 5 $10.500

0177 Glenda Johnson  East Barco Corporation  12/20/2006 733 2 $10,000

1843 George Smith  North Minimart Inc.  11/19/2006 188S 3 $5,199

;

proc print data=a15022.sales;

run;



proc print data=a15022.sales;

where Region eq 'North' and Quantity lt 60 and Customer like "%Pet's are Us%";

run;

Learning: ‘where’ condition is used to apply constraints on dataset in PROC PRINT statement.
Conditional operator used here are eq-> Equal to, lt-> Less than.

Output:

           
Obs
EmpID
Name
Region
put
Item
Quantity
UnitCost
TotalSales
1
1843
George Smith
North
Barco Corporation
144L
50
8.99
449.5
2
1843
George Smith
South
Cost Cutter's
122
100
5.99
599.0
3
1843
George Smith
North
Minimart Inc.
188S
3
5199.00
15597.0
4
1843
George Smith
North
Barco Corporation
908X
1
5129.00
5129.0
5
1843
George Smith
South
Ely Corp.
122L
10
29.95
299.5
6
0177
Glenda Johnson
East
Food Unlimited
188X
100
6.99
699.0
7
0177
Glenda Johnson
East
Shop and Drop
144L
100
8.99
899.0
8
1843
George Smith
South
Cost Cutter's
855W
1
9109.00
9109.0
9
9888
Sharon Lu
West
Cost Cutter's
122
50
5.99
299.5
10
9888
Sharon Lu
West
Pet's are Us
100W
1000
1.99
1990.0
11
0017
Jason Nguyen
East
Roger's Spirits
122L
500
39.99
19995.0
12
0017
Jason Nguyen
South
Spirited Spirits
407XX
100
19.95
1995.0
13
0177
Glenda Johnson
North
Minimart Inc.
777
5
10.50
52.5
14
0177
Glenda Johnson
East
Barco Corporation
733
2
10000.00
20000.0
15
1843
George Smith
North
Minimart Inc.
188S
3
5199.00
15597.0


Performing iterative processing

Looping is very important programming structure, especially when you have to run particular type of instructions multiple times.

Code: Ch8_1.sas

/*Run the program here to create a temporary SAS data set called Vitals

Using this data set, create a new data set (NewVitals) with the following new

variables:

For subjects less than 50 years of age:

If Pulse is less than 70, set PulseGroup equal to Low;

otherwise, set PulseGroup equal to High.

If SBP is less than 130, set SBPGroup equal to Low;

otherwise, set SBPGroup equal to High.



For subjects greater than or equal to 50 years of age:

If Pulse is less than 74, set PulseGroup equal to Low;

otherwise, set PulseGroup equal to High.

If SBP is less than 140, set SBPGroup equal to Low;

otherwise, set SBPGroup equal to High.

You may assume there are no missing values for Pulse or SBP*/



libname a15022 '/folders/myfolders/my_prac';

data a15022.vitals;

input ID : $3.

Age

Pulse

SBP

DBP;

label SBP = "Systolic Blood Pressure"

DBP = "Diastolic Blood Pressure";

datalines;

001 23 68 120 80

002 55 72 188 96

003 78 82 200 100

004 18 58 110 70

005 43 52 120 82

006 37 74 150 98

007 . 82 140 100

;

data a15022.NewVitals;

      set a15022.vitals;

if age lt 50 then do;

      if Pulse lt 70 then PulseGroup='Low';

      else PulseGroup='High';

      if SBP lt 130 then SBPGroup='Low';

      else SBPGroup='High';

      end;

if age ge 50 then do;

      if Pulse lt 74 then PulseGroup='Low';

      else PulseGroup='High';

      if SBP lt 140 then SBPGroup='Low';

      else SBPGroup='High';

      end;

      run;

      

proc print data=a15022.NewVitals;

run;

Learning: ‘do’ in a if statement enables us to perform number of operations within that if condition.

Output:
Obs
ID
Age
Pulse
SBP
DBP
PulseGroup
SBPGroup
1
001
23
68
120
80
Low
Low
2
002
55
72
188
96
Low
Hig
3
003
78
82
200
100
Hig
Hig
4
004
18
58
110
70
Low
Low
5
005
43
52
120
82
Low
Low
6
006
37
74
150
98
Hig
Hig
7
007
.
82
140
100
Hig
Hig

Code: Ch8_2.sas

/*

Run the program here to create a temporary SAS data set (MonthSales):

data monthsales;

input month sales @@;

/* add your line(s) here */

/*datalines;

1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500

9 5100 10 5700 11 6500 12 7500

;

Modify this program so that a new variable, SumSales, representing Sales to date, is

added to the data set. Be sure that the missing value for Sales in month 3 does not

result in a missing value for SumSales.

*/

libname a15022 '/folders/myfolders/my_prac';

data a15022.monthsales;

input month sales @@;

/* add your line(s) here */

SumSales+sales;

datalines;

1 4000 2 5000 3 . 4 5500 5 5000 6 6000 7 6500 8 4500

9 5100 10 5700 11 6500 12 7500

;



proc print data=a15022.monthsales;

run;

Learning: Sum function is another way of writing x=x+1. In above problem, we are creating new column SumSales as SumSales=Sumsales+sales.

Output:
Obs
month
sales
SumSales
1
1
4000
4000
2
2
5000
9000
3
3
.
9000
4
4
5500
14500
5
5
5000
19500
6
6
6000
25500
7
7
6500
32000
8
8
4500
36500
9
9
5100
41600
10
10
5700
47300
11
11
6500
53800
12
12
7500
61300


Code: ch8_4.sas

/*

Count the number of missing values for the variables A, B, and C in the Missing data

set. Add the cumulative number of missing values to each observation (use variable

names MissA, MissB, and MissC). Use the MISSING function to test for the missing

values.

*/

*Setting up private library;

libname a15022 '/folders/myfolders/my_prac';

* data step ;
data a15022.mss;

*reading from missing.txt file;

infile '/folders/myfolders/Data_Sources/missing.txt';

input A B C;

* conditional statement with 'do';

if missing(A) then do;

 MissA+1;  **Using sum statement;

end;

if missing(B) then do;

 MissB+1;

end;

if missing(C) then do;

 MissC+1;

end;

run;



proc print data=a15022.mss;

run;

Learning: Above example illustrates use of sum function for calculating number of missing values.
Output given below is just upto 10 observations.

Output:
Obs
A
B
C
MissA
MissB
MissC
1
1
2
3
0
0
0
2
4
5
.
0
0
1
3
6
7
8
0
0
1
4
9
10
11
0
0
1


Sub setting and combining datasets

SAS provides various method for sub setting as well as combining different datasets.

Code: Ch10_1.sas
/*1. Using the SAS data set Blood, create two temporary SAS data sets called Subset_A
and Subset_B. Include in both of these data sets a variable called Combined equal to
.001 times WBC plus RBC. Subset_A should consist of observations from Blood
where Gender is equal to Female and BloodType is equal to AB. Subset_B should
consist of all observations from Blood where Gender is equal to Female, BloodType
is equal to AB, and Combined is greater than or equal to 14. */


* Creating dataset;
libname a15022'/folders/myfolders/my_prac';
data a15022.blood;
   infile '/folders/myfolders/Data_Sources/blood.txt' truncover;
   length Gender $ 6 BloodType $ 2 AgeGroup $ 5;
   input Subject
         Gender
         BloodType
         AgeGroup
         WBC
         RBC
         Chol;
   label Gender = "Gender"
         BloodType = "Blood Type"
         AgeGroup = "Age Group"
         Chol = "Cholesterol";
run;

proc print data=a15022.blood;


data a15022.subset_A;
set a15022.blood;
Combine=(WBC+RBC)*0.001;
where Gender='Female' and BloodType='AB';
run;

proc print data=a15022.subset_A;
run;

data a15022.subset_B;
set a15022.blood;
Combine=(WBC+RBC)*0.001;
where Gender='Female' and BloodType='AB' and ((WBC+RBC)*0.001) >= 14;
run;

proc print data=a15022.subset_B;
run;

Learning: In above problem, subset A and subset B are being derived based on condition given in ‘where’ statement.

Output:
Obs
Gender
BloodType
AgeGroup
Subject
WBC
RBC
Chol
Combine
1
Female
AB
Young
1
7710
7.40
258
7.71740
2
Female
AB
Old
25
7480
4.70
183
7.48470
3
Female
AB
Young
78
7410
5.82
175
7.41582
4
Female
AB
Young
79
.
4.61
69
.
5
Female
AB
Young
101
7610
4.60
162
7.61460
6
Female
AB
Old
142
9380
5.82
218
9.38582
7
Female
AB
Young
180
6220
5.58
.
6.22558
8
Female
AB
Old
199
6810
5.54
.
6.81554
9
Female
AB
Old
255
8080
5.45
144
8.08545
10
Female
AB
Young
260
7680
.
127
.
11
Female
AB
Young
288
6810
8.26
138
6.81826
12
Female
AB
Young
363
7950
5.24
.
7.95524
13
Female
AB
Old
366
7350
4.72
135
7.35472
14
Female
AB
Old
414
.
5.98
.
.
15
Female
AB
Old
449
7480
3.37
186
7.48337
16
Female
AB
Young
459
6620
6.19
184
6.62619
17
Female
AB
Old
528
6310
6.30
.
6.31630
18
Female
AB
Old
544
6360
4.88
.
6.36488
19
Female
AB
Old
652
9110
5.41
.
9.11541
20
Female
AB
Old
802
7200
3.37
185
7.20337

Code: Ch10_2.sas
/*
2. Using the a15022S data set Hosp, create a temporary a15022S data set called Monday2002,
consisting of observations from Hosp where the admission date (AdmitDate) falls on
a Monday and the year is 2002. Include in this new data set a variable called Age,
computed as the person’s age as of the admission date, rounded to the nearest year.

*/

*Setting up library;
libname a15022'/folders/myfolders/my_prac';

*create dataset Monday2002;
data a15022.Monday2002;
      set a15022.hosp;
Age= round(yrdif(DOB,AdmitDate),1); *Age calculation using yrdif function;
where weekday(AdmitDate)=2 and year(AdmitDate)=2002; *Condition for subsetting;
run;
proc print data=a15022.Monday2002;
run;



*Data set HOSP;
data a15022.hosp;
   do j = 1 to 1000;
      AdmitDate = int(ranuni(1234)*1200 + 15500);
      quarter = intck('qtr','01jan2002'd,AdmitDate);
      do i = 1 to quarter;
         if ranuni(0) lt .1 and weekday(AdmitDate) eq 1 then
            AdmitDate = AdmitDate + 1;
         if ranuni(0) lt .1 and weekday(AdmitDate) eq 7 then
            AdmitDate = AdmitDate - int(3*ranuni(0) + 1);
         DOB = int(25000*Ranuni(0) + '01jan1920'd);
         DischrDate = AdmitDate + abs(10*rannor(0) + 1);
         Subject + 1;
         output;
      end;
   end;
   drop i j;
   format AdmitDate DOB DischrDate mmddyy10.;
run;

proc print data=a15022.hosp;
run;

Learning: ‘where’ condition in data step enables us to derive a subset from given dataset.

Output:
Obs
AdmitDate
quarter
DOB
DischrDate
Subject
Age
1
11/11/2002
3
10/01/1928
11/18/2002
48
74
2
11/11/2002
3
05/08/1940
11/19/2002
49
63
3
11/11/2002
3
09/26/1942
11/15/2002
50
60
4
07/22/2002
2
10/14/1986
07/25/2002
69
16
5
07/22/2002
2
09/05/1981
07/23/2002
70
21
6
10/14/2002
3
01/28/1961
10/20/2002
289
42
7
10/14/2002
3
04/25/1922
10/25/2002
290
80
8
10/14/2002
3
11/06/1948
10/29/2002
291
54
9
12/23/2002
3
09/07/1949
01/02/2003
303
53
10
12/23/2002
3
10/31/1927
12/26/2002
304
75


Working with numeric functions

SAS functions are an essential tool in DATA step programming. They perform such tasks
as rounding numbers, computing dates from month-day-year values, summing and
averaging the values of SAS variables, and hundreds of other tasks.

Code: ch11_2.sas
/* Problem
2. Count the number of missing values for WBC, RBC, and Chol in the Blood data set.
Use the MISSING function to detect missing values.
*/
libname a15022 '/folders/myfolders/my_prac';

*Data set BLOOD;
data a15022.blood;
   infile '/folders/myfolders/Data_Sources/blood.txt' truncover;
   length Gender $ 6 BloodType $ 2 AgeGroup $ 5;
   input Subject
         Gender
         BloodType
         AgeGroup
         WBC
         RBC
         Chol;
   label Gender = "Gender"
         BloodType = "Blood Type"
         AgeGroup = "Age Group"
         Chol = "Cholesterol";
run;


data a15022.miss;
set a15022.blood;
if missing(Chol) then MissChol + 1;
if missing(WBC) then MissWBC + 1;
if missing(RBC) then MissRBC + 1;
drop Subject Gender BloodType AgeGroup WBC RBC Chol;
run;

proc print data=a15022.miss;
run;
Learning: 
‘missing’ function lets you detect missing value in dataset.

Output:
Obs
MissChol
MissWBC
MissRBC
1
0
0
0
2
1
0
0
3
1
0
0
4
2
0
0
5
2
1
0
6
2
1
0
7
2
1
0
8
2
1
0
9
2
2
0
10
3
2
0


Working with Character function

These functions enable you to search for character strings, take strings apart and put them back
together again, and remove selected characters from a string.

Code: ch12_2.sas
/*
2. Using the data set Mixed, create a temporary SAS data set (also called Mixed) with
the following new variables:
a. NameLow – Name in lowercase
b. NameProp – Name in proper case
c.(Bonus – difficult) NameHard – Name in proper case without using the
PROPCASE function
*/


libname a15022 '/folders/myfolders/my_prac';

*Data set MIXED;
data a15022.mixed;
   input Name & $20. ID;
datalines;
Daniel Fields  123
Patrice Helms  233
Thomas chien  998
;

data a15022.mixed;
set a15022.mixed;
Namelow=lowcase(Name); * fun lowcase to covert it into lower case;
Nameprop=propcase(Name); * fun uppercase to convert it into upper case;

* taking substring of first character in every word and making it upcase;
* then, concatenating all substrings to form full name with proper case;
fchar=upcase(substr(Name,1,1));
fname=cats(fchar,substr(Name,2,index(Name,' ')-1));
lchar=upcase(substr(Name,index(Name,' ')+1,1));
lname= cats(lchar,substr(Name,index(Name,' ')+2,length(Name)));
Namehard=catx(' ',fname,lname);
drop fname lname fchar lchar;
run;

proc print data=a15022.mixed;
run;

Learning: In above code, up case function has been used to make observation uppercase. Another functions such as cats-for concatenation, substr- for taking out substring, catx- concatenating with space in between has been used to make observation in Propercase which can also be done with in-built function ‘Propcase’.

Output :
Obs
Name
ID
Namelow
Nameprop
Namehard
1
Daniel Fields
123
daniel fields
Daniel Fields
Daniel Fields
2
Patrice Helms
233
patrice helms
Patrice Helms
Patrice Helms
3
Thomas chien
998
thomas chien
Thomas Chien
Thomas Chien


Arrays

Typically arrays are used to perform a similar operation on a group of variables.

Code: ch13_2.sas
/*
. Redo Problem 1, except use data set Survey2.
Note: Ques1–Ques5 are numeric variables.*/


*Data set SURVEY2;
data a15022.survey2;
   input ID
         (Q1-Q5)(1.);
datalines;
535 13542
012 55443
723 21211
007 35142
;

* making dataset named survey1;
data a15022.survey1;
set a15022.survey;
* Creating a temporary array;
array temp{5} _temporary_;
*creating array for survey1;
array Q{5} Ques1-Ques5;
do i=1 to 5;
temp{6-i}=Q{i};
end;

do i=1 to 5;
Q{i}=temp{i};
end;
drop i;
proc print data=a15022.survey1;
run;

Learning: In above example, arrays have been used to perform operation on Question column of Survey2 dataset. Here, we are reversing values of column Ques1 to Ques5 in resulting survey1 table. _temporary_ lets us create a temporary array.

Output:
Obs
ID
Gender
Age
Salary
Ques1
Ques2
Ques3
Ques4
Ques5
1
1
M
23
28000
3
2
1
2
1
2
2
F
55
76123
1
1
2
5
4
3
3
M
38
36500
1
2
2
2
2
4
4
F
67
128000
4
2
2
3
5
5
5
M
22
23060
2
4
3
3
3
6
6
M
63
90000
3
4
5
3
2
7
7
F
45
76100
3
3
4
3
5

Displaying your data

Code: Ch14_2.sas
/*
Use PROC PRINT (without any DATA steps) to create a listing like the one here.
Note: The variables in the Hosp data set are Subject, AdmitDate (Admission Date),
DischrDate (Discharge Date), and DOB (Date of Birth).
Selected Patients from HOSP Data Set
Admitted in September of 2004
Older than 83 years of age
--------------------------------------
Date of Admission Discharge
Subject Birth Date Date
401 03/21/1921 09/13/2004 09/22/2004
407 08/26/1920 09/13/2004 09/18/2004
409 01/01/1921 09/13/2004 10/02/2004
2577 04/30/1920 09/27/2004 09/27/2004
6889 10/26/1920 09/17/2004 09/22/2004
7495 02/11/1921 09/21/2004 09/22/2004
Number of Patients = 6
*/

libname a15022 '/folders/myfolders/my_prac';

title "Hosp Dataset";
title2 "Patients admitted in sept 2004 and age >83";
title3 "Older than 83 years of age";

proc print data=a15022.hosp
           n='Number of Patients = '
           label
           double;
   where Year(AdmitDate) eq 2004 and
         Month(AdmitDate) eq 9 and
         yrdif(DOB,AdmitDate,'Actual') ge 83;
   id Subject;
   var DOB AdmitDate DischrDate;
   label AdmitDate = 'Admission Date'
         DischrDate = 'Discharge Date'
         DOB = 'Date of Birth';
run;

Learning: Label statement lets you present data well. In given example data from Hosp dataset is filtered and presented with label names.

Output:
Hosp Dataset
Patients admitted in sept 2004 and age >83
Older than 83 years of age
Subject
Date of Birth
Admission Date
Discharge Date
401
03/21/1921
09/13/2004
09/22/2004
407
08/26/1920
09/13/2004
09/18/2004
409
01/01/1921
09/13/2004
10/02/2004
2577
04/30/1920
09/27/2004
09/27/2004
6889
10/26/1920
09/17/2004
09/22/2004
7495
02/11/1921
09/21/2004
09/22/2004
Number of Patients = 6 


Creating customized reports
Although you can customize the output produced by PROC PRINT, there are times when
you need a bit more control over the appearance of your report. PROC REPORT was
developed to fit this need.
Code: Ch15_1.sas
/*
Using the SAS data set BloodPressure, produce a report showing Gender and Age,
along with a new variable called AgeGroup. Values of AgeGroup are <= 50 or
> 50 depending on the value of Age. Label this variable “Age Group.” Your report
should look like this:
Patient Age Groups
Age
Gender Age Group
M 23 <= 50
F 68 > 50
M 55 > 50
F 28 <= 50
M 35 <= 50
M 45 <= 50
F 48 <= 50
F 78 > 50
316 Learning SAS by Example: A Programmer’s Guide
6. Using the SAS data set BloodPressure, produce a report showing Gender, Age, SBP,
and DBP. Order the report in Gender and Age order as shown here:
Subject's in Gender and Age Order
Systolic Diastolic
Blood Blood
Gender Age Pressure Pressure
F 28 120 70
48 138 88
68 110
*/
libname a15022 '/folders/myfolders/my_prac';

*Data set BLOODPRESSURE;
data a15022.bloodpressure;
   input Gender : $1.
         Age
         SBP
         DBP;
datalines;
M 23 144 90
F 68 110 62
M 55 130 80
F 28 120 70
M 35 142 82
M 45 150 96
F 48 138 88
F 78 132 76
;



title "Patient Age Groups";
proc report data=a15022.bloodpressure nowd;
   column Gender Age AgeGroup;
   define Gender / width=6;
   define Age / display width=5;
   define AgeGroup / computed "Age Group";
   compute AgeGroup / character length=5;
      if Age gt 50 then AgeGroup = '> 50';
      else if not missing(Age) then AgeGroup = '<= 50';
   endcomp;
run;


Learning: Above example shows usage of PROC report to generate report with a new column as AgeGroup and defining width og existing table columns.

Output:
Patient Age Groups
Gender
Age
Age Group
M
23
<= 50
F
68
> 50
M
55
> 50
F
28
<= 50
M
35
<= 50
M
45
<= 50
F
48
<= 50
F
78
> 50

Please click on below link to get all the program with data sets..
Programs and datasets...