Friday, 2 October 2015

SAS problems

Chapter 15 : Creating Customized Reports

Although you can customize the output produced by PROC PRINT, there are times when you need a bit more control over the appearance of your report. PROC REPORT was developed to fit this need. Not only can you control the appearance of every column of your report, you can produce summary reports as well as detail listings.

Code: ch15_6.sas

/*6.Using the SAS data set Blood Pressure, produce a report showing Gender, Age, SBP,
and DBP. Order the report in Gender and Age order as shown here:
Subject's in Gender and Age Order
Systolic Diastolic
Blood Blood
Gender Age Pressure Pressure
F 28 120 70
48 138 88
68 110 62
78 132 76
M 23 144 90
35 142 82
45 150 96
55 130 80*/

title "Subject's in Gender and Age Order";
proc report data=a15022.bloodpressure nowd;
column Gender Age SBP DBP;
define Gender / order width=6;
define Age / order width=5;
define SBP / display "Systolic Blood Pressure" width=8;
define DBP / display "Diastolic Blood Pressure" width=9;
run;

Learning:
Program defines format for each column such as age, blood pressure. It outputs a report which is easy and nice to read and understand.

Output:

Subject's in Gender and Age Order
Gender
Age
Systolic Blood Pressure
Diastolic Blood Pressure
F
28
120
70

48
138
88

68
110
62

78
132
76
M
23
144
90

35
142
82

45
150
96

55
130
80

Code: ch15_8.sas

/*

Using the data set Blood, produce a report like the one here. The numbers in the table

are the average WBC and RBC counts for each combination of blood type and

gender.

Average Blood Counts by Gender

ƒƒƒƒƒƒGenderƒƒƒƒƒƒ ƒƒƒƒƒƒGenderƒƒƒƒƒƒ

Blood Female Male Female Male

Type WBC WBC RBC RBC

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

A 7,218 7,051 5.47 5.46

AB 7,421 6,893 5.43 5.69

B 6,716 6,991 5.52 5.42

O 7,050 6,930 5.53 5.48

*/

/*blood dataset*/

data a15022.blood;

   infile '/folders/myfolders/programs/blood.txt' truncover;

   length Gender $ 6 BloodType $ 2 AgeGroup $ 5;

   input Subject 

         Gender 

         BloodType 

         AgeGroup

         WBC 

         RBC 

         Chol;

   label Gender = "Gender"

         BloodType = "Blood Type"

         AgeGroup = "Age Group"

         Chol = "Cholesterol";

run;



title "Average Blood Counts by Gender";

proc report data=a15022.blood nowd headline;

column BloodType Gender,WBC Gender,RBC;

define BloodType / group 'Blood Type' width=5;

define Gender / across width=8 '-Gender-';

define WBC / analysis mean format=comma8.;

define RBC / analysis mean format=8.2;

run;

Learning:
We could aggregate data by gender and then by blood type using above code.
Output:
Average Blood Counts by Gender

Gender
Gender

Female
Male
Female
Male
Blood Type
WBC
WBC
RBC
RBC
A
7,218
7,051
5.47
5.46
AB
7,421
6,893
5.43
5.69
B
6,716
6,991
5.52
5.42
O
7,050
6,930
5.53
5.48

Chapter 16: Summarizing your data

You may have thought of PROC MEANS (or PROC SUMMARY) primarily as a way to
generate summary reports, reporting the sums and means of your numeric variables.
However, these procedures are much more versatile and can be used to create summary
data sets that can then be analyzed with more DATA or PROC steps.
All the examples in this chapter use PROC MEANS rather than PROC SUMMARY,
even when all you want is an output data set. The reason for this is that using PROC
MEANS with a NOPRINT option is identical to using PROC SUMMARY
.

Code: Ch16_4.sas

/*Chapter 16*/



/*Repeat Problem 3 (CLASS statement only), except group small and medium school
sizes together. Do this by writing a new format for SchoolSize (values are S, M, and
L). Do not use any DATA steps.*/



/*College Dataset*/

proc format library=a15022;

   value $yesno 'Y','1' = 'Yes'

                'N','0' = 'No'

                ' '     = 'Not Given';

   value $size 'S' = 'Small'

               'M' = 'Medium'

               'L' = 'Large'

                ' ' = 'Missing';

   value $gender 'F' = 'Female'

                 'M' = 'Male'

                 ' ' = 'Not Given';

run;



data a15022.college;

   length StudentID $ 5 Gender SchoolSize $ 1;

   do i = 1 to 100;

      StudentID = put(round(ranuni(123456)*10000),z5.);

      if ranuni(0) lt .4 then Gender = 'M';

      else Gender = 'F';

      if ranuni(0) lt .3 then SchoolSize = 'S';

      else if ranuni(0) lt .7 then SchoolSize = 'M';

      else SchoolSize = 'L';

      if ranuni(0) lt .2 then Scholarship = 'Y';

      else Scholarship = 'N';

      GPA = round(rannor(0)*.5 + 3.5,.01);

      if GPA gt 4 then GPA = 4;

      ClassRank = int(ranuni(0)*60 + 41);

      if ranuni(0) lt .1 then call missing(ClassRank);

      if ranuni(0) lt .05 then call missing(SchoolSize);

      if ranuni(0) lt .05 then call missing(GPA);

      output;

   end;

   format Gender $gender1. 

          SchoolSize $size. 

          Scholarship $yesno.;

   drop i;

run;



/*PRogram;*/



proc format;

value $groupsize

'S','M' = 'Small and Medium'

'L' = 'Large';

run;

title "Statistics on the College Data Set";

title2 "Broken down by School Size";

proc means data=a15022.college

n mean median min max maxdec=2;

class SchoolSize;

var ClassRank GPA;

format SchoolSize $groupsize.;

run;

Learning:
Usage of proc means to generate more detail report and finding specific statistic like min, max, median.

Output:
Statistics on the College Data Set
Broken down by School Size
The MEANS Procedure
SchoolSize
N Obs
Variable
N
Mean
Median
Minimum
Maximum
Large
18
ClassRank
GPA
15
16
72.80
3.61
71.00
3.61
45.00
2.98
98.00
4.00
Small and Medium
78
ClassRank
GPA
69
75
71.83
3.47
73.00
3.62
41.00
2.29
100.00
4.00


Code: Ch16_5.sas



/*Using the SAS data set College, report the mean GPA for the following categories of

ClassRank: 0–50 = bottom half, 51–74 = 3rd quartile, and 75 to 100 = top

quarter. Do this by creating an appropriate format. Do not use a DATA step.*/



proc format;

value rank 0-50 = 'Bottom Half'

51-74 = 'Third Quartile'

75-100 = 'Top Quarter';

run;

title "Statistics on the College Data Set";

title2 "Broken down by School Size";

proc means data=a15022.college

n mean maxdec=2;

class ClassRank;

var GPA;

format ClassRank rank.;

run;

Learning:
In above program we are diving students in quartiles according to their marks.
Output:
Statistics on the College Data Set
Broken down by School Size
The MEANS Procedure
Analysis Variable : GPA
ClassRank
N Obs
N
Mean
Bottom Half
15
14
3.78
Third Quartile
34
34
3.52
Top Quarter
39
35
3.42

Chapter 7: Counting frequencies
PROC FREQ can be used to count frequencies of both character and numeric variables, in one-way, two-way, and three-way tables. In addition, you can use PROC FREQ to create output data sets containing counts and percentages. Finally, if you are statistically inclined, you can use this procedure to compute various statistics such as chi-square, odds ratio, and relative risk.

Code: Ch17_1.sas

/*Chapter 17*/



/*Using the SAS data set Blood, generate one-way frequencies for the variables

Gender, BloodType, and AgeGroup. Use the appropriate options to omit the

cumulative statistics and percentages.*/



title "One-way Frequencies from BLOOD Data Set";

proc freq data=a15022.blood;

tables Gender BloodType AgeGroup / nocum nopercent;

run;

Learning:
In above program we learn usage of proc freq and tables statement to count and display one way frequencies.

Output:

One-way Frequencies from BLOOD Data Set
The FREQ Procedure
Gender
Gender
Frequency
Female
440
Male
560

Blood Type
BloodType
Frequency
A
412
AB
44
B
96
O
448

Age Group
AgeGroup
Frequency
Old
598
Young
402

Code: Ch17_2.sas





/*5. Program 17-2*/

/*

Using the SAS data set BloodPressure, generate frequencies for the variable Age.

Use a user-defined format to group ages into three categories: 40 and younger, 41 to

60, and 61 and older. Use the appropriate options to omit the cumulative statistics

and percentages.

*/

proc format;

value agegrp low-40 = '40 and lower'

41-60 = '41 to 60'

61-high = '61 and higher';

run;

title "Using a Format to Regroup Values";

proc freq data=a15022.bloodpressure;

tables age / nocum nopercent;

format age agegrp.;

run;

Learning: Above program shows frequency as per age group using tables statement.

Output:

Using a Format to Regroup Values
The FREQ Procedure
Age
Frequency
40 and lower
3
41 to 60
3
61 and higher
2

Please click on below link to find all the programs…