Sunday, October 4, 2009

Multi-Value Ranges

This was one of my interview questions.

"How do you write a proc format for overlapping values? Can you write down the format for variable: age?"

I have pasted the same here. The multilabel option helps a user in defining format for overlapping values.

proc format;
value age(multilabel)
0 - 12 = "Children"
13 - 19 = "Teenager"
20 - 25 = "Young Adult";
0 - 19 = "Children & Teenager";
low - 25 = "Children, Teenager & Young Adult";
25 - high = "Adult";
run;

Another example:

proc format; 
value cat(multilabel)
90 - high = "90% improvement"
75 - 90 = "75% improvement"
50 - 75 = "50% improvement";
run;

The below example is different one in which the proc format is used for the usual range values...

proc format; 
value rv
120 - high = '5'
60 < 120 = '4'
35 < 60 = '3'
25 < 35 = '2
25 = '1';
run;

What is IDE???


IDE/ Integrated development environment is a programming environment integrated into a software application. It incorporates a GUI builder, a code editor, a compiler and/or an interpreter & a debugger.

The basic features of this IDE for SAS include:
  • Collection of edit macros and templates
  • Creation of new templates
  • SAS highlighting features
  • Indention
  • Fast commenting
  • Expansion of keystrokes into SAS language constructs
  • Development, testing, and fixing (with SAS/PC).

These built-in features in an "Integrated development environment" facilitates a programmer maximize his/ her efficiency and manage time with a fairly large work load.

More: http://multi-edit-2008.software.informer.com/11.2/

Wednesday, September 23, 2009

Use of %STR, %EVAL and %SYSEVALF


This is a code used for the programming of adverse event table (where a programmer has to repeatedly use FREQ). Here in this macro, one can notice the use of %str.

What is the use of %str in this code???

This helps in masking of special characters in the parameter value. The use of %str help the macro processor understand that the passed parameter value is “nothing other than text”.

For example: if the following code*** is executed without a %str and if a parameter value 'Body system' is passed to the macro, it does not comprehend the meaning of &txt = 'Body system’!
***
%if %str
(&txt) eq 'Body system' %then %do;

Instead the log displays a message saying that the "condition is false".

Hence the use of %str is absolutely necessary when parameter value contains blank/ special characters.

Same way, %eval is for evaluation of integers and %sysevalf is for floatingpoint values.

***** A CODE FOR FINDING FREQUENCY *****;

%macro freq(indt=, sort=, cond=, byvar=, class=, var=, ord=, txt=, outdt=);
proc sort data=&indt out=sdat; by &sort;
proc means data=sdat completetypes nway chartype missing noprint;
where &cond;
by &byvar;
class &class /preloadfmt missing mlf order=formatted;
var &var;
format trtcd trtf.;
output out=_freq n=ccnt mean = mn;
run;

data freq; length txt txt1 txt2 col1 $200;
set _freq;

ord=⩝ txt=&txt;

%if &ord=4 %then %do;
%if %str(&txt) eq 'Body system' %then %do;
txt1=strip(aebodsys); txt2=''; ord1=1;
%end;
%if %str
(&txt) eq 'Preferred term' %then %do;
txt1=strip(aebodsys); txt2=strip(aedecod); ord1=2;
%end;
%end;

%else %do;
txt1=''; txt2=''; ord1=0;
%end;

if trtcd=1 then deno = &trt1;
else if trtcd=2 then deno = &trt2;
else deno = &trt3;

if ord in(3,4,5) then do; per = (ccnt/deno)*100;
col1 = right(put(ccnt,3.))right(put('('trim(left(put(per,4.1)))'%)',8.));
end;
else if ord in (2) then do; col1 = right(put(mn,4.1)); end;
else if ord in (1) then do; col1 = right(put(mn,5.2)); end;
else do; col1 = right(put(ccnt,3.)); end;
run;

***** TRANSPOSED FOR DISPLAY *****;

proc sort data=freq; by txt txt1 txt2 ord ord1;
proc transpose data=freq out=freq1;
by txt txt1 txt2 ord ord1;
id trtcd ;
var col1;
run;

data &outdt;
set freq1(drop=_name_);
run;
%mend freq;

%freq(indt=zer, sort= trtcd, cond = , byvar= trtcd, class= , var=v, ord=0, txt='Subjects treated', outdt=zer1);


Monday, June 29, 2009

Output a dataset from proc contents!


proc contents data = check
out = check1(keep=name format label length name type varnum);
run;

Many of us use the procedure: proc contents but only few make use of the resulting dataset. The output dataset of proc contents helps the user visualize data items (in the form of a dataset). For example one can see the name, format, label, length, name, type, varnum etc.

Friday, June 26, 2009

Template for merging with SUPP's


%macro mer(seq, select);
data lb1;
set lb;
seq=&seq;
run;

data supp;
set supplb;
if strip(qnam)=&select;
seq=input(idvarval,best.);
run;

proc sort data=lb1; by usubjid seq;
proc sort data=supp; by usubjid seq;
data test;
merge lb1(in=a) supp(in=b);
by usubjid seq;
if a & b;
run;

proc sort data = test out = test1
dupout = dups nodupkey;
by studyid rdomain usubjid idvar idvarval qnam qlabel qval qorig qeval;
run;
%mend;

%mer(lbseq, 'CS');

Sunday, June 7, 2009

Conversion of character date & time to numeric


The following macro dtcn helps in conversion of character date and time to numeric date and time.

%macro dtcn(indt,dtmn);
length dt1 $10 tm1 $6 dt1n $20;
if index(&indt,'T') ne 0 then do;
dt1 = scan(&indt,1,'T');
tm1 = scan(&indt,2,'T');
if dt1 ne '' and tm1 ne '' then dt1n = put(input(dt1,yymmdd10.),date9.)' 'trim(left(tm1));
if dt1 ne '' and tm1 ne '' then &dtmn = input(dt1n,datetime15.);
else if dt1 ne '' and tm1 eq '' then &dtmn =input(put(input(dt1,yymmdd10.),date9.),date9.);
end;
else &dtmn = .;
%mend;


data date;
xxstdtc='2009-01-02T17:30';
run;

data date1;
format xxstdtn datetime15.;
set date;
%dtcn(xxstdtc, xxstdtn);
run;

Saturday, June 6, 2009

Find out sum


There are two possible ways to find out the sum of variable x.

data dummy;
input pt $1 seq x;
datalines;
1 1 6
1 2 1
1 3 .
2 1 4
3 1 .
4 1 .
4 2 2
5 1 3
6 1 .
7 1 3
;
run;
proc sort data=dummy; by pt seq; run;

1. One can either transpose the value of variable x (by identifier seq) & find the sum.

proc transpose data=dummy out=dummy1 prefix=_;
by pt;
id seq;
var x;
run;

data sumt;
set dummy1;
val=sum(_1,_2,_3);
run;

2. Or can use the following code to calculate it.

The code given below calculates the sum of variable x without transposing data:

data sum;
set dummy;
by pt;
retain val;
if first.pt then do;
*set the value to missing when it reads first.pt;
val = .;
*& finds where x is non-missing to fill the column val with first.pt value;
if x ne . then val = x;
end;
*does an: addition of two value & retain the same;
else if x ne . & val ne . then val = val+x;
*incase when first.pt is missing & other records of the same pt are available: then fills the column val with value of x;
else if x ne . & val eq . then val = x;
run;

data sum1;
set sum;
by pt;
if last.pt;
keep pt x val;
run;

Monday, April 27, 2009

A Fwd that I enjoyed reading...


A Priest, a Doctor and a CRO employee die and go to heaven,
they hear God's voice and it says:

“My dear sons I am really very sorry but the heaven is full and I can accommodate only one of you, so to choose the right person, one by one tell me what you have done in your lifetime.”

The Priest goes up first and says:

"Well God I'm a priest, I am your humble servant and have spent all my life working to spread your message."

The Doctor goes up next and says:

"Well I'm a doctor and I have helped thousands of people recover from their illnesses"

The CRO employee goes up says:

"Well I worked in ***** Clinical Research India Pvt Ltd and........", Before the CRO employee could say anything further, the heaven's gate opened and God came with tears in his eyes and said to the CRO employee… Say no more my son come with me, you have already been through HELL...

Wednesday, April 8, 2009

How to read variable names?


The system dataset SASHELP.VCOLUMN is a SAS view. This view is where the attributes (like: variable names, labels and formats) of a dataset are stored as the records of a dataset. This is like an output dataset from the CONTENTS procedure.

For example, in the code below: If the DM dataset is executed, the corresponding attributes of this dataset gets saved in a view called sashelp.vcolumn (which can later be set for analysis).

*** RUN DM DATASET ***;

data dm;
set dm;
run;

*** SET SASHELP.VCOLUMN TO READ VARIABLE NAMES ***;

data test;
set sashelp.vcolumn(where=(libname="WORK" and memname in ("DM"))) end=last;
if find (label, "Date"); *capture date variables;
run;

Saturday, April 4, 2009

Checklists for validation


1) Log Check:
  • Is the program log clean (with no errors, warnings, uninitialized values, character to numeric conversions and vice versa?)
2) Cosmetic Check:
  • Does the output match mock up?
  • Are all words spelled correctly?
  • Does the number [1] in header match footnote number?
  • Does each footnote end with a period except reference?
  • Does the footnote refer to corresponding listing? (For Tables)
  • Are all variables aligned: Numeric – Right & decimal aligned, Character – Left aligned?
  • Is there any truncation for character variables?
  • Are ordering / display of subtitle consistent across all listings?
  • Does 5 subjects in QC program (a random pick) match output?
3) Logic Check:
  • Are the TLFs subset for right population?
  • Does the number of records displayed match with QC program?
  • Does the output content in each column relate to the annotated specification?
  • Does the output make logical sense for the given data?
  • Is there any duplicates in the given data?

Use of #BYVAL(variable-name)


How to customize our title display??? How do we insert a text in the title???

This option #byval helps the user insert text at the position it is placed in the title statement.

For example in the code below:

The use of #byval option with the variable TRTGRP in the BY statement (within a proc step) makes its dynamic value get displayed in the title. As result of which we get different treatment groups displayed in the title part.

title1 "Listing of Patient Population";
title2 "Treatment Group=#byval(trtgrp)";
proc report data=dummy nowd headline headskip spacing=1 missing;
by trtgrp;
column(subj rand saft ittp eval);

define subj/ order_ width = 20 left spacing = 0 "Subject" ;
define rand/ order_ width = 20 center "Randomized";
define saft/ display width = 20 center "Safety" ;
define ittp/ display width = 20 center "ITT";
define eval/ display width = 20 center "Evaluable";

break after subj / skip;
run;


Saturday, February 21, 2009

Labeling with proc transpose


data new;
input subject $ value $ id $ label $;
cards;
111 03 A One
112 03 A One
111 04 B Two
113 03 B Two
114 04 C Three
;
run;

proc sort data=new;
by subject;
run;

proc transpose data=new out=new1;
by subject;
id id;
var value;
idlabel label;
run;

The ID statement identifies a variable whose values will supply the SAS names for variables in the transposed data.

is "FLOW" not working?


Then try this code:

var = compress(temp,compress(temp,'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890!@#$%^&*()_+?<>{}]["".,/= '));

This code compresses the invisible characters that are present in the variable named temp and it helps in perfect alignment of text with FLOW function.

Saturday, January 31, 2009

Assigning GRADES


data assign;
set draft;
*temperature in celsius;
temp=temperature;

*INCREASE IN TEMPERATURE;
if (temp^= . & temp>= 38)
then do;
if 38 <= temp<= 39 then grade= 'GRADE1'; else if 39 < grade=" 'GRADE2';"> 40 then grade= 'GRADE3';
end;
*DECREASE IN TEMPERATURE;
else if (temp^= . & temp<= 35) then do;
if 35 >= temp> 32 then grade= 'GRADE2';
else if 32 >= temp> 28 then grade= 'GRADE3';
else if temp<= 28 then grade= 'GRADE4'; end;

run;

Visit in Window???


data window;
set draft;
by usubjid visitnum;

if visitday ne . & visitday lt 0 then day= 0;
else day= visitday;
*bringing the preceding day using lag function;
preday=lag(day);
*identifying observation through lag of usubjid;
lagusid = lag(usubjid);
*(day-preceding day) gives the day difference;
if usubjid = lagusid then do;

if day ne . & preday ne . then daydiff=(day-preday);
end;


*VISIT WINDOW FOR SCREENING;
if day ne . & visitnum in (1) then do;
if day le 0 then viwind = 'Yes';
else viwind = 'No';
end;
*VISIT WINDOW FOR VISIT 1 2 & 3;
if daydiff ne . & visitnum in (2,3,4) then do;
if 5 le daydiff le 7 then viwind = 'Yes';
else viwind = 'No';
end;
*VISIT WINDOW FOR END OF STUDY;
if daydiff ne . & visitnum in (5)
then do;
if 0 le daydiff le 1 then viwind = 'Yes';
else viwind = 'No';
end;
*VISIT WINDOW FOR FOLLOW-UP;
if daydiff ne . & visitnum in (6) then do;
if daydiff le 25 then viwind = 'Yes';
else viwind = 'No';
end;


run;