Saturday, March 22, 2008
Character array for creating a dummy dataset
data one;
input visit$ 1-10;
datalines;
DAY 8
DAY 15
DAY 29
DAY 57
DAY 85
;
run;
-
*creating a dummy dataset with 3 different xxorres for each visit present in dataset one;
data dummy;
set one;
length xxorres $ 20;
array x{*} _character_;
do i = 1 to dim(x);
if i gt 0 then do;
xxorres='COMPLETE CLEARANCE'; output;
xxorres='INCOMPLETE CLEARANCE'; output;
xxorres='UNABLE TO ASSESS'; output;
i=i+1;
end;
end;
run;
proc sort data=dummy; by i; run;
proc print; run;
-
More: http://dist.stat.tamu.edu/flash/SAS/
-
*display of zero for missing values (using:character array);
data display;
set disp;
*new: is array name, l1 to l4: are character variable;
array new l1 - l4 _character_;
do i=1 to dim(new);
if new{i}='' then new{i}='0';
end;
run;
-
Tuesday, March 18, 2008
"When" for data analysis
- "When the ORRES is populated, the STAT (& REASND) remains blank. When the STAT (& REASND) is populated, the ORRES remains blank".
- "When both the raw dataset and the derived dataset names are same, the derived dataset repalce the raw dataset".
- "When two dataset's are merged & one of them have few missing usubjid's:- if a and b; condition is preferred to avoid blanks.
- "When TERM is missing, the derived dataset have blank values". Ex:- To drop observations without value:- if aeterm ne " "; condition is used.
- "When the derived dataset have both TERM and STAT:- if aeterm ne " " or aestat ne " "; condition is used".
- "When the result is not missing in raw dataset then ORRES=result, STRESC=ORRES and STRESN=STRESC(the numeric form of STRESC)". Usually ORRES & STRESC are displayed as "Y" / "N" instead of "Yes" / "No".
- "When the STRESC is "Y" / "N then STRESN is displayed as "blank" (rather than 1 / 2) ".
- "When END DATE is " " then ENRF="ONGOING"; Example: MH dataset".
- "When a PARTIAL DATE is present in a raw dataset, then the same is displayed in the derived dataset".
- "When a sequence number is assigned, it remains unique for all observations within a usubjid".
- When working on display of DAYS, Programmer's check for DAY:- not displayed as 0".
- When the variables in the SAS data set are not in a desired order then one of the following (ATTRIB, ARRAY, FORMAT, INFORMAT, LENGTH, RETAIN) is used before the SET statement to re-order it.
- When a split character is used in the listing/table programming then the same split character must be used in the proc report with a FLOW option. This flow option hold back the direct display of split chatacter "*" in the output.
- When the following warning is received in the log, the PROC PRINT option is removed from the program. "WARNING: Data too long for column "TEST"; truncated to 125 characters to fit."
- When an order is missing in the final dataset of a table and if the same order is being used in the proc report: - the row for which the order is missing will not be displayed.
- When different order numbers are given for the "result HEADER" and the "RESULT"(in listing/table) & if the same order variable is used in the break after statement (break after ord/ skip;) , there appears a blank line. To avoid this same order number is given for both the header and result.
Age (years)-> result header
blank line ------------
N->result
Saturday, March 15, 2008
10 Essential Proc's
PROC name [DATA=dataset [dsoptions] ] [options];
[other PROC-specific statements;]
[BY varlist;]
RUN;
where:
name:------ identifies the procedure you want to use.
dataset:---- identifies the SAS data set to be used by the procedure; if omitted, the last data set to have been created during the session is used.
dsoptions:-- specifies the data set options to be used.
varlist:----- specifies the variables that define the groups to be processed separately. The data set must already be sorted by these same variables.
options:---- specifies the PROC-specific options to be used.
/***********************************************************/
PROC COMPARE BASE = mydataset
COMPARE = otherds;
RUN;
/***********************************************************/
2) SAS Language allows users to assign their own formats to values in their data. Usually, this involves users asking SAS to replace numbers in their data with some kind of labels. The most common example is the user request that: SAS take '1's and '2's in a variable called SEX and format those values as the words 'female' and 'male' when they are displayed in output.
PROC FORMAT;
VALUE sexfmt
1="Female"
2="Male";
RUN;
FORMAT sex sexfmt.;
O U
-----------------------
1 female
2 male
2 male
1 female
2 male
O: original data, U: output using user-defined format
The original data stay the same -- coded as numbers -- but whenever the format is requested, the output shows the assigned labels instead of the original numbers. These kinds of formats are called 'user-defined formats'. The formats already known to SAS are called SAS formats.
/***********************************************************/
PROC SQL;
SELECT STATE, SALES,
(SALES * .05) AS TAX
FROM USSALES;
QUIT;
It has the functionality of DATA and PROC steps into a single step. PROC SQL can sort, summarize, subset, join (merge), and concatenate datasets, create new variables, and print the results or create a new table or view all in one step.
/***********************************************************/
- Data set level
Name
Engine
Creation date
Number of observations
Number of variables
File size (bytes) - Variable level
Name
Type
Length
Formats
Position
Label
PROC CONTENTS DATA=temp (options);
RUN;
Few options:-
Position: output lists the variables by their position in the data set (default is alphabetical).
Short: output is just the variable names in a row by row format.
Out=filename: creates a data set where each observation is a variable from the original data set.
/***********************************************************/
5) Values of all the variables in the data set will be listed (by PROC PRINT) unless a variables (VAR) statement accompanies the procedure. The VAR statement can also be used to specify the order in which variables are to be printed.
PROC PRINT DATA=temp (options);
RUN;
Few options:-
N: The number of observations is printed after the listing.
UNIFORM: This option specifies that the values for each variable be printed in the same columns on every page of the output.
DOUBLE: This option forces SAS to double space the output.
ROUND: This option causes SAS to round variables being summed. NOOBS: This option causes suppression of the observation number in the printed output.
LABEL: Use information from LABEL statements (where defined) as the column headings rather than the variable names.
/***********************************************************/
6) The FREQ procedure produces one-way frequency tables and n-way cross tabulations. For example, to obtain frequency counts for all variables in the SURVEY data set, enter: PROC FREQ;
PROC FREQ DATA=CLASS;
TABLES AGE*HEIGHT;
RUN;
Observe that:
- Tables can be produced for numeric and character variables.
- More than one TABLES statement can be used.
- When the TABLES statement is omitted, one-way frequency tables are printed for each variable in the data set.
- There is no limit to the number of variables in a TABLES request.
- For a table of two or more variables an * must be given between each pair of variable names.
The following example illustrates the generation of one-, two-, and three-way frequency tables:
PROC FREQ;
TABLES sex;
TABLES sex * r1;
TABLES sex * r1 * r2;
The first TABLES statement generates a one-way frequency table for the variable SEX. The second TABLES statement generates a two-way frequency table for the row variable SEX and the column variable R1. The third TABLES statement generates a three-way frequency table for the variables SEX, R1, and R2; R1 is the row variable and R2 is the column; a new table is printed for each different value of the variable SEX.
/***********************************************************/
7) The MEANS procedure is used to produce simple descriptive statistics on numeric variables in a SAS data set.
PROC MEANS DATA=CLASS;
VAR HEIGHT WEIGHT;
RUN;
/***********************************************************/
8) PROC TRANSPOSE helps to reshape the long data to wide one.
data long1;
input famid year faminc;
cards;
1111 96 40000
1111 97 40500
1111 98 41000
2222 96 45000
2222 97 45400
2222 98 45800
3333 96 75000
3333 97 76000
3333 98 77000 ;
run;
proc sort data=long1; by famid; run;
*incyr: income year, prefix is used for identification;
PROC TRANSPOSE DATA=long1 OUT=wide1 PREFIX=incyr;
*same sort order is used as in proc sort statement;
BY FAMID;
*this is the variable that would become as: identifier/header;
ID YEAR;
*this is the variable that would become as: observation;
VAR FAMINC;
RUN;
Result:
Obs famid _NAME_ incyr96 incyr97 incyr98
1--- 1111 --faminc- 40000 -40500 -41000
2--- 2222 --faminc- 45000 -45400 -45800
3--- 3333 --faminc- 75000 -76000 -77000
/***********************************************************/
9) PROC SORT: sort's the data file. The program below sorts the data file called "DRAFT" on the variable "USUBJID and VISIT" and saves the sorted file as "DRAFT1". The original file remains unchanged since we used out=DRAFT1 to specify that the sorted data should be placed in DRAFT1.
PROC SORT DATA=DRAFT OUT=DRAFT1;
BY USUBJID VISIT;
RUN ;
*new data;
DATA DRAFT2;
SET DRAFT1;
BY USUBJID VISIT;
*if first dot is used to pick out the first obs;
IF FIRST.VISIT;
RUN;
(NODUPKEY option: is used with proc sort to remove duplicates)
/***********************************************************/
10) PROC REPORT: Influence the way in which a data has to be presented and the compute block introduces lines of text into the report.
PROC REPORT .......;
column sex;
.....;
COMPUTE BEFORE sex;
line@1 'Gender';
line@1 sex $200.;
endcomp;
RUN;
Monday, March 10, 2008
TRANWRD, COMPRESS & INDEXW
TRANWRD Function: "Replace all ocurrences of a word".
name=tranwrd(name, "Miss", "Ms.");
put name;
Value: Miss. Joan Smith
Result: Ms. Joan Smith
More: http://www.asu.edu/sas/sasdoc/sashtml/lgref/z0215027.htm
COMPRESS Function: "Removes specific characters from a character string".
a='AB C D ';
b=compress(a);
put b;
Value: 'AB C D ';
Result: ABCD
More: http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000212246.htm
INDEXW Function: "Returns the first position in the character-value that contains the find-string". If the find-string is not found, the function returns a 0.
indexw(STRING1,"the");
Value: STRING1 = "there is a the here"
Result: 12 (the word "the")
More: http://support.sas.com/publishing/pubcat/chaps/59343.pdf
-
Wednesday, March 5, 2008
Formulas for creating Vital Signs Dataset
For temperature:
Conversion of Fahrenheit to Celsius:
Temperature_in_Celsius = (5/9)*(Tf-32); *where Tf is the "temperature in fahrenheit";
Conversion of Celsius to Fahrenheit :
Temperature_in_Fahrenheit = (9/5)*Tc+32; *where Tc is the "temperature in celsius";
height:
Conversion of Inches to Centimeters:
Centimeters = inches x 2.54;
Conversion of Centimeters to Inches:
Inches = centimeters x 0.3937;
weight:
Conversion of Pounds to Kilograms;
Kilograms = lbs / 2.2;
Conversion of Kilograms to Pounds:
Pounds = kg x 2.2;
& bmi:
*Weight in Pounds:
BMI = (Weight in Pounds / (Height in inches) x (Height in inches)) x 703
*Weight in Kilograms:
BMI = (Weight in Kilograms / (Height in Meters) x (Height in Meters))
-
Saturday, February 23, 2008
Industry Reports
The Thomson CenterWatch 2007 Survey of Investigative Sites in the U.S. finds that: sites have rated Kendle, Covance, Omnicare as Top CROs to work with.

Sunday, February 10, 2008
Basics & Books to Read
The Little SAS Book
More:
Visit: "SaS9 Learn to Lead"
Thursday, February 7, 2008
Preparing for SAS Base Certification?
- The best way to prepare for certification is: To read the book: SAS Certification Prep Guide: Base Programming for SAS 9 . This book helps in understanding the basics of SAS Programming.
- After mastering the book, one can try solving questions in this blog: http://sascert.blogspot.com/ (which is very useful). There are about 120 questions and it is worth reading.
- For all who wd plan to take the exam: "Only practical programming make someone a better programmer".
SAS Certified Base Programmer Credential for SAS 9
2 hours duration
70 multiple-choice questions
Must answer 46 correctly to pass
SAS Programming I: Essentials
Course Contents
Getting Started with SAS
- overview of the SAS System
- introduction to SAS programs
- running SAS programs
- mastering fundamental concepts
- diagnosing and correcting syntax errors
- exploring the SAS environment (self-study)
Getting Familiar with SAS Data Sets
- explaining the concept of a SAS data library
Producing List Reports
- getting started with the PRINT procedure
- sequencing and grouping observations
- identifying observations (self-study)
- using special WHERE statement operators (self-study)
Enhancing Output
- customizing report appearance
- formatting data values
- creating HTML reports
Creating SAS Data Sets
- reading raw data files using column input and formatted input
- examining data errors
- assigning variable attributes
- changing variable attributes (self-study)
- reading Microsoft Excel spreadsheets (self-study)
Programming with the DATA Step
- reading SAS data sets and creating variables
- executing statements conditionally
- dropping and keeping variables (self-study)
- reading date fields from Microsoft Excel spreadsheets (self-study)
Combining SAS Data Sets
- concatenating SAS data sets
- merging SAS data sets
- combining SAS data sets using additional features (self-study)
Producing Summary Reports
- introduction to summary reports
- generating basic summary reports
- using the REPORT procedure
- creating reports using the TABULATE procedure (self-study)
Introduction to Graphics using SAS/GRAPH Software (Self-Study)
- producing bar and pie charts
- enhancing output
- producing plots
Additional Resources:-
Using SAS Enterprise Guide
- creating the files needed for the course
- understanding functional areas in SAS Enterprise Guide
- naming a project
- working with existing code
- resizing windows in SAS Enterprise Guide
- modifying code
- executing SAS code
- viewing SAS Enterprise Guide output
- diagnosing and correcting syntax errors
- reating SAS programs
- accessing data sources with the LIBNAME statement
- renaming a code node in the Process Flow window
- submitting programs
- saving projects
- the Output Delivery System (ODS) and SAS Enterprise Guide
- copying SAS programs within a project
Introduction to Graphics Using SAS Enterprise Guide
- producing and modifying a vertical bar chart
- producing and modifying a pie chart
- producing a horizontal bar chart
- producing a two-dimensional plot
SAS Programming II: Manipulating Data with the DATA Step
Course Contents
Introduction
- review of SAS basics
- review of DATA step processing
- review of displaying SAS data sets
- working with existing SAS data sets
Controlling Input and Output
- outputting multiple observations
- writing to multiple SAS data sets
- selecting variables and observations
- writing to external files
Summarizing Data
- creating an accumulating total variable
- accumulating totals for a group of data
Reading and Writing Different Types of Data
- reading delimited raw data files
- controlling when a record loads
- reading hierarchical raw data files
Data Transformations
- manipulating character variables
- manipulating numeric variables
- manipulating numeric variables based on dates
- converting variable type
Processing Data Iteratively
- performing DO loop processing
- performing SAS array processing
Combining SAS Data Sets
- match-merging two or more SAS data sets
- performing simple joins using the SQL procedure (self-study)
Learning More
- identifying additional resources
For further details: http://support.sas.com/certify/index.html
Tuesday, February 5, 2008
For SAS Beginners: Self-Paced e-Learning
- Open this page in Internet Explorer and click on "SUPPORT & TRAINING".
- This hyperlink will take you to another page: http://support.sas.com/
- In the left corner of the page, you can find LEARNING CENTER: Training which when clicked would expand to:
Self-Paced e-Learning (click here to sign up)
- Not registered yet? Sign up now.
- Once this form is filled, the user can see Free Tutorials
"TUTORIAL: Getting Started with SAS(R) Free:",=> Add to Cart. - Now, click onYour Cart Symbol on Top Right corner...
This blog:
"welcome to SAS world": http://learningworld-baithi.blogspot.com/ is also an useful link. Few interesting topics include: SAS Clinical Questions & SAS Programming
Monday, February 4, 2008
SAS: Numeric conversion of variables containing special characters and alphabets
- Sometimes, a programmer have to convert few lab results (variable: which are in character) to numeric form.
- These variables may contain numbers(123456789.) as well as special characters(*/()*&%$,:'+-<>") and alphabets.
- In such case, the following code does the numeric conversion of that variable:
if indexc(result,"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz*/()*&%$,:'+-<>") = 0 and result ne " " then do;
numericresult=input(result,best.);
end;
Wednesday, January 23, 2008
This is how I started coding......... in SAS
data marriedlife;
input motherinlaw sisterinlaw;
datalines;
1220000 560000
;
run;
***** deriving husband *******;
data husband_temp;
set marriedlife;
if motherinlaw ne . and sisterinlaw ne . then do;
husband = motherinlaw + sisterinlaw;
end;
drop motherinlaw sisterinlaw;
run;
data laterinlife;
input salary pension;
datalines;
1570000 650000
;
run;
***** deriving life *******;
data mylife_permanent;
set laterinlife;
if salary ne . and pension ne . then do;
husband = salary + pension;
mylife=husband;
end;
drop husband;
run;
Just for fun ;)
Wednesday, January 9, 2008
Hell to Hell: A joke that all CRO employees would love to read
Politician said:
"I miss my country. I want to call my country and see how everybody is doing there."
He called and talked for about 5 minutes, and then he asked:
"Well, devil how much I need to pay for the call????"
The devil said:
"Five million dollars".
The Politician wrote him a cheque and went back to sit on his chair.
Thief was so jealous and he started screaming:
"My turnnn! I wanna call my group members, I want to see how everybody is doing there"
He called and talked for about 2 minutes, and then he asked:
"Well, devil, how much do I need to pay for the call????"
The devil said:
"Ten million dollars".
With a smug look on his face, he made a cheque and went back to sit on his chair.
CRO employee was even more jealous & started screaming:
"I want to call my office friends, managers and partners"
He called other CRO employees and talked for twenty hours:- about job, audit files, timelines, new clients....., he talked & talked & talked,then he asked:
"Well, devil how much do I need to pay for the call????"
The devil said:
"Twenty dollars".
CRO employee was stunned & said:
"Twenty dollars Only ?????" !!!!!!!!
Devil said:
"Calling from Hell to Hell is considered as Local Call".
[Courtesy: http://m-nicky.livejournal.com/1926.html]
Wednesday, October 24, 2007
Biometrics, the final touch..... of clinical research
I. Deriving the dataset:
For example: The raw dataset may contain two variables namely,
a) Birth date of the patient &
b) Visit date of the patient
but it may not contain a variable called: “age”.
In order to derive the variable “age” from birth date and visit date, programmer's write code [age = int ( (visitdt - birthdt) / 365.25 ) + 1;]. Like wise many other variables are derived from the raw dataset to make a derived dataset. These derived datasets make table programming much easier.
II. Listing:
Next step is the listing of data's from the derived datasets (with little or no SAS programming).
III. Tables:
Here the programmer’s subset data according to the mock.
Listings and Tables are done based on these data's:
1) Demographics
2) Patient disposition
3) Vital signs / Lab test
4) Adverse event
5) Study drug
6) Disease progression / Death
7) Efficacy
The ultimate aim of Bio (Biological) metrics (Measurement) is to find out:
1) The difference between treatment groups
2) Change from baseline to end of study and
3) Comparison of patient subsets
Sunday, October 14, 2007
How to remove the deleted page from Google's cache?
Our personal information (submitted to some job search sites) could be seen in Google search. To remove it, first the page has to be deleted from the website. Even after deleting, the page might continue to remain in Google's cache (and they appear in the search result). To delete the page from Google's cache, the following link can be used....
Webpage removal request tool:
https://www.google.com/webmasters/tools/removals?pli=1
This tool is used to remove:-
- Sensitive information from Google
- Outdated or "dead" link in the Google search results
- Inappropriate webpage or image that appears in SafeSearch filtered results
Tuesday, September 25, 2007
Learning Basics of Clinical Trials
- Investigators
- Nurse
- Data managers and
- Statisticians
It helps the user learn:- History, Basics, Informed consent, IRB Review, Ongoing Protections and International Research in Clinical Trials
Login> http://cme.cancer.gov/clinicaltrials/learning/humanparticipant-protections.asp
Wednesday, September 19, 2007
List of Universities providing Clinical Research Courses
1. Department of Bioinformatics,
University of Pune, Pune
2. H.V. Desai Eye Hospital
Mohammad Wadi, Hadapsar, Pune
3. A335 Shivalik Enclave,
New Delhi
4. Sankara Nethralaya, 18,
College Road, Chennai
5. T John College 88/1 Gottigere,
Bannerghatta Road Bangalore
---
C) Manipal University: http://www.crra.manipal.edu/manipaladva.html
Manipal Universal Learning Pvt. Ltd
Airport Road, Bangalore
Friday, August 11, 2006
Y Clinical Research in India
Clinical Research is another emerging field of Life science. The word Clinical trial means “evaluation of new drug”. There are different stages in Drug discovery which are listed as below:
- Drug Target Identification and Validation (Target is a biological molecule which plays an important role in the genesis or development of an illness)
- Developing Leads from Hits, Lead Optimization and Development ('Hit' is a chemical compound that shows activity in primary screen and 'Lead' is a compound with confirmed activity that warrants development)
- Drug Development Process: This process includes- Chemistry and synthesis of drug molecules, In vitro and animal model efficacy, Defining the appropriate dose, Defining the right animal models, In vivo dosage, Pre- clinical drug development stage, Nonclinical drug development stage, Clinical drug development stage, Manufacturing & Filing an Investigational new drug Application
In the above mentioned drug development process, the Clinical drug development stage i.e., the clinical trials is an expensive processes which involve 4 phases
· Phase 1 Safety and Tolerance Study - (20 to 30 healthy volunteers are involved in the study to check for safety & dosage)
· Phase 2 Efficacy Studies - (100 to 300 patient volunteers are involved to check for efficacy & side effects)
· Phase 3 Definitive Safety and Efficacy Studies - (1000 to 5000 patient volunteers are involved for monitoring reactions to long term drug use)
· Phase 4 studies – (are done once the drug has received market approval)
It takes about 15 years and $800 million dollars for researchers to find out a single drug but the cost of conducting the clinical research in India would be quite lesser because India provides number of well trained life science graduates & other required facilities at a lower cost.
