Friday, December 3, 2010

Performance tuning while working with large datasets


1. WHERE to subset data
2. KEEP / DROP to reduce cpu time
3. LENGTH to reduce variable size
4. CHARACTER variables need to be created as much as possible
5. IF-THEN/ELSE to improve efficiency
6. MACROS for redundant code
7. PROC SORT only when needed
8. PROC SQL to reduce the number of steps
9. INDEX to read large datasets
10. COPY to copy dataset with index
11. COMPRESS to reduce number of bytes
12. DATA _NULL_ for processing null datasets
13. PROC APPEND instead of set
14. PROCs with CLASS statement need to be used
15. SASFILE to reduces I/O processing
16. STORED PROGRAM FACILITY for complex data steps
17. BUFSIZE for the size of the input/output buffers
18. REUSE for whether free space is reused
19. POINTOBS to randomly access by an observation number
20. NOMACRO to conserve memory
21. KILL unwanted datasets
22. FORMAT/ INFORMAT instead of if then else (for logics)
23. VIEWS to create virtual tables
24. SAS FUNCTIONS to perform common tasks
25. COMBINE steps to reduce number of DATA and or PROC steps

No comments:

Post a Comment