Sunday, May 25, 2008

List of Basic Questions For SAS Beginners

In this section, which will get updated regularly, I am trying to bring together the list of basic SAS beginners exercise list (Homework list) based on request from some readers. For convenience this will be arranged systematically as in SAS BASICS reference manuals. I will first point out the key points and then fill in with examples over time.

Data Set Options:

  • Data set options override OPTIONS statement
  • Data set options are usable in data statement or set statement. Be watchful about the effect of data set options when specified in Data statement vs. set statement.
  • Top data set options
  • Difference between ALTER=, PW=, READ=, WRITE=, ENCRYPT (this is the most restrictive) for managing passwords; if encrypt= option is used and password is lost, the only way to get the data set is to recreate the data set
  • The importance of CNTLLEV=LIB/MEM/REC for simultaneous shared access to data sets
  • KEEP=, DROP=, RENAME=(old name=new name ....)
  • The difference between FIRSTOBS= and OBS=
  • GENMAX= vs. GENNUM= usage.
  • Usefulness of IDXNAME and IDXWHERE, and INDEX
  • SORTEDBY=
  • WHERE= and WHEREUP=YES/NO

Exercises (You have a basic SAS data set, say FIRST, with vars, VAR1, VAR2, VAR3, VAR4, VAR5 VAR6, VAR7, VAR8, VAR9, VAR10, VAR11):

  1. Read observations from 5 to 10 and print them with page options which provides 132 columns width and 64 rows.
  2. Store the 5 observations you created from FIRST and save it with READ=mypass and with MAXGEN=2. Call this as data set SECOND.
  3. The SECOND data set should have facility for two people to read and work on it at the same time (Note: otherwise the data set will keep the lock with some one who opened it first and will not allow even the creator to work on it, if the creator wants to modify the data set)
  4. Keep only VAR7 and VAR8 by renaming them as GENDER and AGE and also at the same time selecting only those who are aged 75 and above); Name this data set as THIRD
  5. Modify the THIRD data set, created out of exercise 4 with IDXNAME created out of VAR1, VAR2, and VAR6 (FirstName, LastName, ZIP, with _ as the connector)
  6. Create data set FOURTH, from THIRD, creating index from VAR10 (which is PHONENUM)
  7. Create data set FIFTH, from FOURTH sorted by VAR9 (which is CUSTOMER_NUM), keeping only those records for which there is VAR11 not missing (SSNUM)

In the next summary, we will see the FORMATS.

No comments: