Tuesday, March 11, 2008

A List of SAS Questions Every Statistician has to Know Their Answers - This List Will be Evolving

  1. What is a PDV (program data vector)
  2. Why we need a RETAIN statement; give an example; relate it to PDV
  3. Want to post the total number of observations to all the records; what are the codes (if we do not manually input anything)
  4. How do we read multiple records to create a single individual
  5. How do we output multiple individual records from a single record (assuming multiple individuals' data are available in a single record)
  6. What is the purpose of first.account and last.account when we set data sets by account? Think of an example where this is useful. Why is this not available outside of the data step, automatically (HINT: relate it to PDV basics)
  7. How do we find the macro variable names and their values? (HINT: Use PROC SQL)
  8. What is metrics "C" in logistic regression output? how is it related to the area under the ROC curve
  9. What is sensitivity and specificity? Which is more important than the other in a marketing modeling context
  10. What goes on in the PDV when we use the statement "if _n_=1 then set a; set b;"
  11. How to subset observations at input level?
  12. What is the difference between obs=10 in options statement and obs=10 in the infile statement
  13. What is the purpose of missover, flowover, DSD, DLM options in infile statement?
  14. Identify the list of SAS generated important variables (during SET statement) which are not available outside the data step.
  15. ...

No comments: