Sunday, February 19, 2012

SAS 1001 Tips - Tip 11 - How to read varying length character variable

The default character length that is assigned in SAS data steps before data is read is 8 character length.

Some times the data comes with character variables which are of different lengths and with in each variable, the records also change their lengths because of the nature of data (for example, the character variable is say capturing the web site page url name) or collection of items mentioned in the variable values itself. Say the data comes like this in a file called F:\import.txt

origin products TotalUnits

China printer, printer ink 5,678
Mexico stem tomoto 22,212
USA pears apples 121,425
South Africa

The right way to read this is as follows

data import;
infile 'F:\import.txt' delimiter = ' ' dsd;
input origin $12 products & $25. units;
;
proc print;
run;


The & symbol says that this dsd file needs to be read until it finds the next variable. The alternative symbol is :, which means give maximum character length of 25 characters. & goes extra step to read upto 25 characters and also treat consequetive spaces as single space.

No comments: