SAS Tips: The LENGTH statement
In SAS, the length of a variable is the number of bytes SAS allocates for storing the variable. It is not necessarily the same as the number of characters in the variable.
Why use it? To reduce the size (in terms of disk space) of SAS data sets.
By default, SAS uses 8 bytes to store numeric variables. Variables containing integer values can be stored using less than 8 bytes.
Length in bytes | Largest integer represented exactly |
3 |
8,192 |
4 |
2,097,152 |
5 |
536,870,912 |
6 |
137,438,953,472 |
7 |
35,184,372,088,832 |
8 |
9,007,199,254,740,990 |
Data sets containing many integer variables (such are common with data collected by questionnaire) or indicator variables can be reduced in size by more than 50%.
Variables containing real numbers should be left with the default length of 8.
Examples using the LENGTH statement:
data one; length sex age 3 pnr 6; ... run;
data two; length pnr 6 default=3; ... run;
Details are given on pages 429-30 of the SAS Language: Reference manual (version 6 first edition), which is reproduced in full in the online help, and the relevant host companion.
Note that specifying a length less than that required will result in a loss of precision without any warning being given (see the example on page 92 of the SAS Language: Reference manual). For example, the following code will produce the output shown below:
data temp;
length x 4 y 3;
do x=9000 to 9010;
y=x;
output;
end;
run;
proc print;
run;
OBS X Y
1 9000 9000
2 9001 9000
3 9002 9002
4 9003 9002
5 9004 9004
6 9005 9004
7 9006 9006
8 9007 9006
9 9008 9008
10 9009 9008
11 9010 9010
The variable X has length 4, which can store integers up to 2,097,152 without loss of precision, whereas the variable Y has length 3 which can store integers up to 8,192 without loss of precision. As such, Y is not able to store all values precisely.
Variable lengths can also be assigned using the ATTRIB statement, which can also assign other variable characteristics (e.g. formats, informats, labels).