SAS Learning Post

Determining the size of a SAS data set

When developing SAS® data sets, program code and/or applications, efficiency is not always given the attention it deserves, particularly in the early phases of development. Since data sizes and system performance can affect a program and/or an application’s behavior, SAS users may want to access information about a data set’s content and size. To access, for example, how much disk space a data set is using, users can make a few calculations and/or learn how to access metadata content to obtain the information to this important information. This tip explores a few ways to determine, or estimate, the size of a data set – a question many users are curious about when discussing SAS performance and tuning techniques.

Using PROC SQL and DICTIONARY.TABLES

The SAS System collects valuable information (known as “metadata”) about all known SAS libraries, data sets (tables), catalogs, indexes, macros, system options, views and a collection of other “read-only” tables called Dictionary tables and SASHELP views. One specific Dictionary table, TABLES, and its SASHELP view equivalent, VTABLE, contains details about a SAS session’s data sets. In the following PROC SQL code, the specification of a PROC SQL SELECT-clause is illustrated to access the contents of four columns found in the TABLES Dictionary table, specifically LIBNAME, MEMNAME, MEMTYPE and FILESIZE to display the size of the CARS data set.

PROC SQL and Dictionary.TABLES:

PROC SQL ;
TITLE ‘Filesize for CARS Data Set’ ;
SELECT LIBNAME,
MEMNAME,
FILESIZE FORMAT=SIZEKMG.,
FILESIZE FORMAT=SIZEK.
FROM DICTIONARY.TABLES
WHERE LIBNAME = ‘SASHELP’
AND MEMNAME = ‘CARS’
AND MEMTYPE = ‘DATA’ ;
QUIT ;

Results

SAS Certifications Tutorials and Materials, SAS Certifications Guide, SAS Certifications, SAS Certifications Syllabus

Analysis

As shown in the results, above, the CARS data set filesize is 192KB. Note: When the SIZEKMG. format is specified in a format= option, SAS determines whether to apply KB for kilobytes, MB for megabytes, or GB for gigabytes; and divides the numeric filesize value by one of the following values:

KB      1024
MB     1048576
GB      1073741824

Using PROC PRINT and SASHELP.VTABLE

In the next example, the specification of a PROC PRINT is illustrated to access the contents of three columns found in the VTABLE SASHELP view, specifically LIBNAME, MEMNAME and FILESIZE to display the size of the CARS data set.

PROC PRINT and SASHELP.VTABLE

PROC PRINT DATA=SASHELP.VTABLE NOOBS ;
VAR LIBNAME MEMNAME FILESIZE ;
WHERE LIBNAME = ‘SASHELP’
AND MEMNAME = ‘CARS’ ;
FORMAT FILESIZE SIZEKMG. ;
TITLE ‘Filesize for SASHELP.CARS Data Set’ ;
RUN ;

Results

SAS Certifications Tutorials and Materials, SAS Certifications Guide, SAS Certifications, SAS Certifications Syllabus

Using DATA _NULL_, SASHELP.VEXTFL and CALL SYMPUTX

In the final example, a DATA _NULL_ is illustrated to access the contents of the VEXTFL SASHELP view with a FILENAME statement. An assignment statement is specified to calculate the FILESIZE value for the size of the CARS data set. The CALL SYMPUTX left justifies and trims the trailing blanks from the numeric FILESIZE value of 196608.

DATA_NULL_and SASHELP.VEXTFL

filename myfile ‘C:\Program Files\SAS9.4\SASFoundation\9.4\\CORE\SASHELP\Cars.sas7bdat’ ;
DATA _NULL_ ;
SET SASHELP.VEXTFL (WHERE=(FILEREF=’MYFILE’)) ;
/* Calculate the Filesize in MB */
FILESIZE = FILESIZE / (1024 ** 2) ;
CALL SYMPUTX (‘FILESIZE’,FILESIZE) ;
RUN ;

Results

SAS Certifications Tutorials and Materials, SAS Certifications Guide, SAS Certifications, SAS Certifications Syllabus

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s