SAS – Sort Data Sets

Next »»
Data sets in SAS can be sorted on any of the variables present in them. This helps both in data analysis and performing other options like merging etc. Sorting can happen on any single variable as well as multiple variables. The SAS procedure used to carry out the sorting in SAS data set is named PROC SORT. The result after sorting is stored in a new data set and the original data set remains unchanged.

Syntax

The basic syntax for sort operation in data set in SAS is:

PROC SORT DATA=original dataset OUT=Sorted dataset;
BY variable name;
Following is the description of the parameters used:

  • variable name is the column name on which the sorting happens.
  • Original dataset is the dataset name to be sorted.
  • Sorted dataset is the dataset name after it is sorted.

Example
Let’s consider the below SAS data set containing the employee details of an organization. We can sort the data set on salary by using the following code.

DATA Employee;
INPUT empid name $ salary DEPT $ ;
DATALINES;
1 Rick 623.3 IT
2 Dan 515.2 OPS
3 Mike 611.5 IT
4 Ryan 729.1 HR
5 Gary 843.25 FIN
6 Tusar 578.6 IT
7 Pranab 632.8 OPS
8 Rasmi 722.5 FIN
;
RUN;

PROC SORT DATA=Employee OUT=Sorted_sal ;
BY salary;
RUN ;

PROC PRINT DATA=Sorted_sal;
RUN ;

When the above code is executed, we get the following output.

SAS - Sort Data Sets

Reverse Sorting

The default sorting option is in ascending order, which means the observations are arranged as per the lower to higher value of the sorted variable. But we may also want the sort to happen in ascending order.

Example
In the below code reverse sorting is achieved by using DESCENDING statement.

DATA Employee;
INPUT empid name $ salary DEPT $ ;
DATALINES;
1 Rick 623.3 IT
2 Dan 515.2 OPS
3 Mike 611.5 IT
4 Ryan 729.1 HR
5 Gary 843.25 FIN
6 Tusar 578.6 IT
7 Pranab 632.8 OPS
8 Rasmi 722.5 FIN
;
RUN;

PROC SORT DATA=Employee OUT=Sorted_sal_reverse ;
BY DESCENDING salary;
RUN ;

PROC PRINT DATA=Sorted_sal_reverse;
RUN ;

When the above code is executed, we get the following output.

SAS - Sort Data Sets

Sorting Multiple Variables

Sorting can be applied to multiple variables by using them with the BY statement. The variables get sorted with a priority from left to right.

Example
In the below code the data set is sorted first on the variable department name and next on the variable name salary.

DATA Employee;
INPUT empid name $ salary DEPT $ ;
DATALINES;
1 Rick 623.3 IT
2 Dan 515.2 OPS
3 Mike 611.5 IT
4 Ryan 729.1 HR
5 Gary 843.25 FIN
6 Tusar 578.6 IT
7 Pranab 632.8 OPS
8 Rasmi 722.5 FIN
;
RUN;

PROC SORT DATA=Employee OUT=Sorted_dept_sal ;
BY salary DEPT;
RUN ;

PROC PRINT DATA=Sorted_dept_sal;
RUN ;

When the above code is executed, we get the following output.

SAS - Sort Data Sets

Next »»