Graphically Speaking

Category highlighting

When presenting information in form of a graph we show the data and let the reader draw the inferences. However, often one may want to draw the attention of the reader towards some aspect of the graph or data. For one such case, a user asked how to highlight one (or more) boxes in a box plot.

Another simpler and effective way would be to draw an “underglow” effect for the bars that need highlighting. This can be easily done by drawing reference lines at specific category values behind the box or bar chart as shown below.

In the first case, I decide to highlight the “Coronary Heart Disease” category. The graph and the SGPLOT code is shown below.

SAS Certifiations, SAS Guide, SAS Tutorials

title ‘Cholesterol by Death Cause’;
proc sgplot data=sashelp.heart noborder;
refline ‘Coronary Heart Disease’ / axis=x
lineattrs=(thickness=70 color=yellow) transparency=0.6 ;
vbox cholesterol / category=deathcause;
yaxis offsetmin=0.05 offsetmax=0.05 display=(noline noticks) grid;
xaxis offsetmin=0.1 offsetmax=0.1 display=(nolabel);
run;

In the graph and code above, I have drawn a REFLINE on the x-axis, with a specified value of “Coronary Heart Disease”. The thickness of the reference line is set to 70px with a yellow color. The “70” is a guess that works well for this case. So, the above solution is very specific, with hard coded values. Note, I also have to set x and y axis offsets to prevent the thick reference line from skewing the offsets.

Alternatively, we may be able to determine the categories that need to be highlighted based on some criteria in data step code. Such a case is shown below. In this case, I have used the MEANS procedure to compute the mean mileage, and then decided to highlight all car types with mileage > 20. I do this by creating another column called “Highlight” and copy the type value into it when mileage > 20. Now, I can draw the reference line by column “Highlight” in the data itself. Data set, graph and code are shown below.

SAS Certifiations, SAS Guide, SAS Tutorials

SAS Certifiations, SAS Guide, SAS Tutorials

title ‘Average City Mileage by Type’;
proc sgplot data=cars noborder;
<strong>refline highlight</strong> / axis=x
lineattrs=(thickness=100 color=gold) transparency=0.4 ;
vbar type / response=mean dataskin=pressed fillattrs=graphdata2 barwidth=0.7;
yaxis offsetmin=0.0 offsetmax=0.05 display=(noline noticks) grid;
xaxis offsetmin=0.1 offsetmax=0.1 display=(nolabel);
run;

In this case, the REFLINE uses the “Highlight” column, and thick reference lines are drawn where the Highlight column contains a category name. The thickness of the reference line is still hard coded.

Now, it is clear that it would be nice if we did not need to hard code the thickness of the reference line. The space between the category values varies with each graph so we need a way to set that by option. Also, this should not adversely impact the axis offsets.

Such a feature is planned for the V9.40M5 release of SAS SGPLOT procedure and GTL. You can use a new option “DISCRETETHICKNESS=fraction” to make the reference line a fraction of the midpoint spacing as shown below.

SAS Certifiations, SAS Guide, SAS Tutorials

title ‘Average City Mileage by Type’;
proc sgplot data=cars noborder ;
<strong>refline highlight / axis=x discretethickness=0.9</strong>
lineattrs=(color=gold) transparency=0.4 ;
vbar type / response=mean dataskin=pressed fillattrs=graphdata3 barwidth=0.7;
yaxis offsetmin=0.0 display=(noline noticks) grid;
xaxis display=(nolabel);
run;

Here, the DISCRETETHICKNESS=0.9, so the reference line is 90% of the midpoint spacing, regardless of the pixel width of the spacing. Also, the offsets are not adversely impacted. So, this is a sneak preview into a new option coming with SAS 9.40M5 that will make such customization easier and scalable..

Full SAS 9.40M3 SGPLOT code: Highlight

%let gpath=C:\;
ods html close;
%let w=5in;
%let h=3in;
%let dpi=200;
ods listing gpath=”&gpath” image_dpi=&dpi;

/*–Highlighted box plot–*/
ods graphics / reset width=&w height=&h imagename=’BoxHighlight’;
title ‘Cholesterol by Death Cause’;
proc sgplot data=sashelp.heart noborder;
refline ‘Coronary Heart Disease’ / axis=x
lineattrs=(thickness=70 color=yellow) transparency=0.6 ;
vbox cholesterol / category=deathcause;
yaxis offsetmin=0.05 offsetmax=0.05 display=(noline noticks) grid;
xaxis offsetmin=0.1 offsetmax=0.1 display=(nolabel);
run;

/*–Compute means–*/
proc means data=sashelp.cars noprint;
class type;
var mpg_city;
output out=cars(where=(_type_=1))
Mean=mean;
run;
data cars;
set cars(where=(type ne ‘Hybrid’));
if mean > 20 then highlight=type;
run;
/*proc print; run;*/

/*–Highlighted BarChart–*/
ods graphics / reset width=&w height=&h imagename=’BarHighlight’;
title ‘Average City Mileage by Type’;
proc sgplot data=cars noborder;
refline highlight / axis=x
lineattrs=(thickness=100 color=gold) transparency=0.4 ;
vbar type / response=mean dataskin=pressed fillattrs=graphdata2 barwidth=0.7;
yaxis offsetmin=0.0 offsetmax=0.05 display=(noline noticks) grid;
xaxis offsetmin=0.1 offsetmax=0.1 display=(nolabel);
run;

title;
footnote;

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s