SAS Learning Post

# Are more women getting STEM degrees?

For the past several years, efforts have been under way to recruit more women into the STEM (science, technology, engineering, and math) fields. I recently saw an interesting graph showing the percentage of bachelor’s degrees conferred to women in the US, and I wondered if I could tweak that graph a bit to focus on the STEM majors.

But before we get started on the technical graphs, here’s a fun ‘graduation’ photo… This is a picture of my friend Margie, graduating from Hurricanes U. She’s a huge Carolina Hurricanes hockey fan, and has earned the nickname “Clever Sign Chick” for the signs she holds up during the games. So, of course, when the Hurricanes offered a special training camp, she was head of the class!

And now, let’s get technical … Here’s a snapshot of Randy Olson’s original graph. It’s difficult to show so many lines on a graph without it looking like spaghetti, but I think he did a pretty good job. In particular, I like how each line is labeled, and the colors of the labels match the colors of the lines (eliminating the need for a color legend).

Randy showed the data source URL in a footnote, but it still took a bit of digging to determine exactly which table(s) on the NCES (National Center for Education Statistics) website he had used. I finally determined that the data was in the ‘325.x’ tables, and I downloaded the Excel spreadsheets and wrote some SAS code to import them (… 17 separate tables, which each had the data in slightly different ranges of cells). The basic plot was easy, but the tricky part was annotating the text labels for each line, making the label colors match the line colors, and adjusting the y-position of the labels so they didn’t overlap.

And here’s my new version of the plot:

Here’s a list of the changes I made:

• I wanted to emphasize the STEM fields, therefore I used darker colors for those lines, and lighter colors for the other lines.
• To make the STEM lines even easier to identify, I also added plot markers to those lines.
• I made my axis go to 100%, rather than just 90%.
• I made the 50% reference line darker & bolder, since that’s the important balance point.
• I mention in the footnote exactly which set of NCES tables the data came from.
• I used the full descriptive text from the tables as the line labels, rather than shortening the text (the shortened names in the original graph could be a little misleading – for example, I used “Public Administration and Social Services” rather than just “Public Administration”).
• And I got the latest data, so I was able to extend the graph out to 2015 (instead of just 2010).

Were there any surprises or insights you were able to derive from this graph? The biggest surprise for me was the “Computer and Information Sciences” line – I assumed that the percentage of women in this field had been increasing, but apparently it peaked in the mid-1980s and has decreased by about 50% since then.