a group of people using laptops

This post is adapted from the blog of General Assembly, a Priceonomics Data Studio customer. Does your company have interesting data? Become a Priceonomics customer.

***

How diverse will a lucrative, growing field like data science be in the future?

Will it end up like computer science today (not very diverse) or computer science a few decades ago (much more so)?  One way to prognosticate the future demographic composition of data science is to look at who is studying data science and its prerequisite skills today. For data science, the results are not encouraging.

A recent article in Forbes notes, “Women hold only about 26% of data jobs in the United States. There are a few proposed reasons for the gender gap: a lack of STEM education for women early on in life, lack of mentorship for women in data science, and human resources rules and regulations not catching up to gender balance policies, to name a few.” 

Moreover, federal civil rights data further demonstrate that “black and Latino high school students are being shortchanged in their access to high-level math and science courses that could prepare them for college” and for careers in fields like data science.

Just how diverse is data science? More specifically, if we look at the study of data science as a predictor of future participation in the field, what is the gender and demographic breakdown down of its students compared to other fields? 

We analyzed data from Priceonomics customer General Assembly, an education company that trains students in data science and other technical fields. We analyzed data from their part-time programs (which typically reach students who already have jobs and are looking to expand their skill set as they pursue a promotion or a career shift), here’s what we found: 

While great gender parity strides have been made in fields like web development and user experience (UX) design, data sciencea relatively newer concentrationstill has a ways to go.

Off all the technical education fields we studied, data science had the lowest representation of female students, at just 35.3%. 

Additionally, among these same technical fields, data science had the lowest percentage of African American and Latino/Hispanic students enrolled.

Gender and Data Science

For our analysis, we went through five months’ worth (September 2016 through January 2017) of anonymized enrollment data for part-time General Assembly students (those enrolled in 10- to 12-week evening courses). We chose to focus on part-time data (rather than the full-time program) because the sample size was bigger though the results would be similar.

First, let’s take a look specifically at the gender breakdown of students in these courses.

chart

Data source: General Assembly

On average, part-time courses skew more female (56.5%) than male (42.3%).

Some courses, like Product Management and Data Analytics, seem to come close to gender parity. Front-End Web Development falls in right around the average across all courses, and in Digital Marketing and User Experience Design, both more consumer-facing fields, two-thirds or more students are women.

But the Data Science course shows the largest composition of male studentsand the lowest of female students, at just 35.3%.

Race and Ethnicity in Data Science

Turning to the same anonymized data set, let’s now look at race and ethnicity.

table

Data source: General Assembly

At first glance, it appears that Data Science courses fare pretty well in diversity: The percentage of enrolled students who are white (46.1%) is less than average (46.9%).

But looking specifically at Hispanic/Latino and African-American students, the course hasby farthe lowest total percentage of students.

chart, funnel chart

Data source: General Assembly

To put this data in context, the population of the United States is 62% white, 17% Hispanic or Latino, 12% African-American, and 6% Asian/Pacific Islander.

Just 11.8% of part-time Data Science enrollees are Hispanic/Latino or African-American. That’s 5.7% below the overall average, and nearly half of the figure in the Front-End Web Development courses.

Education in Data Science

This data set also gives us insight into the highest level of education attained from part-time enrollees across these courses.

On average, Data Science students come in with the highest degree attainment.

chart

Data source: General Assembly

Across all courses, 85.4% of part-time these part-time students have a bachelor’s degree or higher; in Data Science, that figure is 93.8%. This seems to largely be driven by the fact that there are far more master’s and Ph.D. graduates in Data Science (37.7%) than the overall average (24.%). A surprisingly high 3.7% of students hold a Ph.D.more than triple the average of 1.2%.

Data Science seems to draw from a smaller, more specialized pool, which could, in part, perpetuate diversity issues.

Data Science Is Still New

Female and minority students have made positive strides in coding and tech education in this data set.

When coding and web development started getting increasingly popular two decades ago, the fields were almost entirely dominated by menmost of whom were white. 

Looking at the data here, though, it’s clear things have changed dramatically: Front-End Web Development courses are now 57% female and boast the highest percentage of students of color of any course. Since data science is still a relatively new field, it is possible things may just take some time to equalize but its entirely possibly it won’t unless the issue is addressed directly. 

***

Note: If you’re a company that wants to work with Priceonomics to turn your data into great stories, learn more about the Priceonomics Data Studio. Top image via Flickr user gdsteam.