What they don’t teach you about data science when you do a data science degree
In a couple of weeks’ time, I’ll attend my Zoom graduation and I’ll officially receive my data science degree, MSc with Distinction! As a social scientist and management scholar through and through, I worked a lot to develop my programming, statistics, and computer science knowledge. I started the degree terrified by matrix multiplication, and ended it being able to explain Stochastic Gradient Descent while watching videos of rotating tesseracts. I learned a lot.
But as a social scientist and management scholar, I was also able to see all this knowledge from a different perspective than your usual computer science or statistics major. I was constantly thinking about how all that I was learning would apply in the business context. As it turned out, I wasn’t the only one. In one particularly tough lecture still before the pandemic, someone at the back of the lecture room raised their hand and asked: “…but how can I use this to help my company?”. This, we were told, we’d have to learn on the job.
As I reflect on everything I now know about data science, I realize how richer these skills are in the business context if someone also happens to know about management. Data science graduates who come from computer science, math or statistics backgrounds are great at all things big data and analytics, but they’re some crucial elements of their jobs they’re not usually taught as part of their degrees.
How to understand business problems?
Every data science project starts from a business problem. Someone in the company identifies an issue, and thinks data science can help but doesn’t know how. This person, usually from the business, then meets with data scientists and assumes the problem, if identified, is as obvious to them as to the management. Data scientists with limited business backgrounds and management knowledge will find it rather difficult to understand what the companies they work for struggle with, and how they can help. This can result in a lot of frustration, especially when after listening to an embattled senior executive explaining the company is losing too many customers this quarter, the data scientist shrugs and says well, spend more money on marketing and just get new customers instead. Data scientists are not, by default, taught about businesses and their problems.
How to turn business problems into analytics questions?
Even if a business problem is well explained and the data scientist understand its importance, relevance, ramifications and context, there’s still a long way from this point to training a predictive model that produces promising answers. The first hurdle is to turn a business problem into a question that can be answered with data and analytics. Many new data scientists will see business problems through the analytical frames they were trained to work with, rather than trying to start from the problem and turn it into a series of potential questions that they can test-drive with the business. Even the most accurate predictive model will be useless if it predicts a variable that’s not really related to the problem.
How to give meaning to results?
Data scientists tend to be told that their job is done when the model is accurate, the data flows, and the predictions are happily generated. They’re perhaps even led to believe that all the management cares about is the output of the model and how well it does on the test dataset. Well, I’m sorry to break it to all the aspiring data scientists out there. Management is interested in how the results of your work help solve the problem the company is facing. Unfortunately, they often can’t figure it out by themselves just looking at your model. And at the same time data scientists aren’t trained how to package their work in a way that shows this link between the output of the model and the problem.
How to work with others in the company?
Most future data scientists I met when doing my degree were computer science, statistics or math majors. Nothing in what we studied prepared us for the reality of data science in workplaces where data scientists often have to interact and work with other, very different professions. A data scientist will have to speak jargon-free to an HR professional to understand some data points collected. Or she will have to get back to that executive with clear, to-the-point questions about the problem. Most likely, you won’t be taught how to do this as part of your degree.
How to communicate findings effectively?
In one particularly interesting assignment, I had to conduct a Principal Component Analysis on a complex dataset and then report on my work. I responded as a good data scientist would – with lots of very busy diagrams, more numbers, and even more acronyms. This got me a very good grade. But it wouldn’t get me far in the business context. Data scientists aren’t routinely trained how to communicate about their work in a way that can be well understood by those who’re interested the most: their stakeholders. This can be unpleasant to both sides. Data scientists end up disappointed that their hard work isn’t getting traction, and the business is disappointed that they didn’t get the insights they needed.
As I register for my virtual graduation, I want future data scientists to know what they most likely won’t be taught during their data science degree so that they’re aware of it and can act in advance. It turns out the five areas I talk about here are extremely important parts of data science at work. Don’t wait to learn these skills on the job. You’re much more likely to be perceived as a valuable asset and a good potential hire if you develop them in advance. How? Choose a data science degree with a business focus, or one that’s offered at a business school, or prepare your own learning plan.