Many people who preferred to get an education in statistics, like myself, have heard many terms for roles that they may pursue post-academic career. You might hear “Data Analyst”, “Data Engineer”, “Statistician”, and “Data Scientist” – just to name a few. The title “Data Scientist” has become one of the more popular terms that companies are using to describe someone who analyzes data, despite using the wrong description of the role. So what does this term really mean and how might it be different than that of a “Statisician” in which companies are actually hiring?

By definition, a Data Scientist is someone who interprets data to help organizations draw insights that said organizations can apply to their business solutions. Some of their duties include collecting and cleaning data, integrating and storing data, preparing data to transfer, and using machine learning models. Usually most Data Scientists have a background in the math/statistics, computer science, engineering, business/economics, and/or the natural sciences. As opposed to other terms listed in the previous paragraph, being a Data Scientist is more than just analyzing data; it is painting a picture using raw numbers and text so your constituents can make decisions without having an excellent math/statistics background that you might have inquired.

So how is that different than a “Statistician”? For starters, a Statisician is someone who uses mathematical and statistical methods to collect and interpret data to solve problems. Some might say that a Data Scientist does everything a Statistician does, but has added responsibilities that require more technical skills. A Statistician has similar duties to a Data Scientist, but they usually do the prep work for what a Data Scientist does. For example, a Statistician might bring the data for someone else to model via surveys and experiments but the Data Scientist is the one actually modeling the data. A Statistician might have data that is easier to analyze since it will be smaller, cleaner, and structured versus a Data Scientist that would have a large, messy, unstructured data set. All in all, a Data Scientist will usually be someone who is a little more advanced with their technical skills and at least on par with their mathematics and statistics background as someone who is a Statistician.

So for companies who loosely throw out the term “Data Scientist” maybe ask yourself if this is really what you are doing or if you’re more in line with what a Statistician is doing. Are you doing the prep work that a Statistician is doing or are you creating a machine learning model that a Data Scientist would be performing? Are you only dealing with smaller, structured data sets like a Statisician or large, unstructured ones like a Data Scientist? Also think about what you want to do. If it sounds scary right out of your education to do the tasks of a Data Scientist or want more experience, start with a role as a Statistican and as you gain confidence and those technical skills then take the leap.

As the author of this blog, I can confidently say that if I had to pursue a job after my undergraduate career I would fall into the line of work of a Statistician. The purpose of my Master’s education was to gain some more advanced technical skills and a voluminous statistics background to be able to assume the role of a Data Scientist. Up until my last role, I was given smaller data sets that were not overly hard to implement some basic models I learned from my undergraduate educaton. It wasn’t until this past summer, I was given extremly large, integrated, unstructed data that I had to find ways to model. I had to use existing machine learning techniques along with exploring model creation to optimize a result to a business question/problem the company wanted solved. This experience was the first time I felt the jump from Statistician to Data Scientist. For once, I was painting a picture using words and numbers. And while there was a learning curve, I enjoyed the challenge and I wish to improve my skills until I hopefully graduate in May 2024 so that I too can be a part of one of the largest growing fields in the United States as a Data Scientist.


<
Blog Archive
Archive of all previous blog posts
>
Next Post
Recap of Interacting with US Treasury API