The role of data scientist grew alongside the internet as did a few other related engineering jobs. Data engineers and data analysts work with data scientists to complete the whole “big-data” picture. They work together to determine the data platform requirement, basic and advanced algorithms, and to deliver the visual tools needed to analyze and present the data and value creation back to other departments in an easily understood and insightful format.
Who is a data scientist?
Typically a data scientist has an advanced degree in mathematics or physics. A PhD isn’t uncommon and a Master’s degree is a prerequisite. Data scientists strongly understand statistical modeling and how to build and customize advanced mathematical algorithms. This is both within their comfort zone and where they excel. I heard one person describe a data scientist as “a rock star statistician with above average software engineering skills.” Yet, when you ask data scientists how they came into their profession, the path is varied. This is a relatively new position so we don’t have 20 years of history to track how a data scientist progresses. There is a bit of overlap between a data scientist and a data engineer.
Besides working on advanced algorithms, a data scientist is hands on with AB testing and has advanced knowledge of multivariate testing and the design of experiments. A strong data scientist is able to customize and change the models after they are built and especially capable ones are those who can customize models to your business problem.
Who is a data engineer?
A data engineer is “a rock star software engineer with a decent understanding of statistics.” If you have a business problem, you need a data engineer. These are the folks that provide the platform upon which the data can be modeled. Their core value lies in their ability to prepare the data pipeline with clean data. A strong understanding of file systems, distributed computing and database completes the skill set needed to be a good data engineer.
Data engineers have decent understanding of algorithms. Therefore, data engineers should be able to run basic models. With more advanced business needs comes a requirement for more sophisticated algorithms. Many times these needs outstrip the knowledge of the data engineer and that’s when you need to call them a data scientist.
Who is a data analyst?
Data analysts understand the business side of the equation. They know how to ask the right questions and are really good at data analysis, visualization and the presentation of data. Whether presenting to another data analyst or a c-level executive, data analysts are the super stars of data extraction, pattern recognition and finding the insight within reams of data.
If you or your company is thinking about jumping onto the big data bandwagon, start first by defining the business problem that you want to solve with big data. Then figure out what you need: Is it data capture, retrieval, warehousing or analytics? Then write the job description accordingly and be prepared, you may be looking for more than one person in order to truly play the big data game well.