In simple terms, a data scientist’s job is to analyze data for actionable insights.
Specific tasks include:
- Identifying the data-analytics problems that offer the greatest opportunities to the organization
- Determining the correct data sets and variables
- Collecting large sets of structured and unstructured data from disparate sources
- Cleaning and validating the data to ensure accuracy, completeness, and uniformity
- Devising and applying models and algorithms to mine the stores of big data
- Analyzing the data to identify patterns and trends
- Interpreting the data to discover solutions and opportunities
- Communicating findings to stakeholders using visualization and other means
In the book, Doing Data Science, the authors describe the data scientist’s duties this way:
“More generally, a data scientist is someone who knows how to extract meaning from and interpret data, which requires both tools and methods from statistics and machine learning, as well as being human. She spends a lot of time in the process of collecting, cleaning, and munging data, because data is never clean. This process requires persistence, statistics, and software engineering skills—skills that are also necessary for understanding biases in the data, and for debugging logging output from code.
Once she gets the data into shape, a crucial part is exploratory data analysis, which combines visualization and data sense. She’ll find patterns, build models, and algorithms—some with the intention of understanding product usage and the overall health of the product, and others to serve as prototypes that ultimately get baked back into the product. She may design experiments, and she is a critical part of data-driven decision making. She’ll communicate with team members, engineers, and leadership in clear language and with data visualizations so that even if her colleagues are not immersed in the data themselves, they will understand the implications.”
Source: O’Neil, C., and Schutt, R. Doing Data Science. First edition.
Would you make a good data scientist?
To find out, ask yourself: Do you . . .
- hold a degree in mathematics, statistics, computer science, management information systems, or marketing?
- have substantial work experience in any of these areas?
- have an interest in data collection and analysis?
- enjoy individualized work and problem solving?
- communicate well both verbally and visually?
- want to broaden your skills and take on new challenges?
If you answered yes to any of these questions, you may find a lot to like in the field of data science.
Data scientists require a knowledge of math or statistics. A natural curiosity is also important, as is creative and critical thinking. What can you do with all the data? What undiscovered opportunities lie hidden within? You must have a knack for connecting the dots and a desire to search out the answers to questions that have not yet been asked if you are to realize the data’s full potential.
Data scientists are also highly educated. According to industry resource KDnuggets, 88 percent of data scientists have at least a master’s degree and 46 percent have PhDs.
You also need some background in computer programming so you can devise the models and algorithms necessary to mine the stores of big data. Python and R are two of the premier programming environments for data science.
You must be something of an entrepreneur. A head for business strategy is important. Although you may work with other data specialists or even with an interdisciplinary team of professionals, you will not be successful if you cannot devise your own methods and build your own infrastructures to slice and dice the data that will lead you to your new discoveries and new visions for the future.
You must also be able to communicate complex ideas to your nontechnical stakeholders in a way they can easily understand. Data-science software tools can help you visualize your findings, but you will also need the verbal communication skills to tell the story clearly.
Advisers are available Monday through Thursday 8 a.m. to 7:30 p.m., Fridays 8 a.m. to 4:30 p.m. CT, or by appointment.