Data Science – The Boring Definition
Let’s take a look at the textbook definition of Data Science. It reads something like this:
Data science is an interdisciplinary field focused on extracting meaningful information from large sets of data. It combines the scientific method, math and statistics, programming, advanced analytics, AI, and even storytelling to uncover and explain the business insights buried in data.
It’s well said but this definition is somewhat abstract. Understanding a couple use cases will for sure help clear some of the doubts one may have. We will introduce some real-life use cases later in this chapter.
Simply put, data science is the process of discovering insights from data and use it for better changes.
- In a business setting, data science is the processing of collecting, preparing, analyzing, and mining the business data to help the make better decisions or build better data-driven products and therefore resulting in better business outcome.
- In socials science and public sectors, data science is applied to different kinds of data and the analytics results are used to benefit the society. For example, criminal justice, education, economic and workforce development, energy, environment, public health, transportation and infrastructure, as well as public safety.
Data Science means differently to companies at different stages of their data science maturity curve.
- In startups where data is not the core product, data science probably means applying data analytics to keep the business teams informed and support decision making.
- For consumer-facing businesses that has established data teams, data science is usually applied as data-driven processes to optimize key business metrics such as revenue, daily active users, and retention.
- For companies that consider data as a core business strength, data science may be applied in product design and as a result it directly impact the user experience and can lead to company growth.
Regardless of the company sizes, maturity stages, and use cases, data scientists seem to be doing pretty high-impact projects!
Data Science Lifecycle
Data Science is more than just data analysis. A typical data science project involves a few stages in the lifecycle. A general lifecycle may include:
- Business Understaning
- Data Acquisition and Understanding
- Data Analysis and ML Modeling
- Model Validation and Interpretation
- Model Deployment
- Monitoring & Refinement
In layman’s term, this is how one can understand data science lifecycle:
- Business has challenges and raises some assumptions
- Data science team tries to understand the business challenges and figure out the type of data to use for analysis
- Data scientists apply the analytics magic to come up with data solutions
- The outcome gets interpreted to the business team that leverages the magic work done by DS to improve the business product/process
- After business user acceptance, data science insights get turned into a product that requires care and maintenance in production (monitoring & data ops)
- Data scientists analyze the feedback data and continuously iterate on the magic tricks to keep refining the entire process
Here are a few popular data science lifecycles:
In practice, a data scientist don’t always get to work on the entire lifecycle. Depending on the data science team and the maturity, data scientists may focus on specific stages of a lifecycle. For example,
- Some data scientists might focus on machine learning and advanced analytics,
- Some data scientists might focus more on the ML engineering side
- Some citizen data scientists may spend more time on data exploration, visualization, as well as business communications.
Data Science Use Cases
Data Science has so many use cases. That’s one of the reasons why it’s such an appealing career for many. Let’s take a look at some of the real industry projects that WeCloudData students have worked on in the past.
Data Science in Healthcare
- Our students helped medical doctors apply machine learning on a small sample of patient data to predict heart failure. The key research was included in a publication.
- In another project, our students and project managers helped a healthcare startup build a knowledge graph that’s used as a backend tool to power knowledge search in an application built for nurses.
- Our students also helped another healthcare start collect web data through scraping and build the company’s first visualization dashboard.
Data Science in VR/AR
- Our students worked with AR game player engagement data and GPS data to help the client build an interactive dashboard for AR player engagement insights
Data Science in Digital & Media
- Our students helped a digital & media company build audience look-alike models using big data tools such as Snowflake and Apache Spark
Data Science in Accounting & Productivity
- Our students helped a Receipt Management app startup classify and categorize scanned receipt images using machine learning and NLP
Data Science in Personalization
- WeCloudData students helped a Media & Publishing company build personalized recommendation engines to help improve user engagement and retention.
- Our students helped a consumer electronics company’s marketing team build customer and store segmentation models and also created predictive churn models for better user retention management.
Data Science in Supply Chain
- Our students helped several supply chain clients improve the time series forecasting models using deep learning techniques that result in better inventory management as well as revenue forecasting
Data Science in Sports & Entertainment
- Our students helped a sport analytics startup build company-wise and client-facing visualization dashboards to help analyze and monitor the AI sports game prediction models.
Myths about Data Science
There’re some common misunderstanding about Data Science and Data Scientists.
- Data Scientist is the sexiest job of 21st century
- Facts: data science is about 60% data wrangling | 20% modeling | 20% reporting
- Every problem can be solved by data science and big data
- Facts: we don’t need a machine learning model for everything business intuition works most of the time
- Data scientists only focus on machine learning and let the data engineers worry about the rest
- Facts: data scientists need to know basic data engineering skills integration is important
- Machine Learning is the most important part of Data Science lifecycle
[Webinar] Introduction to Data Science
To learn more about data science use cases, check out the info session below.
Hope this article helped you understand the lifecycle of data science and its use cases. Read on to learn more about a career in Data Science and how WeCloudData can support you on your journey into Data Science.