Consulting Case Study: Job Market Analysis

October 19, 2021

Executive Summary

WeCloudData is one of the fastest growing Data & AI training companies in the world. Since 2016, WeCloudData has trained and helped thousands of students and clients level up their data skills and mature their data organizations. Understanding the job market is a central business need for many organizations and for all HR departments and recruiters. By leveraging data engineering techniques combined with a cloud toolchain, WeCloudData helped a client achieve a continuous flow of current job market data with analytical capabilities and dashboards to drive the business forward and stay competitive.


The client required current (but also historical) job data to help:

  1. Address stakeholder questions and communicate job market trends
  2. The leadership team make data-driven business decisions
  3. Match employees & clients with the most relevant jobs

With these business needs the client could not just perform manual ad hoc searches on job boards once a month or once a week. Furthermore, one cannot combine and aggregate data from publicly available job boards into custom graphs or dashboards. The client needed to build its own internal data pipeline with enough flexibility to meet the business requirements for a job market analysis platform & dashboard.


In order to meet the business requirements for a job market analysis platform & dashboard, WeCloudData helped the client leverage a suite of cloud platforms & tools to enable a data pipeline in multiple stages:

  1. Ingest job data from multiple sources and store the raw data in a cloud data lake
  2. Process the raw data with Python & Spark
  3. Load the intermediate and final data sets in a data lake, Postgres database and Redshift
  4. Push the data to downstream analytics, BI and dashboard applications


This architecture is flexible enough to ingest from a variety of data sources and allow different business units to use the analytics, BI or dashboard tool of their choice to pull, aggregate and query the jobs data daily or query historical snapshots of the entire database.


WeCloudData helped a client build a flexible data pipeline to address the needs from multiple business units requiring different sets, views and timelines of job market data. The team was able to achieve this by leveraging cloud as well as open source tools in a modular set up, taking advantage of relatively cheap cloud storage, a versatile programming language in Python and Spark’s powerful processing engine. The client intends to build on and improve this data pipeline by moving towards a more serverless architecture and adding DevOps tools & workflows.

Join our programs and advance your career in Business IntelligenceCloud EngineeringData EngineeringData Science

"*" indicates required fields

This field is for validation purposes and should be left unchanged.
Other blogs you might like
Executive Summary WeCloudData is one of the fastest growing Data & AI training companies in the world. Since 2016,…
by Beam Data
October 19, 2021
Student Blog
This blog series is posted by WeCloudData’s Data Science Immersive Bootcamp student Bob Huang (Linkedin) OVERVIEW: The digital marketing…
by Student WeCloudData
October 28, 2019
Student Blog
The blog is posted by WeCloudData’s Big Data course student Udayan Maurya. This Live Twitter Sentiment Analyzer helps track…
by Student WeCloudData
May 26, 2020

Kick start your career transformation