Consulting Case Study: Topic Modelling on Technician Notes

Client Info Our client is one of Canada’s largest construction vehicle suppliers. They employ thousands of skilled technicians across multiple provinces to support their clients and are known for their excellent customer service and quality of work. The technicians handle repair and maintenance services for their client’s purchased vehicles. For each job that a technician […]

Consulting Case Study: E-commerce Customer Segmentation

Background Our client is a company manufacturing consumer electronic products like mobile devices, printers, computer monitors and so on, who is leading the electronic goods merchant wholesalers industry for many years. Their advanced data analytics team connected to WeCloudData for the machine learning solution on predicting their top merchandize sales and marketing strategies on their […]

Consulting Case Study: Lookalike Models for Audience Expansion

Background Our client is one of the largest news publishers in North America. With their print and digital formats reach millions of readers every week, they lead the national discussion by engaging audiences through its prestigious coverage of news, politics, business, investing and lifestyle topics, across multiple platforms. The WeCloudData team worked with the client’s […]

Consulting Case Study: Integrated AI Content Search

Executive Summary WeCloudData is one of the fastest growing Data & AI training companies in the world. Since 2016, WeCloudData has trained and helped thousands of students and clients level up their data skills and mature their data organizations. As organizations continue to undergo digital transformations all over the world, enterprises are experiencing pains that […]

Consulting Case Study: Recommender Systems

Client Info Our client is one of Canada’s most well-established and decorated news outlets. They have been the recipient of numerous journalism awards and have a reach of millions of readers for their print and digital content across all news categories. In the early to mid 2010s, our client began to shift its focus towards […]

Streaming Stack Overflow Data Using Kinesis Firehose

The blog is posted by WeCloudData’s  student Sneha Mehrin. Overview on how to ingest stack overflow data using Kinesis Firehose and Boto3 and store in S3 This article is a part of the series and continuation of the previous post. Why using Streaming data ingestion? Traditional enterprises follow a methodology of batch processing where you […]

Preprocessing Criteo Dataset for Prediction of Click Through Rate on Ads

The blog is posted by WeCloudData’s student Amany Abdelhalim. In this post, I will be taking you through the steps that I performed to preprocess the Criteo Data set. Some Aspects to Consider when Preprocessing the Data Criteo data set is an online advertising dataset released by Criteo Labs. It contains feature values and click feedback […]

Let’s Read Customer Reviews (actually-make machines do it!)

The blog is posted by WeCloudData’s Bid Data course student Udayan Maurya. Customer reviews are invaluable information to understand the gap in your product market fit. If you sell your products on e-platforms: Amazon, Ebay, Appstore, Playstore, Youtube, etc. then you are in luck. You have direct access to your customers mind. However, to leverage customer’s […]

Live Twitter Sentiment Analysis

The blog is posted by WeCloudData’s Big Data course student Udayan Maurya. This Live Twitter Sentiment Analyzer helps track present sentiment for a given track word. In this document, I will describe the work flow I followed to develop this SaaS app. Contents Data Pipeline Map Data Collection Preparing Data for Data Analysis Training the […]

An Introduction to Big Data & ML Pipeline in AWS

The blog is posted by WeCloudData’s Big Data course student Abhilash Mohapatra. This story represents an easy path for below items in AWS : Build an Big Data Pipeline for both Static and Streaming Data. Process Data in Apache Hadoop using Hive. Load processed data to Data Warehouse solution like Redshift and RDS like MySQL. […]

Kijiji House Price Analysis using Python

This is the first project that I have done for WeCloudData. The purpose of this project is to find the relationship between housing prices in Toronto(GTA) in relation to location, house size, number of bedrooms and number of bathrooms. We start by scraping data from Kijiji through the URL requests. Then we parse our data source […]

Predictive Churn Modeling Using Python

This blog is posted by WeCloudData’s Data Science Bootcamp student Austin Jung. Customer churn is a common business problem in many industries. Losing customers is costly for any business, so identifying unhappy customers early on gives you a chance to offer them incentives to stay. In this post, I am going to talk about machine […]

Credit Scoring Using Machine Learning

The credit score is a numeric expression measuring people’s creditworthiness. The banking usually utilizes it as a method to support the decision-making about credit applications. In this blog, I will talk about how to develop a standard scorecard with Python (Pandas, Sklearn), which is the most popular and simplest form for credit scoring, to measure […]

Fraud Analytics: ML Tutorial on Dealing with an Imbalanced Dataset

This blog is posted by WeCloudData’s Immersive Bootcamp student Anthony Chen. Fraud analytics provide a certain challenge that people may glance over at first. The problem of the imbalanced dataset. How do we approach it? What angle should we start at? What kind of performance measures do we use? The goal of this article is […]

Introduction to Machine Learning In Healthcare

Machine learning applications in healthcare was a great hit with the NYC audience. At least 130 enthusiastic attendees joined the Bots and AI Meetup on December 10th, with the crowd extending far to the back of the room. Lucy He of Flatiron Health kicked off the night with an examination of machine learning’s impact in medical study cohort selection. […]