Courses
Location
Corporate
Our Students
Resources
Bootcamp Programs
Short Courses
Portfolio Courses
Bootcamp Programs

Launch your career in Data and AI through our bootcamp programs

  • Industry-leading curriculum
  • Real portfolio/industry projects
  • Career support program
  • Both Full-time & Part-time options.
Data Science & Big Data

Become a data engineer by learning how to build end-to-end data pipelines

 

Become a data analyst through building hands-on data/business use cases

Become an AI/ML engineer by getting specialized in deep learning, computer vision, NLP, and MLOps

Become a DevOps Engineer by learning AWS, Docker, Kubernetes, IaaS, IaC (Terraform), and CI/CD

Short Courses

Improve your data & AI skills through self-paced and instructor-led courses

  • Industry-leading curriculum
  • Portfolio projects
  • Part-time flexible schedule
AI ENGINEERING
Portfolio Courses

Learn to build impressive data/AI portfolio projects that get you hired

  • Portfolio project workshops
  • Work on real industry data & AI project
  • Job readiness assessment
  • Career support & job referrals

Build data strategies and solve ML challenges for real clients

Help real clients build BI dashboard and tell data stories

Build end to end data pipelines in the cloud for real clients

Location

Choose to learn at your comfort home or at one of our campuses

Corporate Partners

We’ve partnered with many companies on corporate upskilling, branding events, talent acquisition, as well as consulting services.

AI/Data Transformations with our customized and proven curriculum

Do you need expert help on data strategies and project implementations? 

Hire Data, AI, and Engineering talents from WeCloudData

Our Students

Meet our amazing alumni working in the Data industry

Read our students’ stories on how WeCloudData have transformed their career

Resources

Check out our events and blog posts to learn and connect with like-minded professionals working in the industry

Read blogs and updates from our community and alumni

Explore different Data Science career paths and how to get started

Blog

Student Blog

Visualizing New York City Taxi Data

October 28, 2019

[Student Project] Visualizing New York City Taxi Data

This blog is created by WeCloudData’s Data Science Bootcamp alumni Yaoyu Cui.

Please find the complete dashboard on https://goo.gl/gXGTEw

Tableau has been one of the most popular visualization tools among the Data Science community. Besides its ability of data preprocessing and programming, it also provides powerful mapping functionalities. In this blog, a specific task was given regarding a specific New York Taxi company’s pickup data for the year of 2014. The task specifies the use of Python, SQL tools, local weather, and Tableau. To make it more interesting and to demonstrate the mapping functionality of Tableau, I found a Shapefile of New York City (link below).

https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-nynta.page

This is what it looks like in Tableau:

The task data contains four months of pickup locations (latitude and longitude), Date/Time, and Base, that’s it (see image below). A separate file of weather info was also provided including date, temperature, humidity, wind speed, precipitation, etc.

Data Preprocessing:
Before setting up the stage, we must ask what is the link between this data? Is there any useful information we can get out of it? To answer these questions, the data was broken down into more pieces as so:

The ‘week’ column represents the day of the week; note that the raw data of latitude and longitude was transformed into its neighborhood NTA name corresponding to NYC shapefile for a later purpose. The process was done in Python using a package called GeoPandas. The shapefile provided by NYC used an uncommon Coordinate Reference System (CRS). It took me quite a while to figure out the corresponding CRS code:

The three files were then joined in Tableau, and more columns were generated using Tableau functions:

Visualization:

The image below is the final outcome of the dashboard of Manhattan in April:

Note that Tableau provides many powerful interaction options. The dashboard was made out of three sheets, and the filter of one sheet will update on all sheets using the same data source. Tooltip of summary info will appear on hover. All the neighborhoods, days, and hours can work as a filter, and different filters can exist simultaneously (image below selecting rush hour of a certain day in a certain neighborhood):

Data Analysis:

Now let’s talk about the data and what we have found (Tableau provides data summary on sheet level, but not on the dashboard):

The data contains 1.8 million pickups in three months, 81% of which are from Manhattan and 18.76% are from Manhattan Midtown South.

From April to June 2014, New York City had seven consecutive rainy days, each lasted about two days. Out of the seven rainy days, there were five obvious abnormal pickup peaks from Manhattan. Expect on May 10th and June 9th, the pickups show no increase at all.

Other than the weather factor, the most influential factor is the day of the week. The bottom pickups are always on Mondays, where the peaks are on Fridays and Saturdays. Regarding the hour, a local peak would appear during the morning rush hour, by 14:00 the pickups would already surpass the morning peak, by 17:00 rush hour, it would triple the morning peak, having about 3 pickups per min.

Conclusion:

Tableau is a convenient tool for tasks like data science/analytics; it works well with SQL database. Built-in data preprocessing and programming function saves a considerable amount of time on editing. Tableau performs very well with geological data and visualizations. On top of all, Tableau provides many audience-friendly interaction features.

To see Yao’s original blog post please click here. To follow and see Yao’s latest blog posts, please click here.

To find out more about the courses our students have taken to complete these projects and what you can learn from WeCloudData, click here to see our upcoming course schedule.

Other blogs you might like
Student Blog
The blog is posted by WeCloudData’s student Luis Vieira. I will be showing how to build a real-time dashboard on…
by Student WeCloudData
October 21, 2020
Uncategorized
Big Data for Data Scientists – Info Session from WeCloudData…
by WeCloudData
November 9, 2019
Previous
Next

Kick start your career transformation