Student Success
Bootcamp Programs
Short Courses
Portfolio Courses
Bootcamp Programs

Launch your career in Data and AI through our bootcamp programs

  • Industry-leading curriculum
  • Real portfolio/industry projects
  • Career support program
  • Both Full-time & Part-time options.
Data Science Bootcamp

Become a data engineer by learning how to build end-to-end data pipelines


Become a data analyst through building hands-on data/business use cases

Become an AI/ML engineer by getting specialized in deep learning, computer vision, NLP, and MLOps

Become a DevOps Engineer by learning AWS, Docker, Kubernetes, IaaS, IaC (Terraform), and CI/CD

Short Courses

Improve your data & AI skills through self-paced and instructor-led courses

  • Industry-leading curriculum
  • Portfolio projects
  • Part-time flexible schedule
Portfolio Courses

Learn to build impressive data/AI portfolio projects that get you hired

  • Portfolio project workshops
  • Work on real industry data & AI project
  • Job readiness assessment
  • Career support & job referrals

Build data strategies and solve ML challenges for real clients

Help real clients build BI dashboard and tell data stories

Build end to end data pipelines in the cloud for real clients


Choose to learn at your comfort home or at one of our campuses

Corporate Partners

We’ve partnered with many companies on corporate upskilling, branding events, talent acquisition, as well as consulting services.

AI/Data Transformations with our customized and proven curriculum

Do you need expert help on data strategies and project implementations? 

Hire Data, AI, and Engineering talents from WeCloudData

Student Success

Meet our amazing alumni working in the Data industry

Read our students’ stories on how WeCloudData have transformed their career


Check out our events and blog posts to learn and connect with like-minded professionals working in the industry

Read blogs and updates from our community and alumni

Explore different Data Science career paths and how to get started

Our free courses and workshops gives you the skills and knowledge needed to transform your career in tech

Data Science

Big Data for Data Scientists

The big data course teaches data scientists the necessary skills to scale data science solutions. If you want to build portfolio projects that will help you stand out, up-skill yourself to become a senior data scientist, or have the necessary engineering skills to become a data science solution architect, this is the right course for you.

This course is suitable for learners who want to

  • work with large-scale ML problems
  • learn how to work in a cloud first environment
  • elevate data science portfolios via end-to-end big data analytics projects
Talk to our Advisor
Online Live
6 weeks
60 hours
Upcoming Start Date
Feb 13
Registration Deadline:
  February 13, 2024
View more start dates

About the Course

The Big Data for Data Scientists is a 6-week project-based course that teaches data scientists the necessary tools to work on large-scale data science problems. The entire course is built around an end-to-end real-time machine learning problem. Students will learn the most cutting-edge big data frameworks and tools such as AWS, Apache Spark, Amazon SageMaker, Databricks. Students will also learn how to train machine learning models at scale and deploy models at scale in real-time.


  • What you will learn
    • AWS Cloud
    • Big Data & Spark
    • Scaling ML with Spark ML
    • MLOps with SageMaker


  • Case-based learning with real-life datasets
    • AdTech – Fraud Detection
    • AdTech – CTR Prediction
    • Recommender Systems
    • Search Engine (Knowledge base)
    • Social Media Analytics
Want to learn more about this career?

WeCloudData is the perfect place to grow your career


Wenle W., Senior Big Data Developer

After listening and comparing big data courses in different places in Toronto, I went to sign up for all of the WeCloudData courses right after Shaohua’s info session without any hesitation. He is not only very knowledgable and experienced but also teaches so clearly and methodologically, which is way beyond my expectations. I have finished Python and big data courses up far, both courses are well-organized, project-oriented all along. You can start by applying what you have learned, exploring your own tools from there, and also building up step by step as you learn more throughout the courses.

Whenever I got stuck, I could get all the help I need from the instructor, TA, and teammates, sometimes I also got motivated by other teams and pushed forward by my instructor to continue working on my project by doing in-class progress report and presentations. The things I have learned here and the final project presentation I did benefit me a lot by showing so much confidence in my big data interviews and helping me handing job offers. Quite often, I feel I know more than my interviewers!

Many thanks to WeCloudData, it is a great learning platform with a great instructor, TAs and classmates!!

Ranked #1 Data Training Program


Be ready for the new economy

WeCloudData Bootcamps are designed to be project-based. We not only cover essential theories, but also teach how to apply tools and platforms that are in high demand today. Our program curriculum is also highly adaptive to the latest market trends. 

Module 1
Introduction to AWS
Modern data scientists need to become familiar with cloud technologies. Most production data science solutions are implemented on private/public cloud. This course teaches the fundamentals of cloud computing for data scientists such as storage and compute. Students will learn how to work with AWS python SDKs to store and retrieve big data and know how to scale analytics solutions to more powerful cloud instances.
  • Understand the concepts of cloud computing
  • Become familiar with AWS’s data science and ML solution ecosystem
  • Understand different use cases of data science in the cloud
  • Learn how to launch EC2 instances/servers on AWS
  • Learn how to retrieve and store big data in AWS S3
Key Skill:
AWS (Amazon Web Service), Boto, S3, Cloud Storage, EC2, Cloud Instance
Module 2
Introduction to Big Data
Data scientists working for big tech, insurance, telecom, retail, and e-commerce industries are often dealing with large amounts of data. The size of the data can range from hundreds of gigabytes to hundreds of terabytes. In most of those cases, traditional servers and personal computers are not ideal tools and data scientists will need to learn big data tools such as Snowflake and Spark. This module introduces data science learners to the world of big data. We will give an overview of distributed systems and the big data landscape
  • Learn how to scale Pandas data processing pipelines
  • Learn when and how to scale ML workloads using Ray and Polars
  • Get familiar with the concepts of distributed systems
  • Learn the basic concepts of MapReduce
Key Skill:
Ray, Polars, Pandas, Dask, Spark
Module 3
Big Data with Apache Spark
Apache Spark is one of the most popular big data framework for data scientists. In this module we will introduce the Spark DataFrame API to learners and teach you how to scale data processing using Spark SQL and Spark DataFrame. You will experience the performance of Spark on AWS EMR and Databricks.
  • Understand the Spark ecosystem
  • Learn how to read big data from S3 and Parquet
  • Learn how to process 50G to terabyte size datasets using Spark DataFrame
Key Skill:
Spark SQL, Spark DataFrame, Apache Spark, Databricks, Parquet, Distributed Computing, MapReduce
Module 4
Scaling Machine Learning with Spark
This module teaches students how to train large-scale ML models using Spark. Students will learn the basics of distributed ML algorithms and how to train and tune models on data-parallel problems.
  • Know when to use Spark for machine learning
  • Learn how to train and tune machine learning models using Spark ML
  • Learn how to train and tune machine learning models using Ray Train
  • Understand the specific use cases of Spark in recommender systems
  • Understand the specific use cases of Spark in ad-tech
  • Understand the specific use cases of Spark in sentiment analysis and text classification
Key Skill:
Spark ML, Distributed Machine Learning, Recommender System, Click-through Rate Prediction, CTR, Sentiment Analysis, Text Classification
Module 5
MLOps with SageMaker
This module introduces learners to the world of MLOps. Students will learn how to work with end-to-end platforms such as Amazon SageMaker to build, train, tune, and deploy machine learning models. This module will prepare students for a more advanced MLOps Engineer course.
  • Learn how to work with SageMaker Studio
  • Learn how to prepare data using SageMaker Wrangler
  • Learn how to train and deploy models using SageMaker
  • Learn how to monitor ML models using SageMaker
Key Skill:
AWS SageMaker, SageMaker Wrangler, SageMaker Studio, SageMaker Endpoint

Learn from the best

We’ve brought together a team of highly skilled and experienced instructors to help you learn effectively. Our instructors have a passion for teaching and a wealth of real-world experiences in their respective fields, so you can be confident that you’re learning from the best.


Portfolio Experience Building

Making yourself hireable and stand out from the crowd by working on big data personal projects. Here’s what you will experience:

  • Choose a big data problem to focus on
  • Write a project proposal
  • Set up AWS infrastructure
  • Design the ML system diagram
  • Implement and deployment end-to-end ML solutions on AWS
  • Code review with your learning mentor
  • Present your portfolio project
  • Publish your work online

Upcoming Start Dates

Oct 17 -
 Nov 21
Online Live
Feb 13 -
 Mar 19
Online Live
Apr 23 -
 May 28
Online Live

Explore your personalized learning path

Big Data for Data Scientists
  • Case-based learning
  • Portfolio project mentoring
  • Flexible payment plan
Recommended Short Courses
$4,000 - $5,000
  • Enrich your DS experience with advanced DE and AI skills
  • Get alumni discount for other DE, AI, and MLE courses
  • Short courses to consider after completing this course ⇩
Upgrade to Bootcamp
$8,000 - $16,000
  • Upgrade to the DS or AI bootcamp and get alumni discount
  • Get extensive 1-1 career mentoring and job support
  • Get the flexibility to create your own bootcamp
Have Questions?

Start Learning With WeCloudOpen

WeCloudOpen is here to help you unlock your full potential in tech, with our free courses and workshop. Learn the fundamentals of coding and data, and become a proficient tech professional in no time!

WeCloudOpen Course

Our comprehensive courses on Python and SQL are the perfect way to start your journey into the world of tech. WeCloudOpen ensures you learn the basics without any hassles

WeCloudOpen Workshop

Our free workshops offer topics like Business Intelligence, Data Science, Data Engineering, DevOps, Machine Learning – allowing you to get a head start in tech career

student success

What our graduates are saying


M Chowdhury

I would highly recommend WeCloud Data to anyone who wants to learn practical/ applied Data Science and Big Data (Spark, Hadoop, AWS stack, Databases, SQL, NoSQL, Python, Machine Learning). Because I found the team of instructors very helpful. Shaohua is highly experienced in the field. The teaching style is very user-friendly where he breaks down difficult topics easy to understand.


Grace T.

The lectures have been amazing! Both instructors are awesome – it’s obvious that they are experts in this domain. They both have the ability to explain concepts in such a way that they can be understood. The slides are also very helpful and provide a solid reference point for the topics discussed thus far. Overall, very happy that I am taking this course and would definitely recommend to colleagues and friends.

Let WeCloud Accelerate Your Career in Tech

Have questions?

Want more details about this course? Unsure about which path to take? Apply now to reserve a spot or make an appointment with our learning advisor. 

Start learning with WeCloud Open

Join WeCloud Open and start learning today! We provide open courses, career guide, and learning resources. It’s a great way to start your career in tech!


Frequently asked questions about the bootcamp
Learners joining this course will need to have solid python programming skills. Some machine learning experience is also required since this course focuses on scalable machine learning. Some packages learners need to know include Python pandas, scikit-learn. Knowing some linux programming will be helpful but not a must have.
The Big Data course is very hands-on by design. Learners will start to work on a capstone project starting from the 3rd week. There’re lots of exercises that will keep learners busy. The lectures are also taught in a hands-on fashion. Learners will follow instructors and TAs to complete labs.
This big data course is designed to prepare students for data scientists jobs that require big data skills. It’s taught at an intermediate to advanced level and students will complete an end to end project. It will help you build awesome portfolio projects and stand out.
Learns will build an end-to-end big data pipeline to be added to their data science portfolio. You will be required to come up with your original ideas, create the services in AWS, load big data into the data lake, build a machine learning pipeline using Spark ML to solve a ML big data challenge. At the end of the project, you will be presenting your work to experienced instructors and the entire class to get feedback.
While both are big data related courses, this course is for data scientists who want to scale data science and machine learning solutions. It is not a course for data engineers who don’t work with machine learning.
Yes. We have labs on a weekly basis and students work with lab instructors and project mentors when they work on the capstone projects.
No, most of the “big data” processing in this course is done in the cloud. From day one students learn how to work with AWS and as long as you have good internet connection and has an AWS account you’ll be able to do the course.
Yes, you will need to buy AWS credits to take this course. We will mostly work with the free Databricks community account and use AWS free credits. It you go over the free credits limit then you’ll need to pay for the service. Learners shouldn’t expect spending more than $50 USD on AWS in this course.
Yes, learners need to have Python and machine learning knowledge. If you don’t have ML skills yet, we recommend you take WeCloudData’s machine learning short course first.
Yes, payment plans are available for this course. You can inquire about the payment plan by filling out the course inquiry form. Details are on the course package page once you got redirected after filling out the form.
Scholarship is available for bootcamp students. We do offer different kinds of discounts including alumni discount as well.
Some students have their employers cover the tuition. You can always ask your employers about it. We’re happy to provide the curriculum and enrolment letter.
View our Big Data for Data Scientists course package