Upon completing the course, you will be able to:
Use key components of Apache Hadoop: HDFS, MapReduce with streaming, Hive, and Spark to store and process gigantic amounts of data.
Import, clean, and query data using Spark SQL and Spark RDDS.
Use the Spark Machine Learning Library (MLlib) to conduct Machine Learning models.
Use Amazon Web Services (AWS) to deploy Hadoop and Spark clusters.
Use a cluster to process large datasets that cannot fit on your personal computer.
It is hard, but you can do it. Plus, we're here to help.
Instructor & Manager, Big Data Analytics, TD
A data science enthusiast who likes to dig into data to find useful trends and patterns to support business, Faraz has more than 10 years of experience in IT and data science in the banking and high-tech industries. He is now leading a team of highly talented and effective data scientists and data engineers at one of the major banks in Canada, focusing on a variety of data science projects. Prior to the bank, Faraz was a manager leading a team of data scientists at BlackBerry. They have worked on projects of predictive and inferred user attributes for effective ad-targeting, resulting in significant revenue lift, and a content recommendation solution used to increase user stickiness. Faraz is also a great instructor. In teaching Big Data Analytics Tools at Ryerson, he used his work experience to deliver a highly informative and practical course. He has trained more than 100 professionals on big data technologies such as Hadoop, Spark, and data science.